Calibration of the logistic model

We present a calibration study of the logistic growth model defined here.

Analysis of the data

import openturns as ot
import numpy as np
import openturns.viewer as viewer
from matplotlib import pylab as plt
ot.Log.Show(ot.Log.NONE)

We load the logistic model from the usecases module :

from openturns.usecases import logistic_model as logistic_model
lm = logistic_model.LogisticModel()

The data is based on 22 dates from 1790 to 2000. The observation points are stored in the data field :

observedSample = lm.data
nbobs = observedSample.getSize()
nbobs

Out:

22
timeObservations = observedSample[:,0]
timeObservations[0:5]
v0
01790
11800
21810
31820
41830


populationObservations = observedSample[:,1]
populationObservations[0:5]
v1
03.9
15.3
27.2
39.6
413


graph = ot.Graph('', 'Time (years)', 'Population (Millions)', True, 'topleft')
cloud = ot.Cloud(timeObservations, populationObservations)
cloud.setLegend("Observations")
graph.add(cloud)
view = viewer.View(graph)
plot calibration logistic

We consider the times and populations as dimension 22 vectors. The logisticModel function takes a dimension 24 vector as input and returns a dimension 22 vector. The first 22 components of the input vector contains the dates and the remaining 2 components are a and b.

nbdates = 22
def logisticModel(X):
    t = [X[i] for i in range(nbdates)]
    a = X[22]
    c = X[23]
    t0 = 1790.
    y0 = 3.9e6
    b = np.exp(c)
    y = [0.0] * nbdates
    for i in range(nbdates):
        y[i] = a*y0/(b*y0+(a-b*y0)*np.exp(-a*(t[i]-t0)))
    z = [yi/1.e6 for yi in y] # Convert into millions
    return z
logisticModelPy = ot.PythonFunction(24, nbdates, logisticModel)

The reference values of the parameters.

Because b is so small with respect to a, we use the logarithm.

np.log(1.5587e-10)

Out:

-22.581998789427587
a=0.03134
c=-22.58
thetaPrior = [a,c]
logisticParametric = ot.ParametricFunction(logisticModelPy,[22,23],thetaPrior)

Check that we can evaluate the parametric function.

populationPredicted = logisticParametric(timeObservations.asPoint())
populationPredicted

[3.9,5.29757,7.17769,9.69198,13.0277,17.4068,23.0769,30.2887,39.2561,50.0977,62.7691,77.0063,92.311,108.001,123.322,137.59,150.3,161.184,170.193,177.442,183.144,187.55]#22



graph = ot.Graph('', 'Time (years)', 'Population (Millions)', True, 'topleft')
# Observations
cloud = ot.Cloud(timeObservations, populationObservations)
cloud.setLegend("Observations")
cloud.setColor("blue")
graph.add(cloud)
# Predictions
cloud = ot.Cloud(timeObservations.asPoint(), populationPredicted)
cloud.setLegend("Predictions")
cloud.setColor("green")
graph.add(cloud)
view = viewer.View(graph)
plot calibration logistic

We see that the fit is not good: the observations continue to grow after 1950, while the growth of the prediction seem to fade.

Calibration with linear least squares

timeObservationsVector = ot.Sample([[timeObservations[i,0] for i in range(nbobs)]])
timeObservationsVector[0:10]
v0v1v2v3v4v5v6v7v8v9v10v11v12v13v14v15v16v17v18v19v20v21
01790180018101820183018401850186018701880189019001910192019301940195019601970198019902000


populationObservationsVector = ot.Sample([[populationObservations[i, 0] for i in range(nbobs)]])
populationObservationsVector[0:10]
v0v1v2v3v4v5v6v7v8v9v10v11v12v13v14v15v16v17v18v19v20v21
03.95.37.29.6131723313950627692106123132151179203221250281


The reference values of the parameters.

a=0.03134
c=-22.58
thetaPrior = [a,c]
logisticParametric = ot.ParametricFunction(logisticModelPy,[22,23],thetaPrior)

Check that we can evaluate the parametric function.

populationPredicted = logisticParametric(timeObservationsVector)
populationPredicted[0:10]
y0y1y2y3y4y5y6y7y8y9y10y11y12y13y14y15y16y17y18y19y20y21
03.95.2975717.1776949.69197713.0276917.4068223.0769130.288739.2560650.0976762.7690777.006392.31103108.0009123.3223137.5899150.3003161.1843170.193177.4422183.1443187.5496


Calibration

algo = ot.LinearLeastSquaresCalibration(logisticParametric, timeObservationsVector, populationObservationsVector, thetaPrior)
algo.run()
calibrationResult = algo.getResult()
thetaMAP = calibrationResult.getParameterMAP()
thetaMAP

[0.0265958,-23.1714]



thetaPosterior = calibrationResult.getParameterPosterior()
thetaPosterior.computeBilateralConfidenceIntervalWithMarginalProbability(0.95)[0]

[0.0246465, 0.028545]
[-23.3182, -23.0247]



transpose samples to interpret several observations instead of several input/outputs as it is a field model

if calibrationResult.getInputObservations().getSize() == 1:
    calibrationResult.setInputObservations([timeObservations[i] for i in range(nbdates)])
    calibrationResult.setOutputObservations([populationObservations[i] for i in range(nbdates)])
    outputAtPrior = [[calibrationResult.getOutputAtPriorMean()[0, i]] for i in range(nbdates)]
    outputAtPosterior = [[calibrationResult.getOutputAtPosteriorMean()[0, i]] for i in range(nbdates)]
    calibrationResult.setOutputAtPriorAndPosteriorMean(outputAtPrior, outputAtPosterior)
graph = calibrationResult.drawObservationsVsInputs()
view = viewer.View(graph)
plot calibration logistic
graph = calibrationResult.drawObservationsVsInputs()
view = viewer.View(graph)
plot calibration logistic
graph = calibrationResult.drawObservationsVsPredictions()
view = viewer.View(graph)
plot calibration logistic
graph = calibrationResult.drawResiduals()
view = viewer.View(graph)
, Residual analysis
graph = calibrationResult.drawParameterDistributions()
view = viewer.View(graph)

plt.show()
plot calibration logistic

Total running time of the script: ( 0 minutes 2.387 seconds)

Gallery generated by Sphinx-Gallery