Calibration of the logistic model

We present a calibration study of the logistic growth model defined here. In this example, we calibrate the parameters of a model which predicts the dynamics of the size of a population. This shows how you can calibrate a model which predicts a time dependent output. You need to view the output time series as a vector.

Analysis of the data

from openturns.usecases import logistic_model
import openturns as ot
import numpy as np
import openturns.viewer as viewer
from matplotlib import pylab as plt

ot.Log.Show(ot.Log.NONE)

We load the logistic model from the usecases module :

lm = logistic_model.LogisticModel()

The data is based on 22 dates from 1790 to 2000. The observation points are stored in the data field :

observedSample = lm.data
nbobs = observedSample.getSize()
nbobs
22
timeObservations = observedSample[:, 0]
timeObservations[0:5]
v0
01790
11800
21810
31820
41830


populationObservations = observedSample[:, 1]
populationObservations[0:5]
v1
03.9
15.3
27.2
39.6
413


graph = ot.Graph("", "Time (years)", "Population (Millions)", True, "topleft")
cloud = ot.Cloud(timeObservations, populationObservations)
cloud.setLegend("Observations")
graph.add(cloud)
view = viewer.View(graph)
plot calibration logistic

We consider the times and populations as dimension 22 vectors. The logisticModel function takes a dimension 24 vector as input and returns a dimension 22 vector. The first 22 components of the input vector contains the dates and the remaining 2 components are a and b.

nbdates = 22


def logisticModel(X):
    t = [X[i] for i in range(nbdates)]
    a = X[22]
    c = X[23]
    t0 = 1790.0
    y0 = 3.9e6
    b = np.exp(c)
    y = [0.0] * nbdates
    for i in range(nbdates):
        y[i] = a * y0 / (b * y0 + (a - b * y0) * np.exp(-a * (t[i] - t0)))
    z = [yi / 1.0e6 for yi in y]  # Convert into millions
    return z
logisticModelPy = ot.PythonFunction(24, nbdates, logisticModel)

The reference values of the parameters.

Because b is so small with respect to a, we use the logarithm.

np.log(1.5587e-10)
-22.581998789427587
a = 0.03134
c = -22.58
thetaPrior = [a, c]
logisticParametric = ot.ParametricFunction(logisticModelPy, [22, 23], thetaPrior)

Check that we can evaluate the parametric function.

populationPredicted = logisticParametric(timeObservations.asPoint())
populationPredicted

[3.9,5.29757,7.17769,9.69198,13.0277,17.4068,23.0769,30.2887,39.2561,50.0977,62.7691,77.0063,92.311,108.001,123.322,137.59,150.3,161.184,170.193,177.442,183.144,187.55]#22



graph = ot.Graph("", "Time (years)", "Population (Millions)", True, "topleft")
# Observations
cloud = ot.Cloud(timeObservations, populationObservations)
cloud.setLegend("Observations")
cloud.setColor("blue")
graph.add(cloud)
# Predictions
cloud = ot.Cloud(timeObservations.asPoint(), populationPredicted)
cloud.setLegend("Predictions")
cloud.setColor("green")
graph.add(cloud)
view = viewer.View(graph)
plot calibration logistic

We see that the fit is not good: the observations continue to grow after 1950, while the growth of the prediction seem to fade.

Calibration with linear least squares

timeObservationsVector = ot.Sample([[timeObservations[i, 0] for i in range(nbobs)]])
timeObservationsVector[0:10]
v0v1v2v3v4v5v6v7v8v9v10v11v12v13v14v15v16v17v18v19v20v21
01790180018101820183018401850186018701880189019001910192019301940195019601970198019902000


populationObservationsVector = ot.Sample(
    [[populationObservations[i, 0] for i in range(nbobs)]]
)
populationObservationsVector[0:10]
v0v1v2v3v4v5v6v7v8v9v10v11v12v13v14v15v16v17v18v19v20v21
03.95.37.29.6131723313950627692106123132151179203221250281


The reference values of the parameters.

a = 0.03134
c = -22.58
thetaPrior = [a, c]
logisticParametric = ot.ParametricFunction(logisticModelPy, [22, 23], thetaPrior)

Check that we can evaluate the parametric function.

populationPredicted = logisticParametric(timeObservationsVector)
populationPredicted[0:10]
y0y1y2y3y4y5y6y7y8y9y10y11y12y13y14y15y16y17y18y19y20y21
03.95.2975717.1776949.69197713.0276917.4068223.0769130.288739.2560650.0976762.7690777.006392.31103108.0009123.3223137.5899150.3003161.1843170.193177.4422183.1443187.5496


Calibration

algo = ot.LinearLeastSquaresCalibration(
    logisticParametric, timeObservationsVector, populationObservationsVector, thetaPrior
)
algo.run()
calibrationResult = algo.getResult()
thetaMAP = calibrationResult.getParameterMAP()
thetaMAP

[0.0265958,-23.1714]



thetaPosterior = calibrationResult.getParameterPosterior()
thetaPosterior.computeBilateralConfidenceIntervalWithMarginalProbability(0.95)[0]

[0.0246465, 0.028545]
[-23.3182, -23.0247]



transpose samples to interpret several observations instead of several input/outputs as it is a field model

if calibrationResult.getInputObservations().getSize() == 1:
    calibrationResult.setInputObservations(
        [timeObservations[i] for i in range(nbdates)]
    )
    calibrationResult.setOutputObservations(
        [populationObservations[i] for i in range(nbdates)]
    )
    outputAtPrior = [
        [calibrationResult.getOutputAtPriorMean()[0, i]] for i in range(nbdates)
    ]
    outputAtPosterior = [
        [calibrationResult.getOutputAtPosteriorMean()[0, i]] for i in range(nbdates)
    ]
    calibrationResult.setOutputAtPriorAndPosteriorMean(outputAtPrior, outputAtPosterior)
graph = calibrationResult.drawObservationsVsInputs()
view = viewer.View(graph)
plot calibration logistic
graph = calibrationResult.drawObservationsVsInputs()
view = viewer.View(graph)
plot calibration logistic
graph = calibrationResult.drawObservationsVsPredictions()
view = viewer.View(graph)
plot calibration logistic
graph = calibrationResult.drawResiduals()
view = viewer.View(graph)
, Residual analysis
graph = calibrationResult.drawParameterDistributions()
view = viewer.View(graph)

plt.show()
plot calibration logistic

Total running time of the script: ( 0 minutes 0.851 seconds)