.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_calibration/least_squares_and_gaussian_calibration/plot_calibration_logistic.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_calibration_least_squares_and_gaussian_calibration_plot_calibration_logistic.py: Calibration of the logistic model ================================= We present a calibration study of the logistic growth model defined :ref:`here `. The data set that we use is the U.S. population from 1790 to 2000. One of the specific properties of this study is that the observations that we use are real world observations. Hence when we calibrate the model on the data, there is a discrepancy that will be seen. In this example, we calibrate the parameters of a model which predicts the dynamics of the size of a population. This shows how we can calibrate a model which predicts a time dependent output. The output of the model is a time series representing the evolution of the population. This requires a transpose of the sample, so that we can benefit from the visualization methods. Variables --------- In the particular situation where we want to calibrate this model, the following list presents which variables are observed input variables, input calibrated variables and observed output variables. - :math:`t`: Input. Observed. - :math:`z_0`, :math:`a`, :math:`c`: Inputs. Calibrated. - :math:`z`: Output. Observed. .. GENERATED FROM PYTHON SOURCE LINES 33-35 Analysis of the data -------------------- .. GENERATED FROM PYTHON SOURCE LINES 37-46 .. code-block:: Python from openturns.usecases import logistic_model import openturns as ot import numpy as np import openturns.viewer as otv from matplotlib import pylab as plt ot.Log.Show(ot.Log.NONE) .. GENERATED FROM PYTHON SOURCE LINES 47-48 We load the logistic model from the usecases module : .. GENERATED FROM PYTHON SOURCE LINES 48-52 .. code-block:: Python lm = logistic_model.LogisticModel() print("Inputs:", lm.model.getInputDescription()) print("Outputs: ", lm.model.getOutputDescription()) .. rst-class:: sphx-glr-script-out .. code-block:: none Inputs: [t0,t1,t2,t3,t4,t5,t6,t7,t8,t9,t10,t11,t12,t13,t14,t15,t16,t17,t18,t19,t20,t21,a,c]#24 Outputs: [z0,z1,z2,z3,z4,z5,z6,z7,z8,z9,z10,z11,z12,z13,z14,z15,z16,z17,z18,z19,z20,z21]#22 .. GENERATED FROM PYTHON SOURCE LINES 53-57 We see that there are 24 inputs. The first 22 inputs are the timestamps and the last two inputs are the a and c parameters to calibrate. The 22 outputs are the population of the U.S. in millions. .. GENERATED FROM PYTHON SOURCE LINES 59-61 The data is based on 22 dates from 1790 to 2000. The observation points are stored in the `data` field : .. GENERATED FROM PYTHON SOURCE LINES 61-64 .. code-block:: Python observedSample = lm.data observedSample[:5] .. raw:: html
TimeU.S. Population
017903.9
118005.3
218107.2
318209.6
4183013


.. GENERATED FROM PYTHON SOURCE LINES 65-68 .. code-block:: Python nbobs = observedSample.getSize() nbobs .. rst-class:: sphx-glr-script-out .. code-block:: none 22 .. GENERATED FROM PYTHON SOURCE LINES 69-72 .. code-block:: Python timeObservations = observedSample[:, 0] timeObservations[0:5] .. raw:: html
Time
01790
11800
21810
31820
41830


.. GENERATED FROM PYTHON SOURCE LINES 73-76 .. code-block:: Python populationObservations = observedSample[:, 1] populationObservations[0:5] .. raw:: html
U.S. Population
03.9
15.3
27.2
39.6
413


.. GENERATED FROM PYTHON SOURCE LINES 77-91 .. code-block:: Python graph = ot.Graph("", "Time (years)", "Population (Millions)", True, "upper left") cloud = ot.Cloud(timeObservations, populationObservations) cloud.setLegend("Observations") cloud.setPointStyle( ot.ResourceMap.GetAsString("CalibrationResult-ObservationPointStyle") ) graph.add(cloud) curve = ot.Curve(timeObservations, populationObservations) curve.setColor(graph.getDrawable(0).getColor()) curve.setLegend("") curve.setLineStyle(ot.ResourceMap.GetAsString("CalibrationResult-ObservationLineStyle")) graph.add(curve) view = otv.View(graph) .. image-sg:: /auto_calibration/least_squares_and_gaussian_calibration/images/sphx_glr_plot_calibration_logistic_001.png :alt: plot calibration logistic :srcset: /auto_calibration/least_squares_and_gaussian_calibration/images/sphx_glr_plot_calibration_logistic_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 92-94 We see that there is a very smooth growth of the U.S. population. This is a good candidate for model calibration. .. GENERATED FROM PYTHON SOURCE LINES 96-101 We consider the times and populations as dimension 22 vectors. The `logisticModel` function takes a dimension 24 vector as input and returns a dimension 22 vector. The first 22 components of the input vector contains the dates and the remaining 2 components are :math:`a` and :math:`b`. .. GENERATED FROM PYTHON SOURCE LINES 103-104 Print the number of dates: .. GENERATED FROM PYTHON SOURCE LINES 104-106 .. code-block:: Python print(lm.data.getSize()) .. rst-class:: sphx-glr-script-out .. code-block:: none 22 .. GENERATED FROM PYTHON SOURCE LINES 107-108 Get the physical model to calibrate. .. GENERATED FROM PYTHON SOURCE LINES 108-110 .. code-block:: Python logisticModelPy = lm.model .. GENERATED FROM PYTHON SOURCE LINES 111-117 The reference values of the parameters. Because :math:`b` is so small with respect to :math:`a`, we use the logarithm. In other words, we calibrate :math:`c = \log(b)` instead of calibrating :math:`b`. This makes the calibration much easier. .. GENERATED FROM PYTHON SOURCE LINES 119-121 .. code-block:: Python np.log(1.5587e-10) .. rst-class:: sphx-glr-script-out .. code-block:: none -22.581998789427587 .. GENERATED FROM PYTHON SOURCE LINES 122-126 .. code-block:: Python a = 0.03134 c = -22.58 thetaPrior = [a, c] .. GENERATED FROM PYTHON SOURCE LINES 127-155 In the physical model, the inputs and parameters are ordered as presented in the next table. Notice that there are no parameters in the physical model. +-------+----------------+ | Index | Input variable | +=======+================+ | 0 | t1 | +-------+----------------+ | 1 | t2 | +-------+----------------+ | ... | ... | +-------+----------------+ | 21 | t22 | +-------+----------------+ | 22 | a | +-------+----------------+ | 23 | c | +-------+----------------+ +-------+-----------+ | Index | Parameter | +=======+===========+ | ∅ | ∅ | +-------+-----------+ **Table 1.** Indices and names of the inputs and parameters of the physical model. .. GENERATED FROM PYTHON SOURCE LINES 155-158 .. code-block:: Python print("Physical Model Inputs:", lm.model.getInputDescription()) print("Physical Model Parameters:", lm.model.getParameterDescription()) .. rst-class:: sphx-glr-script-out .. code-block:: none Physical Model Inputs: [t0,t1,t2,t3,t4,t5,t6,t7,t8,t9,t10,t11,t12,t13,t14,t15,t16,t17,t18,t19,t20,t21,a,c]#24 Physical Model Parameters: [] .. GENERATED FROM PYTHON SOURCE LINES 159-186 In order to perform calibration, we have to define a parametric model, with observed inputs and parameters to calibrate. In order to do this, we create a :class:`~openturns.ParametricFunction` where the parameters are `a` and `c` which have the indices 22 and 23 in the physical model. +-------+----------------+ | Index | Input variable | +=======+================+ | 0 | t1 | +-------+----------------+ | 1 | t2 | +-------+----------------+ | ... | ... | +-------+----------------+ | 21 | t22 | +-------+----------------+ +-------+-----------+ | Index | Parameter | +=======+===========+ | 0 | a | +-------+-----------+ | 1 | c | +-------+-----------+ **Table 2.** Indices and names of the inputs and parameters of the parametric model. .. GENERATED FROM PYTHON SOURCE LINES 188-192 .. code-block:: Python logisticParametric = ot.ParametricFunction(logisticModelPy, [22, 23], thetaPrior) print("Parametric Model Inputs:", logisticParametric.getInputDescription()) print("Parametric Model Parameters:", logisticParametric.getParameterDescription()) .. rst-class:: sphx-glr-script-out .. code-block:: none Parametric Model Inputs: [t0,t1,t2,t3,t4,t5,t6,t7,t8,t9,t10,t11,t12,t13,t14,t15,t16,t17,t18,t19,t20,t21]#22 Parametric Model Parameters: [a,c] .. GENERATED FROM PYTHON SOURCE LINES 193-194 Check that we can evaluate the parametric function. .. GENERATED FROM PYTHON SOURCE LINES 196-199 .. code-block:: Python populationPredicted = logisticParametric(timeObservations.asPoint()) print(populationPredicted[:5]) .. rst-class:: sphx-glr-script-out .. code-block:: none [3.9,5.29757,7.17769,9.69198,13.0277] .. GENERATED FROM PYTHON SOURCE LINES 200-217 .. code-block:: Python graph = ot.Graph( "Before calibration", "Time (years)", "Population (Millions)", True, "upper left" ) # Observations cloud = ot.Cloud(timeObservations, populationObservations) cloud.setLegend("Observations") cloud.setPointStyle( ot.ResourceMap.GetAsString("CalibrationResult-ObservationPointStyle") ) graph.add(cloud) # Predictions cloud = ot.Cloud(timeObservations.asPoint(), populationPredicted) cloud.setLegend("Predictions") cloud.setPointStyle(ot.ResourceMap.GetAsString("CalibrationResult-PriorPointStyle")) graph.add(cloud) view = otv.View(graph) .. image-sg:: /auto_calibration/least_squares_and_gaussian_calibration/images/sphx_glr_plot_calibration_logistic_002.png :alt: Before calibration :srcset: /auto_calibration/least_squares_and_gaussian_calibration/images/sphx_glr_plot_calibration_logistic_002.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 218-219 We see that the fit is not good: the observations continue to grow after 1950, while the growth of the prediction seems to fade. .. GENERATED FROM PYTHON SOURCE LINES 221-223 Calibration with linear least squares ------------------------------------- .. GENERATED FROM PYTHON SOURCE LINES 225-229 The calibration algorithm takes input and output samples as arguments. In this case, we choose to consider one single observation in dimension 22. In order to perform calibration, we create a `Sample` of input times which has one observation in dimension 22. .. GENERATED FROM PYTHON SOURCE LINES 231-234 .. code-block:: Python timeObservationsSample = ot.Sample([timeObservations.asPoint()]) timeObservationsSample[0, 0:5] .. raw:: html
class=Point name=Unnamed dimension=5 values=[1790,1800,1810,1820,1830]


.. GENERATED FROM PYTHON SOURCE LINES 235-237 Similarly, we create a `Sample` of output populations which has one observation in dimension 22. .. GENERATED FROM PYTHON SOURCE LINES 239-242 .. code-block:: Python populationObservationsSample = ot.Sample([populationObservations.asPoint()]) populationObservationsSample[0, 0:5] .. raw:: html
class=Point name=Unnamed dimension=5 values=[3.9,5.3,7.2,9.6,13]


.. GENERATED FROM PYTHON SOURCE LINES 243-245 .. code-block:: Python logisticParametric = ot.ParametricFunction(logisticModelPy, [22, 23], thetaPrior) .. GENERATED FROM PYTHON SOURCE LINES 246-247 Check that we can evaluate the parametric function. .. GENERATED FROM PYTHON SOURCE LINES 249-252 .. code-block:: Python populationPredicted = logisticParametric(timeObservationsSample) populationPredicted[0, 0:5] .. raw:: html
class=Point name=Unnamed dimension=5 values=[3.9,5.29757,7.17769,9.69198,13.0277]


.. GENERATED FROM PYTHON SOURCE LINES 253-255 Calibration ------------ .. GENERATED FROM PYTHON SOURCE LINES 257-259 We are now able to perform the calibration, using linear least squares using the :class:`~openturns.LinearLeastSquaresCalibration` class. .. GENERATED FROM PYTHON SOURCE LINES 261-270 .. code-block:: Python algo = ot.LinearLeastSquaresCalibration( logisticParametric, timeObservationsSample, populationObservationsSample, thetaPrior ) algo.run() calibrationResult = algo.getResult() thetaMAP = calibrationResult.getParameterMAP() print("theta After = ", thetaMAP) print("theta Before = ", thetaPrior) .. rst-class:: sphx-glr-script-out .. code-block:: none theta After = [0.0265958,-23.1714] theta Before = [0.03134, -22.58] .. GENERATED FROM PYTHON SOURCE LINES 271-273 Compared to the value of the parameters before calibration, we can see that the parameters were significantly updated. .. GENERATED FROM PYTHON SOURCE LINES 275-277 In order to see if the optimum parameters are sensitive to the observation errors, we compute 95% confidence intervals. .. GENERATED FROM PYTHON SOURCE LINES 279-282 .. code-block:: Python thetaPosterior = calibrationResult.getParameterPosterior() print(thetaPosterior.computeBilateralConfidenceIntervalWithMarginalProbability(0.95)[0]) .. rst-class:: sphx-glr-script-out .. code-block:: none [0.0246465, 0.028545] [-23.3182, -23.0247] .. GENERATED FROM PYTHON SOURCE LINES 283-286 The 95% confidence intervals of the optimum are relatively narrow: this means that the optimum is not very sensitive to the observation errors. In this case, the parameters are accurately estimated. .. GENERATED FROM PYTHON SOURCE LINES 288-293 In this example, we have considered a single observation in dimension 22. In this case, the parametric model has has 22 outputs, which create a large number of plots. Transpose samples to interpret the data as a Sample in dimension 1 with 22 observations. .. GENERATED FROM PYTHON SOURCE LINES 295-308 .. code-block:: Python if calibrationResult.getInputObservations().getSize() == 1: calibrationResult.setInputObservations(timeObservations) calibrationResult.setOutputObservations(populationObservations) outputPriorMeanDimension22 = calibrationResult.getOutputAtPriorMean() outputAtPriorDimension1 = ot.Sample.BuildFromPoint(outputPriorMeanDimension22[0]) outputPosteriorMeanDimension22 = calibrationResult.getOutputAtPosteriorMean() outputPosteriorMeanDimension1 = ot.Sample.BuildFromPoint( outputPosteriorMeanDimension22[0] ) calibrationResult.setOutputAtPriorAndPosteriorMean( outputAtPriorDimension1, outputPosteriorMeanDimension1 ) .. GENERATED FROM PYTHON SOURCE LINES 309-311 Increase the default number of points in the plots. This produces smoother spiky distributions. .. GENERATED FROM PYTHON SOURCE LINES 313-315 .. code-block:: Python ot.ResourceMap.SetAsUnsignedInteger("Distribution-DefaultPointNumber", 300) .. GENERATED FROM PYTHON SOURCE LINES 316-320 The next plot presents the U.S. population depending on the time. We see that the calibrated model fits to the data more than the uncalibrated model, especially on the second part of the time interval. .. GENERATED FROM PYTHON SOURCE LINES 320-326 .. code-block:: Python # sphinx_gallery_thumbnail_number = 3 graph = calibrationResult.drawObservationsVsInputs() graph.setLegendPosition("upper left") view = otv.View(graph) .. image-sg:: /auto_calibration/least_squares_and_gaussian_calibration/images/sphx_glr_plot_calibration_logistic_003.png :alt: plot calibration logistic :srcset: /auto_calibration/least_squares_and_gaussian_calibration/images/sphx_glr_plot_calibration_logistic_003.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 327-341 As a metamodel, we can compare the predicted U.S. population depending on the observed U.S. population. We see that the model fits to the data accurately for a population up to 150 million people, but diverges when the population gets larger. On the other hand, the calibrated model under-predicts the population for the [0,150] population interval, and over-predicts for the [150,300] interval, balancing the errors so that the model globally fits. The "S" shape of the graph after calibration reveals that the calibrated model has a structure with residuals that do not follow a normal distribution (otherwise the calibrated cloud would be spread over and under the diagonal). In other words, the model and the data do not fit very well. .. GENERATED FROM PYTHON SOURCE LINES 343-346 .. code-block:: Python graph = calibrationResult.drawObservationsVsPredictions() view = otv.View(graph) .. image-sg:: /auto_calibration/least_squares_and_gaussian_calibration/images/sphx_glr_plot_calibration_logistic_004.png :alt: plot calibration logistic :srcset: /auto_calibration/least_squares_and_gaussian_calibration/images/sphx_glr_plot_calibration_logistic_004.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 347-355 The residuals analysis shows that some residuals were very large before calibration. After calibration, most residuals are in the [-20,20] interval, which explains why the calibrated model fits the data better. We notice that the distribution of the calibrated residuals is relatively close to a normal distribution. This may show that the least squares model is appropriate in this case with respect to this criterion. .. GENERATED FROM PYTHON SOURCE LINES 357-360 .. code-block:: Python graph = calibrationResult.drawResiduals() view = otv.View(graph) .. image-sg:: /auto_calibration/least_squares_and_gaussian_calibration/images/sphx_glr_plot_calibration_logistic_005.png :alt: Residual analysis :srcset: /auto_calibration/least_squares_and_gaussian_calibration/images/sphx_glr_plot_calibration_logistic_005.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 361-364 The next plot shows that there is a significant improvement after the calibration: the initial point is very different from the distribution of the optimum parameter. .. GENERATED FROM PYTHON SOURCE LINES 366-376 .. code-block:: Python graph = calibrationResult.drawParameterDistributions() view = otv.View( graph, figure_kw={"figsize": (8.0, 4.0)}, legend_kw={"bbox_to_anchor": (1.0, 1.0), "loc": "upper left"}, ) plt.subplots_adjust(right=0.8) otv.View.ShowAll() .. image-sg:: /auto_calibration/least_squares_and_gaussian_calibration/images/sphx_glr_plot_calibration_logistic_006.png :alt: plot calibration logistic :srcset: /auto_calibration/least_squares_and_gaussian_calibration/images/sphx_glr_plot_calibration_logistic_006.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 377-378 Reset default settings .. GENERATED FROM PYTHON SOURCE LINES 378-379 .. code-block:: Python ot.ResourceMap.Reload() .. _sphx_glr_download_auto_calibration_least_squares_and_gaussian_calibration_plot_calibration_logistic.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_calibration_logistic.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_calibration_logistic.py `