.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_calibration/least_squares_and_gaussian_calibration/plot_calibration_logistic.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_calibration_least_squares_and_gaussian_calibration_plot_calibration_logistic.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_calibration_least_squares_and_gaussian_calibration_plot_calibration_logistic.py:


Calibration of the logistic model
=================================

We present a calibration study of the logistic growth model defined :ref:`here <use-case-logistic>`.
The data set that we use is the U.S. population from 1790 to 2000.
One of the specific properties of this study is that the observations
that we use are real world observations.
Hence when we calibrate the model on the data, there is a discrepancy that will
be seen.
In this example, we calibrate the parameters of a model which predicts
the dynamics of the size of a population.
This shows how we can calibrate a model which predicts a time
dependent output.
The output of the model is a time series representing the evolution of the
population.
This requires a transpose of the sample, so that we can benefit from
the visualization methods.

Variables
---------

In the particular situation where we want to calibrate this model, the
following list presents which variables are observed input variables,
input calibrated variables and observed output variables.

- :math:`t`: Input. Observed.
- :math:`z_0`, :math:`a`, :math:`c`: Inputs. Calibrated.
- :math:`z`: Output. Observed.

.. GENERATED FROM PYTHON SOURCE LINES 34-36

Analysis of the data
--------------------

.. GENERATED FROM PYTHON SOURCE LINES 38-47

.. code-block:: Python

    from openturns.usecases import logistic_model
    import openturns as ot
    import numpy as np
    import openturns.viewer as otv
    from matplotlib import pylab as plt

    ot.Log.Show(ot.Log.NONE)


.. GENERATED FROM PYTHON SOURCE LINES 48-49

We load the logistic model from the usecases module :

.. GENERATED FROM PYTHON SOURCE LINES 49-53

.. code-block:: Python

    lm = logistic_model.LogisticModel()
    print("Inputs:", lm.model.getInputDescription())
    print("Outputs: ", lm.model.getOutputDescription())


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Inputs: [t0,t1,t2,t3,t4,t5,t6,t7,t8,t9,t10,t11,t12,t13,t14,t15,t16,t17,t18,t19,t20,t21,a,c]#24
    Outputs:  [z0,z1,z2,z3,z4,z5,z6,z7,z8,z9,z10,z11,z12,z13,z14,z15,z16,z17,z18,z19,z20,z21]#22


.. GENERATED FROM PYTHON SOURCE LINES 54-58

We see that there are 24 inputs.
The first 22 inputs are the timestamps and the last two inputs are the
a and c parameters to calibrate.
The 22 outputs are the population of the U.S. in millions.

.. GENERATED FROM PYTHON SOURCE LINES 60-62

The data is based on 22 dates from 1790 to 2000. The observation points are
stored in the `data` field :

.. GENERATED FROM PYTHON SOURCE LINES 62-65

.. code-block:: Python

    observedSample = lm.data
    observedSample[:5]


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <table>
      <tr><td></td><th>Time</th><th>U.S. Population</th></tr>
      <tr><th>0</th><td>1790</td><td>3.9</td></tr>
      <tr><th>1</th><td>1800</td><td>5.3</td></tr>
      <tr><th>2</th><td>1810</td><td>7.2</td></tr>
      <tr><th>3</th><td>1820</td><td>9.6</td></tr>
      <tr><th>4</th><td>1830</td><td>13</td></tr>
    </table>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 66-69

.. code-block:: Python

    nbobs = observedSample.getSize()
    nbobs


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    22


.. GENERATED FROM PYTHON SOURCE LINES 70-73

.. code-block:: Python

    timeObservations = observedSample[:, 0]
    timeObservations[0:5]


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <table>
      <tr><td></td><th>Time</th></tr>
      <tr><th>0</th><td>1790</td></tr>
      <tr><th>1</th><td>1800</td></tr>
      <tr><th>2</th><td>1810</td></tr>
      <tr><th>3</th><td>1820</td></tr>
      <tr><th>4</th><td>1830</td></tr>
    </table>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 74-77

.. code-block:: Python

    populationObservations = observedSample[:, 1]
    populationObservations[0:5]


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <table>
      <tr><td></td><th>U.S. Population</th></tr>
      <tr><th>0</th><td>3.9</td></tr>
      <tr><th>1</th><td>5.3</td></tr>
      <tr><th>2</th><td>7.2</td></tr>
      <tr><th>3</th><td>9.6</td></tr>
      <tr><th>4</th><td>13</td></tr>
    </table>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 78-92

.. code-block:: Python

    graph = ot.Graph("", "Time (years)", "Population (Millions)", True, "upper left")
    cloud = ot.Cloud(timeObservations, populationObservations)
    cloud.setLegend("Observations")
    cloud.setPointStyle(
        ot.ResourceMap.GetAsString("CalibrationResult-ObservationPointStyle")
    )
    graph.add(cloud)
    curve = ot.Curve(timeObservations, populationObservations)
    curve.setColor(graph.getDrawable(0).getColor())
    curve.setLegend("")
    curve.setLineStyle(ot.ResourceMap.GetAsString("CalibrationResult-ObservationLineStyle"))
    graph.add(curve)
    view = otv.View(graph)


.. image-sg:: /auto_calibration/least_squares_and_gaussian_calibration/images/sphx_glr_plot_calibration_logistic_001.png
   :alt: plot calibration logistic
   :srcset: /auto_calibration/least_squares_and_gaussian_calibration/images/sphx_glr_plot_calibration_logistic_001.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 93-95

We see that there is a very smooth growth of the U.S. population.
This is a good candidate for model calibration.

.. GENERATED FROM PYTHON SOURCE LINES 97-102

We consider the times and populations as dimension 22 vectors.
The `logisticModel` function takes a dimension 24 vector as input and
returns a dimension 22 vector.
The first 22 components of the input vector contains the dates and the
remaining 2 components are :math:`a` and :math:`b`.

.. GENERATED FROM PYTHON SOURCE LINES 104-105

Print the number of dates:

.. GENERATED FROM PYTHON SOURCE LINES 105-107

.. code-block:: Python

    print(lm.data.getSize())


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    22


.. GENERATED FROM PYTHON SOURCE LINES 108-109

Get the physical model to calibrate.

.. GENERATED FROM PYTHON SOURCE LINES 109-111

.. code-block:: Python

    logisticModelPy = lm.model


.. GENERATED FROM PYTHON SOURCE LINES 112-118

The reference values of the parameters.

Because :math:`b` is so small with respect to :math:`a`, we use the logarithm.
In other words, we calibrate :math:`c = \log(b)` instead of calibrating
:math:`b`.
This makes the calibration much easier.

.. GENERATED FROM PYTHON SOURCE LINES 120-122

.. code-block:: Python

    np.log(1.5587e-10)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    np.float64(-22.581998789427587)


.. GENERATED FROM PYTHON SOURCE LINES 123-127

.. code-block:: Python

    a = 0.03134
    c = -22.58
    thetaPrior = [a, c]


.. GENERATED FROM PYTHON SOURCE LINES 128-156

In the physical model, the inputs and parameters are ordered as
presented in the next table.
Notice that there are no parameters in the physical model.

+-------+----------------+
| Index | Input variable |
+=======+================+
| 0     | t1             |
+-------+----------------+
| 1     | t2             |
+-------+----------------+
| ...   | ...            |
+-------+----------------+
| 21    | t22            |
+-------+----------------+
| 22    | a              |
+-------+----------------+
| 23    | c              |
+-------+----------------+

+-------+-----------+
| Index | Parameter |
+=======+===========+
| ∅     | ∅         |
+-------+-----------+

**Table 1.** Indices and names of the inputs and parameters of the physical model.


.. GENERATED FROM PYTHON SOURCE LINES 156-159

.. code-block:: Python

    print("Physical Model Inputs:", lm.model.getInputDescription())
    print("Physical Model Parameters:", lm.model.getParameterDescription())


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Physical Model Inputs: [t0,t1,t2,t3,t4,t5,t6,t7,t8,t9,t10,t11,t12,t13,t14,t15,t16,t17,t18,t19,t20,t21,a,c]#24
    Physical Model Parameters: []


.. GENERATED FROM PYTHON SOURCE LINES 160-187

In order to perform calibration, we have to define a parametric model,
with observed inputs and parameters to calibrate.
In order to do this, we create a :class:`~openturns.ParametricFunction` where the parameters
are `a` and `c` which have the indices 22 and 23 in the physical model.

+-------+----------------+
| Index | Input variable |
+=======+================+
| 0     | t1             |
+-------+----------------+
| 1     | t2             |
+-------+----------------+
| ...   | ...            |
+-------+----------------+
| 21    | t22            |
+-------+----------------+

+-------+-----------+
| Index | Parameter |
+=======+===========+
| 0     | a         |
+-------+-----------+
| 1     | c         |
+-------+-----------+

**Table 2.** Indices and names of the inputs and parameters of the parametric model.


.. GENERATED FROM PYTHON SOURCE LINES 189-193

.. code-block:: Python

    logisticParametric = ot.ParametricFunction(logisticModelPy, [22, 23], thetaPrior)
    print("Parametric Model Inputs:", logisticParametric.getInputDescription())
    print("Parametric Model Parameters:", logisticParametric.getParameterDescription())


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Parametric Model Inputs: [t0,t1,t2,t3,t4,t5,t6,t7,t8,t9,t10,t11,t12,t13,t14,t15,t16,t17,t18,t19,t20,t21]#22
    Parametric Model Parameters: [a,c]


.. GENERATED FROM PYTHON SOURCE LINES 194-195

Check that we can evaluate the parametric function.

.. GENERATED FROM PYTHON SOURCE LINES 197-200

.. code-block:: Python

    populationPredicted = logisticParametric(timeObservations.asPoint())
    print(populationPredicted[:5])


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [3.9,5.29757,7.17769,9.69198,13.0277]


.. GENERATED FROM PYTHON SOURCE LINES 201-218

.. code-block:: Python

    graph = ot.Graph(
        "Before calibration", "Time (years)", "Population (Millions)", True, "upper left"
    )
    # Observations
    cloud = ot.Cloud(timeObservations, populationObservations)
    cloud.setLegend("Observations")
    cloud.setPointStyle(
        ot.ResourceMap.GetAsString("CalibrationResult-ObservationPointStyle")
    )
    graph.add(cloud)
    # Predictions
    cloud = ot.Cloud(timeObservations.asPoint(), populationPredicted)
    cloud.setLegend("Predictions")
    cloud.setPointStyle(ot.ResourceMap.GetAsString("CalibrationResult-PriorPointStyle"))
    graph.add(cloud)
    view = otv.View(graph)


.. image-sg:: /auto_calibration/least_squares_and_gaussian_calibration/images/sphx_glr_plot_calibration_logistic_002.png
   :alt: Before calibration
   :srcset: /auto_calibration/least_squares_and_gaussian_calibration/images/sphx_glr_plot_calibration_logistic_002.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 219-220

We see that the fit is not good: the observations continue to grow after 1950, while the growth of the prediction seems to fade.

.. GENERATED FROM PYTHON SOURCE LINES 222-224

Calibration with linear least squares
-------------------------------------

.. GENERATED FROM PYTHON SOURCE LINES 226-230

The calibration algorithm takes input and output samples as arguments.
In this case, we choose to consider one single observation in dimension 22.
In order to perform calibration, we create a `Sample` of input times which has
one observation in dimension 22.

.. GENERATED FROM PYTHON SOURCE LINES 232-235

.. code-block:: Python

    timeObservationsSample = ot.Sample([timeObservations.asPoint()])
    timeObservationsSample[0, 0:5]


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    class=Point name=Unnamed dimension=5 values=[1790,1800,1810,1820,1830]
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 236-238

Similarly, we create a `Sample` of output populations which has
one observation in dimension 22.

.. GENERATED FROM PYTHON SOURCE LINES 240-243

.. code-block:: Python

    populationObservationsSample = ot.Sample([populationObservations.asPoint()])
    populationObservationsSample[0, 0:5]


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    class=Point name=Unnamed dimension=5 values=[3.9,5.3,7.2,9.6,13]
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 244-246

.. code-block:: Python

    logisticParametric = ot.ParametricFunction(logisticModelPy, [22, 23], thetaPrior)


.. GENERATED FROM PYTHON SOURCE LINES 247-248

Check that we can evaluate the parametric function.

.. GENERATED FROM PYTHON SOURCE LINES 250-253

.. code-block:: Python

    populationPredicted = logisticParametric(timeObservationsSample)
    populationPredicted[0, 0:5]


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    class=Point name=Unnamed dimension=5 values=[3.9,5.29757,7.17769,9.69198,13.0277]
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 254-256

Calibration
------------

.. GENERATED FROM PYTHON SOURCE LINES 258-260

We are now able to perform the calibration, using linear least squares
using the :class:`~openturns.LinearLeastSquaresCalibration` class.

.. GENERATED FROM PYTHON SOURCE LINES 262-271

.. code-block:: Python

    algo = ot.LinearLeastSquaresCalibration(
        logisticParametric, timeObservationsSample, populationObservationsSample, thetaPrior
    )
    algo.run()
    calibrationResult = algo.getResult()
    thetaMAP = calibrationResult.getParameterMAP()
    print("theta After = ", thetaMAP)
    print("theta Before = ", thetaPrior)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    theta After =  [0.0265958,-23.1714]
    theta Before =  [0.03134, -22.58]


.. GENERATED FROM PYTHON SOURCE LINES 272-274

Compared to the value of the parameters before calibration, we can see that the
parameters were significantly updated.

.. GENERATED FROM PYTHON SOURCE LINES 276-278

In order to see if the optimum parameters are sensitive to the
observation errors, we compute 95% confidence intervals.

.. GENERATED FROM PYTHON SOURCE LINES 280-283

.. code-block:: Python

    thetaPosterior = calibrationResult.getParameterPosterior()
    print(thetaPosterior.computeBilateralConfidenceIntervalWithMarginalProbability(0.95)[0])


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [0.0246465, 0.028545]
    [-23.3182, -23.0247]


.. GENERATED FROM PYTHON SOURCE LINES 284-287

The 95% confidence intervals of the optimum are relatively narrow:
this means that the optimum is not very sensitive to the observation errors.
In this case, the parameters are accurately estimated.

.. GENERATED FROM PYTHON SOURCE LINES 289-294

In this example, we have considered a single observation in dimension 22.
In this case, the parametric model has has 22 outputs, which
create a large number of plots.
Transpose samples to interpret the data as a Sample in dimension 1
with 22 observations.

.. GENERATED FROM PYTHON SOURCE LINES 296-309

.. code-block:: Python

    if calibrationResult.getInputObservations().getSize() == 1:
        calibrationResult.setInputObservations(timeObservations)
        calibrationResult.setOutputObservations(populationObservations)
        outputPriorMeanDimension22 = calibrationResult.getOutputAtPriorMean()
        outputAtPriorDimension1 = ot.Sample.BuildFromPoint(outputPriorMeanDimension22[0])
        outputPosteriorMeanDimension22 = calibrationResult.getOutputAtPosteriorMean()
        outputPosteriorMeanDimension1 = ot.Sample.BuildFromPoint(
            outputPosteriorMeanDimension22[0]
        )
        calibrationResult.setOutputAtPriorAndPosteriorMean(
            outputAtPriorDimension1, outputPosteriorMeanDimension1
        )


.. GENERATED FROM PYTHON SOURCE LINES 310-312

Increase the default number of points in the plots.
This produces smoother spiky distributions.

.. GENERATED FROM PYTHON SOURCE LINES 314-316

.. code-block:: Python

    ot.ResourceMap.SetAsUnsignedInteger("Distribution-DefaultPointNumber", 300)


.. GENERATED FROM PYTHON SOURCE LINES 317-321

The next plot presents the U.S. population depending on the time.
We see that the calibrated model fits to the data more
than the uncalibrated model, especially on the second part
of the time interval.

.. GENERATED FROM PYTHON SOURCE LINES 321-327

.. code-block:: Python


    # sphinx_gallery_thumbnail_number = 3
    graph = calibrationResult.drawObservationsVsInputs()
    graph.setLegendPosition("upper left")
    view = otv.View(graph)


.. image-sg:: /auto_calibration/least_squares_and_gaussian_calibration/images/sphx_glr_plot_calibration_logistic_003.png
   :alt: plot calibration logistic
   :srcset: /auto_calibration/least_squares_and_gaussian_calibration/images/sphx_glr_plot_calibration_logistic_003.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 328-342

As a metamodel, we can compare the predicted U.S. population
depending on the observed U.S. population.
We see that the model fits to the data accurately for
a population up to 150 million people, but diverges when the population
gets larger.
On the other hand, the calibrated model under-predicts the
population for the [0,150] population interval, and
over-predicts for the [150,300] interval, balancing the errors
so that the model globally fits.
The "S" shape of the graph after calibration reveals that
the calibrated model has a structure with residuals that do
not follow a normal distribution (otherwise the calibrated
cloud would be spread over and under the diagonal).
In other words, the model and the data do not fit very well.

.. GENERATED FROM PYTHON SOURCE LINES 344-347

.. code-block:: Python

    graph = calibrationResult.drawObservationsVsPredictions()
    view = otv.View(graph)


.. image-sg:: /auto_calibration/least_squares_and_gaussian_calibration/images/sphx_glr_plot_calibration_logistic_004.png
   :alt: plot calibration logistic
   :srcset: /auto_calibration/least_squares_and_gaussian_calibration/images/sphx_glr_plot_calibration_logistic_004.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 348-356

The residuals analysis shows that some residuals were very
large before calibration.
After calibration, most residuals are in the [-20,20] interval,
which explains why the calibrated model fits the data better.
We notice that the distribution of the
calibrated residuals is relatively close to a normal distribution.
This may show that the least squares model is appropriate in
this case with respect to this criterion.

.. GENERATED FROM PYTHON SOURCE LINES 358-361

.. code-block:: Python

    graph = calibrationResult.drawResiduals()
    view = otv.View(graph)


.. image-sg:: /auto_calibration/least_squares_and_gaussian_calibration/images/sphx_glr_plot_calibration_logistic_005.png
   :alt: Residual analysis
   :srcset: /auto_calibration/least_squares_and_gaussian_calibration/images/sphx_glr_plot_calibration_logistic_005.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 362-365

The next plot shows that there is a significant improvement
after the calibration: the initial point is very different
from the distribution of the optimum parameter.

.. GENERATED FROM PYTHON SOURCE LINES 367-377

.. code-block:: Python

    graph = calibrationResult.drawParameterDistributions()
    view = otv.View(
        graph,
        figure_kw={"figsize": (8.0, 4.0)},
        legend_kw={"bbox_to_anchor": (1.0, 1.0), "loc": "upper left"},
    )
    plt.subplots_adjust(right=0.8)

    otv.View.ShowAll()


.. image-sg:: /auto_calibration/least_squares_and_gaussian_calibration/images/sphx_glr_plot_calibration_logistic_006.png
   :alt: plot calibration logistic
   :srcset: /auto_calibration/least_squares_and_gaussian_calibration/images/sphx_glr_plot_calibration_logistic_006.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 378-379

Reset default settings

.. GENERATED FROM PYTHON SOURCE LINES 379-380

.. code-block:: Python

    ot.ResourceMap.Reload()


.. _sphx_glr_download_auto_calibration_least_squares_and_gaussian_calibration_plot_calibration_logistic.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_calibration_logistic.ipynb <plot_calibration_logistic.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_calibration_logistic.py <plot_calibration_logistic.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: plot_calibration_logistic.zip <plot_calibration_logistic.zip>`