.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_calibration/bayesian_calibration/plot_bayesian_calibration.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        Click :ref:`here <sphx_glr_download_auto_calibration_bayesian_calibration_plot_bayesian_calibration.py>`
        to download the full example code

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_calibration_bayesian_calibration_plot_bayesian_calibration.py:


Bayesian calibration of a computer code
=======================================

.. GENERATED FROM PYTHON SOURCE LINES 6-81

In this example we are going to compute the parameters of a computer model thanks to Bayesian estimation.

Let us denote :math:`(y_1, \dots, y_n)` the observation sample,
:math:`(\vect z_1, \ldots, \vect z_n) = (f(x_1|\vect\theta), \ldots, f(x_n|\vect\theta))` the model prediction,
:math:`p(y |\vect z)` the density function of observation :math:`y`
conditional on model prediction :math:`\vect z`,
and :math:`\vect\theta \in \mathbb{R}^p` the calibration parameters we wish to estimate.


The posterior distribution is given by Bayes theorem:

.. math::\pi(\vect\theta | \vect y) \quad \propto \quad L\left(\vect y | \vect\theta\right) \times \pi(\vect\theta):math:``

where :math:`\propto` means "proportional to", regarded as a function of :math:`\vect\theta`.

The posterior distribution is approximated here by the empirical distribution of the sample :math:`\vect\theta^1, \ldots, \vect\theta^N` generated by the Metropolis-Hastings algorithm. This means that any quantity characteristic of the posterior distribution (mean, variance, quantile, ...) is approximated by its empirical counterpart.

Our model (i.e. the compute code to calibrate) is a standard normal linear regression, where

.. math::
   y_i = \theta_1 + x_i \theta_2 + x_i^2 \theta_3 + \varepsilon_i

where :math:`\varepsilon_i \stackrel{i.i.d.}{\sim} \mathcal N(0, 1)`.

The "true" value of :math:`\theta` is:

.. math::
   \vect \theta_{true} = (-4.5,4.8,2.2)^T.


We use a normal prior on :math:`\vect\theta`:

.. math::
   \pi(\vect\theta) = \mathcal N(\vect{\mu}_\vect{\theta}, \mat{\Sigma}_\vect{\theta})

where

.. math::
    \vect{\mu}_\vect{\theta} =
    \begin{pmatrix}
     -3 \\
      4 \\
      1
    \end{pmatrix}

is the mean of the prior and

.. math::
   \mat{\Sigma}_\vect{\theta} =
   \begin{pmatrix}
     \sigma_{\theta_1}^2 & 0 & 0 \\
     0 & \sigma_{\theta_2}^2 & 0 \\
     0 & 0 & \sigma_{\theta_3}^2
   \end{pmatrix}

is the prior covariance matrix with

.. math::
   \sigma_{\theta_1} = 2, \qquad \sigma_{\theta_2} = 1, \qquad \sigma_{\theta_3} = 1.5.

The following objects need to be defined in order to perform Bayesian calibration:

- The conditional density :math:`p(y|\vect z)` must be defined as a probability distribution.
- The computer model must be implemented thanks to the ParametricFunction class.
  This takes a value of :math:`\vect\theta` as input, and outputs the vector of model predictions :math:`\vect z`,
  as defined above (the vector of covariates :math:`\vect x = (x_1, \ldots, x_n)` is treated as a known constant).
  When doing that, we have to keep in mind that :math:`\vect z` will be used as the vector of parameters corresponding
  to the distribution specified for :math:`p(y |\vect z)`. For instance, if :math:`p(y|\vect z)` is normal,
  this means that :math:`\vect z` must be a vector containing the mean and standard deviation of :math:`y`.
- The prior density :math:`\pi(\vect\theta)` encoding the set of possible values for the calibration parameters,
  each value being weighted by its a priori probability, reflecting the beliefs about the possible values
  of :math:`\vect\theta` before consideration of the experimental data.
  Again, this is implemented as a probability distribution.
- Metropolis-Hastings algorithm(s), possibly used in tandem with a Gibbs algorithm
  in order to sample from the posterior distribution of the calibration parameters.

.. GENERATED FROM PYTHON SOURCE LINES 83-90

.. code-block:: default

    import pylab as pl
    import openturns as ot
    import openturns.viewer as viewer
    from matplotlib import pylab as plt

    ot.Log.Show(ot.Log.NONE)


.. GENERATED FROM PYTHON SOURCE LINES 91-92

Dimension of the vector of parameters to calibrate

.. GENERATED FROM PYTHON SOURCE LINES 92-96

.. code-block:: default

    paramDim = 3
    # The number of obesrvations
    obsSize = 10


.. GENERATED FROM PYTHON SOURCE LINES 97-98

Define the observed inputs :math:`x_i`.

.. GENERATED FROM PYTHON SOURCE LINES 100-106

.. code-block:: default

    xmin = -2.0
    xmax = 3.0
    step = (xmax - xmin) / (obsSize - 1)
    rg = ot.RegularGrid(xmin, step, obsSize)
    x_obs = rg.getVertices()


.. GENERATED FROM PYTHON SOURCE LINES 107-113

Define the parametric model :math:`\vect z = f(x,\vect\theta)` that associates each
observation :math:`x` and value of :math:`\vect \theta` to the parameters
of the distribution of the corresponding observation :math:`y`:
here :math:`\vect z=(\mu, \sigma)` where :math:`\mu`,
the first output of the model, is the mean and :math:`\sigma`,
the second output of the model, is the standard deviation.

.. GENERATED FROM PYTHON SOURCE LINES 115-119

.. code-block:: default

    fullModel = ot.SymbolicFunction(
        ["x", "theta1", "theta2", "theta3"], ["theta1+theta2*x+theta3*x^2", "1.0"]
    )


.. GENERATED FROM PYTHON SOURCE LINES 120-127

To differentiate between the two classes of inputs (:math:`x` and :math:`\vect\theta`),
we define a :class:`~openturns.ParametricFunction` from `fullModel`
and make the first input (the observations :math:`x`) its *parameter*:
:math:`f_x(\vect \theta) := f(x, \vect \theta)`.
We set :math:`x = 1` as a placeholder,
but :math:`x` will actually take the values :math:`x_i` of the observations
when we sample :math:`\vect\theta`.

.. GENERATED FROM PYTHON SOURCE LINES 127-131

.. code-block:: default


    linkFunction = ot.ParametricFunction(fullModel, [0], [1.0])
    print(linkFunction)


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    ParametricEvaluation([x,theta1,theta2,theta3]->[theta1+theta2*x+theta3*x^2,1.0], parameters positions=[0], parameters=[x : 1], input positions=[1,2,3])


.. GENERATED FROM PYTHON SOURCE LINES 132-133

Define the observation noise :math:`\varepsilon {\sim} \mathcal N(0, 1)` and create a sample from it.

.. GENERATED FROM PYTHON SOURCE LINES 135-140

.. code-block:: default

    ot.RandomGenerator.SetSeed(0)
    noiseStandardDeviation = 1.0
    noise = ot.Normal(0, noiseStandardDeviation)
    noiseSample = noise.getSample(obsSize)


.. GENERATED FROM PYTHON SOURCE LINES 141-143

Define the vector of observations :math:`y_i`,
here sampled using the "true" value of :math:`\vect \theta`: :math:`\vect \theta_{true}`.

.. GENERATED FROM PYTHON SOURCE LINES 145-147

.. code-block:: default

    thetaTrue = [-4.5, 4.8, 2.2]


.. GENERATED FROM PYTHON SOURCE LINES 148-153

.. code-block:: default

    y_obs = ot.Sample(obsSize, 1)
    for i in range(obsSize):
        linkFunction.setParameter(x_obs[i])
        y_obs[i, 0] = linkFunction(thetaTrue)[0] + noiseSample[i, 0]


.. GENERATED FROM PYTHON SOURCE LINES 154-155

Draw the model predictions vs the observations.

.. GENERATED FROM PYTHON SOURCE LINES 157-167

.. code-block:: default

    functionnalModel = ot.ParametricFunction(fullModel, [1, 2, 3], thetaTrue)
    graphModel = functionnalModel.getMarginal(0).draw(xmin, xmax)
    observations = ot.Cloud(x_obs, y_obs)
    observations = ot.Cloud(x_obs, y_obs)
    observations.setColor("red")
    graphModel.add(observations)
    graphModel.setLegends(["Model", "Observations"])
    graphModel.setLegendPosition("topleft")
    view = viewer.View(graphModel)


.. image-sg:: /auto_calibration/bayesian_calibration/images/sphx_glr_plot_bayesian_calibration_001.png
   :alt: y0 as a function of x
   :srcset: /auto_calibration/bayesian_calibration/images/sphx_glr_plot_bayesian_calibration_001.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 168-171

Define the distribution of observations :math:`y | \vect{z}` conditional on model predictions.

Note that its parameter dimension is the one of :math:`\vect{z}`, so the model must be adjusted accordingly.

.. GENERATED FROM PYTHON SOURCE LINES 173-175

.. code-block:: default

    conditional = ot.Normal()


.. GENERATED FROM PYTHON SOURCE LINES 176-177

Define the mean :math:`\mu_\theta`, the covariance matrix :math:`\Sigma_\theta`, then the prior distribution :math:`\pi(\vect\theta)` of the parameter :math:`\vect\theta`.

.. GENERATED FROM PYTHON SOURCE LINES 179-181

.. code-block:: default

    thetaPriorMean = [-3.0, 4.0, 1.0]


.. GENERATED FROM PYTHON SOURCE LINES 182-190

.. code-block:: default

    sigma0 = [2.0, 1.0, 1.5]  # standard deviations
    thetaPriorCovarianceMatrix = ot.CovarianceMatrix(paramDim)
    for i in range(paramDim):
        thetaPriorCovarianceMatrix[i, i] = sigma0[i] ** 2

    prior = ot.Normal(thetaPriorMean, thetaPriorCovarianceMatrix)
    prior.setDescription(["theta1", "theta2", "theta3"])


.. GENERATED FROM PYTHON SOURCE LINES 191-194

The proposed steps for
:math:`\theta_1`, :math:`\theta_2` and :math:`\theta_3`
will all follow a uniform distribution.

.. GENERATED FROM PYTHON SOURCE LINES 194-197

.. code-block:: default


    proposal = ot.Uniform(-1.0, 1.0)


.. GENERATED FROM PYTHON SOURCE LINES 198-200

Test the Metropolis-Hastings sampler
------------------------------------

.. GENERATED FROM PYTHON SOURCE LINES 202-204

Creation of a single component random walk Metropolis-Hastings (RWMH) sampler.
This involves a combination of the RWMH and the Gibbs algorithms.

.. GENERATED FROM PYTHON SOURCE LINES 206-208

.. code-block:: default

    initialState = thetaPriorMean


.. GENERATED FROM PYTHON SOURCE LINES 209-212

We create a :class:`~openturns.RandomWalkMetropolisHastings` sampler for each component.
Each sampler must be aware of the joint prior distribution.
We also use the same proposal distribution, but this is not mandatory.

.. GENERATED FROM PYTHON SOURCE LINES 212-218

.. code-block:: default


    mh_coll = [
        ot.RandomWalkMetropolisHastings(prior, initialState, proposal, [i])
        for i in range(paramDim)
    ]


.. GENERATED FROM PYTHON SOURCE LINES 219-221

Each sampler must be made aware of the likelihood.
Otherwise we would sample from the prior!

.. GENERATED FROM PYTHON SOURCE LINES 221-225

.. code-block:: default


    for mh in mh_coll:
        mh.setLikelihood(conditional, y_obs, linkFunction, x_obs)


.. GENERATED FROM PYTHON SOURCE LINES 226-227

Finally, the :class:`~openturns.Gibbs` algorithm is constructed from all Metropolis-Hastings samplers.

.. GENERATED FROM PYTHON SOURCE LINES 227-230

.. code-block:: default


    sampler = ot.Gibbs(mh_coll)


.. GENERATED FROM PYTHON SOURCE LINES 231-232

Tuning of the Gibbs algorithm:

.. GENERATED FROM PYTHON SOURCE LINES 234-237

.. code-block:: default

    sampler.setThinning(1)
    sampler.setBurnIn(2000)


.. GENERATED FROM PYTHON SOURCE LINES 238-239

Generate a sample from the posterior distribution of the parameters :math:`\vect \theta`.

.. GENERATED FROM PYTHON SOURCE LINES 241-244

.. code-block:: default

    sampleSize = 10000
    sample = sampler.getSample(sampleSize)


.. GENERATED FROM PYTHON SOURCE LINES 245-248

Look at the acceptance rate (basic check of the sampling efficiency:
values close to :math:`0.2` are usually recommended
for Normal posterior distributions).

.. GENERATED FROM PYTHON SOURCE LINES 250-252

.. code-block:: default

    [mh.getAcceptanceRate() for mh in sampler.getMetropolisHastingsCollection()]


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none


    [0.46225, 0.29283333333333333, 0.12466666666666666]


.. GENERATED FROM PYTHON SOURCE LINES 253-254

Build the distribution of the posterior by kernel smoothing.

.. GENERATED FROM PYTHON SOURCE LINES 256-259

.. code-block:: default

    kernel = ot.KernelSmoothing()
    posterior = kernel.build(sample)


.. GENERATED FROM PYTHON SOURCE LINES 260-261

Display prior vs posterior for each parameter.

.. GENERATED FROM PYTHON SOURCE LINES 263-278

.. code-block:: default


    fig = pl.figure(figsize=(12, 4))

    for parameter_index in range(paramDim):
        graph = posterior.getMarginal(parameter_index).drawPDF()
        priorGraph = prior.getMarginal(parameter_index).drawPDF()
        priorGraph.setColors(["blue"])
        graph.add(priorGraph)
        graph.setLegends(["Posterior", "Prior"])
        ax = fig.add_subplot(1, paramDim, parameter_index + 1)
        _ = ot.viewer.View(graph, figure=fig, axes=[ax])

    _ = fig.suptitle("Bayesian calibration")

    plt.show()


.. image-sg:: /auto_calibration/bayesian_calibration/images/sphx_glr_plot_bayesian_calibration_002.png
   :alt: Bayesian calibration
   :srcset: /auto_calibration/bayesian_calibration/images/sphx_glr_plot_bayesian_calibration_002.png
   :class: sphx-glr-single-img


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** ( 0 minutes  1.766 seconds)


.. _sphx_glr_download_auto_calibration_bayesian_calibration_plot_bayesian_calibration.py:


.. only :: html

 .. container:: sphx-glr-footer
    :class: sphx-glr-footer-example


  .. container:: sphx-glr-download sphx-glr-download-python

     :download:`Download Python source code: plot_bayesian_calibration.py <plot_bayesian_calibration.py>`


  .. container:: sphx-glr-download sphx-glr-download-jupyter

     :download:`Download Jupyter notebook: plot_bayesian_calibration.ipynb <plot_bayesian_calibration.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_