.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_calibration/bayesian_calibration/plot_bayesian_calibration.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_calibration_bayesian_calibration_plot_bayesian_calibration.py: Bayesian calibration of a computer code ======================================= .. GENERATED FROM PYTHON SOURCE LINES 6-79 In this example we are going to compute the parameters of a computer model thanks to Bayesian estimation. Let us denote :math:`\underline y = (y_1, \dots, y_n)` the observation sample, :math:`\underline z = (f(x_1|\underline{\theta}), \ldots, f(x_n|\underline{\theta}))` the model prediction, :math:`p(y |z)` the density function of observation :math:`y` conditional on model prediction :math:`z`, and :math:`\underline{\theta} \in \mathbb{R}^p` the calibration parameters we wish to estimate. The posterior distribution is given by Bayes theorem: .. math::\pi(\underline{\theta} | \underline y) \quad \propto \quad L\left(\underline y | \underline{\theta}\right) \times \pi(\underline{\theta}):math:`` where :math:`\propto` means "proportional to", regarded as a function of :math:`\underline{\theta}`. The posterior distribution is approximated here by the empirical distribution of the sample :math:`\underline{\theta}^1, \ldots, \underline{\theta}^N` generated by the Metropolis-Hastings algorithm. This means that any quantity characteristic of the posterior distribution (mean, variance, quantile, ...) is approximated by its empirical counterpart. Our model (i.e. the compute code to calibrate) is a standard normal linear regression, where .. math:: y_i = \theta_1 + x_i \theta_2 + x_i^2 \theta_3 + \varepsilon_i where :math:`\varepsilon_i \stackrel{i.i.d.}{\sim} \mathcal N(0, 1)`. The "true" value of :math:`\theta` is: .. math:: \theta_{true} = (-4.5,4.8,2.2)^T. We use a normal prior on :math:`\underline{\theta}`: .. math:: \pi(\underline{\theta}) = \mathcal N(\mu_\theta, \Sigma_\theta) where .. math:: \mu_\theta = \begin{pmatrix} -3 \\ 4 \\ 1 \end{pmatrix} is the mean of the prior and .. math:: \Sigma_\theta = \begin{pmatrix} \sigma_{\theta_1}^2 & 0 & 0 \\ 0 & \sigma_{\theta_2}^2 & 0 \\ 0 & 0 & \sigma_{\theta_3}^2 \end{pmatrix} is the prior covariance matrix with .. math:: \sigma_{\theta_1} = 2, \qquad \sigma_{\theta_2} = 1, \qquad \sigma_{\theta_3} = 1.5. The following objects need to be defined in order to perform Bayesian calibration: - The conditional density :math:`p(y|z)` must be defined as a probability distribution - The computer model must be implemented thanks to the ParametricFunction class. This takes a value of :math:`\underline{\theta}` as input, and outputs the vector of model predictions :math:`\underline z`, as defined above (the vector of covariates :math:`\underline x = (x_1, \ldots, x_n)` is treated as a known constant). When doing that, we have to keep in mind that :math:`z` will be used as the vector of parameters corresponding to the distribution specified for :math:`p(y |z)`. For instance, if :math:`p(y|z)` is normal, this means that :math:`z` must be a vector containing the mean and variance of :math:`y` - The prior density :math:`\pi(\underline{\theta})` encoding the set of possible values for the calibration parameters, each value being weighted by its a priori probability, reflecting the beliefs about the possible values of :math:`\underline{\theta}` before consideration of the experimental data. Again, this is implemented as a probability distribution - The Metropolis-Hastings algorithm that samples from the posterior distribution of the calibration parameters requires a vector :math:`\underline{\theta}_0` initial values for the calibration parameters, as well as the proposal laws used to update each parameter sequentially. .. GENERATED FROM PYTHON SOURCE LINES 81-86 .. code-block:: default import openturns as ot import openturns.viewer as viewer from matplotlib import pylab as plt ot.Log.Show(ot.Log.NONE) .. GENERATED FROM PYTHON SOURCE LINES 87-88 Dimension of the vector of parameters to calibrate .. GENERATED FROM PYTHON SOURCE LINES 88-92 .. code-block:: default paramDim = 3 # The number of obesrvations obsSize = 10 .. GENERATED FROM PYTHON SOURCE LINES 93-94 - Define the observed inputs :math:`x_i` .. GENERATED FROM PYTHON SOURCE LINES 96-103 .. code-block:: default xmin = -2. xmax = 3. step = (xmax-xmin)/(obsSize-1) rg = ot.RegularGrid(xmin, step, obsSize) x_obs = rg.getVertices() x_obs .. raw:: html
t
0-2
1-1.444444
2-0.8888889
3-0.3333333
40.2222222
50.7777778
61.333333
71.888889
82.444444
93


.. GENERATED FROM PYTHON SOURCE LINES 104-105 - Define the parametric model :math:`z = f(x,\theta)` that associates each observation :math:`x_i` and values of the parameters :math:`\theta_i` to the parameters of the distribution of the corresponding observation: here :math:`z=(\mu, \sigma)` where :math:`\mu`, the first output of the model, is the mean and :math:`\sigma`, the second output of the model, is the standard deviation. .. GENERATED FROM PYTHON SOURCE LINES 107-112 .. code-block:: default fullModel = ot.SymbolicFunction( ['x1', 'theta1', 'theta2', 'theta3'], ['theta1+theta2*x1+theta3*x1^2','1.0']) model = ot.ParametricFunction(fullModel, [0], x_obs[0]) model .. raw:: html

ParametricEvaluation([x1,theta1,theta2,theta3]->[theta1+theta2*x1+theta3*x1^2,1.0], parameters positions=[0], parameters=[x1 : -2], input positions=[1,2,3])



.. GENERATED FROM PYTHON SOURCE LINES 113-114 - Define the observation noise :math:`\varepsilon {\sim} \mathcal N(0, 1)` and create a sample from it. .. GENERATED FROM PYTHON SOURCE LINES 116-122 .. code-block:: default ot.RandomGenerator.SetSeed(0) noiseStandardDeviation = 1. noise = ot.Normal(0,noiseStandardDeviation) noiseSample = noise.getSample(obsSize) noiseSample .. raw:: html
X0
00.6082017
1-1.266173
2-0.4382656
31.205478
4-2.181385
50.3500421
6-0.355007
71.437249
80.810668
90.793156


.. GENERATED FROM PYTHON SOURCE LINES 123-124 - Define the vector of observations :math:`y_i` .. GENERATED FROM PYTHON SOURCE LINES 126-127 In this model, we use a constant value of the parameter. The "true" value of :math:`\theta` is used to compute the model outputs. .. GENERATED FROM PYTHON SOURCE LINES 129-131 .. code-block:: default thetaTrue = [-4.5,4.8,2.2] .. GENERATED FROM PYTHON SOURCE LINES 132-138 .. code-block:: default y_obs = ot.Sample(obsSize,1) for i in range(obsSize): model.setParameter(x_obs[i]) y_obs[i,0] = model(thetaTrue)[0] + noiseSample[i,0] y_obs .. raw:: html
v0
0-4.691798
1-8.109383
2-7.466661
3-4.650077
4-5.506077
50.9142396
65.456104
713.8533
821.18968
930.49316


.. GENERATED FROM PYTHON SOURCE LINES 139-140 - Draw the model vs the observations. .. GENERATED FROM PYTHON SOURCE LINES 142-152 .. code-block:: default functionnalModel = ot.ParametricFunction(fullModel, [1,2,3], thetaTrue) graphModel = functionnalModel.getMarginal(0).draw(xmin,xmax) observations = ot.Cloud(x_obs,y_obs) observations = ot.Cloud(x_obs,y_obs) observations.setColor("red") graphModel.add(observations) graphModel.setLegends(["Model","Observations"]) graphModel.setLegendPosition("topleft") view = viewer.View(graphModel) .. image:: /auto_calibration/bayesian_calibration/images/sphx_glr_plot_bayesian_calibration_001.png :alt: y0 as a function of x1 :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 153-156 - Define the distribution of observations :math:`\underline{y} | \underline{z}` conditional on model predictions Note that its parameter dimension is the one of :math:`\underline{z}`, so the model must be adjusted accordingly .. GENERATED FROM PYTHON SOURCE LINES 158-161 .. code-block:: default conditional = ot.Normal() conditional .. raw:: html

Normal(mu = 0, sigma = 1)



.. GENERATED FROM PYTHON SOURCE LINES 162-163 - Define the mean :math:`\mu_\theta`, the covariance matrix :math:`\Sigma_\theta`, then the prior distribution :math:`\pi(\underline{\theta})` of the parameter :math:`\underline{\theta}`. .. GENERATED FROM PYTHON SOURCE LINES 165-167 .. code-block:: default thetaPriorMean = [-3.,4.,1.] .. GENERATED FROM PYTHON SOURCE LINES 168-177 .. code-block:: default sigma0 = [2.,1.,1.5] # standard deviations thetaPriorCovarianceMatrix = ot.CovarianceMatrix(paramDim) for i in range(paramDim): thetaPriorCovarianceMatrix[i, i] = sigma0[i]**2 prior = ot.Normal(thetaPriorMean, thetaPriorCovarianceMatrix) prior.setDescription(['theta1', 'theta2', 'theta3']) prior .. raw:: html

Normal(mu = [-3,4,1], sigma = [2,1,1.5], R = [[ 1 0 0 ]
[ 0 1 0 ]
[ 0 0 1 ]])



.. GENERATED FROM PYTHON SOURCE LINES 178-179 - Proposal distribution: uniform. .. GENERATED FROM PYTHON SOURCE LINES 181-184 .. code-block:: default proposal = [ot.Uniform(-1., 1.)] * paramDim proposal .. rst-class:: sphx-glr-script-out Out: .. code-block:: none [class=Uniform name=Uniform dimension=1 a=-1 b=1, class=Uniform name=Uniform dimension=1 a=-1 b=1, class=Uniform name=Uniform dimension=1 a=-1 b=1] .. GENERATED FROM PYTHON SOURCE LINES 185-189 Test the MCMC sampler --------------------- The MCMC sampler essentially computes the log-likelihood of the parameters. .. GENERATED FROM PYTHON SOURCE LINES 191-193 .. code-block:: default mymcmc = ot.MCMC(prior, conditional, model, x_obs, y_obs, thetaPriorMean) .. GENERATED FROM PYTHON SOURCE LINES 194-196 .. code-block:: default mymcmc.computeLogLikelihood(thetaPriorMean) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none -151.2962855240547 .. GENERATED FROM PYTHON SOURCE LINES 197-199 Test the Metropolis-Hastings sampler ------------------------------------ .. GENERATED FROM PYTHON SOURCE LINES 201-202 - Creation of the Random Walk Metropolis-Hastings (RWMH) sampler. .. GENERATED FROM PYTHON SOURCE LINES 204-206 .. code-block:: default initialState = thetaPriorMean .. GENERATED FROM PYTHON SOURCE LINES 207-210 .. code-block:: default RWMHsampler = ot.RandomWalkMetropolisHastings( prior, conditional, model, x_obs, y_obs, initialState, proposal) .. GENERATED FROM PYTHON SOURCE LINES 211-212 In order to check our model before simulating it, we compute the log-likelihood. .. GENERATED FROM PYTHON SOURCE LINES 214-216 .. code-block:: default RWMHsampler.computeLogLikelihood(initialState) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none -151.2962855240547 .. GENERATED FROM PYTHON SOURCE LINES 217-218 We observe that, as expected, the previous value is equal to the output of the same method in the MCMC object. .. GENERATED FROM PYTHON SOURCE LINES 220-221 Tuning of the RWMH algorithm. .. GENERATED FROM PYTHON SOURCE LINES 223-224 Strategy of calibration for the random walk (trivial example: default). .. GENERATED FROM PYTHON SOURCE LINES 226-229 .. code-block:: default strategy = ot.CalibrationStrategyCollection(paramDim) RWMHsampler.setCalibrationStrategyPerComponent(strategy) .. GENERATED FROM PYTHON SOURCE LINES 230-231 Other parameters. .. GENERATED FROM PYTHON SOURCE LINES 233-237 .. code-block:: default RWMHsampler.setVerbose(True) RWMHsampler.setThinning(1) RWMHsampler.setBurnIn(2000) .. GENERATED FROM PYTHON SOURCE LINES 238-239 Generate a sample from the posterior distribution of the parameters theta. .. GENERATED FROM PYTHON SOURCE LINES 241-244 .. code-block:: default sampleSize = 10000 sample = RWMHsampler.getSample(sampleSize) .. GENERATED FROM PYTHON SOURCE LINES 245-246 Look at the acceptance rate (basic checking of the efficiency of the tuning; value close to 0.2 usually recommended). .. GENERATED FROM PYTHON SOURCE LINES 248-250 .. code-block:: default RWMHsampler.getAcceptanceRate() .. raw:: html

[0.456667,0.2955,0.1305]



.. GENERATED FROM PYTHON SOURCE LINES 251-252 Build the distribution of the posterior by kernel smoothing. .. GENERATED FROM PYTHON SOURCE LINES 254-257 .. code-block:: default kernel = ot.KernelSmoothing() posterior = kernel.build(sample) .. GENERATED FROM PYTHON SOURCE LINES 258-259 Display prior vs posterior for each parameter. .. GENERATED FROM PYTHON SOURCE LINES 261-278 .. code-block:: default from openturns.viewer import View import pylab as pl fig = pl.figure(figsize=(12, 4)) for parameter_index in range(paramDim): graph = posterior.getMarginal(parameter_index).drawPDF() priorGraph = prior.getMarginal(parameter_index).drawPDF() priorGraph.setColors(['blue']) graph.add(priorGraph) graph.setLegends(['Posterior', 'Prior']) ax = fig.add_subplot(1, paramDim, parameter_index+1) _ = ot.viewer.View(graph, figure=fig, axes=[ax]) _ = fig.suptitle("Bayesian calibration") plt.show() .. image:: /auto_calibration/bayesian_calibration/images/sphx_glr_plot_bayesian_calibration_002.png :alt: Bayesian calibration :class: sphx-glr-single-img .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 1.542 seconds) .. _sphx_glr_download_auto_calibration_bayesian_calibration_plot_bayesian_calibration.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_bayesian_calibration.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_bayesian_calibration.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_