{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n# Posterior sampling using a PythonDistribution\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this example we are going to show how to do Bayesian inference using the :class:~openturns.RandomWalkMetropolisHastings algorithm in a statistical model defined through a :class:~openturns.PythonDistribution.\n\nThis method is illustrated on a simple lifetime study test-case, which involves censored data, as described hereafter.\n\nIn the following, we assume that the lifetime $T_i$ of an industrial component follows the Weibull distribution $\\mathcal W(\\alpha, \\beta)$, with CDF given by $F(t|\\alpha,\\beta)= 1 - e^{-\\left( \\frac{t}{\\beta} \\right)^\\alpha}$.\n\nOur goal is to estimate the model parameters $\\alpha, \\beta$ based on a dataset of recorded failures $(t_1, \\ldots, t_n),$ some of which correspond to actual failures, and the remaining are right-censored. Let $(f_1, \\ldots, f_n) \\in \\{0,1\\}^n$ represent the nature of each datum, $f_i=1$ if $t_i$ corresponds to an actual failure, $f_i=0$ if it is right-censored.\n\nNote that the likelihood of each recorded failure is given by the Weibull density:\n\n\\begin{align}\\mathcal L(t_i | f_i=1, \\alpha, \\beta) = \\frac{\\alpha}{\\beta}\\left( \\frac{t_i}{\\beta} \\right)^{\\alpha-1} e^{-\\left( \\frac{t_i}{\\beta} \\right)^\\alpha}.\\end{align}\n\nOn the other hand, the likelihood of each right-censored observation is given by:\n\n\\begin{align}\\mathcal L(t_i | f_i=0, \\alpha, \\beta) = e^{-\\left( \\frac{t_i}{\\beta} \\right)^\\alpha}.\\end{align}\n\nFurthermore, assume that the prior information available on $\\alpha, \\beta$ is represented by independent prior laws, whose respective densities are denoted by $\\pi(\\alpha)$ and $\\pi(\\beta).$\n\nThe posterior distribution of $(\\alpha, \\beta)$ represents the update of the prior information on $(\\alpha, \\beta)$ given the dataset.\nIts PDF is known up to a multiplicative constant:\n\n\n\\begin{align}\\pi(\\alpha, \\beta | (t_1, f_1), \\ldots, (t_n, f_n) ) \\propto \\pi(\\alpha)\\pi(\\beta) \\left(\\frac{\\alpha}{\\beta}\\right)^{\\sum_i f_i} \\left(\\prod_{f_i = 1} \\frac{t_i}{\\beta}\\right)^{\\alpha-1} \\exp\\left[-\\sum_{i=1}^n\\left(\\frac{t_i}{\\beta}\\right)^\\alpha\\right].\\end{align}\n\nThe :class:~openturns.RandomWalkMetropolisHastings class can be used to sample from the posterior distribution. It relies on the following objects:\n\n- The conditional density $p(t_{1:n}|f_{1:n}, \\alpha, \\beta)$ will be defined as a :class:~openturns.PythonDistribution.\n- The prior probability density $\\pi(\\vect{\\theta})$ reflects beliefs about the possible values\n of $\\vect{\\theta} = (\\alpha, \\beta)$ before the experimental data are considered.\n- Initial values $\\vect{\\theta}_0$ for the calibration parameters.\n- Proposal distributions used to update each parameter sequentially.\n\n## Set up the PythonDistribution\n\n\nThe censured Weibuill likelihood is outside the usual catalog of probability distributions in OpenTURNS, hence we need to define it using the :class:~openturns.PythonDistribution class.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import numpy as np\nimport openturns as ot\nfrom openturns.viewer import View\not.Log.Show(ot.Log.NONE)\not.RandomGenerator.SetSeed(123)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following methods must be defined:\n\n- getRange: required for conversion to the :class:~openturns.Distribution format\n- computeLogPDF: used by :class:~openturns.RandomWalkMetropolisHastings to evaluate the posterior density\n- setParameter used by :class:~openturns.RandomWalkMetropolisHastings to test new parameter values\n\n

#### Note

We formally define a bivariate distribution on the $(t_i, f_i)$ couple, even though $f_i$ has no distribution (it is simply a covariate).\n This is not an issue, since the sole purpose of this :class:~openturns.PythonDistribution object is to pass the likelihood calculation over to :class:~openturns.RandomWalkMetropolisHastings.

\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "class CensoredWeibull(ot.PythonDistribution):\n \"\"\"\n Right-censored Weibull log-PDF calculation\n Each data point x is assumed 2D, with:\n x: observed functioning time\n x: nature of x:\n if x=0: x is a censoring time\n if x=1: x is a time-to failure\n \"\"\"\n def __init__(self, beta=5000.0, alpha=2.0):\n super(CensoredWeibull, self).__init__(2)\n self.beta = beta\n self.alpha = alpha\n\n def getRange(self):\n return ot.Interval([0, 0], [1, 1], [True]*2, [False, True])\n\n def computeLogPDF(self, x):\n if not (self.alpha>0.0 and self.beta>0.0):\n return -np.inf\n log_pdf = -( x / self.beta )**self.alpha\n log_pdf += ( self.alpha - 1 ) * np.log( x / self.beta ) * x\n log_pdf += np.log( self.alpha / self.beta ) * x\n return log_pdf\n\n def setParameter( self, parameter ):\n self.beta = parameter\n self.alpha = parameter\n\n def getParameter( self ):\n return [self.beta, self.alpha]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Convert to :class:~openturns.Distribution\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "conditional = ot.Distribution( CensoredWeibull() )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Observations, prior, initial point and proposal distributions\n\nDefine the observations\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "Tobs = np.array([4380, 1791, 1611, 1291, 6132, 5694, 5296, 4818, 4818, 4380])\nfail = np.array( [True]*4+[False]*6 )\nx = ot.Sample( np.vstack((Tobs, fail)).T )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Define a uniform prior distribution for $\\alpha$ and a Gamma prior distribution for $\\beta$\n\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "alpha_min, alpha_max = 0.5, 3.8\na_beta, b_beta = 2, 2e-4\n\npriorCopula = ot.IndependentCopula(2)# prior independence\npriorMarginals = [] # prior marginals\npriorMarginals.append(ot.Gamma(a_beta, b_beta)) # Gamma prior for beta\npriorMarginals.append(ot.Uniform(alpha_min, alpha_max)) # uniform prior for alpha\nprior=ot.ComposedDistribution( priorMarginals, priorCopula )\nprior.setDescription(['beta','alpha'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We select prior means as the initial point of the Metropolis-Hastings algorithm.\n\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "initialState = ot.Point([a_beta / b_beta, 0.5*(alpha_max - alpha_min)])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For our random walk proposal distributions, we choose normal steps, with standard deviation equal to roughly $10\\%$ of the prior range (for the uniform prior) or standard deviation (for the normal prior).\n\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "proposal=[]\nproposal.append( ot.Normal(0., 0.1 * np.sqrt( a_beta / b_beta**2 ) ) )\nproposal.append( ot.Normal(0., 0.1 * ( alpha_max - alpha_min ) ) )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Sample from the posterior distribution\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "RWMHsampler = ot.RandomWalkMetropolisHastings(prior, conditional, x, initialState, proposal)\nstrategy = ot.CalibrationStrategyCollection(2)\nRWMHsampler.setCalibrationStrategyPerComponent(strategy)\nRWMHsampler.setVerbose(True)\nsampleSize = 10000\nsample = RWMHsampler.getSample(sampleSize)\n# compute acceptance rate\nprint(\"Acceptance rate: %s\"%(RWMHsampler.getAcceptanceRate()))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot prior to posterior marginal plots\n\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "kernel = ot.KernelSmoothing()\nposterior = kernel.build(sample)\ngrid = ot.GridLayout(1, 2)\ngrid.setTitle('Bayesian inference')\nfor parameter_index in range(2):\n graph = posterior.getMarginal(parameter_index).drawPDF()\n priorGraph = prior.getMarginal(parameter_index).drawPDF()\n graph.add(priorGraph)\n graph.setColors(ot.Drawable.BuildDefaultPalette(2))\n graph.setLegends(['Posterior', 'Prior'])\n grid.setGraph(0, parameter_index, graph)\n_ = View(grid)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.5" } }, "nbformat": 4, "nbformat_minor": 0 }