.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_probabilistic_modeling/distributions/plot_mixture_distribution.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_probabilistic_modeling_distributions_plot_mixture_distribution.py: Create a mixture of distributions ================================= .. GENERATED FROM PYTHON SOURCE LINES 7-44 Introduction ------------ In this example, we show how to build a distribution which is a Mixture from a collection of distributions of the same dimension :math:`d \geq 1`. We denote by :math:`\inputRV` the random vector with such a distribution. Let :math:`(\cL_1, \dots, \cL_N)` a collection of distributions and :math:`(\alpha_1, \dots, \alpha_N)` a collection of positive weights such that :math:`\sum_{i=1}^N \alpha_i = 1`. Let :math:`\inputRV_i` be a random vector whose distribution is :math:`\cL_i`. In the case where all the :math:`\cL_i` have a probability density function :math:`\mu_i`, then the mixture has a probability density function :math:`\inputMeasure` defined by: .. math:: \inputMeasure(\vect{x}) = \sum_{i=1}^N \alpha_i \mu_i(\vect{x}) In the case where all the :math:`\cL_i` are discrete, then the mixture is discrete and its probability distribution function is defined by: .. math:: \Prob{\inputRV = \vect{x}} = \sum_{i=1}^N \alpha_i \Prob{\vect{X}_i = \vect{x}} In the case where some of the :math:`\cL_i` have a probability density function and the other ones are discrete, then the mixture is not discrete and does not have a probability density function. Its cumulated distribution function is defined by: .. math:: F_\inputRV(\vect{x}) = \sum_{i=1}^N \alpha_i F_{\vect{X}_i}(\vect{x}) We illustrate the following particular cases: - Case 1a: Mixture of continuous distributions, - Case 1b: Mixture of copulas, - Case 1c: Mixture of a Histogram and a Generalized Pareto Distribution, - Case 2: Mixture of discrete distributions, - Case 3: Mixture of discrete and continuous distributions. .. GENERATED FROM PYTHON SOURCE LINES 47-51 .. code-block:: Python import openturns as ot import openturns.viewer as otv .. GENERATED FROM PYTHON SOURCE LINES 52-61 Case 1a: Mixture of continuous distributions -------------------------------------------- In this case, we build the mixture of the following continuous distributions: - a :class:`~openturns.Triangular`, - a :class:`~openturns.Normal`, - a :class:`~openturns.Uniform`. The weigths are automatically normalized. .. GENERATED FROM PYTHON SOURCE LINES 63-64 We define the collection of distributions and the associated weights. .. GENERATED FROM PYTHON SOURCE LINES 64-71 .. code-block:: Python distributions = [ ot.Triangular(1.0, 2.0, 4.0), ot.Normal(-1.0, 1.0), ot.Uniform(5.0, 6.0), ] weights = [0.4, 1.0, 0.2] .. GENERATED FROM PYTHON SOURCE LINES 72-73 We create the mixture. .. GENERATED FROM PYTHON SOURCE LINES 73-76 .. code-block:: Python distribution = ot.Mixture(distributions, weights) print(distribution) .. rst-class:: sphx-glr-script-out .. code-block:: none Mixture((w = 0.25, d = Triangular(a = 1, m = 2, b = 4)), (w = 0.625, d = Normal(mu = -1, sigma = 1)), (w = 0.125, d = Uniform(a = 5, b = 6))) .. GENERATED FROM PYTHON SOURCE LINES 77-78 We can draw the probability density function. .. GENERATED FROM PYTHON SOURCE LINES 78-84 .. code-block:: Python graph = distribution.drawPDF() graph.setTitle("Mixture of Triangular, Normal, Uniform") graph.setXTitle("x") graph.setLegendPosition("") view = otv.View(graph) .. image-sg:: /auto_probabilistic_modeling/distributions/images/sphx_glr_plot_mixture_distribution_001.png :alt: Mixture of Triangular, Normal, Uniform :srcset: /auto_probabilistic_modeling/distributions/images/sphx_glr_plot_mixture_distribution_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 85-86 We can draw the cumulated distribution function. .. GENERATED FROM PYTHON SOURCE LINES 86-92 .. code-block:: Python graph = distribution.drawCDF() graph.setTitle("Mixture of Triangular, Normal, Uniform") graph.setXTitle("x") graph.setLegendPosition("") view = otv.View(graph) .. image-sg:: /auto_probabilistic_modeling/distributions/images/sphx_glr_plot_mixture_distribution_002.png :alt: Mixture of Triangular, Normal, Uniform :srcset: /auto_probabilistic_modeling/distributions/images/sphx_glr_plot_mixture_distribution_002.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 93-99 Case 1b: Mixture of copulas --------------------------- In this case, we build the mixture of the following copulas: - a :class:`~openturns.GumbelCopula`, - a :class:`~openturns.ClaytonCopula`. .. GENERATED FROM PYTHON SOURCE LINES 101-102 We define the collection of copulas and the associated weights. .. GENERATED FROM PYTHON SOURCE LINES 102-105 .. code-block:: Python copulas = [ot.GumbelCopula(4.5), ot.ClaytonCopula(2.3)] weights = [0.2, 0.8] .. GENERATED FROM PYTHON SOURCE LINES 106-107 We create a mixture of copulas. .. GENERATED FROM PYTHON SOURCE LINES 107-110 .. code-block:: Python distribution = ot.Mixture(copulas, weights) print(distribution) .. rst-class:: sphx-glr-script-out .. code-block:: none Mixture((w = 0.2, d = GumbelCopula(theta = 4.5)), (w = 0.8, d = ClaytonCopula(theta = 2.3))) .. GENERATED FROM PYTHON SOURCE LINES 111-112 We can draw the probability density function. .. GENERATED FROM PYTHON SOURCE LINES 112-119 .. code-block:: Python graph = distribution.drawPDF() graph.setTitle("Mixture of Gumbel copula, Clayton copula") graph.setXTitle(r"$x_0$") graph.setYTitle(r"$x_1$") graph.setLegendPosition("") view = otv.View(graph) .. image-sg:: /auto_probabilistic_modeling/distributions/images/sphx_glr_plot_mixture_distribution_003.png :alt: Mixture of Gumbel copula, Clayton copula :srcset: /auto_probabilistic_modeling/distributions/images/sphx_glr_plot_mixture_distribution_003.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 120-121 We can draw the cumulated distribution function. .. GENERATED FROM PYTHON SOURCE LINES 121-128 .. code-block:: Python graph = distribution.drawCDF() graph.setTitle("Mixture of Gumbel copula, Clayton copula") graph.setXTitle(r"$x_0$") graph.setYTitle(r"$x_1$") view = otv.View(graph) .. image-sg:: /auto_probabilistic_modeling/distributions/images/sphx_glr_plot_mixture_distribution_004.png :alt: Mixture of Gumbel copula, Clayton copula :srcset: /auto_probabilistic_modeling/distributions/images/sphx_glr_plot_mixture_distribution_004.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 129-149 Case 1c: Mixture of a Histogram and a Generalized Pareto Distribution --------------------------------------------------------------------- We want to create the scalar distribution of :math:`X` such that: .. math:: X|X \leq x_0 & \sim \mathcal{L}_1 \\ X|X \geq x_0 & \sim \mathcal{L}_2 where: - :math:`\mathcal{L}_1` is a Histogram, - :math:`\mathcal{L}_2` is a Generalized Pareto distribution (GPD), - :math:`x_0` is a quantile of high level of :math:`X`. Let us define: .. math:: w = \Prob{X \leq x_0} We assume that we only have a sample from :math:`X`. .. GENERATED FROM PYTHON SOURCE LINES 151-153 In this example, we consider a Normal distribution with zero mean and unit variance. We generate a sample of size :math:`n`. .. GENERATED FROM PYTHON SOURCE LINES 153-157 .. code-block:: Python n = 5000 X_dist = ot.Normal() sample_X = X_dist.getSample(n) .. GENERATED FROM PYTHON SOURCE LINES 158-159 We build the whole histogram from the sample. .. GENERATED FROM PYTHON SOURCE LINES 159-167 .. code-block:: Python hist_dist = ot.HistogramFactory().build(sample_X) g_hist = hist_dist.drawPDF() g_hist.setTitle(r"Empirical distribution of $X$") g_hist.setXTitle("x") g_hist.setYTitle("pdf") g_hist.setLegends(["histogram"]) view = otv.View(g_hist) .. image-sg:: /auto_probabilistic_modeling/distributions/images/sphx_glr_plot_mixture_distribution_005.png :alt: Empirical distribution of $X$ :srcset: /auto_probabilistic_modeling/distributions/images/sphx_glr_plot_mixture_distribution_005.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 168-169 We estimate the extreme empirical quantile of level 0.95. .. GENERATED FROM PYTHON SOURCE LINES 169-172 .. code-block:: Python w = 0.95 x0 = hist_dist.computeQuantile(w)[0] .. GENERATED FROM PYTHON SOURCE LINES 173-175 We start by truncating the initial histogram on the interval :math:`]-\infty, x_0]`. We visualize it! .. GENERATED FROM PYTHON SOURCE LINES 175-183 .. code-block:: Python hist_trunc = ot.TruncatedDistribution(hist_dist, x0, ot.TruncatedDistribution.UPPER) g_hist_trunc = hist_trunc.drawPDF() g_hist_trunc.setTitle(r"Empirical distribution of $X|X \leq $" + "%.3g" % (x0)) g_hist_trunc.setXTitle("x") g_hist_trunc.setYTitle("pdf") g_hist_trunc.setLegends(["truncated histogram"]) view = otv.View(g_hist_trunc) .. image-sg:: /auto_probabilistic_modeling/distributions/images/sphx_glr_plot_mixture_distribution_006.png :alt: Empirical distribution of $X|X \leq $1.65 :srcset: /auto_probabilistic_modeling/distributions/images/sphx_glr_plot_mixture_distribution_006.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 184-187 Then we model :math:`X|X \geq x_0` by a Generalized Pareto distribution (GPD). We start by extracting from the sample all the values greater than :math:`x_0` to build the upper sample. We get about :math:`n(1-w)` points. .. GENERATED FROM PYTHON SOURCE LINES 187-195 .. code-block:: Python sample_X_upper = ot.Sample(0, 1) for i in range(len(sample_X)): if sample_X[i, 0] > x0: sample_X_upper.add(sample_X[i]) print("Excess number = ", sample_X_upper.getSize()) print("n(1-w) = ", int(n * (1 - w))) .. rst-class:: sphx-glr-script-out .. code-block:: none Excess number = 247 n(1-w) = 250 .. GENERATED FROM PYTHON SOURCE LINES 196-204 Then we fit a GPD parameterized by :math:`(\sigma, \xi, x_0)`: the threshold is fixed to :math:`x_0`. We use the estimator that maximizes the likelihood. To solve the optimisation problem faster, we start by estimating the 3 parameters :math:`(\sigma, \xi, u)` from the upper sample. Then we fix the threshold to :math:`u = x_0` and we estimate the remaining parameters :math:`(\sigma, \xi)` using the previous values of :math:`(\sigma, \xi)` as a starting point to the optimization problem. We visualize the pdf of the GPD. .. GENERATED FROM PYTHON SOURCE LINES 204-219 .. code-block:: Python gpd_first = ot.GeneralizedParetoFactory().build(sample_X_upper) mlFact = ot.MaximumLikelihoodFactory(gpd_first) # we fix the threshold to :math:`x_0`. mlFact.setKnownParameter([x0], [2]) gpd_estimated = mlFact.build(sample_X_upper) print("estimated gpd = ", gpd_estimated) g_gpd = gpd_estimated.drawPDF() g_gpd.setTitle(r"Distribution of $X|X \geq $" + "%.3g" % (x0)) g_gpd.setXTitle("x") g_gpd.setYTitle("pdf") g_gpd.setLegends(["GPd"]) view = otv.View(g_gpd) .. image-sg:: /auto_probabilistic_modeling/distributions/images/sphx_glr_plot_mixture_distribution_007.png :alt: Distribution of $X|X \geq $1.65 :srcset: /auto_probabilistic_modeling/distributions/images/sphx_glr_plot_mixture_distribution_007.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none estimated gpd = GeneralizedPareto(sigma = 0.440432, xi=-0.0537874, u=1.65325) .. GENERATED FROM PYTHON SOURCE LINES 220-222 Then we can create the mixture using the truncated Histogram distribution below :math:`x_0` and the GPD over :math:`x_0` weighted by :math:`w` and :math:`(1-w)`. .. GENERATED FROM PYTHON SOURCE LINES 222-249 .. code-block:: Python mixt_dist = ot.Mixture([hist_trunc, gpd_estimated], [w, 1 - w]) g_hist.add(mixt_dist.drawPDF()) ord_Max = max(hist_dist.getImplementation().getHeight()) line = ot.Curve([x0, x0], [0.0, ord_Max]) line.setColor("red") line.setLineStyle("dashed") g_hist.add(line) draw_ref = X_dist.drawPDF().getDrawable(0) draw_ref.setLineStyle("dashed") g_hist.add(draw_ref) g_hist.setLegends(["histogram", "mixture", "", "exact dist"]) g_hist.setTitle(r"Distribution of $X$: Mixture") view = otv.View(g_hist) # We draw here only the mixture distribution to make the comparison easier. g_mixt = mixt_dist.drawPDF() g_mixt.setTitle(r"Mixture distribution of $X$") g_mixt.setXTitle("x") g_mixt.setYTitle("pdf") g_mixt.setLegendPosition("") view = otv.View(g_mixt) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /auto_probabilistic_modeling/distributions/images/sphx_glr_plot_mixture_distribution_008.png :alt: Distribution of $X$: Mixture :srcset: /auto_probabilistic_modeling/distributions/images/sphx_glr_plot_mixture_distribution_008.png :class: sphx-glr-multi-img * .. image-sg:: /auto_probabilistic_modeling/distributions/images/sphx_glr_plot_mixture_distribution_009.png :alt: Mixture distribution of $X$ :srcset: /auto_probabilistic_modeling/distributions/images/sphx_glr_plot_mixture_distribution_009.png :class: sphx-glr-multi-img .. GENERATED FROM PYTHON SOURCE LINES 250-258 Case 2: Mixture of discrete distributions ----------------------------------------- In this case, we build the mixture of the following distributions: - a :class:`~openturns.Poisson`, - a :class:`~openturns.Geometric`. The weigths are automatically normalized. .. GENERATED FROM PYTHON SOURCE LINES 260-261 We define the collection of distributions and the associated weights. .. GENERATED FROM PYTHON SOURCE LINES 261-264 .. code-block:: Python distributions = [ot.Poisson(1.2), ot.Geometric(0.7)] weights = [0.4, 1.0] .. GENERATED FROM PYTHON SOURCE LINES 265-266 We create the mixture. .. GENERATED FROM PYTHON SOURCE LINES 266-269 .. code-block:: Python distribution = ot.Mixture(distributions, weights) print(distribution) .. rst-class:: sphx-glr-script-out .. code-block:: none Mixture((w = 0.285714, d = Poisson(lambda = 1.2)), (w = 0.714286, d = Geometric(p = 0.7))) .. GENERATED FROM PYTHON SOURCE LINES 270-271 We can draw the probability distribution function. .. GENERATED FROM PYTHON SOURCE LINES 271-277 .. code-block:: Python graph = distribution.drawPDF() graph.setTitle("Mixture of Poisson, Geometric") graph.setXTitle("x") graph.setLegendPosition("") view = otv.View(graph) .. image-sg:: /auto_probabilistic_modeling/distributions/images/sphx_glr_plot_mixture_distribution_010.png :alt: Mixture of Poisson, Geometric :srcset: /auto_probabilistic_modeling/distributions/images/sphx_glr_plot_mixture_distribution_010.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 278-279 We can draw the cumulated distribution function. .. GENERATED FROM PYTHON SOURCE LINES 279-286 .. code-block:: Python graph = distribution.drawCDF() graph.setTitle("Mixture of Poisson, Geometric") graph.setXTitle("x") graph.setLegendPosition("") view = otv.View(graph) .. image-sg:: /auto_probabilistic_modeling/distributions/images/sphx_glr_plot_mixture_distribution_011.png :alt: Mixture of Poisson, Geometric :srcset: /auto_probabilistic_modeling/distributions/images/sphx_glr_plot_mixture_distribution_011.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 287-296 Case Case 3: Mixture of discrete and continuous distributions ------------------------------------------------------------- In this case, we build the mixture of the following distributions: - a :class:`~openturns.Normal`, - a :class:`~openturns.Poisson`. The resulting distribution is not continuous nor discrete. It not possible to draw the pdf... .. GENERATED FROM PYTHON SOURCE LINES 298-299 We define the collection of distributions and the associated weights. .. GENERATED FROM PYTHON SOURCE LINES 299-302 .. code-block:: Python distributions = [ot.Normal(), ot.Poisson(0.7)] weights = [0.4, 1.0] .. GENERATED FROM PYTHON SOURCE LINES 303-304 We create the mixture. .. GENERATED FROM PYTHON SOURCE LINES 304-307 .. code-block:: Python distribution = ot.Mixture(distributions, weights) print(distribution) .. rst-class:: sphx-glr-script-out .. code-block:: none Mixture((w = 0.285714, d = Normal(mu = 0, sigma = 1)), (w = 0.714286, d = Poisson(lambda = 0.7))) .. GENERATED FROM PYTHON SOURCE LINES 308-310 We cannot draw the probability distribution function as it is not defined. But, we can draw the cumulated distribution function. .. GENERATED FROM PYTHON SOURCE LINES 310-316 .. code-block:: Python graph = distribution.drawCDF() graph.setTitle("Mixture of Normal, Poisson") graph.setXTitle("x") graph.setLegendPosition("") view = otv.View(graph) .. image-sg:: /auto_probabilistic_modeling/distributions/images/sphx_glr_plot_mixture_distribution_012.png :alt: Mixture of Normal, Poisson :srcset: /auto_probabilistic_modeling/distributions/images/sphx_glr_plot_mixture_distribution_012.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 317-318 Reset ResourceMap .. GENERATED FROM PYTHON SOURCE LINES 318-320 .. code-block:: Python ot.ResourceMap.Reload() .. GENERATED FROM PYTHON SOURCE LINES 321-322 Show all the graphs. .. GENERATED FROM PYTHON SOURCE LINES 322-323 .. code-block:: Python view.ShowAll() .. _sphx_glr_download_auto_probabilistic_modeling_distributions_plot_mixture_distribution.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_mixture_distribution.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_mixture_distribution.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_mixture_distribution.zip `