` to download the full example code
.. rst-class:: sphx-glr-example-title
.. _sphx_glr_auto_data_analysis_statistical_hypothesis_testing_plot_fitted_distribution_ranking.py:
Select fitted distributions
===========================
In this example help to make a choice between several distributions fitted to a sample.
Several methods can be used:
- the ranking by the Kolmogorov p-values (for continuous distributions),
- the ranking by the ChiSquared p-values (for discrete distributions),
- the ranking by BIC values.
.. code-block:: default
from __future__ import print_function
import openturns as ot
import openturns.viewer as viewer
from matplotlib import pylab as plt
ot.Log.Show(ot.Log.NONE)
Create a sample from a continuous distribution
.. code-block:: default
distribution = ot.Beta(2.0, 2.0, 0.0, 1.)
sample = distribution.getSample(1000)
graph = ot.UserDefined(sample).drawCDF()
view = viewer.View(graph)
.. image:: /auto_data_analysis/statistical_hypothesis_testing/images/sphx_glr_plot_fitted_distribution_ranking_001.png
:alt: X0 CDF
:class: sphx-glr-single-img
**1. Specify the model only**
Create the list of distribution estimators
.. code-block:: default
factories = [ot.BetaFactory(), ot.TriangularFactory()]
Rank the continuous models by the Kolmogorov p-values:
.. code-block:: default
estimated_distribution, test_result = ot.FittingTest.BestModelKolmogorov(sample, factories)
test_result
.. raw:: html
class=TestResult name=Unnamed type=Lilliefors Beta binaryQualityMeasure=false p-value threshold=0.5 p-value=0.006 statistic=0.0327766 description=[Beta(alpha = 1.72649, beta = 1.66568, a = 0.00526109, b = 0.970313) vs sample Beta]
Rank the continuous models wrt the BIC criteria (no test result):
.. code-block:: default
ot.FittingTest.BestModelBIC(sample, factories)
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
[class=Beta name=Beta dimension=1 alpha=1.72649 beta=1.66568 a=0.00526109 b=0.970313, -0.19254944819710879]
Rank the continuous models wrt the AIC criteria (no test result)
.. code-block:: default
ot.FittingTest.BestModelAIC(sample, factories)
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
[class=Beta name=Beta dimension=1 alpha=1.72649 beta=1.66568 a=0.00526109 b=0.970313, -0.21218046931303733]
Rank the continuous models wrt the AICc criteria (no test result):
.. code-block:: default
ot.FittingTest.BestModelAICC(sample, factories)
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
[class=Beta name=Beta dimension=1 alpha=1.72649 beta=1.66568 a=0.00526109 b=0.970313, -0.2121402683080122]
**2. Specify the model and its parameters**
Create a collection of the distributions to be tested
.. code-block:: default
distributions = [ot.Beta(2.0, 2.0, 0.0, 1.0), ot.Triangular(0.0, 0.5, 1.0)]
Rank the continuous models by the Kolmogorov p-values:
.. code-block:: default
estimated_distribution, test_result = ot.FittingTest.BestModelKolmogorov(sample, distributions)
test_result
.. raw:: html
class=TestResult name=Unnamed type=Kolmogorov Beta binaryQualityMeasure=true p-value threshold=0.05 p-value=0.127302 statistic=0.0369407 description=[Beta(alpha = 2, beta = 2, a = 0, b = 1) vs sample Beta]
Rank the continuous models wrt the BIC criteria:
.. code-block:: default
ot.FittingTest.BestModelBIC(sample, distributions)
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
[class=Beta name=Beta dimension=1 alpha=2 beta=2 a=0 b=1, -0.21804827501286062]
Rank the continuous models wrt the AIC criteria:
.. code-block:: default
ot.FittingTest.BestModelAIC(sample, distributions)
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
[class=Beta name=Beta dimension=1 alpha=2 beta=2 a=0 b=1, -0.21804827501286062]
Rank the continuous models wrt the AICc criteria:
.. code-block:: default
ot.FittingTest.BestModelAICC(sample, distributions)
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
[class=Beta name=Beta dimension=1 alpha=2 beta=2 a=0 b=1, -0.21804827501286062]
**Discrete distributions**
Create a sample from a discrete distribution
.. code-block:: default
distribution = ot.Poisson(2.0)
sample = distribution.getSample(1000)
graph = ot.UserDefined(sample).drawCDF()
view = viewer.View(graph)
.. image:: /auto_data_analysis/statistical_hypothesis_testing/images/sphx_glr_plot_fitted_distribution_ranking_002.png
:alt: X0 CDF
:class: sphx-glr-single-img
Create the list of distribution estimators
.. code-block:: default
distributions = [ot.Poisson(2.0), ot.Geometric(0.1)]
Rank the discrete models wrt the ChiSquared p-values:
.. code-block:: default
estimated_distribution, test_result = ot.FittingTest.BestModelChiSquared(sample, distributions)
test_result
.. raw:: html
class=TestResult name=Unnamed type=ChiSquared Poisson binaryQualityMeasure=true p-value threshold=0.05 p-value=0.184085 statistic=8.81784 description=[Poisson(lambda = 2) vs sample Poisson]
Rank the discrete models wrt the BIC criteria:
.. code-block:: default
ot.FittingTest.BestModelBIC(sample, distributions)
plt.show()
.. rst-class:: sphx-glr-timing
**Total running time of the script:** ( 0 minutes 0.703 seconds)
.. _sphx_glr_download_auto_data_analysis_statistical_hypothesis_testing_plot_fitted_distribution_ranking.py:
.. only :: html
.. container:: sphx-glr-footer
:class: sphx-glr-footer-example
.. container:: sphx-glr-download sphx-glr-download-python
:download:`Download Python source code: plot_fitted_distribution_ranking.py `
.. container:: sphx-glr-download sphx-glr-download-jupyter
:download:`Download Jupyter notebook: plot_fitted_distribution_ranking.ipynb `
.. only:: html
.. rst-class:: sphx-glr-signature
`Gallery generated by Sphinx-Gallery `_