.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_data_analysis/distribution_fitting/plot_advanced_mle_estimator.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_data_analysis_distribution_fitting_plot_advanced_mle_estimator.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_data_analysis_distribution_fitting_plot_advanced_mle_estimator.py:


Fitting a distribution with customized maximum likelihood
=========================================================

.. GENERATED FROM PYTHON SOURCE LINES 7-26

Introduction
------------

When we perform distribution fitting using the :class:`~openturns.MaximumLikelihoodFactory`
class, the default optimization solver sometimes fails to
maximize the likelihood.
This might be because the optimization solver is not appropriate in
this particular case, or because the bounds of the problem
are not properly set.
In this example, we configure the optimization solver by two different
methods.

- The first method sets a specific solver among the :class:`~openturns.NLopt` solvers.
  Moreover, we customize the distribution parameters bounds.
  This helps the solver to avoid the part of the domain where
  the distribution cannot be built.
- The second method sets the :class:`~openturns.ResourceMap` key
  that is used by the :class:`~openturns.MaximumLikelihoodFactory` in
  order to select the default solver.

.. GENERATED FROM PYTHON SOURCE LINES 26-29

.. code-block:: Python


    import openturns as ot


.. GENERATED FROM PYTHON SOURCE LINES 30-32

Simulating a sample
-------------------

.. GENERATED FROM PYTHON SOURCE LINES 34-36

In this example, we consider the :class:`~openturns.TruncatedNormal` distribution
with default parameters and generate a large sample from it.

.. GENERATED FROM PYTHON SOURCE LINES 36-41

.. code-block:: Python

    distribution = ot.TruncatedNormal()
    print("Distribution=", distribution)
    size = 10000
    sample = distribution.getSample(size)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Distribution= TruncatedNormal(mu = 0, sigma = 1, a = -1, b = 1)


.. GENERATED FROM PYTHON SOURCE LINES 42-43

We can use the :class:`~openturns.TruncatedNormalFactory` class.

.. GENERATED FROM PYTHON SOURCE LINES 43-47

.. code-block:: Python

    factory = ot.TruncatedNormalFactory()
    fittedDistribution = factory.build(sample)
    print("Fitted distribution=", fittedDistribution)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Fitted distribution= TruncatedNormal(mu = -0.00486846, sigma = 0.980626, a = -0.999871, b = 0.999978)


.. GENERATED FROM PYTHON SOURCE LINES 48-53

But this method sometimes fails for specific distributions, for example
when the sample size is very small or when the distribution has
specific properties (e.g., some parameter is known beforehand).
In these cases, general-purpose methods cannot be used, and the
methods presented below may be relevant.

.. GENERATED FROM PYTHON SOURCE LINES 55-57

Defining the estimation problem
-------------------------------

.. GENERATED FROM PYTHON SOURCE LINES 59-73

The truncated Normal distribution has 4 parameters:

- the mean and standard deviation parameters of the
  underlying Normal distribution,
- the minimum and maximum of the underlying truncated distribution.

Estimating the parameters will be easier if two conditions are met:

- the starting point (i.e., the initial values of the parameters)
  of the optimization is compatible with the sample,
- the bounds are compatible with the distribution.

First, we define the starting point of the optimization,
using statistics from the sample.

.. GENERATED FROM PYTHON SOURCE LINES 75-82

.. code-block:: Python

    sample_min = sample.getMin()[0]
    sample_max = sample.getMax()[0]
    sample_mean = sample.computeMean()[0]
    sigma_hat = sample.computeStandardDeviation()[0]
    startingPoint = [sample_mean, sigma_hat, sample_min, sample_max]
    print("startingPoint=", startingPoint)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    startingPoint= [-0.00142841023688803, 0.5380783048703683, -0.9997714958942324, 0.9998780942933398]


.. GENERATED FROM PYTHON SOURCE LINES 83-108

Secondly, we set the bounds of the parameters.
We assume that the minimum and maximum cannot be lower than -1
or greater than 1.
The parameters have the following bounds:

- :math:`\mu \in [-\infty, +\infty]` ;
- :math:`\sigma \in [0, +\infty]` ;
- :math:`a \in [-1, 1]` ;
- :math:`b \in [-1, 1]`.

But this may not give satisfactory results, because infinite parameters
are impossible and a zero standard deviation is infeasible.
Furthermore, we must have :math:`a < b`.
We assume that we know that the mean parameter cannot be greater than 10
and lower than -10.
Similarly, the standard deviation cannot be greater than 10.
Hence, we use the following bounds:

- :math:`\mu \in [-10, +10]` ;
- :math:`\sigma \in [\epsilon, +10]` ;
- :math:`a \in [-1 - \epsilon, \bar{x} - \epsilon]` ;
- :math:`b \in [\bar{x} + \epsilon, 1 + \epsilon]`

where :math:`\bar{x}` is the sample mean and :math:`\epsilon`
is a small parameter.

.. GENERATED FROM PYTHON SOURCE LINES 110-116

.. code-block:: Python

    epsilon = 1.0e-3
    bounds_lower = [-10.0, epsilon, -1.0 - epsilon, sample_mean + epsilon]
    bounds_upper = [10.0, 10.0, sample_mean - epsilon, 1.0 + epsilon]
    print("bounds_lower=", bounds_lower)
    print("bounds_upper=", bounds_upper)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    bounds_lower= [-10.0, 0.001, -1.001, -0.00042841023688802997]
    bounds_upper= [10.0, 10.0, -0.00242841023688803, 1.001]


.. GENERATED FROM PYTHON SOURCE LINES 117-118

The boolean `False` means unbounded and `True` means bounded.

.. GENERATED FROM PYTHON SOURCE LINES 120-124

.. code-block:: Python

    finiteLowerBound = [True] * 4
    finiteUpperBound = [True] * 4
    interval = ot.Interval(bounds_lower, bounds_upper, finiteLowerBound, finiteUpperBound)


.. GENERATED FROM PYTHON SOURCE LINES 125-127

Setting the solver and bounds
-----------------------------

.. GENERATED FROM PYTHON SOURCE LINES 129-136

The two methods that we suggest are based on the :class:`~openturns.MaximumLikelihoodFactory` class
which uses the maximum likelihood estimator to estimate
the parameters of the distribution depending on the sample.
The first method is to use the :meth:`~openturns.MaximumLikelihoodFactory.setOptimizationAlgorithm`
method.
This sets the optimization algorithm as a replacement for the
default optimization solver.

.. GENERATED FROM PYTHON SOURCE LINES 136-148

.. code-block:: Python


    def printTruncatedNormalParameters(truncatedNormalDistribution):
        mu, sigma, a, b = truncatedNormalDistribution.getParameter()
        print(" ", truncatedNormalDistribution.getImplementation().getClassName())
        print("  mu= %.3f" % (mu))
        print("  sigma= %.4f" % (sigma))
        print("  a= %.2f" % (a))
        print("  b= %.2f" % (b))
        return None


.. GENERATED FROM PYTHON SOURCE LINES 149-169

.. code-block:: Python

    factory = ot.MaximumLikelihoodFactory(distribution)
    factory.setOptimizationBounds(interval)
    distribution_MLE = factory.build(sample)
    print("Fitted distribution with default solver:")
    printTruncatedNormalParameters(distribution_MLE)
    default_solver = factory.getOptimizationAlgorithm()
    print("Default solver=", default_solver.getImplementation().getClassName())
    solver = ot.NLopt("LN_COBYLA")
    print("New solver=", solver.getClassName(), solver.getAlgorithmName())
    solver.setStartingPoint(startingPoint)
    factory.setOptimizationAlgorithm(solver)
    print(
        "Updated solver=",
        factory.getOptimizationAlgorithm().getImplementation().getClassName(),
    )
    distribution_MLE = factory.build(sample)
    print("Fitted distribution with new solver:")
    printTruncatedNormalParameters(distribution_MLE)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Fitted distribution with default solver:
      TruncatedNormal
      mu= -0.000
      sigma= 0.9999
      a= -1.00
      b= 1.00
    Default solver= TNC
    New solver= NLopt LN_COBYLA
    Updated solver= NLopt
    Fitted distribution with new solver:
      TruncatedNormal
      mu= -0.001
      sigma= 0.9409
      a= -1.00
      b= 1.00


.. GENERATED FROM PYTHON SOURCE LINES 170-172

Using ResourceMap to set the solver
-----------------------------------

.. GENERATED FROM PYTHON SOURCE LINES 174-177

Another method is to use a special key of the :class:`~openturns.ResourceMap`,
which defines the default solver used by :class:`~openturns.MaximumLikelihoodFactory`.
In this case, we do not define the starting point.

.. GENERATED FROM PYTHON SOURCE LINES 179-195

.. code-block:: Python

    ot.ResourceMap.SetAsString(
        "MaximumLikelihoodFactory-DefaultOptimizationAlgorithm", "LN_COBYLA"
    )
    ot.ResourceMap.SetAsUnsignedInteger(
        "MaximumLikelihoodFactory-MaximumCallsNumber", 100000
    )
    factory = ot.MaximumLikelihoodFactory(distribution)
    factory.setOptimizationBounds(interval)
    print(
        "Default solver=",
        factory.getOptimizationAlgorithm().getImplementation().getClassName(),
    )
    distribution_MLE = factory.build(sample)
    print("Fitted distribution:")
    printTruncatedNormalParameters(distribution_MLE)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Default solver= NLopt
    Fitted distribution:
      TruncatedNormal
      mu= -0.005
      sigma= 0.9808
      a= -1.00
      b= 1.00


.. GENERATED FROM PYTHON SOURCE LINES 196-198

Fitting a LogNormal with zero location parameter
------------------------------------------------

.. GENERATED FROM PYTHON SOURCE LINES 200-207

We now consider the :class:`~openturns.LogNormal` distribution
and present two different topics.

- We show how to estimate the parameters of a LogNormal distribution
  which has a zero location parameter.
- We compute the list of solvers which are compatible with a specific
  MLE problem.

.. GENERATED FROM PYTHON SOURCE LINES 209-210

We generate a large sample from the standard LogNormal distribution.

.. GENERATED FROM PYTHON SOURCE LINES 212-215

.. code-block:: Python

    standardLogNormal = ot.LogNormal()
    sample = standardLogNormal.getSample(1000)


.. GENERATED FROM PYTHON SOURCE LINES 216-228

If we estimate the parameters using :class:`~openturns.LogNormalFactory`,
the algorithm does not know that the true value of the location parameter
:math:`\gamma` is zero.
Hence, the estimator may not necessarily lead to an estimated parameter
exactly equal to zero.
In this example, we assume that we know that this parameter is zero.
In this case, the simplest method is to use the
:meth:`~openturns.MaximumLikelihoodFactory.setKnownParameter` method.
Furthermore, we know that the absolute value of the mean of the underlying Normal
distribution cannot be greater than 5.
Finally, we know that the standard deviation of the underlying
Normal distribution cannot be lower than, say, :math:`\epsilon = 10^{-4}`.

.. GENERATED FROM PYTHON SOURCE LINES 230-241

.. code-block:: Python

    logNormalFactoryWithZeroLocation = ot.MaximumLikelihoodFactory(standardLogNormal)
    logNormalFactoryWithZeroLocation.setKnownParameter([2], [0.0])
    lowerBound = [-5.0, 1.0e-4]
    upperBound = [5.0, -1.0]
    finiteLowerBound = [True, True]
    finiteUpperBound = [True, False]
    bounds = ot.Interval(lowerBound, upperBound, finiteLowerBound, finiteUpperBound)
    logNormalFactoryWithZeroLocation.setOptimizationBounds(bounds)
    fittedDistribution = logNormalFactoryWithZeroLocation.build(sample)
    print("   fittedDistribution = ", fittedDistribution)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

       fittedDistribution =  LogNormal(muLog = 0.0165248, sigmaLog = 0.959205, gamma = 0)


.. GENERATED FROM PYTHON SOURCE LINES 242-244

See the list of solvers compatible with a given MLE problem
-----------------------------------------------------------

.. GENERATED FROM PYTHON SOURCE LINES 246-253

In the next script, we perform a loop over the algorithms
from the :class:`~openturns.NLopt` class and select the algorithms
which can solve the problem.
This is done using `try/except` statements.
Moreover, some solvers pretend to solve the problem, but do
not make any change to the parameters.
In this case, we do not select them.

.. GENERATED FROM PYTHON SOURCE LINES 255-277

.. code-block:: Python

    defaultParameter = standardLogNormal.getParameter()
    defaultSolver = logNormalFactoryWithZeroLocation.getOptimizationAlgorithm()
    startingPoint = defaultSolver.getStartingPoint()
    print("Default solver = ", defaultSolver.getImplementation().getClassName())
    availableSolverList = []
    differenceThreshold = 1.0e-8
    for algorithmName in ot.NLopt.GetAlgorithmNames():
        print("- ", algorithmName)
        solver = ot.NLopt(algorithmName)
        solver.setStartingPoint(startingPoint)
        logNormalFactoryWithZeroLocation.setOptimizationAlgorithm(solver)
        try:
            fittedDistribution = logNormalFactoryWithZeroLocation.build(sample)
            print("   fittedDistribution = ", fittedDistribution)
            parameter = fittedDistribution.getParameter()
            differenceOfParameters = parameter - defaultParameter
            if differenceOfParameters.norm() > differenceThreshold:
                # Consider only solvers which make a difference after optimization
                availableSolverList.append(algorithmName)
        except TypeError:
            print("   Fail : Impossible to build with ", algorithmName)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Default solver =  NLopt
    -  AUGLAG
       fittedDistribution =  LogNormal(muLog = 0.0171728, sigmaLog = 0.961145, gamma = 0)
    -  AUGLAG_EQ
       fittedDistribution =  LogNormal(muLog = 0.0171728, sigmaLog = 0.961145, gamma = 0)
    -  GD_MLSL
       Fail : Impossible to build with  GD_MLSL
    -  GD_MLSL_LDS
       Fail : Impossible to build with  GD_MLSL_LDS
    -  GN_CRS2_LM
       Fail : Impossible to build with  GN_CRS2_LM
    -  GN_DIRECT
       Fail : Impossible to build with  GN_DIRECT
    -  GN_DIRECT_L
       Fail : Impossible to build with  GN_DIRECT_L
    -  GN_DIRECT_L_NOSCAL
       Fail : Impossible to build with  GN_DIRECT_L_NOSCAL
    -  GN_DIRECT_L_RAND
       Fail : Impossible to build with  GN_DIRECT_L_RAND
    -  GN_DIRECT_L_RAND_NOSCAL
       Fail : Impossible to build with  GN_DIRECT_L_RAND_NOSCAL
    -  GN_DIRECT_NOSCAL
       Fail : Impossible to build with  GN_DIRECT_NOSCAL
    -  GN_ESCH
       Fail : Impossible to build with  GN_ESCH
    -  GN_ISRES
       Fail : Impossible to build with  GN_ISRES
    -  GN_MLSL
       Fail : Impossible to build with  GN_MLSL
    -  GN_MLSL_LDS
       Fail : Impossible to build with  GN_MLSL_LDS
    -  GN_ORIG_DIRECT
       Fail : Impossible to build with  GN_ORIG_DIRECT
    -  GN_ORIG_DIRECT_L
       Fail : Impossible to build with  GN_ORIG_DIRECT_L
    -  G_MLSL
       Fail : Impossible to build with  G_MLSL
    -  G_MLSL_LDS
       Fail : Impossible to build with  G_MLSL_LDS
    -  LD_AUGLAG
       fittedDistribution =  LogNormal(muLog = 0.0171728, sigmaLog = 0.961145, gamma = 0)
    -  LD_AUGLAG_EQ
       fittedDistribution =  LogNormal(muLog = 0.0171728, sigmaLog = 0.961145, gamma = 0)
    -  LD_CCSAQ
       fittedDistribution =  LogNormal(muLog = 0.0170128, sigmaLog = 0.960857, gamma = 0)
    -  LD_LBFGS
       fittedDistribution =  LogNormal(muLog = 0.0165133, sigmaLog = 0.959186, gamma = 0)
    -  LD_MMA
       fittedDistribution =  LogNormal(muLog = 0.0171728, sigmaLog = 0.961145, gamma = 0)
    -  LD_SLSQP
       fittedDistribution =  LogNormal(muLog = 0.0165261, sigmaLog = 0.959214, gamma = 0)
    -  LD_TNEWTON
       fittedDistribution =  LogNormal(muLog = 0.0165271, sigmaLog = 0.959208, gamma = 0)
    -  LD_TNEWTON_PRECOND
       fittedDistribution =  LogNormal(muLog = 0.016527, sigmaLog = 0.959208, gamma = 0)
    -  LD_TNEWTON_PRECOND_RESTART
       fittedDistribution =  LogNormal(muLog = 0.0165271, sigmaLog = 0.959208, gamma = 0)
    -  LD_TNEWTON_RESTART
       fittedDistribution =  LogNormal(muLog = 0.0165271, sigmaLog = 0.959208, gamma = 0)
    -  LD_VAR1
       fittedDistribution =  LogNormal(muLog = 0.0166006, sigmaLog = 0.959359, gamma = 0)
    -  LD_VAR2
       fittedDistribution =  LogNormal(muLog = 0.0166006, sigmaLog = 0.959359, gamma = 0)
    -  LN_AUGLAG
       fittedDistribution =  LogNormal(muLog = 0.0171728, sigmaLog = 0.961145, gamma = 0)
    -  LN_AUGLAG_EQ
       fittedDistribution =  LogNormal(muLog = 0.0171728, sigmaLog = 0.961145, gamma = 0)
    -  LN_BOBYQA
       fittedDistribution =  LogNormal(muLog = 0.0160768, sigmaLog = 0.958842, gamma = 0)
    -  LN_COBYLA
       fittedDistribution =  LogNormal(muLog = 0.016203, sigmaLog = 0.959755, gamma = 0)
    -  LN_NELDERMEAD
       fittedDistribution =  LogNormal(muLog = 0.0146306, sigmaLog = 0.960879, gamma = 0)
    -  LN_NEWUOA
       Fail : Impossible to build with  LN_NEWUOA
    -  LN_NEWUOA_BOUND
       fittedDistribution =  LogNormal(muLog = 0.0173999, sigmaLog = 0.960394, gamma = 0)
    -  LN_SBPLX
       fittedDistribution =  LogNormal(muLog = 0.0149536, sigmaLog = 0.959034, gamma = 0)


.. GENERATED FROM PYTHON SOURCE LINES 278-282

.. code-block:: Python

    print("Available solvers = ", len(availableSolverList))
    for name in availableSolverList:
        print("- ", name)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Available solvers =  21
    -  AUGLAG
    -  AUGLAG_EQ
    -  LD_AUGLAG
    -  LD_AUGLAG_EQ
    -  LD_CCSAQ
    -  LD_LBFGS
    -  LD_MMA
    -  LD_SLSQP
    -  LD_TNEWTON
    -  LD_TNEWTON_PRECOND
    -  LD_TNEWTON_PRECOND_RESTART
    -  LD_TNEWTON_RESTART
    -  LD_VAR1
    -  LD_VAR2
    -  LN_AUGLAG
    -  LN_AUGLAG_EQ
    -  LN_BOBYQA
    -  LN_COBYLA
    -  LN_NELDERMEAD
    -  LN_NEWUOA_BOUND
    -  LN_SBPLX


.. GENERATED FROM PYTHON SOURCE LINES 283-284

Pick one and try it.

.. GENERATED FROM PYTHON SOURCE LINES 286-293

.. code-block:: Python

    algorithmName = "LD_SLSQP"
    solver = ot.NLopt(algorithmName)
    print("   New solver = ", solver.getClassName(), solver.getAlgorithmName())
    solver.setStartingPoint(startingPoint)
    logNormalFactoryWithZeroLocation.setOptimizationAlgorithm(solver)
    fittedDistribution = logNormalFactoryWithZeroLocation.build(sample)
    print("   fittedDistribution = ", fittedDistribution)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

       New solver =  NLopt LD_SLSQP
       fittedDistribution =  LogNormal(muLog = 0.0165261, sigmaLog = 0.959214, gamma = 0)


.. _sphx_glr_download_auto_data_analysis_distribution_fitting_plot_advanced_mle_estimator.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_advanced_mle_estimator.ipynb <plot_advanced_mle_estimator.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_advanced_mle_estimator.py <plot_advanced_mle_estimator.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: plot_advanced_mle_estimator.zip <plot_advanced_mle_estimator.zip>`