Kolmogorov-Smirnov : get the statistics distribution
====================================================

In this example, we draw the Kolmogorov-Smirnov distribution for a sample size 10. We want to test the hypothesis that this sample has the `Uniform(0, 1)` distribution. The K.S. distribution is first plotted in the case where the  parameters of the uniform distribution are known. Then we plot the distribution when the parameters of the uniform distribution are estimated from the sample.

*Reference* : Hovhannes Keutelian, "The Kolmogorov-Smirnov test when parameters are estimated from data", 30 April 1991, Fermilab

Note: There is a sign error in the paper; the equation: `D[i]=max(abs(S+step),D[i])` must be replaced with `D[i]=max(abs(S-step),D[i])`.

.. code-block:: default

   import openturns as ot
   import openturns.viewer as viewer
   from matplotlib import pylab as plt
   ot.Log.Show(ot.Log.NONE)

.. code-block:: default

   x = [0.9374, 0.7629, 0.4771, 0.5111, 0.8701, 0.0684, 0.7375, 0.5615, 0.2835, 0.2508]
   sample = ot.Sample([[xi] for xi in x])

.. code-block:: default

   samplesize = sample.getSize()
   samplesize

.. rst-class:: sphx-glr-script-out

Out:

.. code-block:: none

   10

Plot the empirical distribution function.

.. code-block:: default

   graph = ot.UserDefined(sample).drawCDF()
   graph.setLegends(["Sample"])
   curve = ot.Curve([0, 1], [0, 1])
   curve.setLegend("Uniform")
   graph.add(curve)
   graph.setXTitle("X")
   graph.setTitle("Cumulated distribution function")
   view = viewer.View(graph)

.. image-sg:: /auto_data_analysis/statistical_tests/images/sphx_glr_plot_kolmogorov_distribution_001.png
   :alt: Cumulated distribution function
   :srcset: /auto_data_analysis/statistical_tests/images/sphx_glr_plot_kolmogorov_distribution_001.png
   :class: sphx-glr-single-img

The computeKSStatisticsIndex function computes the Kolmogorov-Smirnov distance between the sample and the distribution. The following function is for teaching purposes only: use `FittingTest` for real applications.

.. code-block:: default

   def computeKSStatistics(sample, distribution):
       sample = sample.sort()
       n = sample.getSize()
       D = 0.
       index = -1
       D_previous = 0.
       for i in range(n):
           F = distribution.computeCDF(sample[i])
           Fminus = F - float(i)/n
           Fplus = float(i+1)/n - F
           D = max(Fminus, Fplus, D)
           if (D > D_previous):
               index = i
           D_previous = D
       return D

.. code-block:: default

   dist = ot.Uniform(0, 1)
   dist

.. raw:: html

   Uniform(a = 0, b = 1)

Uniform(a = 0, b = 1)