.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_data_analysis/statistical_tests/plot_test_independence.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        Click :ref:`here <sphx_glr_download_auto_data_analysis_statistical_tests_plot_test_independence.py>`
        to download the full example code

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_data_analysis_statistical_tests_plot_test_independence.py:


Test independence
=================

.. GENERATED FROM PYTHON SOURCE LINES 6-12

.. code-block:: default

    import openturns as ot
    import openturns.viewer as viewer
    from matplotlib import pylab as plt
    ot.Log.Show(ot.Log.NONE)


.. GENERATED FROM PYTHON SOURCE LINES 13-44

Sample independence test
------------------------

In this paragraph we perform tests to assess whether two 1-d samples are independent or not.

The following tests are available :

- the ChiSquared test: it tests if both scalar samples (discrete ones only) are independent.
  If :math:`n_{ij}` is the number of values of the sample :math:`i=(1,2)` in the modality :math:`1 \leq j \leq m`, :math:`\displaystyle n_{i.} = \sum_{j=1}^m n_{ij}` :math:`\displaystyle n_{.j} = \sum_{i=1}^2 n_{ij}`, and the ChiSquared test evaluates the decision variable:

.. math::
   D^2 = \sum_{i=1}^2 \sum_{j=1}^m \frac{( n_{ij} - \frac{n_{i.} n_{.j}}{n} )^2}{\frac{n_{i.} n_{.j}}{n}}

which tends towards the :math:`\chi^2(m-1)` distribution. The hypothesis of independence is rejected if :math:`D^2` is too high (depending on the p-value threshold).

- the Pearson test: it tests if there exists a linear relation between two scalar samples which form a gaussian vector (which is equivalent to have a linear correlation coefficient not equal to zero).
  If both samples are :math:`\underline{x} = (x_i)_{1 \leq i \leq n}` and :math:`\underline{y} = (y_i)_{1 \leq i \leq n}`, and :math:`\bar{x} = \displaystyle \frac{1}{n}\sum_{i=1}^n x_i` and :math:`\bar{y} = \displaystyle \frac{1}{n}\sum_{i=1}^n y_i`, the Pearson test evaluates the decision variable:

 .. math::
     D = \frac{\sum_{i=1}^n (x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum_{i=1}^n (x_i - \bar{x})^2\sum_{i=1}^n (y_i - \bar{y})^2}}

The variable :math:`D` tends towards a :math:`\chi^2(n-2)`, under the hypothesis of normality of both samples. The hypothesis of a linear coefficient equal to 0 is rejected (which is equivalent to the independence of the samples) if D is too high (depending on the p-value threshold).

- the Spearman test: it tests if there exists a monotonous relation between two scalar samples.
  If both samples are :math:`\underline{x} = (x_i)_{1 \leq i \leq n}` and :math:`\underline{y}= (y_i)_{1 \leq i \leq n}`,, the Spearman test evaluates the decision variable:

.. math::
     D = 1-\frac{6\sum_{i=1}^n (r_i - s_i)^2}{n(n^2-1)}

where :math:`r_i = rank(x_i)` and  :math:`s_i = rank(y_i)`. :math:`D` is such that :math:`\sqrt{n-1}D` tends towards the standard normal distribution.


.. GENERATED FROM PYTHON SOURCE LINES 46-50

The continuous case
^^^^^^^^^^^^^^^^^^^

We create two different continuous samples :

.. GENERATED FROM PYTHON SOURCE LINES 50-53

.. code-block:: default

    sample1 = ot.Normal().getSample(100)
    sample2 = ot.Normal().getSample(100)


.. GENERATED FROM PYTHON SOURCE LINES 54-55

We first use the Pearson test and store the result :

.. GENERATED FROM PYTHON SOURCE LINES 55-57

.. code-block:: default

    resultPearson = ot.HypothesisTest.Pearson(sample1, sample2, 0.10)


.. GENERATED FROM PYTHON SOURCE LINES 58-61

We can then display the result of the test as a yes/no answer with
the `getBinaryQualityMeasure`. We can retrieve the p-value and the threshold with the `getPValue`
and `getThreshold` methods.

.. GENERATED FROM PYTHON SOURCE LINES 61-66

.. code-block:: default

    print('Component is normal?', resultPearson.getBinaryQualityMeasure(),
          'p-value=%.6g' % resultPearson.getPValue(),
          'threshold=%.6g' % resultPearson.getThreshold())


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    Component is normal? False p-value=0.0451584 threshold=0.1


.. GENERATED FROM PYTHON SOURCE LINES 67-68

We can also use the Spearman test :

.. GENERATED FROM PYTHON SOURCE LINES 68-74

.. code-block:: default

    resultSpearman = ot.HypothesisTest.Spearman(sample1, sample2, 0.10)
    print('Component is normal?', resultSpearman.getBinaryQualityMeasure(),
          'p-value=%.6g' % resultSpearman.getPValue(),
          'threshold=%.6g' % resultSpearman.getThreshold())


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    Component is normal? False p-value=0.0603411 threshold=0.1


.. GENERATED FROM PYTHON SOURCE LINES 75-79

The discrete case
^^^^^^^^^^^^^^^^^

Testing is also possible for discrete distribution. Let us create discrete two different samples :

.. GENERATED FROM PYTHON SOURCE LINES 79-82

.. code-block:: default

    sample1 = ot.Poisson(0.2).getSample(100)
    sample2 = ot.Poisson(0.2).getSample(100)


.. GENERATED FROM PYTHON SOURCE LINES 83-84

We use the Chi2 test to check independence and store the result :

.. GENERATED FROM PYTHON SOURCE LINES 84-86

.. code-block:: default

    resultChi2 = ot.HypothesisTest.ChiSquared(sample1, sample2, 0.10)


.. GENERATED FROM PYTHON SOURCE LINES 87-88

and display the results :

.. GENERATED FROM PYTHON SOURCE LINES 88-93

.. code-block:: default

    print('Component is normal?', resultChi2.getBinaryQualityMeasure(),
          'p-value=%.6g' % resultChi2.getPValue(),
          'threshold=%.6g' % resultChi2.getThreshold())


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    Component is normal? True p-value=0.20552 threshold=0.1


.. GENERATED FROM PYTHON SOURCE LINES 94-100

Test samples independence using regression
------------------------------------------

Independence testing with regression is also an option in OpenTURNS.
It consists in detecting a linear relation between two scalar samples.


.. GENERATED FROM PYTHON SOURCE LINES 102-103

We generate a sample of dimension 3 with component 0 correlated to component 2 :

.. GENERATED FROM PYTHON SOURCE LINES 103-110

.. code-block:: default

    marginals = [ot.Normal()] * 3
    S = ot.CorrelationMatrix(3)
    S[0, 2] = 0.9
    copula = ot.NormalCopula(S)
    distribution = ot.ComposedDistribution(marginals, copula)
    sample = distribution.getSample(30)


.. GENERATED FROM PYTHON SOURCE LINES 111-112

Next, we split it in two samples : firstSample of dimension=2, secondSample of dimension=1.

.. GENERATED FROM PYTHON SOURCE LINES 112-115

.. code-block:: default

    firstSample = sample[:, :2]
    secondSample = sample[:, 2]


.. GENERATED FROM PYTHON SOURCE LINES 116-117

We test independence of each component of firstSample against the secondSample :

.. GENERATED FROM PYTHON SOURCE LINES 117-122

.. code-block:: default

    test_results = ot.LinearModelTest.FullRegression(firstSample, secondSample)
    for i in range(len(test_results)):
        print('Component', i, 'is independent?', test_results[i].getBinaryQualityMeasure(),
              'p-value=%.6g' % test_results[i].getPValue(),
              'threshold=%.6g' % test_results[i].getThreshold())


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    Component 0 is independent? True p-value=0.646138 threshold=0.05
    Component 1 is independent? False p-value=1.30057e-10 threshold=0.05
    Component 2 is independent? True p-value=0.342379 threshold=0.05


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** ( 0 minutes  0.006 seconds)


.. _sphx_glr_download_auto_data_analysis_statistical_tests_plot_test_independence.py:


.. only :: html

 .. container:: sphx-glr-footer
    :class: sphx-glr-footer-example


  .. container:: sphx-glr-download sphx-glr-download-python

     :download:`Download Python source code: plot_test_independence.py <plot_test_independence.py>`


  .. container:: sphx-glr-download sphx-glr-download-jupyter

     :download:`Download Jupyter notebook: plot_test_independence.ipynb <plot_test_independence.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_