.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_data_analysis/statistical_tests/plot_test_independence.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_data_analysis_statistical_tests_plot_test_independence.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_data_analysis_statistical_tests_plot_test_independence.py:


Test independence
=================

.. GENERATED FROM PYTHON SOURCE LINES 7-9

.. code-block:: Python

    import openturns as ot


.. GENERATED FROM PYTHON SOURCE LINES 10-32

Sample independence test
------------------------

In this paragraph we perform tests to assess whether two 1-d samples generated
by two random variables :math:`X` and :math:`Y` are independent or not.

The following tests are available:

- the ChiSquared test only used for discrete variables. Refer to :ref:`chi2_independence_test` for
  more details.

- the Pearson test: this test checks if there exists a linear
  relationship between :math:`X` and :math:`Y`. It is equivalent to an independence test only
  if the random vector :math:`(X,Y)` is a Gaussian vector. Refer to :ref:`pearson_test` for
  more details.

- the Spearman test: this test checks if there exists a monotonic
  relationship between :math:`X` and :math:`Y`. Refer to :ref:`spearman_test` for
  more details.

- independence test using regression: this test checks if there exists a linear relation between
  :math:`X` and :math:`Y` using a linear model.

.. GENERATED FROM PYTHON SOURCE LINES 34-38

Case 1: Pearson and Spearman tests
----------------------------------

We create a sample generated by a bivariate Gaussian vector :math:`(X,Y)` with independent components.

.. GENERATED FROM PYTHON SOURCE LINES 38-42

.. code-block:: Python

    sample_Biv = ot.Normal(2).getSample(1000)
    sample1 = sample_Biv.getMarginal(0)
    sample2 = sample_Biv.getMarginal(1)


.. GENERATED FROM PYTHON SOURCE LINES 43-48

To test the independence between both samples, we first use the Pearson test with
the Type I error equal to 0.1 (which is the probability to wrongly rejects the null hypothesis).
The Pearson test checks if there is a linear correlation between both random variables.
The null hypothesis is: *There is no linear relation*.
As :math:`(X,Y)` is a  Gaussian vector, it is equivalent to test the independence of the components.

.. GENERATED FROM PYTHON SOURCE LINES 48-50

.. code-block:: Python

    resultPearson = ot.HypothesisTest.Pearson(sample1, sample2, 0.10)


.. GENERATED FROM PYTHON SOURCE LINES 51-54

We can then display the result of the test as a yes/no answer with
the `getBinaryQualityMeasure`. We can retrieve the p-value and the threshold with the `getPValue`
and `getThreshold` methods.

.. GENERATED FROM PYTHON SOURCE LINES 54-61

.. code-block:: Python

    print(
        "Is the Pearson correlation coefficient is null ?",
        resultPearson.getBinaryQualityMeasure(),
        "p-value=%.6g" % resultPearson.getPValue(),
        "threshold=%.6g" % resultPearson.getThreshold(),
    )


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Is the Pearson correlation coefficient is null ? True p-value=0.748637 threshold=0.1


.. GENERATED FROM PYTHON SOURCE LINES 62-66

**Conclusion**: The Pearson test validates that there is no linear correlation between both samples:
the null hypothesis assuming that the Pearson correlation coefficient is null is accepted. It
means that the components are independent.
In the general case, the Gaussian vector hypothesis must be validated!

.. GENERATED FROM PYTHON SOURCE LINES 68-72

We can also use the Spearman test  with
the Type I error equal to 0.1 (which is the probability to wrongly rejects the null hypothesis).
The Spearman test checks if there exists a monotonic relationship between :math:`X` and :math:`Y`.
The null hypothesis is: *There is no monotonic relation*.

.. GENERATED FROM PYTHON SOURCE LINES 72-80

.. code-block:: Python

    resultSpearman = ot.HypothesisTest.Spearman(sample1, sample2, 0.10)
    print(
        "Is the Spearman correlation coefficient is null ?",
        resultSpearman.getBinaryQualityMeasure(),
        "p-value=%.6g" % resultSpearman.getPValue(),
        "threshold=%.6g" % resultSpearman.getThreshold(),
    )


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Is the Spearman correlation coefficient is null ? True p-value=0.839209 threshold=0.1


.. GENERATED FROM PYTHON SOURCE LINES 81-83

**Conclusion**: The Spearman test validates that there is no monotonic correlation between both samples:
the null hypothesis assuming that the Spearman correlation coefficient is null is accepted.

.. GENERATED FROM PYTHON SOURCE LINES 85-88

Here, we create a bivariate sample from a Gaussian vector which components are correlated. We note
that the Pearson test and the Spearman test both detect a correlation as both null hypotheses are
rejected.

.. GENERATED FROM PYTHON SOURCE LINES 88-98

.. code-block:: Python

    cor_Matrix = ot.CorrelationMatrix(2)
    cor_Matrix[0, 1] = 0.8
    sample_Biv = ot.Normal([0] * 2, [1] * 2, cor_Matrix).getSample(1000)
    sample1 = sample_Biv.getMarginal(0)
    sample2 = sample_Biv.getMarginal(1)
    resultPearson = ot.HypothesisTest.Pearson(sample1, sample2, 0.10)
    resultSpearman = ot.HypothesisTest.Spearman(sample1, sample2, 0.10)
    print('Pearson test : ', resultPearson)
    print('Spearman test : ', resultSpearman)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Pearson test :  class=TestResult name=Unnamed type=Pearson binaryQualityMeasure=false p-value threshold=0.1 p-value=1.993e-212 statistic=40.4302 description=[]
    Spearman test :  class=TestResult name=Unnamed type=Spearman binaryQualityMeasure=false p-value threshold=0.1 p-value=0 statistic=38.3542 description=[]


.. GENERATED FROM PYTHON SOURCE LINES 99-100

We consider now a discrete distribution. Let us create two independent samples.

.. GENERATED FROM PYTHON SOURCE LINES 100-103

.. code-block:: Python

    sample1 = ot.Poisson(0.2).getSample(100)
    sample2 = ot.Poisson(0.2).getSample(100)


.. GENERATED FROM PYTHON SOURCE LINES 104-105

We use the Chi2 test to check independence.

.. GENERATED FROM PYTHON SOURCE LINES 105-107

.. code-block:: Python

    resultChi2 = ot.HypothesisTest.ChiSquared(sample1, sample2, 0.10)


.. GENERATED FROM PYTHON SOURCE LINES 108-109

We display the results.

.. GENERATED FROM PYTHON SOURCE LINES 109-116

.. code-block:: Python

    print(
        "Are the components independent?",
        resultChi2.getBinaryQualityMeasure(),
        "p-value=%.6g" % resultChi2.getPValue(),
        "threshold=%.6g" % resultChi2.getThreshold(),
    )


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Are the components independent? True p-value=0.531971 threshold=0.1


.. GENERATED FROM PYTHON SOURCE LINES 117-119

**Conclusion**: The Chi2  test validates that both samples are independent:
the null hypothesis assuming the independence is accepted.

.. GENERATED FROM PYTHON SOURCE LINES 122-127

Case 2: Independence test using regression
------------------------------------------

This test consists in fitting a linear model between :math:`X` and :math:`Y` and anylysing
if the coefficients are significantly different from 0.

.. GENERATED FROM PYTHON SOURCE LINES 129-131

We create a sample generated by a Gaussian vector :math:`(X_1, X_2, X_3)` with zero mean, unit
variance and which components :math:`(X_1, X_3)` are correlated.

.. GENERATED FROM PYTHON SOURCE LINES 131-136

.. code-block:: Python

    corr_Matrix = ot.CorrelationMatrix(3)
    corr_Matrix[0, 2] = 0.9
    distribution = ot.Normal([0] * 3, [1] * 3, corr_Matrix)
    sample = distribution.getSample(100)


.. GENERATED FROM PYTHON SOURCE LINES 137-139

Next, we split the sample in two samples : the first one is associated to  :math:`(X_1, X_2)` and the
second one is associated to  :math:`X_3`.

.. GENERATED FROM PYTHON SOURCE LINES 139-142

.. code-block:: Python

    first_Sample = sample.getMarginal([0, 1])
    second_Sample = sample.getMarginal(2)


.. GENERATED FROM PYTHON SOURCE LINES 143-150

We fit a linear model of :math:`X_3` with respect to :math:`(X_1, X_2)`:
:math:`X_3 = a_0 + a_1X_1 + a_2X_2`.
Then, we test if each coefficient :math:`a_k` is significantly different from 0.
The null hypothesis is *The coefficient of the linear model is equal to zero*.
When the result is *True*, the null hypothesis is accepted, which means that
there is no dependence between the components. When the result is *False*, the null
hypothesis is rejected, which means that there is a linear relationship between the components.

.. GENERATED FROM PYTHON SOURCE LINES 150-159

.. code-block:: Python

    test_results = ot.LinearModelTest.FullRegression(first_Sample, second_Sample)
    for i in range(len(test_results)):
        print(
            "Coefficient a" + str(i) + " is equal to 0?",
            test_results[i].getBinaryQualityMeasure(),
            "p-value=%.6g" % test_results[i].getPValue(),
            "threshold=%.6g" % test_results[i].getThreshold(),
        )


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Coefficient a0 is equal to 0? True p-value=0.951597 threshold=0.05
    Coefficient a1 is equal to 0? False p-value=6.20875e-41 threshold=0.05
    Coefficient a2 is equal to 0? True p-value=0.113865 threshold=0.05


.. GENERATED FROM PYTHON SOURCE LINES 160-162

**Conclusion**: The test detects the independence between :math:`X_1` and :math:`X_3` and the
correlation between :math:`X_2` and :math:`X_3`. It also detects that :math:`a_0` is null.


.. _sphx_glr_download_auto_data_analysis_statistical_tests_plot_test_independence.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_test_independence.ipynb <plot_test_independence.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_test_independence.py <plot_test_independence.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: plot_test_independence.zip <plot_test_independence.zip>`