Note

Click here to download the full example code

# Test independence¶

```
import openturns as ot
import openturns.viewer as viewer
from matplotlib import pylab as plt
ot.Log.Show(ot.Log.NONE)
```

## Sample independence test¶

In this paragraph we perform tests to assess whether two 1-d samples are independent or not.

The following tests are available :

the ChiSquared test: it tests if both scalar samples (discrete ones only) are independent. If is the number of values of the sample in the modality , , and the ChiSquared test evaluates the decision variable:

which tends towards the distribution. The hypothesis of independence is rejected if is too high (depending on the p-value threshold).

the Pearson test: it tests if there exists a linear relation between two scalar samples which form a gaussian vector (which is equivalent to have a linear correlation coefficient not equal to zero). If both samples are and , and and , the Pearson test evaluates the decision variable:

The variable tends towards a , under the hypothesis of normality of both samples. The hypothesis of a linear coefficient equal to 0 is rejected (which is equivalent to the independence of the samples) if D is too high (depending on the p-value threshold).

the Spearman test: it tests if there exists a monotonous relation between two scalar samples. If both samples are and ,, the Spearman test evaluates the decision variable:

where and . is such that tends towards the standard normal distribution.

### The continuous case¶

We create two different continuous samples :

```
sample1 = ot.Normal().getSample(100)
sample2 = ot.Normal().getSample(100)
```

We first use the Pearson test and store the result :

```
resultPearson = ot.HypothesisTest.Pearson(sample1, sample2, 0.10)
```

We can then display the result of the test as a yes/no answer with the getBinaryQualityMeasure. We can retrieve the p-value and the threshold with the getPValue and getThreshold methods.

```
print('Component is normal?', resultPearson.getBinaryQualityMeasure(),
'p-value=%.6g' % resultPearson.getPValue(),
'threshold=%.6g' % resultPearson.getThreshold())
```

Out:

```
Component is normal? False p-value=0.0451584 threshold=0.1
```

We can also use the Spearman test :

```
resultSpearman = ot.HypothesisTest.Spearman(sample1, sample2, 0.10)
print('Component is normal?', resultSpearman.getBinaryQualityMeasure(),
'p-value=%.6g' % resultSpearman.getPValue(),
'threshold=%.6g' % resultSpearman.getThreshold())
```

Out:

```
Component is normal? False p-value=0.0603411 threshold=0.1
```

### The discrete case¶

Testing is also possible for discrete distribution. Let us create discrete two different samples :

```
sample1 = ot.Poisson(0.2).getSample(100)
sample2 = ot.Poisson(0.2).getSample(100)
```

We use the Chi2 test to check independence and store the result :

```
resultChi2 = ot.HypothesisTest.ChiSquared(sample1, sample2, 0.10)
```

and display the results :

```
print('Component is normal?', resultChi2.getBinaryQualityMeasure(),
'p-value=%.6g' % resultChi2.getPValue(),
'threshold=%.6g' % resultChi2.getThreshold())
```

Out:

```
Component is normal? True p-value=0.20552 threshold=0.1
```

## Test samples independence using regression¶

Independence testing with regression is also an option in OpenTURNS. It consists in detecting a linear relation between two scalar samples.

We generate a sample of dimension 3 with component 0 correlated to component 2 :

```
marginals = [ot.Normal()] * 3
S = ot.CorrelationMatrix(3)
S[0, 2] = 0.9
copula = ot.NormalCopula(S)
distribution = ot.ComposedDistribution(marginals, copula)
sample = distribution.getSample(30)
```

Next, we split it in two samples : firstSample of dimension=2, secondSample of dimension=1.

```
firstSample = sample[:, :2]
secondSample = sample[:, 2]
```

We test independence of each component of firstSample against the secondSample :

```
test_results = ot.LinearModelTest.FullRegression(firstSample, secondSample)
for i in range(len(test_results)):
print('Component', i, 'is independent?', test_results[i].getBinaryQualityMeasure(),
'p-value=%.6g' % test_results[i].getPValue(),
'threshold=%.6g' % test_results[i].getThreshold())
```

Out:

```
Component 0 is independent? True p-value=0.646138 threshold=0.05
Component 1 is independent? False p-value=1.30057e-10 threshold=0.05
Component 2 is independent? True p-value=0.342379 threshold=0.05
```

**Total running time of the script:** ( 0 minutes 0.006 seconds)