Pearson correlation test¶

The Pearson test checks if there exists a linear relationship between two random variables $X$ and $Y$ .

The Pearson test is based on the Pearson correlation coefficient defined in Pearson coefficient. It tests if the Pearson correlation coefficient is significantly different from zero. In the case where $(X, Y)$ form a Gaussian vector, it is equivalent to test the independence between $X$ and $Y$ .

The Pearson test compares the null hypothesis $\cH_0 = \left\{ \rho_P(X,Y) = 0 \right\}$ against the alternative hypothesis $\cH_1 = \left\{ \rho_P(X,Y) \neq 0 \right\}$ .

The Pearson coefficient $\rho_P(X,Y)$ is evaluated on a sample generated by the bivariate random vector $(X,Y)$ of size $\sampleSize$ and denoted by $\hat{\rho}_P(X,Y)$ according to the relation (1).

The statistics $T(X,Y)$ used in the test is defined by:

$T(X,Y) = \hat{\rho}_P(X,Y) \sqrt{ \dfrac{\sampleSize-2}{1-(\hat{\rho}_P(X,Y))^2} }$

Under the null hypothesis $\cH_0$ , the statistics $T$ follows a Student distribution with $\sampleSize-2$ degrees of freedom in the case of a Gaussian vector. In the other cases, the Student distribution $T(\sampleSize-2)$ is equivalent to the asymptotic distribution of $T$ . The library uses the Student distribution $T(\sampleSize-2)$ in all the cases.

The p-value $p_v$ is the probability $p_v = \Prob{|T| \geq |t(X,Y)|}$ where $t(X,Y)$ is the realization of $T(X,Y)$ computed on the sample. The null hypothesis $\cH_0$ is rejected if $p_v < p_v^\ell$ where $p_v^\ell$ is specified (usually 0.1 or 0.05). The p-value limit $p_v^\ell$ is the probability to wrongly reject the null hypothesis $\cH_0$ , which means to commit a Type I error.

When the null hypothesis $\cH_0$ is rejected, it means that there is a significant linear relationship between $X$ and $Y$ .

OpenTURNS

An Open source initiative for the Treatment of Uncertainties, Risks'N Statistics

Previous topic

Next topic

This Page

Pearson correlation test¶