Pearson correlation testΒΆ

The Pearson test checks if there exists a linear relationship between two random variables X and Y.

The Pearson test is based on the Pearson correlation coefficient defined in Pearson coefficient. It tests if the Pearson correlation coefficient is significantly different from zero. In the case where (X, Y) form a Gaussian vector, it is equivalent to test the independence between X and Y.

The Pearson test compares the null hypothesis \cH_0 = \left\{ \rho_P(X,Y) = 0 \right\} against the alternative hypothesis \cH_1 = \left\{ \rho_P(X,Y) \neq 0 \right\}.

The Pearson coefficient \rho_P(X,Y) is evaluated on a sample generated by the bivariate random vector (X,Y) of size \sampleSize and denoted by \hat{\rho}_P(X,Y) according to the relation (1).

The statistics T(X,Y) used in the test is defined by:

T(X,Y) = \hat{\rho}_P(X,Y) \sqrt{ \dfrac{\sampleSize-2}{1-(\hat{\rho}_P(X,Y))^2} }

Under the null hypothesis \cH_0, the statistics T follows a Student distribution with \sampleSize-2 degrees of freedom in the case of a Gaussian vector. In the other cases, the Student distribution T(\sampleSize-2) is equivalent to the asymptotic distribution of T. The library uses the Student distribution T(\sampleSize-2) in all the cases.

The p-value p_v is the probability p_v = \Prob{|T| \geq |t(X,Y)|} where t(X,Y) is the realization of T(X,Y) computed on the sample. The null hypothesis \cH_0 is rejected if p_v < p_v^\ell where p_v^\ell is specified (usually 0.1 or 0.05). The p-value limit p_v^\ell is the probability to wrongly reject the null hypothesis \cH_0, which means to commit a Type I error.

When the null hypothesis \cH_0 is rejected, it means that there is a significant linear relationship between X and Y.