Spearman correlation coefficient¶
The Spearman rank correlation coefficient measures how strongly two random variables with finite variance are correlated. Spearman’s correlation assesses monotonic relationships between both variables.
Let be two random variables which CDF are denoted by
and
.
Spearman’s rank correlation coefficient
is defined by:
where is the covariance operator and
and
are the respective CDF of
and
.
The Spearman correlation between two variables is equal to the Pearson correlation coefficient between the rank values of the variables:
If is the CDF of the copula of the random vector
, then we get:
which shows that the Spearman correlation is linked to the copula only.
Let be a sample generated
by the bivariate random vector
.
We denote by
the rank sample,
which means that
is the rank of the value
within the sample
and
is the rank of the value
within the
sample
. The estimator
is equal to the
estimator
computed
on the rank sample
. It is estimated as follows:
(1)¶
where and
are the empirical mean rank of each sample.
We sum up some interesting features of the coefficient:
The Spearman correlation coefficient takes values between -1 and 1.
If
then there exists a monotonic function
such that
.
The closer
is to 1, the stronger the indication is that a monotonic relationship exists between
and
. The sign of the Spearman coefficient indicates if the two variables increase or decrease in the same direction (positive coefficient) or in opposite directions (negative coefficient).
If
and
are independent, then
.
If
, it does not imply the independence of the variables
and
. It may only means that the relation between both variables is not monotonic.
(Source code, svg)
(Source code, svg)
(Source code, svg)
(Source code, svg)
Spearman’s coefficient is often referred to as the rank correlation coefficient.
OpenTURNS