Uncertainty ranking: PCC and PRCC¶
Partial Correlation Coefficients deal with analyzing the influence of the random vector
on a random variable
, which is being studied.
Here we attempt to measure the linear relationships that exist between
and the different components
.
The basic method of hierarchical ordering using Pearson’s coefficients deals with the case
where the variable depends linearly on the
variables
.
Partial Correlation Coefficients are also useful in this case
but provide a different kind of information:
the partial correlation
coefficient between the variables
and
measures the residual influence
of
on
once influences from all other variables
have been eliminated.
In particular, if
and
are perfectly correlated,
then
.
For any variable index ,
the estimation for each partial correlation coefficient
uses a sample of size
denoted by
of the vector
. This requires the
following three steps to be carried out:
Determine the effect of other variables
on
by linear regression; when the values of the variables
are known, the average forecast for the value of
is then available in the form of the equation:
Determine the effect of other variables
on
by linear regression; when the values of the variables
are known, the average forecast for the value of
is then available in the form of the equation:
The Pearson Correlation Coefficient coefficient
is then equal to the sample Pearson correlation coefficient
estimated for the variables
and
.
One can then class the variables
according to the absolute value of the partial correlation coefficients:
the higher the value of
,
the greater the impact the variable
has on
.
In order to introduce the PRCC, we need to define the rank of an observation of a random variable.
Let be a random variable.
Let
be a sample of size
.
We can sort the sample in increasing order:
For any , the index
is the rank of the
-th observation if
is the
-th largest observation
in the sample.
In other words, the observation
appears at the
-th index in the ordered
sample
.
Now that the rank of a random variable is defined, consider again the case of an input random
vector and the output random variable
.
The Partial Rank Correlation Coefficients (PRCC) are PRC coefficients computed on the rank
of the input variables
and the rank of the output variable
.
OpenTURNS