Uncertainty ranking: PCC and PRCC¶

Partial Correlation Coefficients deal with analyzing the influence the random vector $\inputRV = \left( X_1,\ldots,X_{\inputDim} \right)$ has on a random variable $Y$ which is being studied for uncertainty. Here we attempt to measure linear relationships that exist between $Y$ and the different components $X_i$ .

The basic method of hierarchical ordering using Pearson’s coefficients deals with the case where the variable $Y$ linearly depends on $\inputDim$ variables $\left\{ X_1,\ldots,X_{\inputDim} \right\}$ but this can be misleading when statistical dependencies or interactions between the variables $X_i$ (e.g. a crossed term $X_i \times X_j$ ) exist. In such a situation, the partial correlation coefficients can be more useful in ordering the uncertainty hierarchically: the partial correlation coefficients $\textrm{PCC}_{X_i,Y}$ between the variables $Y$ and $X_i$ attempts to measure the residual influence of $X_i$ on $Y$ once influences from all other variables $X_j$ have been eliminated.

The estimation for each partial correlation coefficient $\textrm{PCC}_{X_i,Y}$ uses a sample of size $\sampleSize$ denoted by $\left\{ \left(y^{(1)},x_1^{(1)},\ldots,x_{\inputDim}^{(1)} \right),\ldots, \left(y^{(\sampleSize)},x_1^{(\sampleSize)},\ldots,x_{\inputDim}^{(\sampleSize)} \right) \right\}$ of the vector $(Y,X_1,\ldots,X_{\inputDim})$ . This requires the following three steps to be carried out:

Determine the effect of other variables $\left\{ X_j,\ j\neq i \right\}$ on $Y$ by linear regression; when the values of the variables $\left\{ X_j,\ j\neq i \right\}$ are known, the average forecast for the value of $Y$ is then available in the form of the equation:

$\begin{aligned} \widehat{Y} = \sum_{k \neq i,\ 1 \leq k \leq d} \widehat{a}_k X_k \end{aligned}$
Determine the effect of other variables $\left\{ X_j,\ j\neq i \right\}$ on $X_i$ by linear regression; when the values of the variables $\left\{ X_j,\ j\neq i \right\}$ are known, the average forecast for the value of $X_i$ is then available in the form of the equation:

$\begin{aligned} \widehat{X}_i = \sum_{k \neq i,\ 1 \leq k \leq d} \widehat{b}_k X_k \end{aligned}$
$\textrm{PCC}_{X_i,Y}$ is then equal to the Pearson correlation coefficient $\widehat{\rho}_{Y-\widehat{Y},X_i-\widehat{X}_i}$ estimated for the variables $Y-\widehat{Y}$ and $X_i-\widehat{X}_i$ on the $\sampleSize$ -sample of simulations.

One can then class the $d$ variables $X_1,\ldots, X_{\inputDim}$ according to the absolute value of the partial correlation coefficients: the higher the value of $\left| \textrm{PCC}_{X_i,Y} \right|$ , the greater the impact the variable $X_i$ has on $Y$ .

Partial Rank Correlation Coefficients (PRCC) are PRC coefficients computed on the ranked input variables $r\inputRV = \left( rX_1,\ldots,rX_{\inputDim} \right)$ and the ranked output variable $rY$ .

OpenTURNS

An Open source initiative for the Treatment of Uncertainties, Risks'N Statistics

Previous topic

Next topic

This Page

Uncertainty ranking: PCC and PRCC¶