Uncertainty ranking: PCC and PRCC

Partial Correlation Coefficients deal with analyzing the influence the random vector \vect{X} = \left( X_1,\ldots,X_{n_X} \right) has on a random variable Y which is being studied for uncertainty. Here we attempt to measure linear relationships that exist between Y and the different components X_i.

The basic method of hierarchical ordering using Pearson’s coefficients deals with the case where the variable Y linearly depends on n_X variables \left\{ X_1,\ldots,X_{n_X} \right\} but this can be misleading when statistical dependencies or interactions between the variables X_i (e.g. a crossed term X_i \times X_j) exist. In such a situation, the partial correlation coefficients can be more useful in ordering the uncertainty hierarchically: the partial correlation coefficients \textrm{PCC}_{X_i,Y} between the variables Y and X_i attempts to measure the residual influence of X_i on Y once influences from all other variables X_j have been eliminated.

The estimation for each partial correlation coefficient \textrm{PCC}_{X_i,Y} uses a set made up of N values \left\{ \left(y^{(1)},x_1^{(1)},\ldots,x_{n_X}^{(1)} \right),\ldots, \left(y^{(N)},x_1^{(N)},\ldots,x_{n_X}^{(N)} \right) \right\} of the vector (Y,X_1,\ldots,X_{n_X}). This requires the following three steps to be carried out:

  1. Determine the effect of other variables \left\{ X_j,\ j\neq i \right\} on Y by linear regression; when the values of the variables \left\{ X_j,\ j\neq i \right\} are known, the average forecast for the value of Y is then available in the form of the equation:

    \begin{aligned}
      \widehat{Y} = \sum_{k \neq i,\ 1 \leq k \leq n_X} \widehat{a}_k X_k
    \end{aligned}

  2. Determine the effect of other variables \left\{ X_j,\ j\neq i \right\} on X_i by linear regression; when the values of the variables \left\{ X_j,\ j\neq i \right\} are known, the average forecast for the value of X_i is then available in the form of the equation:

    \begin{aligned}
      \widehat{X}_i = \sum_{k \neq i,\ 1 \leq k \leq n_X} \widehat{b}_k X_k
    \end{aligned}

  3. \textrm{PCC}_{X_i,Y} is then equal to the Pearson correlation coefficient \widehat{\rho}_{Y-\widehat{Y},X_i-\widehat{X}_i} estimated for the variables Y-\widehat{Y} and X_i-\widehat{X}_i on the N-sample of simulations.

One can then class the n_X variables X_1,\ldots, X_{n_X} according to the absolute value of the partial correlation coefficients: the higher the value of \left| \textrm{PCC}_{X_i,Y} \right|, the greater the impact the variable X_i has on Y.

Partial Rank Correlation Coefficients (PRCC) are PRC coefficients computed on the ranked input variables r\vect{X} = \left( rX_1,\ldots,rX_{n_X} \right) and the ranked output variable rY.