Uncertainty ranking: PCC

This method deals with analyzing the influence the random vector \vect{X} = \left( X^1,\ldots,X^{n_X} \right) has on a random variable Y^j which is being studied for uncertainty. Here we attempt to measure linear relationships that exist between Y^j and the different components X^i.

The basic method of hierarchical ordering using Pearson’s coefficients deals with the case where the variable Y^j linearly depends on n_X variables \left\{ X^1,\ldots,X^{n_X} \right\} but this can be misleading when statistical dependencies or interactions between the variables X^i (e.g. a crossed term X^i \times X^j) exist. In such a situation, the partial correlation coefficients can be more useful in ordering the uncertainty hierarchically: the partial correlation coefficients \textrm{PCC}_{X^i,Y^j} between the variables Y^j and X^i attempts to measure the residual influence of X^i on Y^j once influences from all other variables X^j have been eliminated.

The estimation for each partial correlation coefficient \textrm{PCC}_{X^i,Y^j} uses a set made up of N values \left\{ (y^j_1,x_1^1,\ldots,x_1^{n_X}),\ldots,(y^j_N,x_N^1,\ldots,x_N^{n_X}) \right\} of the vector (Y^j,X^1,\ldots,X^{n_X}). This requires the following three steps to be carried out:

  1. Determine the effect of other variables \left\{ X^j,\ j\neq i \right\} on Y^j by linear regression; when the values of variable \left\{ X^j,\ j\neq i \right\} are known, the average forecast for the value of Y^j is then available in the form of the equation:

    \begin{aligned}
      \widehat{Y^j} = \sum_{k \neq i,\ 1 \leq k \leq n_X} \widehat{a}_k X^k
    \end{aligned}

  2. Determine the effect of other variables \left\{ X^j,\ j\neq i \right\} on X^i by linear regression; when the values of variable \left\{ X^j,\ j\neq i \right\} are known, the average forecast for the value of Y^j is then available in the form of the equation:

    \begin{aligned}
      \widehat{X}^i = \sum_{k \neq i,\ 1 \leq k \leq n_X} \widehat{b}_k X^k
    \end{aligned}

  3. \textrm{PCC}_{X^i,Y^j} is then equal to the Pearson’s correlation coefficient \widehat{\rho}_{Y^j-\widehat{Y^j},X^i-\widehat{X}^i} estimated for the variables Y^j-\widehat{Y^j} and X^i-\widehat{X}^i on the N-sample of simulations.

One can then class the n_X variables X^1,\ldots, X^{n_X} according to the absolute value of the partial correlation coefficients: the higher the value of \left| \textrm{PCC}_{X^i,Y^j} \right|, the greater the impact the variable X^i has on Y^j.

References:

  • Saltelli, A., Chan, K., Scott, M. (2000). “Sensitivity Analysis”, John Wiley & Sons publishers, Probability and Statistics series

  • J.C. Helton, F.J. Davis (2003). “Latin Hypercube sampling and the propagation of uncertainty analyses of complex systems”. Reliability Engineering and System Safety 81, p.23-69

  • J.P.C. Kleijnen, J.C. Helton (1999). “Statistical analyses of scatterplots to identify factors in large-scale simulations, part 1 : review and comparison of techniques”. Reliability Engineering and System Safety 65, p.147-185