Sensitivity analysis using Sobol’ indices¶
Consider the input random vector and let be the output of the physical model:
We consider the output for any index . Sobol’ indices measure the influence of the input to the output . The method considers the part of the variance of the output produced by the different inputs .
In the first part of this document, we introduce the Sobol’ indices of a scalar output . Hence, the model is simplified to:
In the second part of the document, we consider the general case where the output is multivariate. In this case, aggregated Sobol’ indices can be used [gamboa2013].
The Sobol’ decomposition is described more easily when the domain of the input is the unit interval . It can be easily extended to any input domain using expectations, variances and variance of conditional expectations.
We assume that the input marginal variables are independent. This restrictive hypothesis implies that the only copula of the input random vector for which the Sobol’ indices are easy to interpret is the independent copula. If the input variables are dependent, then the Sobol’ indices can be defined, but some of their properties are lost.
Partition of the input¶
For any , let be the vector made of components of which indices are different from . Hence, if , then:
Consider the function defined by the equation:
where . With this notation, we can partition the input of :
The goal of sensitivity analysis is to measure the sensitivity of the variance of the output depending on the variable . This may take into account the dependence of the output to the interactions of and through the function .
More generally, let be a group of variables. Therefore:
The goal of sensitivity analysis is to measure the sensitivity of the variance of the output depending on the group of variables . This may take into account the dependence of the output to the interactions of and through the function .
Sobol’ decomposition¶
In this section, we introduce the Sobol’-Hoeffding decomposition [sobol1993]. If can be integrated in , then there is a unique decomposition:
where is a constant and the functions of the decomposition satisfy the equalities:
for any and any indices and .
Extension to any input distribution with independent marginals¶
In this section, we extend the previous definitions to an input random vector that is not necessarily defined on the input unit cube . To do this, we define the functions using conditional expectations.
The functions satisfy the equality:
for any group of variables with size lower or equal to , where is the cardinal of the subset . The functions can be defined recursively, using groups of variables of lower dimensionality:
where denotes a proper subset. Let be a point and let be a group of variables. Therefore:
The Möbius inversion formula implies (see [daveiga2022] Theorem 3.3 page 49):
The previous equation is a consequence of the Möbius inversion formula [rota1964] (also called the exclusion-inclusion principle).
Decomposition of the variance¶
The variance of the function can be decomposed into:
where the interaction variances are:
More generally, the interaction variance of a group of variables is:
for any . Using the Hoeffding decomposition, we get:
The Möbius inversion formula implies (see [daveiga2022] corollary 3.5 page 52):
Interaction sensitivity index of a variable¶
The first order interaction sensitivity indices are equal to:
The first order Sobol’ index measures the part of the variance of explained by alone. The second order Sobol’ index measures the part of the variance of explained by the interaction of and .
More generally, the first order interaction Sobol’ index of a group of variables is:
where is the function of the input variables in the group of the functional Sobol’-Hoeffding ANOVA decomposition of the physical model. This index measures the sensitivity of the variance of the output explained by interactions within the group.
Total interaction sensitivity index of a group of variables¶
The total interaction sensitivity index of the group is (see [liu2006] eq. 8 page 714 where it is named “superset importance”):
This index measures the sensitivity of the variance of the output explained by interactions within the group and groups of variables containing it.
First order Sobol’ sensitivity index of a variable¶
The first order Sobol’ sensitivity index is equal to the corresponding interaction index of the group :
for . The first order Sobol’ index measures the sensitivity of the output variance explained by the effect of alone. We can alternatively define the first order Sobol’ sensitivity index using the variance of a conditional expectation. The first order Sobol’ sensitivity index satisfies the equation:
for .
Total sensitivity index of a variable¶
The total Sobol’ sensitivity index is:
for . The total Sobol’ sensitivity index can be equivalently defined in terms of the variance of a conditional expectation. The total Sobol’ sensitivity index satisfies the equation:
for . For any , let us define
Total Sobol’ indices satisfy the equality:
for .
The total Sobol’ index measures the part of the variance of explained by and its interactions with other input variables. It can also be viewed as the part of the variance of that cannot be explained without .
First order closed sensitivity index of a group of variables¶
Let be a group of input variables. The first order (closed) Sobol’ index of a group of input variables is:
The first order closed Sobol’ index of a group of input variables measures the sensitivity of the variance of explained by the variables within the group. This index is useful when the group contains random variables parameterizing a single uncertainty source (see [knio2010] page 139).
Total sensitivity index of a group of variables¶
The total Sobol’ index of a group of variables is:
where is the function of the variables in the group of the functional Sobol’-Hoeffding ANOVA decomposition of the physical model. The total Sobol’ index of a group of input variables measures the sensitivity of the variance of explained by the variables within the group and any group of variables containing any variable in the group. It can also be viewed as the part of the variance of that cannot be explained without .
For any group of variables , the total and first order (closed) Sobol’ indices are related by the equation:
where is the complementary group of .
Summary of Sobol’ indices¶
The next table presents a summary of the 6 different Sobol’ indices that we have presented.
Single variable or group |
Sensitivity Index |
Formula |
---|---|---|
One single variable |
First order |
|
Total |
||
Interaction of a group |
First order |
|
Total interaction |
||
Group (closed) |
First order closed |
|
Total |
Table 1. First order and total Sobol’ indices of a single variable or a group .
Let us summarize the properties of the Sobol’ indices.
All these indices are in the interval.
The sum of interaction first order Sobol’ indices is equal to 1:
Each first order index is lower than its total counterpart:
If , there are interactions between the variable and other variables.
If for , then the function is additive, i.e. the function is the sum of functions of input dimension 1:
Example¶
Let us consider a function which has inputs . The full set of interaction indices is:
Each Sobol’ index combines a subset of the previous interaction indices. For example, the first and total Sobol’ indices are presented in the next table.
Variable |
First order |
Total |
---|---|---|
Table 2. First order and total Sobol’ indices of the variables , and .
The list of possible groups is , , and . The next table presents the Sobol’ indices of the group .
Sobol’ index of group |
Value |
---|---|
Group interaction |
|
Group total interaction |
|
Group first order (closed) |
|
Group total |
Table 3. Sobol’ indices of the group .
Aggregated Sobol’ indices¶
For multivariate outputs i.e. when , the Sobol’ indices can be aggregated [gamboa2013]. Let be the (first order) variance of the conditional expectation of the k-th output :
for and . Similarly, let be the total variance of the conditional expectation of for and .
The indices can be aggregated with the following formulas:
for .
Estimators¶
To estimate these quantities, Sobol’ proposes to use numerical methods that rely on two independent realizations of the random vector . This is known as the pick-freeze estimator.
Let be the size of each sample. Let and be two independent samples of size of :
Each line is a realization of the random vector .
We are now going to mix these two samples to get an estimate of the sensitivity indices.
Several estimators of , and are provided by the SobolIndicesAlgorithm
implementations:
SaltelliSensitivityAlgorithm
based on [saltelli2002],JansenSensitivityAlgorithm
based on [jansen1999],MartinezSensitivityAlgorithm
based on [martinez2011].
Specific formulas for , and are given in the corresponding documentation pages.
The estimator of is the same for all these classes:
Notice that the value of the second order conditional variance depends on the estimators and which are method-dependent. This implies that the value of the second order indices may depend on the specific Sobol’ estimator we use.
Centering the output¶
For the sake of stability, computations are performed with centered output. Let be the mean of the combined samples and . Let be the empirically centered function defined, for any , by:
To estimate the total variance ,
we use the computeVariance()
method of the Sample
.