Functional Chaos Expansion¶
Accounting for the joint probability density function (PDF)
of the input random vector
, one seeks the joint PDF of output random vector
. This may be achieved using
Monte Carlo (MC) simulation (see Monte Carlo simulation). However, the MC
method may require a large number of model evaluations, i.e. a great
computational cost, in order to obtain accurate results.
A possible solution to overcome this problem is to project the model
in a suitable functional space, such as
the Hilbert space
of square-integrable functions with
respect to
. This projection is called a functional chaos expansion of
the model
.
The principles of the building of a functional chaos expansion are described in the sequel.
Methodology: principles¶
We consider the output random vector:
where is the model,
is the input random vector which distribution is
.
We assume that has finite variance i.e.
.
When , the functional chaos algorithm is used on each marginal
of
, using the same multivariate orthonormal basis for
all the marginals.
Thus, the method is detailed here for a scalar output
and
.
The functional space
is a Hilbert space equipped with an inner
product defined by:
for any .
For a continuous random variable, the inner product is defined by:
For a discrete random variable, the scalar product is:
The associated norm is defined by:
for any .
The Hilbert space admits an orthonormal basis denoted by
with
. Therefore, any
can be written as a functional chaos expansion
(see [lemaitre2010] page 39):
where is a sequence of coefficients.
The meta model is the truncation of the expansion to a finite subset
of coefficients:
Methodology: step by step¶
The construction of a meta model as a truncated functional expansion consists in choosing:
Choice 1: an approximation space
such that:
(1)¶
Choice 2: a sequence of nested approximation spaces
such that:
(2)¶
which implies that any function in is the limit with
respect to
of a sequence
such
that
:
(3)¶
Choice 3: a basis
of
:
(4)¶
where is a finite index set.
Choice 4: a specific space
in which
is approximated by
as follows:
(5)¶
which amounts to solving:
(6)¶
This problem is a linear least-squares problem.
Which approximation space and sequence of nested approximation spaces?¶
Many choices are possible for and the associated sequence
.
For instance, one may choose to be the set of multivariate polynomials,
provided that
verifies the condition (1) (see also [sullivan2015],
page 139, [dahlquist2008], theorem 4.5.16 page 456 and
[rudin1987], section 4.24 page 85).
We then consider a sequence of nested spaces defined as follows: we
construct a complete family using graded polynomials by introducing a bijection
from
into
. The mapping
specifies the
multi-index of marginal degrees (this bijection ensures that all polynomials are covered and
that any finite family is linearly independent). Then,
is the space spanned by
polynomials with marginal degrees
. Depending on the choice of
,
may correspond to the set of polynomials of total degree less than
. See Tensorized multivariate basis enumeration functions and Multivariate indices enumeration functions for more details on this topic.
Note that, depending on the distribution , the space
of
multivariate polynomials does not always satisfy
, and even when this inclusion holds, the
density condition (1) must also be satisfied:
First, for all polynomial spaces
to be subspaces of
, it is required that
for
be finite in the scalar case, and
with
in dimension
. This implies that the measure
must have finite moments of all orders. This is the case, for instance, when the support of
is bounded. However,this is not the case, for instance, when
is the Cauchy distribution: for this distribution, the only polynomial space contained in
is the space of constant polynomials, since the Cauchy distribution has an infinite expectation.
Moreover, among probability measures
that admit moments of all orders, the closure property (1) is not always guaranteed. For instance, the log-normal distribution (see [ernst2012]). In this case, the expansion may not converge to the function. Nevertheless, even without any guarantee, it is possible that the meta model built using the basis
may be a good approximation of
.
One may also consider non-polynomial approximation spaces . For instance, the space
spanned by trigonometric polynomials (the so-called Fourier space) or spanned by the Haar
wavelets.
If the approximation space does not satisfy condition (1), i.e., if
, then one may use an
isoprobabilistic transformation
such that
is a random
vector distributed according to a density
satisfying
.
The model is then transformed into
.
One seeks
, the orthogonal projection of
with respect to
onto the polynomial space
.
The metamodel of
is then given by:
Projecting onto the basis
of
orthogonally
with respect to
is equivalent to projecting
onto the
space spanned by
orthogonally with respect to
.
Which basis of the approximation space?¶
The sub-spaces are nested which means that the basis of
is constructed from that of
by enrichment. Several choices
of basis are possible.
We can choose a basis which is orthonormal with respect to
(see
Multivariate Orthonormal basis for more details):
If
has independent components, this multivariate orthonormal basis can be built as the tensor product of the univariate basis orthonormal to each component of
.
If
has dependent components, we can use an isoprobabilistic transformation
that maps the measure
into a measure
with independent marginals. We can also use the Soize-Ghanem basis but this usage is not recommended.
We can also choose a basis which is not orthonormal with respect to
but
which is orthonormal with respect to an instrumental measure
.
This instrumental measure is such that:
the support of
is larger than the support of
,
has independent components.
This choice is done in the domination strategy.
Once the basis has been chosen, the enumeration function helps to enumerate the elements of the basis in a specific order (see Tensorized multivariate basis enumeration functions and Multivariate indices enumeration functions for more details on this topic).
Which sub-space of approximation?¶
If the dimension of (that is to say if the number of coefficients to be
computed) is too large, this can lead to overfitting. This may happen for instance if the
total polynomial order we choose is too large.
In order to limit this effect, one method is to define a strategy for exploring the basis (see
Sparse least squares meta model) as well as a strategy to select the coefficients which
best predict the output (see FixedStrategy and CleaningStrategy).
Estimate the coefficients¶
In this section, we give some elements to estimate the coefficients of the expansion.
The vector of coefficients is the solution of the linear least-squares problem defined in Functional Chaos Expansion by equation (6). Refer to Least squares meta models for more details on the resolution of the least-squares problem.
The choice of basis of has a major impact on the conditioning of the
least-squares problem.
Indeed, if the basis
is
orthonormal with respect to
, then the design matrix of
the least squares problem is well-conditioned.
In the other case, the design matrix might be ill-conditioned, leading to numerical instabilities.
If the chosen basis is orthonormal with respect to
, then the coefficients
can be computed
using the inner product (see [dahlquist2008] theorem 4.5.13 page 454) as follows:
(7)¶
for .
This integral can computed using a quadrature rule (see the documentation
of the
IntegrationStrategy class).
This is the case for instance when an Isoprobabilistic transformations is used
and the model
is expanded onto the basis
which is orthonormal with respect to
. Thus, we get:
(8)¶
Generally speaking, choosing a basis that is orthonormal with respect to
simplifies the computation of the coefficients
by turning the
least-squares problem into the computation of inner products: in this case, solving the
least-squares problem amounts to evaluating inner products, resulting in a significantly lower
computational cost.
However, the optimal estimator of the coefficient vector is the one obtained by solving the
least-squares problem. Therefore, for estimating the coefficients, it is preferable to solve
the least-squares problem even when the basis of the approximation space is orthonormal with
respect to the distribution .
In a probabilistic setting of independent and identically distributed samples, the two estimates (from least-squares problem and from the inner product) are consistent but have different asymptotic variance, resulting in two qualities of approximation (see [lemaitre2010] eq. 3.48 and eq. 3.49 page 66).
Several algorithms are available to compute the coefficients
:
see
IntegrationExpansionfor an algorithm based on quadrature,see
LeastSquaresExpansionfor an algorithm based on the least squares problem,see
FunctionalChaosAlgorithmfor an algorithm that can manage both methods.
Cross Validation of the functional chaos expansion¶
The cross-validation of a polynomial chaos expansion uses the theory presented in Validation and cross validation of metamodels. In [blatman2009] page 84, the author applies the LOO equation to polynomial chaos expansion (see appendix D page 203 for a proof). If the coefficients are estimated from integration, the same derivation cannot, in theory, be applied.
The fast methods presented in Validation and cross validation of metamodels can be applied:
the fast leave-one-out cross-validation,
the fast K-Fold cross-validation.
Refer to FunctionalChaosValidation.
Usual exploitation of the functional chaos expansion¶
There are many ways to use the functional chaos expansion. We present two usual exploitations:
using the expansion as a random vector generator,
performing the sensitivity analysis of the expansion.
The first usage is to create a random vector defined by:
This equation can be used to simulate independent random observations
from the functional chaos expansion: see the FunctionalChaosRandomVector
class for more details on this topic.
The second usage assumes that the input distribution has
independent
marginals and that the basis
is orthonormal
with respect
to
and that the first element be:
(9)¶
The orthogonality of the functions implies that:
for any non-zero .
In that case, the Sobol’ indices can easily be deduced from the coefficients
: see
FunctionalChaosSobolIndices for
more details on this topic.
OpenTURNS