Pitfalls in polynomial chaos expansion due to the input distribution

In this example, we construct a polynomial chaos expansion (refer to Functional Chaos Expansion for more details) using a countable family of polynomials orthonormal with respect to an input distribution \mu such that the functional space generated by this orthonormal polynomial family is not dense in the Hilbert space L^2(\mu) of square-integrable functions with respect to \mu. In other words, this family is not complete.

Therefore, there exist functions in L^2(\mu) that are not equal to their orthogonal projection onto the closure of the polynomial space. Moreover, any approximation obtained by truncating this projection may lead to a low-quality surrogate model (see [feller1970]).

This situation occurs when \mu is not characterized by the infinite sequence of its moments. The LogNormal distribution is one such example.

We also show that numerical cancellation issues may arise when using a large number of polynomials in the surrogate model.

We consider the model \model:  \Rset^* \rightarrow \Rset defined by:

\model(x) = \dfrac{1}{x}.

Let the random variable X follow a LogNormal \mathcal{LN}(0,1) distribution whose PDF is denoted by \mu. This distribution is not characterized by the infinite sequence of its moments.

This particular case is discussed in [ernst2012].

We proceed as follows:

  • Introduction: we build the family of polynomials orthonormal with respect to the LogNormal distribution up to high degrees,

  • Case 1: we create a polynomial chaos surrogate model of \model using the polynomials orthonormal with respect to the LogNormal distribution,

  • Case 2: we show how to obtain a higher-quality surrogate model using another polynomial family: the polynomials orthonormal with respect to the Normal distribution.

import openturns as ot
import openturns.viewer as otv
from math import sqrt

# sphinx_gallery_thumbnail_number = 8

Introduction: orthonormal polynomial family w.r.t. the LogNormal distribution

In this section, we build the family of polynomials orthonormal with respect to the LogNormal distribution using the AdaptiveStieltjesAlgorithm.

We first build the LogNormal distribution and we display its probability density function.

dist_X = ot.LogNormal()
g = dist_X.drawPDF()
view = otv.View(g)
plot chaos lognormal

We build the polynomials up to degree 20: they are defined by their three-term recurrence coefficients.

lower_bound = dist_X.computeQuantile(0.01, False)
upper_bound = dist_X.computeQuantile(0.01, True)
degreeMax = 20

as_algo = ot.AdaptiveStieltjesAlgorithm(dist_X)
sample_coeff = ot.Sample(0, 3)
for i in range(degreeMax):
    coeff = as_algo.getRecurrenceCoefficients(i)
    sample_coeff.add(coeff)

We build a set of polynomials with increasing degrees. For each polynomial, we display its value at x = 0.

We anticipate that the vicinity of x = 0 will cause difficulties for the surrogate model of \model, since \lim_{x \rightarrow 0} \model(x) = +\infty. Given that the polynomials take reasonable values in the vicinity of x = 0, we expect the polynomial surrogate model to require very large linear coefficients in order to reproduce high values near x = 0, which may lead to numerical cancellation issues.

We explore polynomial degrees up to 20.

deg_list = [5, 10, 20]
g = ot.Graph(
    "Polynomials orthonormal with respect to the LogNormal distribution", "x", "y", True
)
leg = ot.Description(0)
for deg in deg_list:
    coeff_poly_deg = sample_coeff[0:deg]
    poly_deg = ot.OrthogonalUniVariatePolynomial(coeff_poly_deg)
    g.add(poly_deg.draw(lower_bound[0], upper_bound[0], 1024))
    leg.add(r"$d = $" + str(deg))

g.setLegends(leg)
g.setLegendPosition("topleft")
view = otv.View(g)
Polynomials orthonormal with respect to the LogNormal distribution

Moreover, we print the value of the first polynomials at x = 0: they have alternates signs.

for deg in range(6):
    coeff_poly_deg = sample_coeff[0:deg]
    poly_deg = ot.OrthogonalUniVariatePolynomial(coeff_poly_deg)
    print("degree = ", deg, "polynomial(0) = ", poly_deg(0))
degree =  0 polynomial(0) =  1.0
degree =  1 polynomial(0) =  -0.7628739831836646
degree =  2 polynomial(0) =  0.49765901615612673
degree =  3 polynomial(0) =  -0.3217752980797546
degree =  4 polynomial(0) =  0.2877253668417715
degree =  5 polynomial(0) =  -0.2969414432018124

We observe that the polynomials up to degree 20:

  • they are computed in a stable manner, as indicated by their smooth graphs,

  • they take reasonable values on the interval [0, 25],

  • the sign of the polynomials at x = 0 alternates with the degree. Then the value of the surrogate model in the vicinity of x = 0 which has to be large will be obtained as the sum of real values with opposite signs and large magnitudes.

Case 1: PCE using orthonormal polynomial family w.r.t. the LogNormal distribution

We define \model. The input distribution of the variable X has already been created in the previous lines.

g = ot.SymbolicFunction("x", "1/(x)")

We create a training sample. We check that the points are all in the interval [0,25] which is the interval plotted previously.

sample_size = 1000
input_train = dist_X.getSample(sample_size)
output_train = g(input_train)
print('max sample = ', input_train.getMax()[0])
max sample =  23.560713541344256

We now construct the polynomial chaos surrogate model \metaModel using the polynomials up to degree 5. Note that the coefficients in the linear combination of the polynomials are large, as expected. They are all with the same sign, which confirms that the value of the surrogate model in the vicinity of x = 0 will be obtained as the sum of real values with opposite signs and large magnitudes. As such, it will be prone to cancellation.

basis_LN = ot.OrthogonalProductPolynomialFactory(
    [ot.StandardDistributionPolynomialFactory(ot.LogNormal())]
)
basis_size_LN = 6
chaos_algo_LN = ot.LeastSquaresExpansion(
    input_train, output_train, dist_X, basis_LN, basis_size_LN
)
chaos_algo_LN.run()
result_LN = chaos_algo_LN.getResult()
meta_model_LN_5 = result_LN.getMetaModel()
result_LN
FunctionalChaosResult
  • input dimension: 1
  • output dimension: 1
  • distribution dimension: 1
  • transformation: 1 -> 1
  • inverse transformation: 1 -> 1
  • orthogonal basis dimension: 1
  • indices size: 6
Index Multi-index Coeff.
0 [0] -23.67553
1 [1] -3651.218
2 [2] -111611.1
3 [3] -658095
4 [4] -798496.9
5 [5] -238352


We assess the quality of the surrogate model using the MetaModelValidation class. We compute the R^2 score, which is very low.

input_test = dist_X.getSample(sample_size)
output_test = g(input_test)
meta_model_predictions_LN = meta_model_LN_5(input_test)
valid_meta_model_LN = ot.MetaModelValidation(output_test, meta_model_predictions_LN)
r2_score_LN = valid_meta_model_LN.computeR2Score()[0]
r2_score_LN
-0.06976176960795577

In the following graph, we see that the points are far away from the diagonal: the surrogate mmodel is not accurate, in particular in the center part of the LogNormal distribution. Let the center part of the LogNormal distribution be defined as the interval I = [q_{0.1}, q_{0.9}] \simeq [0.27, 3.6] where q_{0.1} is the quantile of order 0.1 of the LogNormal distribution and q_{0.9} is the quantile of order 0.9. Within this interval, we see on the graph that the surrogate model and the model differ significantly, which explains why the R^2 score is so bad. This region corresponds to the model values within [1/3.6, 1/0.27] \simeq [0.28, 3.6].

We show the validation graph and its zoom for x \in I.

graph_valid_LN = valid_meta_model_LN.drawValidation().getGraph(0, 0)
graph_valid_LN.setTitle(
    r"$R^2=$%.2f%%,Polynomial surrogate model of degree $d = $" % (r2_score_LN * 100)
    + str(basis_size_LN - 1)
)

grid = ot.GridLayout(1, 2)
grid.setGraph(0, 0, graph_valid_LN)
boudingbox = ot.Interval([0.2, 0.2], [4.0, 4.0])
graph_valid_LN.setBoundingBox(boudingbox)
graph_valid_LN.setTitle(r'zoom for $x \in I$')
grid.setGraph(0, 1, graph_valid_LN)
view = otv.View(grid)
, $R^2=$-6.98%,Polynomial surrogate model of degree $d = $5, zoom for $x \in I$

In the next figure, we plot the model and the surrogate model.

graph_LN = g.draw(lower_bound, upper_bound, [251])
graph_LN.add(meta_model_LN_5.draw(lower_bound, upper_bound, [251]))
graph_LN.setLegends(["model", "surrogate model"])
graph_LN.setLegendPosition("topright")
graph_LN.setTitle(
    r"Polynomial surrogate model of degree $d = $"
    + str(basis_size_LN - 1)
)
graph_LN.setXTitle("x")
graph_LN.setYTitle("y")
view = otv.View(graph_LN)
Polynomial surrogate model of degree $d = $5

In conclusion, the surrogate model is a low-quality one.

We then increase the degree of the surrogate model up to 8 and 9. We observe that the surrogate model does not converge to the true model, even for high polynomial degrees. For each surrogate model, we display the coefficients of the linear combination. We see that the coefficients become very large.

Finally, we save the surrogate model so that it can be easily reused later.

basis_size_list = [9, 10]
leg = graph_LN.getLegends()
leg[1] = r"$d = $" + str(basis_size_LN - 1)
meta_model_LN_list = list()
for basis_size in basis_size_list:
    chaos_algo_LN = ot.LeastSquaresExpansion(
        input_train, output_train, dist_X, basis_LN, basis_size
    )
    chaos_algo_LN.run()
    meta_model_LN = chaos_algo_LN.getResult().getMetaModel()
    meta_model_LN_list.append(meta_model_LN)
    graph_LN.add(meta_model_LN.draw(lower_bound, upper_bound, [128]))
    leg.add(r"$d = $" + str(basis_size - 1))
    result_LN = chaos_algo_LN.getResult()
    meta_model_LN = result_LN.getMetaModel()
    print("degree = ", basis_size - 1, "coefficients of the meta Model (LN) = ", result_LN.getCoefficients())
graph_LN.setTitle("Polynomial surrogate model of degree $d$")
graph_LN.setLegends(leg)
view = otv.View(graph_LN)
Polynomial surrogate model of degree $d$
degree =  8 coefficients of the meta Model (LN) =  0 : [ 3.74947e+07 ]
1 : [ 2.12584e+10 ]
2 : [ 1.61419e+12 ]
3 : [ 1.70001e+13 ]
4 : [ 3.5334e+13  ]
5 : [ 2.83542e+13 ]
6 : [ 1.24457e+13 ]
7 : [ 3.02189e+12 ]
8 : [ 3.18442e+11 ]
degree =  9 coefficients of the meta Model (LN) =  0 : [ -6.93693e+07 ]
1 : [ -4.9564e+10  ]
2 : [ -4.29664e+12 ]
3 : [ -4.93729e+13 ]
4 : [ -1.11742e+14 ]
5 : [ -1.03185e+14 ]
6 : [ -5.71302e+13 ]
7 : [ -2.01392e+13 ]
8 : [ -4.1825e+12  ]
9 : [ -3.91781e+11 ]

We note that the graph of the surrogate model is not smooth, from degree 9, which is due to massive cancellation: we see that for the surrogate model of degree 9, all the coefficients have a large magnitude (of order 10^{14}), have the same sign but they multiply polynomials of alternative sign. As a result, the computatin of the surrogate model is prone to numerical cancellation issues, as anticipated earlier.

%% Case 2: PCE using orthonormal polynomial family w.r.t. another distribution —————————————————————————

We choose to modify the parametrisation of the model in order to expand the composed model w.r.t. the the family of polynomials orthonormal with respect to the Normal distribution. The Normal distribution is characterized by its infinite sequence of moments. We use the isoprobabilistic transformation T which maps the LogNormal distribution into the Normal distribution. It is defined by:

..math::

T(x) = e^z

Thus, the polynomial chaos expansion is constructed for h(z) = g \circ T(z) = e^{-z}. This function is analytical with an infinite radius of convergence, so its polynomial approximation converges very quickly and does not require many polynomials to get spetral convergence.

We construct the surrogate model as before, using only the polynomials up to degree 5.

basis_N = ot.OrthogonalProductPolynomialFactory(
    [ot.StandardDistributionPolynomialFactory(ot.Normal())]
)
basis_size_N = 6
chaos_algo_N = ot.LeastSquaresExpansion(
    input_train, output_train, dist_X, basis_N, basis_size_N
)

chaos_algo_N.run()
meta_model_N = chaos_algo_N.getResult().getMetaModel()

We assess the quality of the surrogate model and compute the R^2 score, which is very high.

meta_model_predictions_N = meta_model_N(input_train)
valid_meta_model_N = ot.MetaModelValidation(output_train, meta_model_predictions_N)
r2_score_N = valid_meta_model_N.computeR2Score()[0]
r2_score_N

graph_valid_N = valid_meta_model_N.drawValidation()
graph_valid_N.setTitle(
    r"$R^2=$%.2f%%, Normal polyn. family, $d \leq $" % (r2_score_N * 100)
    + str(basis_size_N - 1)
)
view = otv.View(graph_valid_N)
$R^2=$99.99%, Normal polyn. family, $d \leq $5

Here, we plot the model and the surrogate model. No differences are observed between the model and the surrogate model across the entire interval [0, 25].

graph_N = g.draw(lower_bound, upper_bound, [251])
graph_N.add(meta_model_N.draw(lower_bound, upper_bound, [251]))
graph_N.setLegends(["model", "surrogate model"])
graph_N.setLegendPosition("topright")
graph_N.setTitle(
    r"Chaos expansion with $T$, $d =$"
    + str(basis_size_N - 1)
)
graph_N.setXTitle("x")
graph_N.setYTitle("y")
view = otv.View(graph_N)
Chaos expansion with $T$, $d =$5

We compute the L^2-error between the surrogate model and the true model, for the surrogate models with and without the transformation T. We see that with the transformation, the surrogate model converges very quickly to the true model. Without the transformation, it does not seem to converge.

This is an example where:

  • the use of a polynomial approximation space does not lead to a convergince sequence of surrogate models,

  • the use of the isoprobabilistic transformation is beneficial.

size = 100000
sample_in = dist_X.getSample(size)
sample_model = g(sample_in)
sample_predic_N = meta_model_N(sample_in)
sample_predic_LN = meta_model_LN_5(sample_in)
square_L2_N_norm = (sample_predic_N - sample_model).computeRawMoment(2)[0]
square_L2_LN_norm = (sample_predic_LN - sample_model).computeRawMoment(2)[0]
print("surrogate model with    T and degree =", basis_size_N - 1, "L2 norm = ", sqrt(square_L2_N_norm))
print("surrogate model without T and degree =", basis_size_N - 1, "L2 norm = ", sqrt(square_L2_LN_norm))

# Here, we facilitate the comparison between the surrogate model constructed from the polynomial family orthonormal
# with respect to the LogNormal distribution up to degree 20,
# and the surrogate model constructed from the polynomial family orthonormal
# with respect to the Normal distribution up to degree 5.
graph_comp = ot.Graph(
    r"Chaos expansion of $f: x \rightarrow 1/x$,  $X \sim LogNormal$",
    "x",
    "y",
    True,
)
graph_comp = g.draw(lower_bound, upper_bound, [251])
graph_comp.add(meta_model_N.draw(lower_bound, upper_bound, [251]))
graph_comp.add(meta_model_LN_5.draw(lower_bound, upper_bound, [251]))
graph_comp.setLegends(
    [
        "model",
        r"$d = 5$, with $T$",
        r"$d = 5$, without $T$",
    ]
)
graph_comp.setLegendPosition("topright")
graph_comp.setTitle(
    r"Chaos expansion of $f: x \rightarrow 1/x$,  $X \sim LogNormal$"
)
graph_comp.setXTitle("x")
graph_comp.setYTitle("y")
view = otv.View(graph_comp)
Chaos expansion of $f: x \rightarrow 1/x$,  $X \sim LogNormal$
surrogate model with    T and degree = 5 L2 norm =  0.04933647163951184
surrogate model without T and degree = 5 L2 norm =  691.9412593844918

We display all figures.

otv.View.ShowAll()