Note
Go to the end to download the full example code.
Pitfalls in polynomial chaos expansion due to the input distribution¶
In this example, we construct a polynomial
chaos expansion (refer to Functional Chaos Expansion for more details)
using a countable family of polynomials orthonormal with respect to an input distribution
such that
the functional space generated by this orthonormal polynomial family
is not dense in the Hilbert space
of square-integrable functions with respect to
. In other words, this
family is not complete.
Therefore, there exist functions in that are not equal to
their orthogonal projection onto the closure of the polynomial space.
Moreover, any approximation obtained by truncating this projection
may lead to a low-quality surrogate model (see [feller1970]).
This situation occurs when
is not characterized by the infinite sequence of its moments.
The
LogNormal distribution is one such example.
We also show that numerical cancellation issues may arise when using a large number of polynomials in the surrogate model.
We consider the model defined by:
Let the random variable follow a
LogNormal
distribution whose PDF is denoted by
.
This distribution is not characterized by the infinite sequence of its moments.
This particular case is discussed in [ernst2012].
We proceed as follows:
Introduction: we build the family of polynomials orthonormal with respect to the
LogNormaldistribution up to high degrees,Case 1: we create a polynomial chaos surrogate model of
using the polynomials orthonormal with respect to the LogNormal distribution,
Case 2: we show how to obtain a higher-quality surrogate model using another polynomial family: the polynomials orthonormal with respect to the
Normaldistribution.
import openturns as ot
import openturns.viewer as otv
from math import sqrt
# sphinx_gallery_thumbnail_number = 8
Introduction: orthonormal polynomial family w.r.t. the LogNormal distribution¶
In this section, we build the family of polynomials orthonormal with respect to the LogNormal distribution
using the AdaptiveStieltjesAlgorithm.
We first build the LogNormal distribution and we display its probability density function.
dist_X = ot.LogNormal()
g = dist_X.drawPDF()
view = otv.View(g)
We build the polynomials up to degree 20: they are defined by their three-term recurrence coefficients.
lower_bound = dist_X.computeQuantile(0.01, False)
upper_bound = dist_X.computeQuantile(0.01, True)
degreeMax = 20
as_algo = ot.AdaptiveStieltjesAlgorithm(dist_X)
sample_coeff = ot.Sample(0, 3)
for i in range(degreeMax):
coeff = as_algo.getRecurrenceCoefficients(i)
sample_coeff.add(coeff)
We build a set of polynomials with increasing degrees.
For each polynomial, we display its value at .
We anticipate that the vicinity of will cause difficulties
for the surrogate model of
, since
.
Given that the polynomials take reasonable values in the vicinity of
,
we expect the polynomial surrogate model to require
very large linear coefficients in order to reproduce high values near
,
which may lead to numerical cancellation issues.
We explore polynomial degrees up to 20.
deg_list = [5, 10, 20]
g = ot.Graph(
"Polynomials orthonormal with respect to the LogNormal distribution", "x", "y", True
)
leg = ot.Description(0)
for deg in deg_list:
coeff_poly_deg = sample_coeff[0:deg]
poly_deg = ot.OrthogonalUniVariatePolynomial(coeff_poly_deg)
g.add(poly_deg.draw(lower_bound[0], upper_bound[0], 1024))
leg.add(r"$d = $" + str(deg))
g.setLegends(leg)
g.setLegendPosition("topleft")
view = otv.View(g)
Moreover, we print the value of the first polynomials at : they have alternates signs.
for deg in range(6):
coeff_poly_deg = sample_coeff[0:deg]
poly_deg = ot.OrthogonalUniVariatePolynomial(coeff_poly_deg)
print("degree = ", deg, "polynomial(0) = ", poly_deg(0))
degree = 0 polynomial(0) = 1.0
degree = 1 polynomial(0) = -0.7628739831836646
degree = 2 polynomial(0) = 0.49765901615612673
degree = 3 polynomial(0) = -0.3217752980797546
degree = 4 polynomial(0) = 0.2877253668417715
degree = 5 polynomial(0) = -0.2969414432018124
We observe that the polynomials up to degree 20:
they are computed in a stable manner, as indicated by their smooth graphs,
they take reasonable values on the interval
,
the sign of the polynomials at
alternates with the degree. Then the value of the surrogate model in the vicinity of
which has to be large will be obtained as the sum of real values with opposite signs and large magnitudes.
Case 1: PCE using orthonormal polynomial family w.r.t. the LogNormal distribution¶
We define . The input distribution of the variable
has already been created
in the previous lines.
g = ot.SymbolicFunction("x", "1/(x)")
We create a training sample. We check that the points are all in the interval
which is the interval plotted previously.
sample_size = 1000
input_train = dist_X.getSample(sample_size)
output_train = g(input_train)
print('max sample = ', input_train.getMax()[0])
max sample = 23.560713541344256
We now construct the polynomial chaos surrogate model
using the polynomials up to degree 5. Note that the coefficients in the linear combination of
the polynomials are large, as expected. They are all with the same sign, which confirms that
the value of the surrogate model in the vicinity of
will be obtained as the sum of real
values with opposite signs and large magnitudes. As such, it will be prone to cancellation.
basis_LN = ot.OrthogonalProductPolynomialFactory(
[ot.StandardDistributionPolynomialFactory(ot.LogNormal())]
)
basis_size_LN = 6
chaos_algo_LN = ot.LeastSquaresExpansion(
input_train, output_train, dist_X, basis_LN, basis_size_LN
)
chaos_algo_LN.run()
result_LN = chaos_algo_LN.getResult()
meta_model_LN_5 = result_LN.getMetaModel()
result_LN
We assess the quality of the surrogate model using the MetaModelValidation class.
We compute the score, which is very low.
input_test = dist_X.getSample(sample_size)
output_test = g(input_test)
meta_model_predictions_LN = meta_model_LN_5(input_test)
valid_meta_model_LN = ot.MetaModelValidation(output_test, meta_model_predictions_LN)
r2_score_LN = valid_meta_model_LN.computeR2Score()[0]
r2_score_LN
-0.06976176960795577
In the following graph, we see that the points are far away from the diagonal: the surrogate mmodel is
not accurate, in particular in the center part of the LogNormal distribution.
Let the center part of the LogNormal distribution be defined as the interval
where
is the quantile of order 0.1
of the LogNormal distribution and
is the quantile of order 0.9.
Within this interval, we see on the graph that the surrogate model and the model differ
significantly, which explains why the
score is so bad. This region corresponds to the
model values within
.
We show the validation graph and its zoom for .
graph_valid_LN = valid_meta_model_LN.drawValidation().getGraph(0, 0)
graph_valid_LN.setTitle(
r"$R^2=$%.2f%%,Polynomial surrogate model of degree $d = $" % (r2_score_LN * 100)
+ str(basis_size_LN - 1)
)
grid = ot.GridLayout(1, 2)
grid.setGraph(0, 0, graph_valid_LN)
boudingbox = ot.Interval([0.2, 0.2], [4.0, 4.0])
graph_valid_LN.setBoundingBox(boudingbox)
graph_valid_LN.setTitle(r'zoom for $x \in I$')
grid.setGraph(0, 1, graph_valid_LN)
view = otv.View(grid)
In the next figure, we plot the model and the surrogate model.
graph_LN = g.draw(lower_bound, upper_bound, [251])
graph_LN.add(meta_model_LN_5.draw(lower_bound, upper_bound, [251]))
graph_LN.setLegends(["model", "surrogate model"])
graph_LN.setLegendPosition("topright")
graph_LN.setTitle(
r"Polynomial surrogate model of degree $d = $"
+ str(basis_size_LN - 1)
)
graph_LN.setXTitle("x")
graph_LN.setYTitle("y")
view = otv.View(graph_LN)
In conclusion, the surrogate model is a low-quality one.
We then increase the degree of the surrogate model up to 8 and 9. We observe that the surrogate model does not converge to the true model, even for high polynomial degrees. For each surrogate model, we display the coefficients of the linear combination. We see that the coefficients become very large.
Finally, we save the surrogate model so that it can be easily reused later.
basis_size_list = [9, 10]
leg = graph_LN.getLegends()
leg[1] = r"$d = $" + str(basis_size_LN - 1)
meta_model_LN_list = list()
for basis_size in basis_size_list:
chaos_algo_LN = ot.LeastSquaresExpansion(
input_train, output_train, dist_X, basis_LN, basis_size
)
chaos_algo_LN.run()
meta_model_LN = chaos_algo_LN.getResult().getMetaModel()
meta_model_LN_list.append(meta_model_LN)
graph_LN.add(meta_model_LN.draw(lower_bound, upper_bound, [128]))
leg.add(r"$d = $" + str(basis_size - 1))
result_LN = chaos_algo_LN.getResult()
meta_model_LN = result_LN.getMetaModel()
print("degree = ", basis_size - 1, "coefficients of the meta Model (LN) = ", result_LN.getCoefficients())
graph_LN.setTitle("Polynomial surrogate model of degree $d$")
graph_LN.setLegends(leg)
view = otv.View(graph_LN)
degree = 8 coefficients of the meta Model (LN) = 0 : [ 3.74947e+07 ]
1 : [ 2.12584e+10 ]
2 : [ 1.61419e+12 ]
3 : [ 1.70001e+13 ]
4 : [ 3.5334e+13 ]
5 : [ 2.83542e+13 ]
6 : [ 1.24457e+13 ]
7 : [ 3.02189e+12 ]
8 : [ 3.18442e+11 ]
degree = 9 coefficients of the meta Model (LN) = 0 : [ -6.93693e+07 ]
1 : [ -4.9564e+10 ]
2 : [ -4.29664e+12 ]
3 : [ -4.93729e+13 ]
4 : [ -1.11742e+14 ]
5 : [ -1.03185e+14 ]
6 : [ -5.71302e+13 ]
7 : [ -2.01392e+13 ]
8 : [ -4.1825e+12 ]
9 : [ -3.91781e+11 ]
We note that the graph of the surrogate model is not smooth, from degree 9,
which is due to massive cancellation: we see that for the surrogate model of degree 9,
all the coefficients have a large magnitude (of order ), have the same sign but
they multiply polynomials of alternative sign. As a result, the computatin of the surrogate
model is prone to numerical cancellation issues, as anticipated earlier.
%% Case 2: PCE using orthonormal polynomial family w.r.t. another distribution —————————————————————————
We choose to modify the parametrisation of the model in order to expand the composed model
w.r.t. the the family of polynomials orthonormal with respect to the
Normal distribution. The Normal distribution is
characterized by its infinite sequence of moments. We use the isoprobabilistic transformation
which maps the LogNormal distribution into the Normal distribution. It is defined by:
- ..math::
T(x) = e^z
Thus, the polynomial chaos expansion is constructed for .
This function is analytical with an infinite radius of convergence, so its polynomial
approximation converges very quickly and does not require many polynomials to get spetral convergence.
We construct the surrogate model as before, using only the polynomials up to degree 5.
basis_N = ot.OrthogonalProductPolynomialFactory(
[ot.StandardDistributionPolynomialFactory(ot.Normal())]
)
basis_size_N = 6
chaos_algo_N = ot.LeastSquaresExpansion(
input_train, output_train, dist_X, basis_N, basis_size_N
)
chaos_algo_N.run()
meta_model_N = chaos_algo_N.getResult().getMetaModel()
We assess the quality of the surrogate model and compute the score, which is very high.
meta_model_predictions_N = meta_model_N(input_train)
valid_meta_model_N = ot.MetaModelValidation(output_train, meta_model_predictions_N)
r2_score_N = valid_meta_model_N.computeR2Score()[0]
r2_score_N
graph_valid_N = valid_meta_model_N.drawValidation()
graph_valid_N.setTitle(
r"$R^2=$%.2f%%, Normal polyn. family, $d \leq $" % (r2_score_N * 100)
+ str(basis_size_N - 1)
)
view = otv.View(graph_valid_N)
Here, we plot the model and the surrogate model.
No differences are observed between the model and the surrogate model
across the entire interval .
graph_N = g.draw(lower_bound, upper_bound, [251])
graph_N.add(meta_model_N.draw(lower_bound, upper_bound, [251]))
graph_N.setLegends(["model", "surrogate model"])
graph_N.setLegendPosition("topright")
graph_N.setTitle(
r"Chaos expansion with $T$, $d =$"
+ str(basis_size_N - 1)
)
graph_N.setXTitle("x")
graph_N.setYTitle("y")
view = otv.View(graph_N)
We compute the -error between the surrogate model and the true model,
for the surrogate models with and without the transformation
.
We see that with the transformation, the surrogate model converges very quickly
to the true model. Without the transformation, it does not seem to converge.
This is an example where:
the use of a polynomial approximation space does not lead to a convergince sequence of surrogate models,
the use of the isoprobabilistic transformation is beneficial.
size = 100000
sample_in = dist_X.getSample(size)
sample_model = g(sample_in)
sample_predic_N = meta_model_N(sample_in)
sample_predic_LN = meta_model_LN_5(sample_in)
square_L2_N_norm = (sample_predic_N - sample_model).computeRawMoment(2)[0]
square_L2_LN_norm = (sample_predic_LN - sample_model).computeRawMoment(2)[0]
print("surrogate model with T and degree =", basis_size_N - 1, "L2 norm = ", sqrt(square_L2_N_norm))
print("surrogate model without T and degree =", basis_size_N - 1, "L2 norm = ", sqrt(square_L2_LN_norm))
# Here, we facilitate the comparison between the surrogate model constructed from the polynomial family orthonormal
# with respect to the LogNormal distribution up to degree 20,
# and the surrogate model constructed from the polynomial family orthonormal
# with respect to the Normal distribution up to degree 5.
graph_comp = ot.Graph(
r"Chaos expansion of $f: x \rightarrow 1/x$, $X \sim LogNormal$",
"x",
"y",
True,
)
graph_comp = g.draw(lower_bound, upper_bound, [251])
graph_comp.add(meta_model_N.draw(lower_bound, upper_bound, [251]))
graph_comp.add(meta_model_LN_5.draw(lower_bound, upper_bound, [251]))
graph_comp.setLegends(
[
"model",
r"$d = 5$, with $T$",
r"$d = 5$, without $T$",
]
)
graph_comp.setLegendPosition("topright")
graph_comp.setTitle(
r"Chaos expansion of $f: x \rightarrow 1/x$, $X \sim LogNormal$"
)
graph_comp.setXTitle("x")
graph_comp.setYTitle("y")
view = otv.View(graph_comp)
surrogate model with T and degree = 5 L2 norm = 0.04933647163951184
surrogate model without T and degree = 5 L2 norm = 691.9412593844918
We display all figures.
otv.View.ShowAll()
OpenTURNS