
# Conditional expectation of a polynomial chaos expansion


In this example, we compute the conditional expectation of a polynomial
chaos expansion of the `Ishigami function <use-case-ishigami>` using
the :meth:`~openturns.FunctionalChaosResult.getConditionalExpectation`
method.



## Introduction
Let $\inputDim \in \Nset$
be the dimension of the input random vector.
Let $\Expect{\inputRV} \in \Rset^\inputDim$
be the mean of the input random vector $\inputRV$.
Let $\model$ be the physical model:

\begin{align}\model : \Rset^\inputDim \rightarrow \Rset.\end{align}

Given $\vect{u} \subseteq \{1, ..., \inputDim\}$ a group
of input variables, we want to create a new function $\widehat{\model}$:

\begin{align}\widehat{\model}: \Rset^{|\vect{u}|} \rightarrow \Rset\end{align}

where $|\vect{u}| = \operatorname{card}(\vect{u})$ is the
number of variables in the group.

In this example, we experiment two different ways to reduce the
input dimension of a polynomial chaos expansion:

- the parametric function,
- the conditional expectation.

The goal of this page is to see how we can create these
functions and the difference between them.



## Parametric function

The simplest method to reduce the dimension of the input
is to set some input variables to constant values.
In this example, all marginal inputs, except those in
the conditioning indices are set to the mean of the input random vector.

Let $\overline{\vect{u}}$ be the complementary set of input
marginal indices such that $\vect{u}$ and $\overline{\vect{u}}$
form a disjoint partition of the full set of variable indices:

\begin{align}\vect{u} \; \dot{\cup} \; \overline{\vect{u}} = \{1, ..., \inputDim\}.\end{align}

The parametric function with reduced dimension is:

\begin{align}\widehat{\model}(\inputReal_{\vect{u}})
  = \model\left(\inputReal_{\vect{u}},
           \inputReal_{\overline{\vect{u}}}
           = \Expect{\inputRV_{\overline{\vect{u}}}}\right)\end{align}

for any $\inputReal_{\vect{u}} \in \Rset^{|\vect{u}|}$.
The previous function is a parametric function based on the function $\model$
where the parameter is $\Expect{\inputRV_{\overline{\vect{u}}}}$.
Assuming that the input random vector has an independent copula,
computing $\Expect{\inputRV_{\overline{\vect{u}}}}$
can be done by selecting the corresponding indices in $\Expect{\inputRV}$.
This function can be created using the :class:`~openturns.ParametricFunction`
class.



## Parametric PCE

If the physical model is a PCE, then the associated parametric model is also a
PCE.
Its coefficients and the associated functional basis can be computed from
the original PCE.
A significant fact, however, is that the coefficients of the parametric
PCE are *not* the ones of the original PCE: the coefficients of the parametric
PCE have to be multiplied by factors which depend on the
value of the discarded basis functions on the parameter vector.
This feature is not currently available in the library.
However, we present it below as this derivation is interesting
to understand why the conditional expectation may behave
differently from the corresponding parametric PCE.



Let $\cJ^P \subseteq \Nset^{\inputDim}$ be the set of
multi-indices corresponding to the truncated polynomial chaos expansion
up to the $P$-th coefficient.
Let $h$ be the PCE in the standard space:

\begin{align}h(\standardReal) = \sum_{\vect{\alpha} \in \cJ^P}
        a_{\vect{\alpha}} \psi_{\vect{\alpha}}(\standardReal).\end{align}

Let $\vect{u} \subseteq \{1, ..., \inputDim\}$ be a group of variables,
let $\overline{\vect{u}}$ be its complementary set such that

\begin{align}\vect{u} \; \dot{\cup} \; \overline{\vect{u}} = \{1, ..., \inputDim\}\end{align}

i.e. the groups $\vect{u}$ and $\overline{\vect{u}}$ create a disjoint partition
of the set $\{1, ..., \inputDim\}$.
Let $|\vect{u}| \in \Nset$ be the number of elements
in the group $\vect{u}$.
Hence, we have $|\vect{u}| + |\overline{\vect{u}}| = \inputDim$.



Let $\standardReal_{\vect{u}}^{(0)} \in \Rset^{|\vect{u}|}$
be a given point.
We are interested in the function :

\begin{align}\widehat{h}(\standardReal_{\overline{\vect{u}}})
    = h\left(\standardReal_{\overline{\vect{u}}},
    \standardReal_{\vect{u}}^{(0)}\right)\end{align}

for any $\standardReal_{\overline{\vect{u}}} \in \Rset^{|\overline{\vect{u}}|}$.
We assume that the polynomial basis are defined by the tensor product:

\begin{align}\psi_{\vect{\alpha}}\left(\standardReal\right)
    = \prod_{i = 1}^{\inputDim}
    \pi_{\alpha_i}^{(i)}\left(\standardReal\right)\end{align}

for any $\standardReal \in \standardInputSpace$
where $\pi_{\alpha_i}^{(i)}$ is the polynomial of degree
$\alpha_i$ of the $i$-th input standard variable.

Let $\vect{u} = (u_i)_{i = 1, ..., |\vect{u}|}$ denote the components of the
group $\vect{u}$ where $|\vect{u}|$ is the number of elements in the group.
Similarly, let $\overline{\vect{u}} = (\overline{u}_i)_{i = 1, ..., |\overline{\vect{u}}|}$ denote the
components of the complementary group $\overline{\vect{u}}$.
The components of $\standardReal \in \Rset^{\inputDim}$
which are in the group $\vect{u}$ are $\left(z_{u_i}^{(0)}\right)_{i = 1, ..., |\vect{u}|}$
and the complementary components are
$\left(z_{\overline{u}_i}\right)_{i = 1, ..., |\overline{\vect{u}}|}$.



Let $\overline{\psi}_{\overline{\vect{\alpha}}}$ be the reduced polynomial:

\begin{align}:label: PCE_CE_1

    \overline{\psi}_{\overline{\vect{\alpha}}}(z_{\overline{\vect{u}}})
    = \left(\prod_{i = 1}^{|\overline{\vect{u}}|}
       \pi_{\alpha_{\overline{u}_i}}^{(\overline{u}_i)}
       \left(\standardReal_{\overline{u}_i}\right) \right)\end{align}

where $\overline{\vect{\alpha}} \in \Nset^{|\vect{u}|}$ is the reduced multi-index
defined from the multi-index $\vect{\alpha}\in \Nset^{\inputDim}$
by the equation:

\begin{align}\overline{\alpha}_i = \alpha_{\overline{u}_i}\end{align}

for $i = 1, ..., |\overline{\vect{u}}|$.
The components of the reduced multi-index $\overline{\vect{\alpha}}$ which corresponds
to the components of the multi-index given by the complementary group $|\vect{u}|$.



We must then gather the reduced multi-indices.
Let $\overline{\cJ}^P$ be the set of unique reduced multi-indices:

\begin{align}:label: PCE_CE_2

    \overline{\cJ}^P = \left\{\overline{\vect{\alpha}} \in \Nset^{|\vect{u}|}
    \; | \; \vect{\alpha} \in \cJ^P\right\}.\end{align}

For any reduced multi-index $\overline{\vect{\alpha}} \in \overline{\cJ}^P$
of dimension $|\overline{\vect{u}}|$,
we note $\cJ_{\overline{\vect{\alpha}}}^P$
the set of corresponding (un-reduced) multi-indices of
dimension $\inputDim$:

\begin{align}:label: PCE_CE_3

    \cJ_{\overline{\vect{\alpha}}}^P
    = \left\{\vect{\alpha} \in \cJ^P \; |\; \overline{\alpha}_i
    = \alpha_{\overline{u}_i}, \; i = 1, ..., |\overline{\vect{u}}|\right\}.\end{align}

Each aggregated coefficient $\overline{a}_{\overline{\vect{\alpha}}} \in \Rset$
is defined by the equation:

\begin{align}:label: PCE_CE_5

    \overline{a}_{\overline{\vect{\alpha}}}
    = \sum_{\vect{\alpha} \in \cJ^P_{\overline{\vect{\alpha}}}}
    a_{\vect{\alpha}} \left(\prod_{i = 1}^{|\vect{u}|}
    \pi_{\alpha_{u_i}}^{(u_i)}\left(\standardReal_{u_i}^{(0)}\right) \right).\end{align}

Finally:

\begin{align}:label: PCE_CE_4

    \widehat{h}(\standardReal_{\overline{\vect{u}}})
    = \sum_{\overline{\vect{\alpha}} \in \overline{\cJ}^P}
    \overline{a}_{\overline{\vect{\alpha}}}
    \overline{\psi}(z_{\overline{\vect{u}}})\end{align}

for any $\standardReal_{\overline{\vect{u}}} \in \Rset^{|\overline{\vect{u}}|}$.



The method is the following.

- Create the reduced polynomial basis from equation :eq:`PCE_CE_1`.
- Create the list of reduced multi-indices from the equation :eq:`PCE_CE_2`, and, for each
  reduced multi-index, the list of corresponding multi-indices from the equation :eq:`PCE_CE_3`.
- Aggregate the coefficients from the equation :eq:`PCE_CE_5`.
- The parametric PCE is defined by the equation :eq:`PCE_CE_4`.



## Conditional expectation

One method to reduce the input dimension of a function is to
consider its conditional expectation.
The conditional expectation function is:

\begin{align}\widehat{\model}(\inputReal_{\vect{u}})
  = \Expect{\model(\inputReal)
           \; | \; \inputRV_{\vect{u}}
           = \inputReal_{\vect{u}}}\end{align}

for any $\inputReal_{\vect{u}} \in \Rset^{|\vect{u}|}$.
In general, there is no dedicated method to create such a conditional expectation
in the library.
We can, however, efficiently compute the conditional expectation of a polynomial
chaos expansion.
In turn, this conditional chaos expansion (PCE) is a polynomial chaos expansion
which can be computed using the :meth:`~openturns.FunctionalChaosResult.getConditionalExpectation`
method from the :class:`~openturns.FunctionalChaosResult` class.



## Create the PCE



In [None]:
import openturns as ot
import openturns.viewer as otv
from openturns.usecases import ishigami_function
import matplotlib.pyplot as plt

The next function creates a parametric PCE based on a
given PCE and a set of indices.



In [None]:
def meanParametricPCE(chaosResult, indices):
    """
    Return the parametric PCE of Y with given input marginals set to the mean.

    All marginal inputs, except those in the conditioning indices
    are set to the mean of the input random vector.

    The resulting function is :

    g(xu) = PCE(xu, xnotu = E[Xnotu])

    where xu is the input vector of conditioning indices,
    xnotu is the input vector fixed indices and
    E[Xnotu] is the expectation of the random vector of the components
    not in u.

    Parameters
    ----------
    chaosResult: ot.FunctionalChaosResult(inputDimension)
        The polynomial chaos expansion.
    indices: ot.Indices()
        The indices of the input variables which are set to constant values.

    Returns
    -------
    parametricPCEFunction : ot.ParametricFunction(reducedInputDimension, outputDimension)
        The parametric PCE.
        The reducedInputDimension is equal to inputDimension - indices.getSize().
    """
    inputDistribution = chaosResult.getDistribution()
    if not inputDistribution.hasIndependentCopula():
        raise ValueError(
            "The input distribution has a copula" "which is not independent"
        )
    # Create the parametric function
    pceFunction = chaosResult.getMetaModel()
    xMean = inputDistribution.getMean()
    referencePoint = xMean[indices]
    parametricPCEFunction = ot.ParametricFunction(pceFunction, indices, referencePoint)
    return parametricPCEFunction

The next function creates a sparse PCE using least squares.



In [None]:
def computeSparseLeastSquaresFunctionalChaos(
    inputTrain,
    outputTrain,
    multivariateBasis,
    basisSize,
    distribution,
    sparse=True,
):
    """
    Create a sparse polynomial chaos based on least squares.

    * Uses the enumerate rule in multivariateBasis.
    * Uses the LeastSquaresStrategy to compute the coefficients based on
      least squares.
    * Uses LeastSquaresMetaModelSelectionFactory to use the LARS selection method.
    * Uses FixedStrategy in order to keep all the coefficients that the
      LARS method selected.

    Parameters
    ----------
    inputTrain : ot.Sample
        The input design of experiments.
    outputTrain : ot.Sample
        The output design of experiments.
    multivariateBasis : ot.Basis
        The multivariate chaos basis.
    basisSize : int
        The size of the function basis.
    distribution : ot.Distribution.
        The distribution of the input variable.
    sparse: bool
        If True, create a sparse PCE.

    Returns
    -------
    result : ot.PolynomialChaosResult
        The estimated polynomial chaos.
    """
    if sparse:
        selectionAlgorithm = ot.LeastSquaresMetaModelSelectionFactory()
    else:
        selectionAlgorithm = ot.PenalizedLeastSquaresAlgorithmFactory()
    projectionStrategy = ot.LeastSquaresStrategy(
        inputTrain, outputTrain, selectionAlgorithm
    )
    adaptiveStrategy = ot.FixedStrategy(multivariateBasis, basisSize)
    chaosAlgorithm = ot.FunctionalChaosAlgorithm(
        inputTrain, outputTrain, distribution, adaptiveStrategy, projectionStrategy
    )
    chaosAlgorithm.run()
    chaosResult = chaosAlgorithm.getResult()
    return chaosResult

In the next cell, we create a training sample from the
Ishigami test function.
We choose a sample size equal to 1000.



In [None]:
ot.Log.Show(ot.Log.NONE)
ot.RandomGenerator.SetSeed(0)
im = ishigami_function.IshigamiModel()
input_names = im.inputDistribution.getDescription()
sampleSize = 1000
inputSample = im.inputDistribution.getSample(sampleSize)
outputSample = im.model(inputSample)

We then create a sparce PCE of the Ishigami function using
a candidate basis up to the total degree equal to 12.
This leads to 455 candidate coefficients.
The coefficients are computed from least squares.



In [None]:
multivariateBasis = ot.OrthogonalProductPolynomialFactory([im.X1, im.X2, im.X3])
totalDegree = 12
enumerateFunction = multivariateBasis.getEnumerateFunction()
basisSize = enumerateFunction.getBasisSizeFromTotalDegree(totalDegree)
print("Basis size = ", basisSize)

Finally, we create the PCE.
Only 61 coefficients are selected by the :class:`~openturns.LARS`
algorithm.



In [None]:
chaosResult = computeSparseLeastSquaresFunctionalChaos(
    inputSample,
    outputSample,
    multivariateBasis,
    basisSize,
    im.inputDistribution,
)
print("Selected basis size = ", chaosResult.getIndices().getSize())
chaosResult

In order to see the structure of the data, we create a grid of
plots which shows all projections of $Y$ versus $X_i$
for $i = 1, 2, 3$.
We see that the Ishigami function is particularly non linear.



In [None]:
grid = ot.VisualTest.DrawPairsXY(inputSample, outputSample)
grid.setTitle(f"n = {sampleSize}")
view = otv.View(grid, figure_kw={"figsize": (8.0, 3.0)})
plt.subplots_adjust(wspace=0.4, bottom=0.25)

## Parametric function

We now create the parametric function where $X_i$ is free
and the other variables are set to their mean values.
We can show that a parametric PCE is, again, a PCE.
The library does not currently implement this feature.
In the next cell, we create it from the `meanParametricPCE` we defined
previously.



Create different parametric functions for the PCE.
In the next cell, we create the parametric PCE function
where $X_1$ is active while $X_2$ and $X_3$ are
set to their mean values.



In [None]:
indices = [1, 2]
parametricPCEFunction = meanParametricPCE(chaosResult, indices)
print(parametricPCEFunction.getInputDimension())

Now that we know how the `meanParametricPCE` works, we loop over
the input marginal indices and consider the three functions
$\widehat{\model}_1(\inputReal_1)$,
$\widehat{\model}_2(\inputReal_2)$ and
$\widehat{\model}_3(\inputReal_3)$.
For each marginal index `i`, we we plot the output $Y$
against the input marginal $X_i$ of the sample.
Then we plot the parametric function depending on $X_i$.



In [None]:
inputDimension = im.inputDistribution.getDimension()
npPoints = 100
inputRange = im.inputDistribution.getRange()
inputLowerBound = inputRange.getLowerBound()
inputUpperBound = inputRange.getUpperBound()
# Create the palette with transparency
palette = ot.Drawable().BuildDefaultPalette(2)
firstColor = palette[0]
r, g, b, a = ot.Drawable.ConvertToRGBA(firstColor)
newAlpha = 64
newColor = ot.Drawable.ConvertFromRGBA(r, g, b, newAlpha)
palette[0] = newColor
grid = ot.VisualTest.DrawPairsXY(inputSample, outputSample)
reducedBasisSize = chaosResult.getCoefficients().getSize()
grid.setTitle(
    f"n = {sampleSize}, total degree = {totalDegree}, "
    f"basis = {basisSize}, selected = {reducedBasisSize}"
)
for i in range(inputDimension):
    graph = grid.getGraph(0, i)
    graph.setLegends(["Data"])
    graph.setXTitle(f"$x_{1 + i}$")
    # Set all indices except i
    indices = list(range(inputDimension))
    indices.pop(i)
    parametricPCEFunction = meanParametricPCE(chaosResult, indices)
    xiMin = inputLowerBound[i]
    xiMax = inputUpperBound[i]
    curve = parametricPCEFunction.draw(xiMin, xiMax, npPoints).getDrawable(0)
    curve.setLineWidth(2.0)
    curve.setLegend(r"$PCE(x_i, x_{-i} = \mathbb{E}[X_{-i}])$")
    graph.add(curve)
    if i < inputDimension - 1:
        graph.setLegends([""])
    graph.setColors(palette)
    grid.setGraph(0, i, graph)

grid.setLegendPosition("topright")
view = otv.View(
    grid,
    figure_kw={"figsize": (8.0, 3.0)},
    legend_kw={"bbox_to_anchor": (1.0, 1.0), "loc": "upper left"},
)
plt.subplots_adjust(wspace=0.4, right=0.7, bottom=0.25)

We see that the parametric function is located within each cloud, but
sometimes seems a little vertically on the edges of the data.
More precisely, the function represents well how $Y$ depends
on $X_2$, but does not seem to represent well how $Y$
depends on $X_1$ or $X_3$.



## Conditional expectation



In the next cell, we create the conditional expectation function
$\Expect{\model(\inputReal) \; | \; \inputRV_1 = \inputReal_1}$.



In [None]:
conditionalPCE = chaosResult.getConditionalExpectation([0])
conditionalPCE

On output, we see that the result is, again, a PCE.
Moreover, a subset of the previous coefficients are presented in this
conditional expectation: only multi-indices which involve
$X_1$ are presented (and the other marginal components are removed).
We observe that the value of the coefficients are unchanged with respect to the
previous PCE.



In the next cell, we create the conditional expectation function
$\Expect{\model(\inputReal) \; | \; \inputRV_2 = \inputReal_2, \inputRV_3 = \inputReal_3}$.



In [None]:
conditionalPCE = chaosResult.getConditionalExpectation([1, 2])
conditionalPCE

We see that the conditional PCE has input dimension 2.



In the next cell, we compare the parametric PCE and the conditional
expectation of the PCE.



In [None]:
# sphinx_gallery_thumbnail_number = 3
inputDimension = im.inputDistribution.getDimension()
npPoints = 100
inputRange = im.inputDistribution.getRange()
inputLowerBound = inputRange.getLowerBound()
inputUpperBound = inputRange.getUpperBound()
# Create the palette with transparency
palette = ot.Drawable().BuildDefaultPalette(3)
firstColor = palette[0]
r, g, b, a = ot.Drawable.ConvertToRGBA(firstColor)
newAlpha = 64
newColor = ot.Drawable.ConvertFromRGBA(r, g, b, newAlpha)
palette[0] = newColor
grid = ot.VisualTest.DrawPairsXY(inputSample, outputSample)
grid.setTitle(f"n = {sampleSize}, total degree = {totalDegree}")
for i in range(inputDimension):
    graph = grid.getGraph(0, i)
    graph.setLegends(["Data"])
    graph.setXTitle(f"$x_{1 + i}$")
    xiMin = inputLowerBound[i]
    xiMax = inputUpperBound[i]
    # Set all indices except i to the mean
    indices = list(range(inputDimension))
    indices.pop(i)
    parametricPCEFunction = meanParametricPCE(chaosResult, indices)
    # Draw the parametric function
    curve = parametricPCEFunction.draw(xiMin, xiMax, npPoints).getDrawable(0)
    curve.setLineWidth(2.0)
    curve.setLineStyle("dashed")
    curve.setLegend(r"$PCE\left(x_i, x_{-i} = \mathbb{E}[X_{-i}]\right)$")
    graph.add(curve)
    # Compute conditional expectation given Xi
    conditionalPCE = chaosResult.getConditionalExpectation([i])
    print(f"i = {i}")
    print(conditionalPCE)
    conditionalPCEFunction = conditionalPCE.getMetaModel()
    curve = conditionalPCEFunction.draw(xiMin, xiMax, npPoints).getDrawable(0)
    curve.setLineWidth(2.0)
    curve.setLegend(r"$\mathbb{E}\left[PCE | X_i = x_i\right]$")
    graph.add(curve)
    if i < inputDimension - 1:
        graph.setLegends([""])
    graph.setColors(palette)
    # Set the graph into the grid
    grid.setGraph(0, i, graph)

grid.setLegendPosition("topright")
view = otv.View(
    grid,
    figure_kw={"figsize": (8.0, 3.0)},
    legend_kw={"bbox_to_anchor": (1.0, 1.0), "loc": "upper left"},
)
plt.subplots_adjust(wspace=0.4, right=0.7, bottom=0.25)

We see that the conditional expectation of the PCE is a better
approximation of the data set than the parametric PCE.



## Conclusion

In this example, we have seen how to compute the conditional
expectation of a PCE.
We have seen that this function is a good approximation of the Ishigami
function when we reduce the input dimension.
We have also seen that the parametric PCE might be a poor
approximation of the Ishigami function.
This is because the parametric PCE depends on the particular value
that we have chosen to create the parametric function.

The fact that the conditional expectation of the PCE is a
good approximation of the function when we reduce the input dimension
is a consequence of a theorem which states that the
conditional expectation is the best approximation of the
function in the least squares sense (see [girardin2018]_ page 79).



In [None]:
otv.View.ShowAll()