Least squares meta models

A least squares meta model provides an approximation of the model \model: \Rset^\inputDim \rightarrow \Rset^\outputDim which is valid over its whole domain of definition.

Let \model: \Rset^\inputDim \rightarrow \Rset^\outputDim. The objective is to create a meta model \metaModel defined by:

(1)\metaModel(\vect{x})  = \sum_{j=0}^N \vect{a}_j \Psi_j(\vect{x})

where, for 0 \leq j \leq N, \vect{a}_j \in \Rset^\outputDim and \psi_j: \Rset^\inputDim \rightarrow \Rset with \psi_0: \vect{x} \rightarrow 1.

The functional basis (\psi_j)_{0 \leq j \leq N} is specified and the objective it to determine the coefficients (\vect{a}_j)_{0 \leq j \leq N}.

Let \mat{A} \in \cM_{N+1, \outputDim} and \vect{\Psi(\vect{x})} \in \Rset^{N+1} be defined by:

(2)\mat{A} & = \Tr{(\vect{a}_0 | \dots | \vect{a}_N)}\\
\vect{\Psi(\vect{x})} & = \Tr{(\Psi_0(\vect{x}), \dots, \Psi_N(\vect{x})}

Then the meta model (1) can be written as:

(3)\vect{y}  =  \Tr{\mat{A}}\vect{\Psi(\vect{x})}

Let \cX be an experimental design of size \sampleSize, that is, a set of observations of the input vector defined by:

\cX = \left\{ \vect{x}^{(1)}, \dots, \vect{x}^{(\sampleSize)} \right\},

as well as the corresponding output vectors:

\cY = \left\{ \vect{y}^{(1)}, \dots, \vect{y}^{(\sampleSize)} \right\}.

where \vect{y}^{(k)} = \model( \vect{x}^{(k)}).

We define the objective function \cJ by:

(4)\cJ(\mat{A}) = \sum_{i=1}^\sampleSize \left \| \vect{y}^{(i)} - \Tr{\mat{A}} \vect{\Psi(\vect{x}^{(i)})}
\right \|_{L^2}^2

Then we search \widehat{\mat{A}} that minimizes the objective function (4):

(5)\widehat{\mat{A}} = \argmin \cJ(\mat{A})

Let the matrices \mat{\Psi} \in \cM_{\sampleSize, N+1} and \mat{Y}\in \cM_{\sampleSize, \outputDim} be defined by:

\mat{\Psi} & = (\psi_j(\vect{x}^{(i)}))_{i,j}\\
\mat{Y}    & = \Tr{(\vect{y}^{(1)} | \dots | \vect{y}^{(\sampleSize)})}

Then \widehat{\mat{A}} is solution of the linear system:

(6)\Tr{\mat{\Psi}}\mat{\Psi}\widehat{\mat{A}} = \Tr{\mat{\Psi}}\mat{Y}

The library relies on the method dgelsy of LAPACK: refer to its documentation to get information on the resolution of (6), in particular when the problem is underdetermined (which means when \sampleSize < N) or overdetermined (which means when \sampleSize > N).

The Gram matrix \Tr{\mat{\Psi}}\mat{\Psi} can be ill-conditionned. Hence, the best method is not necessarily to invert the Gram matrix, because the solution may be particularly sensitive to rounding errors. The least-squares problem is rather solved using more robust numerical methods such as the singular value decomposition (SVD) or the QR-decomposition.

Particular case 1: The functional basis is composed of polynomials with degree less or equal to 1

In this particular case, the functional basis is defined by N = 1+\inputDim linear functions:

\psi_0 & : \vect{x} \mapsto 1\\
\psi_j & : \vect{x} \mapsto x_j - b_j, \quad 1 \leq j \leq \inputDim

with \vect{b} \in \Rset^\inputDim is the empirical mean vector of the experimental design, defined by:

(7)\vect{b} = \dfrac{1}{\sampleSize} \sum_{i=1}^\sampleSize \vect{x}^{(i)}

The resulting meta model can be written as:

(8)\metaModel(\vect{x}) = \vect{c} + \Tr{\mat{L}}(\vect{x} - \vect{b})

where the matrix \mat{L} \in \cM_{\inputDim, \outputDim} is equal to the matrix \mat{A} defined in (2) except its first line. More precisely, we have:

(9)\mat{L} =  \Tr{(\vect{a}_1 | \dots | \vect{a}_\inputDim)}

The vector \vect{c} = \vect{a}_0 \in \Rset^\outputDim is the first line of \mat{A} defined in (2).

Particular case 2: The functional basis is composed of of polynomials with degree less or equal to 2

In this particular case, the functional basis is defined by N = 1+2\inputDim +\dfrac{d(d-1)}{2} polynomials functions of maximal degree 2:

\psi_0 & : \vect{x} \mapsto 1\\
\psi_k & : \vect{x} \mapsto (x_k - b_k)\quad for 1 \leq k \leq \inputDim\\
\psi_{ij} & : \vect{x} \mapsto (x_i - b_i)(x_j - b_j), \quad 1 \leq i < j \leq \inputDim

where \vect{b} is still defined by (7). The coefficients are denoted as follows:

(10)\metaModel(\vect{x})  = \sum_{j=0}^\inputDim \vect{a}_j \Psi_j(\vect{x}) + \sum_{1 \leq i < j \leq \inputDim}^\inputDim \vect{a}_{ij}\Psi_{ij}(\vect{x})

The resulting meta model can be written as:

(11)\metaModel(\vect{x})  = \vect{c} + \vect{x} \Tr{\mat{L}} ( \vect{x} - \vect{b} ) + \frac{1}{2} \Tr{( \vect{x} - \vect{b} )}\tens{M} ( \vect{x} - \vect{b} )

where:

  • \mat{L} \in \cM_{\inputDim, \outputDim} is equal to the matrix \mat{A} defined in (2) except its first line,

  • \vect{c}= \vect{a}_0 \in \Rset^\outputDim,

  • \tens{M} is a \Rset^\outputDim \times \Rset^\inputDim \times \Rset^\inputDim symmetric tensor: the sheet k is the matrix (2a_{ij}^k)_{1, i,j,\inputDim}, for 1 \leq k \leq \outputDim and a_{ij}^k is the component k of the vector \vect{a}_{ij}.