Covariance models

We consider X: \Omega \times\cD \mapsto \Rset^{\inputDim} a multivariate stochastic process of dimension d, where \omega \in \Omega is an event, \cD is a domain of \Rset^{\sampleSize}, \vect{t}\in \cD is a multivariate index and X(\omega, \vect{t}) \in \Rset^{\inputDim}.

We note X_{\vect{t}}: \Omega \rightarrow \Rset^{\inputDim} the random variable at index \vect{t} \in \cD defined by X_{\vect{t}}(\omega)=X(\omega, \vect{t}) and X(\omega): \cD  \mapsto \Rset^{\inputDim} a realization of the process X, for a given \omega \in \Omega defined by X(\omega)(\vect{t})=X(\omega, \vect{t}).

If the process is a second order process, we denote by:

  • m : \cD \mapsto  \Rset^{\inputDim} its mean function, defined by m(\vect{t})=\Expect{X_{\vect{t}}},

  • \mat{C} : \cD \times \cD \mapsto  \cS_{\inputDim}^+(\Rset) its covariance function, defined by \mat{C}(\vect{s}, \vect{t})=\Expect{(X_{\vect{s}}-m(\vect{s}))\Tr{(X_{\vect{t}}-m(\vect{t}))}},

  • \mat{\rho} : \cD \times \cD \mapsto  \cS_{\inputDim}^+(\Rset) its correlation function, defined for all (\vect{s}, \vect{t}), by \mat{\rho}(\vect{s}, \vect{t}) such that for all (i,j), (\rho_{ij}(\vect{s}, \vect{t}) = C_{ij}(\vect{s}, \vect{t})/\sqrt{C_{ii}(\vect{s}, \vect{t})C_{jj}(\vect{s}, \vect{t})}.

In OpenTURNS, it is assumed that:

  • the spatial correlation \mat{R} \in \cS_{\inputDim}^+(\Rset) between the components of X_{\vect{t}} and the vector of marginal standard deviations \vect{\sigma} \in \Rset^{\inputDim} does not depend on \vect{t} \in \cD,

  • the correlation between X_{\vect{s}} and X_{\vect{t}} which is given by \mat{\rho}(\vect{s}, \vect{t}) is such that X^i_{\vect{t}} depends only on X^i_{\vect{s}} and that this link does not depend on the component i. In that case, \mat{\rho}(\vect{s}, \vect{t}) can be defined from the scalar function \rho(\vect{s}, \vect{t}) \in \Rset by \mat{\rho}(\vect{s}, \vect{t}) = \rho(\vect{s}, \vect{t})\, \mat{I}_{\inputDim}. We have \rho(\vect{s}, \vect{s}) = 1.

Then, the covariance model is written as:

(1)\mat{C}(\vect{s}, \vect{t}) = \rho\left(\dfrac{\vect{s}}{\vect{\theta}},
                                  \dfrac{\vect{t}}{\vect{\theta}}\right)\,
                        \diag(\vect{\sigma}) \, \mat{R} \,
                        \diag(\vect{\sigma}), \quad
                        \forall (\vect{s}, \vect{t}) \in \cD

or:

(2)\mat{C}(\vect{s}, \vect{t}) = \rho\left(\dfrac{\vect{s}}{\vect{\theta}},
                                  \dfrac{\vect{t}}{\vect{\theta}}
                            \right)\, \mat{C}^{spatial} \quad
                        \forall (\vect{s}, \vect{t}) \in \cD

where:

  • \vect{\theta} \in \Rset^{\sampleSize} is the scale parameter,

  • \vect{\sigma} \in \Rset^{\inputDim} is the amplitude parameter,

  • \mat{R} \in \cS_{\inputDim}^+(\Rset) is the spatial correlation matrix,

  • \mat{C}^{spatial} = \diag(\vect{\sigma}) \, \mat{R} \, \diag(\vect{\sigma}) is the spatial covariance matrix which does not depend on (\vect{s}, \vect{t}).

It is possible to model a nugget effect. The nugget effect is used to model a noise observed in the output values of a process. This noise may be, for example, a measurement noise coming from a sensor with finite precision. It also has a side effect: it improves the condition number of the covariance matrix (see computeRegularizedCholesky()).

The nugget effect is taken into account by modifying the scalar correlation function \rho at any point (\vect{s}, \vect{s}) by adding a term denoted \varepsilon_{nugget} which does not depend on (\vect{s}, \vect{t}):

(3)\rho_{nugget}(\vect{s}, \vect{t}) =
  \begin{cases}
      \rho(\vect{s}, \vect{t}) & \text{ if }   \vect{s} \neq \vect{t}, \\
        1 + \varepsilon_{nugget} & \text{otherwise.}
   \end{cases}

Then, the nugget effect transforms the covariance function \mat{C} into the covariance function \mat{C}_{nugget} as follows:

(4)\mat{C}_{nugget}(\vect{s}, \vect{t}) = \mat{C}(\vect{s}, \vect{t}) + \varepsilon_{nugget}  \mat{C}^{spatial}
1_{\vect{s} = \vect{t}}

Then, we have:

(5)\mat{C}_{nugget}(\vect{s}, \vect{s}) = (1+\varepsilon_{nugget})\mat{C}^{spatial}

which shows how the nugget factor \varepsilon_{nugget} acts on the covariance function.