GeneralLinearModelAlgorithm

(Source code, png, hires.png, pdf)

../../../_images/GeneralLinearModelAlgorithm.png
class GeneralLinearModelAlgorithm(*args)

Algorithm for the evaluation of general linear models.

Available constructors:

GeneralLinearModelAlgorithm(inputSample, outputSample, covarianceModel, basis, keepCovariance=True)

GeneralLinearModelAlgorithm(inputSample, outputSample, covarianceModel, basisCollection, keepCovariance=True)

Parameters:
inputSample, outputSampleSample or 2d-array

The samples (\vect{x}_k)_{1 \leq k \leq N} \in \Rset^n and (\vect{y}_k)_{1 \leq k \leq N}\in \Rset^d.

basisBasis

Functional basis to estimate the trend: (\varphi_j)_{1 \leq j \leq n_1}: \Rset^n \rightarrow \Rset.

If d>1, the same basis is used for each marginal output.

basisCollectioncollection of Basis

Collection of d functional basis: one basis for each marginal output.

An empty collection means that no trend is estimated.

covarianceModelCovarianceModel

Covariance model of the Gaussian process. See notes for the details.

keepCovariancebool, optional

Indicates whether the covariance matrix has to be stored in the result structure GeneralLinearModelResult. Default value is set in resource map key GeneralLinearModelAlgorithm-KeepCovariance

Notes

We suppose we have a sample (\vect{x}_k, \vect{y}_k)_{1 \leq k \leq N} where \vect{y}_k = \cM(\vect{x}_k) for all k, with \cM:\Rset^n \mapsto \Rset^d a given function.

The objective is to build a metamodel \tilde{\cM}, using a general linear model: the sample (\vect{y}_k)_{1 \leq k \leq N} is considered as the restriction of a Gaussian process \vect{Y}(\omega, \vect{x}) on (\vect{x}_k)_{1 \leq k \leq N}. The Gaussian process \vect{Y}(\omega, \vect{x}) is defined by:

\vect{Y}(\omega, \vect{x}) = \vect{\mu}(\vect{x}) + \vect{W}(\omega, \vect{x})

where:

\vect{\mu}(\vect{x}) = \left(
  \begin{array}{l}
    \mu_1(\vect{x}) \\
    \dots  \\
    \mu_d(\vect{x})
   \end{array}
 \right)

with \mu_\ell(\vect{x}) = \sum_{j=1}^{n_\ell} \beta_j^\ell \varphi_j^\ell(\vect{x}) and \varphi_j^\ell: \Rset^n \rightarrow \Rset the trend functions.

\vect{W} is a Gaussian process of dimension d with zero mean and covariance function C = C(\vect{\theta}, \vect{\sigma}, \mat{R}, \vect{\lambda}) (see CovarianceModel for the notations).

We note:

\vect{\beta}^\ell = \left(
  \begin{array}{l}
    \beta_1^\ell \\
    \dots  \\
    \beta_{n_\ell}^\ell
   \end{array}
 \right) \in \Rset^{n_\ell}
 \quad \mbox{ and } \quad
 \vect{\beta} = \left(
  \begin{array}{l}
     \vect{\beta}^1\\
    \dots  \\
     \vect{\beta}^d
   \end{array}
 \right)\in \Rset^{\sum_{\ell=1}^p n_\ell}

The GeneralLinearModelAlgorithm class estimates the coefficients \beta_j^\ell and \vect{p} where \vect{p} is the vector of parameters of the covariance model (a subset of \vect{\theta}, \vect{\sigma}, \mat{R}, \vect{\lambda}) that has been declared as active (by default, the full vectors \vect{\theta} and \vect{\sigma}).

The estimation is done by maximizing the reduced log-likelihood of the model, see its expression below.

Estimation of the parameters \beta_j^\ell and \vect{p}

We note:

\vect{y} = \left(
     \begin{array}{l}
       \vect{y}_1 \\
       \dots  \\
       \vect{y}_N
      \end{array}
    \right) \in \Rset^{dN},
    \quad
    \vect{m}_{\vect{\beta}} = \left(
     \begin{array}{l}
       \vect{\mu}(\vect{x}_1) \\
       \dots  \\
       \vect{\mu}(\vect{x}_N)
      \end{array}
    \right) \in \Rset^{dN}

and

\mat{C}_{\vect{p}} = \left(
  \begin{array}{lcl}
    \mat{C}_{11} & \dots &  \mat{C}_{1N}\\
    \dots & \dots & \\
    \mat{C}_{N1} & \dots &  \mat{C}_{NN}
   \end{array}
 \right) \in \cS_{dN}^+(\Rset)

where \mat{C}_{ij} = C_{\vect{p}}(\vect{x}_i, \vect{x}_j).

The model likelihood writes:

\cL(\vect{\beta}, \vect{p};(\vect{x}_k, \vect{y}_k)_{1 \leq k \leq N}) = \dfrac{1}{(2\pi)^{dN/2} |\det \mat{C}_{\vect{p}}|^{1/2}} \exp\left[ -\dfrac{1}{2}\Tr{\left( \vect{y}-\vect{m} \right)} \mat{C}_{\vect{p}}^{-1}  \left( \vect{y}-\vect{m} \right)  \right]

If \mat{L} is the Cholesky factor of \mat{C}, ie the lower triangular matrix with positive diagonal such that \mat{L}\,\Tr{\mat{L}} = \mat{C}, then:

(1)\log \cL(\vect{\beta}, \vect{p};(\vect{x}_k, \vect{y}_k)_{1 \leq k \leq N}) = cste - \log \det \mat{L}_{\vect{p}} -\dfrac{1}{2}  \| \mat{L}_{\vect{p}}^{-1}(\vect{y}-\vect{m}_{\vect{\beta}}) \|^2

The maximization of (1) leads to the following optimality condition for \vect{\beta}:

\vect{\beta}^*(\vect{p}^*)=\argmin_{\vect{\beta}} \| \mat{L}_{\vect{p}^*}^{-1}(\vect{y}-\vect{m}_{\vect{\beta}}) \|^2

This expression of \vect{\beta}^* as a function of \vect{p}^* is taken as a general relation between \vect{\beta} and \vect{p} and is substituted into (1), leading to a reduced log-likelihood function depending solely on \vect{p}.

In the particular case where d=\dim(\vect{\sigma})=1 and \sigma is a part of \vect{p}, then a further reduction is possible. In this case, if \vect{q} is the vector \vect{p} in which \sigma has been substituted by 1, then:

\| \mat{L}_{\vect{p}}^{-1}(\vect{y}-\vect{m}_{\vect{\beta}}) \|^2=\| \mat{L}_{\vect{q}}^{-1}(\vect{y}-\vect{m}_{\vect{\beta}}) \|^2/\sigma^2

showing that \vect{\beta}^* is a function of \vect{q}^* only, and the optimality condition for \sigma reads:

\vect{\sigma}^*(\vect{q}^*)=\dfrac{1}{N}\| \mat{L}_{\vect{q}^*}^{-1}(\vect{y}-\vect{m}_{\vect{\beta}^*(\vect{q}^*)}) \|^2

which leads to a further reduction of the log-likelihood function where both \vect{\beta} and \sigma are replaced by their expression in terms of \vect{q}.

The default optimizer is TNC and can be changed thanks to the setOptimizationAlgorithm method. User could also change the default optimization solver by setting the GeneralLinearModelAlgorithm-DefaultOptimizationAlgorithm resource map key to one of the NLopt solver names.

It is also possible to proceed as follows:

  • ask for the reduced log-likelihood function of the GeneralLinearModelAlgorithm thanks to the getObjectiveFunction() method

  • optimize it with respect to the parameters \vect{\theta} and \vect{\sigma} using any optimization algorithms (that can take into account some additional constraints if needed)

  • set the optimal parameter value into the covariance model used in the GeneralLinearModelAlgorithm

  • tell the algorithm not to optimize the parameter using setOptimizeParameters

The behaviour of the reduction is controlled by the following keys in ResourceMap:
  • ResourceMap.SetAsBool(‘GeneralLinearModelAlgorithm-UseAnalyticalAmplitudeEstimate’, True) to use the reduction associated to \sigma. It has no effect if d>1 or if d=1 and \sigma is not part of \vect{p}

  • ResourceMap.SetAsBool(‘GeneralLinearModelAlgorithm-UnbiasedVariance’, True) allows one to use the unbiased estimate of \sigma where \dfrac{1}{N} is replaced by \dfrac{1}{N-p} in the optimality condition for \sigma.

With huge samples, the hierarchical matrix implementation could be used if OpenTURNS had been compiled with hmat-oss support.

This implementation, which is based on a compressed representation of an approximated covariance matrix (and its Cholesky factor), has a better complexity both in terms of memory requirements and floating point operations. To use it, the GeneralLinearModelAlgorithm-LinearAlgebra resource map key should be instancied to HMAT. Default value of the key is LAPACK.

A known centered gaussian observation noise \epsilon_k can be taken into account with setNoise():

\hat{\vect{y}}_k = \vect{y}_k + \epsilon_k, \epsilon_k \sim \mathcal{N}(0, \tau_k^2)

Examples

Create the model \cM: \Rset \mapsto \Rset and the samples:

>>> import openturns as ot
>>> f = ot.SymbolicFunction(['x'], ['x+x * sin(x)'])
>>> inputSample = ot.Sample([[1.0], [3.0], [5.0], [6.0], [7.0], [8.0]])
>>> outputSample = f(inputSample)

Create the algorithm:

>>> f1 = ot.SymbolicFunction(['x'], ['sin(x)'])
>>> f2 = ot.SymbolicFunction(['x'], ['x'])
>>> f3 = ot.SymbolicFunction(['x'], ['cos(x)'])
>>> basis = ot.Basis([f1,f2, f3])
>>> covarianceModel = ot.SquaredExponential([1.0])
>>> covarianceModel.setActiveParameter([])
>>> algo = ot.GeneralLinearModelAlgorithm(inputSample, outputSample, covarianceModel, basis)
>>> algo.run()

Get the resulting meta model:

>>> result = algo.getResult()
>>> metamodel = result.getMetaModel()

Methods

BuildDistribution(inputSample)

Recover the distribution, with metamodel performance in mind.

getClassName()

Accessor to the object's name.

getDistribution()

Accessor to the joint probability density function of the physical input vector.

getId()

Accessor to the object's id.

getInputSample()

Accessor to the input sample.

getName()

Accessor to the object's name.

getNoise()

Observation noise variance accessor.

getObjectiveFunction()

Accessor to the log-likelihood function that writes as argument of the covariance's model parameters.

getOptimizationAlgorithm()

Accessor to solver used to optimize the covariance model parameters.

getOptimizationBounds()

Optimization bounds accessor.

getOptimizeParameters()

Accessor to the covariance model parameters optimization flag.

getOutputSample()

Accessor to the output sample.

getResult()

Get the results of the metamodel computation.

getShadowedId()

Accessor to the object's shadowed id.

getVisibility()

Accessor to the object's visibility state.

hasName()

Test if the object is named.

hasVisibleName()

Test if the object has a distinguishable name.

run()

Compute the response surface.

setDistribution(distribution)

Accessor to the joint probability density function of the physical input vector.

setName(name)

Accessor to the object's name.

setNoise(noise)

Observation noise variance accessor.

setOptimizationAlgorithm(solver)

Accessor to the solver used to optimize the covariance model parameters.

setOptimizationBounds(optimizationBounds)

Optimization bounds accessor.

setOptimizeParameters(optimizeParameters)

Accessor to the covariance model parameters optimization flag.

setShadowedId(id)

Accessor to the object's shadowed id.

setVisibility(visible)

Accessor to the object's visibility state.

__init__(*args)
static BuildDistribution(inputSample)

Recover the distribution, with metamodel performance in mind.

For each marginal, find the best 1-d continuous parametric model else fallback to the use of a nonparametric one.

The selection is done as follow:

  • We start with a list of all parametric models (all factories)

  • For each model, we estimate its parameters if feasible.

  • We check then if model is valid, ie if its Kolmogorov score exceeds a threshold fixed in the MetaModelAlgorithm-PValueThreshold ResourceMap key. Default value is 5%

  • We sort all valid models and return the one with the optimal criterion.

For the last step, the criterion might be BIC, AIC or AICC. The specification of the criterion is done through the MetaModelAlgorithm-ModelSelectionCriterion ResourceMap key. Default value is fixed to BIC. Note that if there is no valid candidate, we estimate a non-parametric model (KernelSmoothing or Histogram). The MetaModelAlgorithm-NonParametricModel ResourceMap key allows selecting the preferred one. Default value is Histogram

One each marginal is estimated, we use the Spearman independence test on each component pair to decide whether an independent copula. In case of non independence, we rely on a NormalCopula.

Parameters:
sampleSample

Input sample.

Returns:
distributionDistribution

Input distribution.

getClassName()

Accessor to the object’s name.

Returns:
class_namestr

The object class name (object.__class__.__name__).

getDistribution()

Accessor to the joint probability density function of the physical input vector.

Returns:
distributionDistribution

Joint probability density function of the physical input vector.

getId()

Accessor to the object’s id.

Returns:
idint

Internal unique identifier.

getInputSample()

Accessor to the input sample.

Returns:
inputSampleSample

Input sample of a model evaluated apart.

getName()

Accessor to the object’s name.

Returns:
namestr

The name of the object.

getNoise()

Observation noise variance accessor.

Parameters:
noisesequence of positive float

The noise variance \tau_k^2 of each output value.

getObjectiveFunction()

Accessor to the log-likelihood function that writes as argument of the covariance’s model parameters.

Returns:
logLikelihoodFunction

The log-likelihood function degined in (1) as a function of (\vect{\theta}, \vect{\sigma}).

Notes

The log-likelihood function may be useful for some postprocessing: maximization using external optimizers for example.

Examples

Create the model \cM: \Rset \mapsto \Rset and the samples:

>>> import openturns as ot
>>> f = ot.SymbolicFunction(['x0'], ['x0 * sin(x0)'])
>>> inputSample = ot.Sample([[1.0], [3.0], [5.0], [6.0], [7.0], [8.0]])
>>> outputSample = f(inputSample)

Create the algorithm:

>>> basis = ot.ConstantBasisFactory().build()
>>> covarianceModel = ot.SquaredExponential(1)
>>> algo = ot.GeneralLinearModelAlgorithm(inputSample, outputSample, covarianceModel, basis)
>>> algo.run()

Get the log-likelihood function:

>>> likelihoodFunction = algo.getObjectiveFunction()
getOptimizationAlgorithm()

Accessor to solver used to optimize the covariance model parameters.

Returns:
algorithmOptimizationAlgorithm

Solver used to optimize the covariance model parameters. Default optimizer is TNC

getOptimizationBounds()

Optimization bounds accessor.

Returns:
boundsInterval

Bounds for covariance model parameter optimization.

getOptimizeParameters()

Accessor to the covariance model parameters optimization flag.

Returns:
optimizeParametersbool

Whether to optimize the covariance model parameters.

getOutputSample()

Accessor to the output sample.

Returns:
outputSampleSample

Output sample of a model evaluated apart.

getResult()

Get the results of the metamodel computation.

Returns:
resultGeneralLinearModelResult

Structure containing all the results obtained after computation and created by the method run().

getShadowedId()

Accessor to the object’s shadowed id.

Returns:
idint

Internal unique identifier.

getVisibility()

Accessor to the object’s visibility state.

Returns:
visiblebool

Visibility flag.

hasName()

Test if the object is named.

Returns:
hasNamebool

True if the name is not empty.

hasVisibleName()

Test if the object has a distinguishable name.

Returns:
hasVisibleNamebool

True if the name is not empty and not the default one.

run()

Compute the response surface.

Notes

It computes the response surface and creates a GeneralLinearModelResult structure containing all the results.

setDistribution(distribution)

Accessor to the joint probability density function of the physical input vector.

Parameters:
distributionDistribution

Joint probability density function of the physical input vector.

setName(name)

Accessor to the object’s name.

Parameters:
namestr

The name of the object.

setNoise(noise)

Observation noise variance accessor.

Parameters:
noisesequence of positive float

The noise variance \tau_k^2 of each output value.

setOptimizationAlgorithm(solver)

Accessor to the solver used to optimize the covariance model parameters.

Parameters:
algorithmOptimizationAlgorithm

Solver used to optimize the covariance model parameters.

setOptimizationBounds(optimizationBounds)

Optimization bounds accessor.

Parameters:
boundsInterval

Bounds for covariance model parameter optimization.

Notes

Parameters involved by this method are:

  • Scale parameters,

  • Amplitude parameters if output dimension is greater than one or analytical sigma disabled,

  • Additional parameters.

Lower & upper bounds are defined in resource map. Default lower upper bounds value for all parameters is 10^{-2} and defined thanks to the GeneralLinearModelAlgorithm-DefaultOptimizationLowerBound resource map key.

For scale parameters, default upper bounds are set as 2 times the difference between the max and min values of X for each coordinate, X being the (transformed) input sample. The value 2 is defined in resource map (GeneralLinearModelAlgorithm-DefaultOptimizationScaleFactor).

Finally for other parameters (amplitude,…), default upper bound is set to 100 (corresponding resource map key is GeneralLinearModelAlgorithm-DefaultOptimizationUpperBound)

setOptimizeParameters(optimizeParameters)

Accessor to the covariance model parameters optimization flag.

Parameters:
optimizeParametersbool

Whether to optimize the covariance model parameters.

setShadowedId(id)

Accessor to the object’s shadowed id.

Parameters:
idint

Internal unique identifier.

setVisibility(visible)

Accessor to the object’s visibility state.

Parameters:
visiblebool

Visibility flag.

Examples using the class

Create a general linear model metamodel

Create a general linear model metamodel