GaussianProcessRegressionCrossValidation¶

class GaussianProcessRegressionCrossValidation(*args)¶

Validate a Gaussian Process Regression surrogate model.

Warning

This class is experimental and likely to be modified in future releases. To use it, import the openturns.experimental submodule.

Parameters:

resultGaussianProcessRegressionResult: A Gaussian Process Regression result.
splitterSplitterImplementation, optional: The cross-validation method. For now, only the default LeaveOneOutSplitter can be used.

Methods

`computeMeanSquaredError`()	Accessor to the mean squared error.
`computeR2Score`()	Compute the R2 score.
`drawValidation`()	Plot a model vs metamodel graph for visual validation.
`getClassName`()	Accessor to the object's name.
`getGaussianProcessRegressionResult`()	Result accessor.
`getLeaveOneOutStandardDeviations`()	Get the Leave-One-Out prediction standard deviations
`getMetamodelPredictions`()	Accessor to the output predictions from the metamodel.
`getName`()	Accessor to the object's name.
`getOutputSample`()	Accessor to the output sample.
`getResidualDistribution`([smooth])	Compute the non parametric distribution of the residual sample.
`getResidualSample`()	Compute the residual sample.
`getSplitter`()	Get the cross-validation method.
`hasName`()	Test if the object is named.
`setName`(name)	Accessor to the object's name.

See also

openturns.GaussianProcessRegression, openturns.GaussianProcessRegressionResult

Notes

A GaussianProcessRegressionCrossValidation object is used for the validation of a Gaussian Process Regression. It is based on the fast (analytical) leave-one-out cross-validation method presented in [ginsbourger2025] (Equation 23).

Note that this method relies on linear algebra, and therefore uses the covariance model parameters fitted on the whole data set. It is therefore not strictly equivalent to the naive cross-validation method, which consists in re-fitting the Gaussian Process Regression model on each training subset.

Examples

Create a Gaussian Process Regression surroagate for the Ishigami function.

>>> import openturns as ot
>>> from openturns.experimental import GaussianProcessRegressionCrossValidation
>>> from openturns.usecases import ishigami_function
>>> im = ishigami_function.IshigamiModel()
>>> sampleSize = 500 
>>> inputTrain = im.distribution.getSample(sampleSize)
>>> outputTrain = im.model(inputTrain)
>>> covariance_kernel = ot.SquaredExponential(inputTrain.getDimension())
>>> basis = ot.ConstantBasisFactory(inputTrain.getDimension()).build()
>>> gpf = ot.GaussianProcessFitter(
...     inputTrain, outputTrain, covariance_kernel, basis
... )
>>> gpf.run()
>>> gpf_result = gpf.getResult()
>>> gpr = ot.GaussianProcessRegression(gpf_result)
>>> gpr.run()
>>> gpr_result = gpr.getResult()

Validate the Gaussian Process Regression surrogate model using leave-one-out cross-validation.

>>> validation = GaussianProcessRegressionCrossValidation(gpr_result)
>>> r2Score = validation.computeR2Score()
>>> print('R2 = ', r2Score[0])
R2 =  0.99...

Draw the validation graph.

>>> graph = validation.drawValidation()

__init__(*args)¶

computeMeanSquaredError()¶

Accessor to the mean squared error.

Returns:

meanSquaredErrorPoint: The mean squared error of each marginal output dimension.

Notes

The sample mean squared error is:

$\widehat{\operatorname{MSE}} = \frac{1}{n} \sum_{j=1}^{n} \left(y^{(j)} - \tilde{g}\left(\bdx^{(j)}\right)\right)^2$

where $n \in \Nset$ is the sample size, $\tilde{g}$ is the metamodel, $\{\bdx^{(j)} \in \Rset^{n_X}\}_{j = 1, ..., n}$ is the input experimental design and $\{y^{(j)} \in \Rset\}_{j = 1, ..., n}$ is the output of the model.

If the output is multi-dimensional, the same calculations are repeated separately for each output marginal $k$ for $k = 1, ..., n_y$ where $n_y \in \Nset$ is the output dimension.

computeR2Score()¶

Compute the R2 score.

Returns:

r2ScorePoint: The coefficient of determination R2

Notes

The coefficient of determination $R^2$ is the fraction of the variance of the output explained by the metamodel. It is defined as:

$R^2 = 1 - \operatorname{FVU}$

where $\operatorname{FVU}$ is the fraction of unexplained variance:

$\operatorname{FVU} = \frac{\operatorname{MSE}(\tilde{g}) }{\Var{Y}}$

where $Y = g(\bdX)$ is the output of the physical model $g$ , $\Var{Y}$ is the variance of the output and $\operatorname{MSE}$ is the mean squared error of the metamodel:

$\operatorname{MSE}(\tilde{g}) = \Expect{\left(g(\bdX) - \tilde{g}(\bdX) \right)^2}.$

The sample $R^2$ is:

$\hat{R}^2 = 1 - \frac{\frac{1}{n} \sum_{j=1}^{n} \left(y^{(j)} - \tilde{g}\left(\bdx^{(j)}\right)\right)^2}{\hat{\sigma}^2_Y}$

where $n \in \Nset$ is the sample size, $\tilde{g}$ is the metamodel, $\left\{\bdx^{(j)} \in \Rset^{n_X}\right\}_{j = 1, ..., n}$ is the input experimental design, $\left\{y^{(j)} \in \Rset\right\}_{j = 1, ..., n}$ is the output of the model and $\hat{\sigma}^2_Y$ is the sample variance of the output:

$\hat{\sigma}^2_Y = \frac{1}{n - 1} \sum_{j=1}^{n} \left(y^{(j)} - \overline{y}\right)^2$

where $\overline{y}$ is the output sample mean:

$\overline{y} = \frac{1}{n} \sum_{j=1}^{n} y^{(j)}.$

drawValidation()¶

Plot a model vs metamodel graph for visual validation.

Returns:

graphGridLayout: The visual validation graph.

Notes

The plot presents the metamodel predictions depending on the model observations. If the points are close to the diagonal line of the plot, then the metamodel validation is satisfactory. Points which are far away from the diagonal represent outputs for which the metamodel is not accurate.

If the output is multi-dimensional, the graph has 1 row and $n_y \in \Nset$ columns, where $n_y$ is the output dimension.

getClassName()¶

Accessor to the object’s name.

Returns:

class_namestr: The object class name (object.__class__.__name__).

getGaussianProcessRegressionResult()¶

Result accessor.

Returns:

resultGaussianProcessRegressionResult: The result provided.

getLeaveOneOutStandardDeviations()¶

Get the Leave-One-Out prediction standard deviations

Returns:

leaveOneOutStandardDeviationsPoint: The Leave-One-Out prediction standard deviation for every point left out of the output sample.

Notes

For every point in the training output sample, the Leave-One-Out prediction for Gaussian Process Regression is assumed to follow a normal distribution with zero mean by the underlying Gaussian Process model. This method returns the standard deviation of this normal distribution.

getMetamodelPredictions()¶

Accessor to the output predictions from the metamodel.

Returns:

outputMetamodelSampleSample: Output sample of the metamodel.

getName()¶

Accessor to the object’s name.

Returns:

namestr: The name of the object.

getOutputSample()¶

Accessor to the output sample.

Returns:

outputSampleSample: Output sample of a model evaluated apart.

getResidualDistribution(smooth=True)¶

Compute the non parametric distribution of the residual sample.

Parameters:

smoothbool: Tells if distribution is smooth (true) or not. Default argument is true.

Returns:

residualDistributionDistribution: The residual distribution.

Notes

The residual distribution is built thanks to KernelSmoothing if smooth argument is true. Otherwise, an histogram distribution is returned, thanks to HistogramFactory.

getResidualSample()¶

Compute the residual sample.

Returns:

residualSample: The residual sample.

Notes

The residual sample is given by :

$r^{(j)} = y^{(j)} - \tilde{g}\left(\vect{x}^{(j)}\right)$

for $j = 1, ..., n$ where $n \in \Nset$ is the sample size, $y^{(j)}$ is the model observation, $\tilde{g}$ is the metamodel and $\vect{x}^{(j)}$ is the $j$ -th input observation.

If the output is multi-dimensional, the residual sample has dimension $n_y \in \Nset$ , where $n_y$ is the output dimension.

getSplitter()¶

Get the cross-validation method.

Returns:

splitterSplitterImplementation: The cross-validation method.

hasName()¶

Test if the object is named.

Returns:

hasNamebool: True if the name is not empty.

setName(name)¶

Accessor to the object’s name.

Parameters:

namestr: The name of the object.

OpenTURNS

An Open source initiative for the Treatment of Uncertainties, Risks'N Statistics

Previous topic

Next topic

This Page

GaussianProcessRegressionCrossValidation¶