LinearModelResult

class LinearModelResult(*args)

Result of a LinearModelAlgorithm.

Parameters:
inputSample2-d sequence of float

The input sample of a model.

basisBasis

Functional basis to estimate the trend.

designMatrix

The design matrix \mat{\Psi}.

outputSample2-d sequence of float

The output sample (Y_i)_{1 \leq i \leq \sampleSize}.

metaModelFunction

The meta model.

coefficientssequence of float

The estimated coefficients \vect{\hat{a}}.

formulastr

The formula description.

coefficientsNamessequence of str

The coefficients names of the basis.

sampleResiduals2-d sequence of float

The residual errors (\varepsilon_i)_{1 \leq i \leq \sampleSize}.

standardizedSampleResiduals2-d sequence of float

The standardized residuals defined in (9).

diagonalGramInversesequence of float

The diagonal of the Gram inverse matrix.

leveragessequence of float

The leverages (\ell_i)_{1 \leq i \leq \sampleSize} defined in (7).

cookDistancessequence of float

Cook’s distances defined in (2).

residualsVariancefloat

The unbiased variance estimator of the residuals defined in (8).

Methods

buildMethod()

Accessor to the least squares method.

getAdjustedRSquared()

Accessor to the Adjusted R-squared test.

getBasis()

Accessor to the basis.

getClassName()

Accessor to the object's name.

getCoefficients()

Accessor to the coefficients of the linear model.

getCoefficientsNames()

Accessor to the coefficients names.

getCoefficientsStandardErrors()

Accessor to the coefficients of standard error.

getCookDistances()

Accessor to the cook's distances.

getDegreesOfFreedom()

Accessor to the degrees of freedom.

getDesign()

Accessor to the design matrix.

getDiagonalGramInverse()

Accessor to the diagonal gram inverse matrix.

getFittedSample()

Accessor to the fitted sample.

getFormula()

Accessor to the formula.

getInputSample()

Accessor to the input sample.

getLeverages()

Accessor to the leverages.

getMetaModel()

Accessor to the metamodel.

getName()

Accessor to the object's name.

getNoiseDistribution()

Accessor to the normal distribution of the residuals.

getOutputSample()

Accessor to the output sample.

getRSquared()

Accessor to the R-squared test.

getResidualsVariance()

Accessor to the unbiased sample variance of the residuals.

getSampleResiduals()

Accessor to the residuals.

getStandardizedResiduals()

Accessor to the standardized residuals.

hasIntercept()

Returns if intercept is provided in the basis or not.

hasName()

Test if the object is named.

involvesModelSelection()

Get the model selection flag.

setInputSample(sampleX)

Accessor to the input sample.

setInvolvesModelSelection(involvesModelSelection)

Set the model selection flag.

setMetaModel(metaModel)

Accessor to the metamodel.

setName(name)

Accessor to the object's name.

setOutputSample(sampleY)

Accessor to the output sample.

getRelativeErrors

getResiduals

setRelativeErrors

setResiduals

__init__(*args)
buildMethod()

Accessor to the least squares method.

Returns:
leastSquaresMethod: LeastSquaresMethod

The least squares method.

Notes

The least squares method used to estimate the coefficients is precised in the ResourceMap class, entry LinearModelAlgorithm-DecompositionMethod.

getAdjustedRSquared()

Accessor to the Adjusted R-squared test.

Returns:
adjustedRSquaredfloat

The R_{ad}^2 indicator.

Notes

The R_{ad}^2 value quantifies the quality of the linear approximation. With respect to R^2, R_{ad}^2 takes into account the data set size and the number of hyperparameters.

If the model is defined by (3) such that the basis does not contain any intercept (constant function), then R_{ad}^2 is defined by:

R_{ad}^2 = 1 - \left(\dfrac{\sampleSize}{dof}\right)(1 - R^2)

where dof is the degrees of freedom of the model defined in (4) and \sampleSize the number of experiences.

Otherwise, when the model is defined by (1) or by (3) with an intercept, R_{ad}^2 is defined by:

R_{ad}^2 = 1 - \left(\dfrac{\sampleSize - 1}{dof}\right)(1 - R^2)

where dof is defined in (3) or (4).

If the degree of freedom dof is null, R_{ad}^2 is not defined.

getBasis()

Accessor to the basis.

Returns:
basisBasis

The basis of the regression model.

Notes

If a functional basis has been provided in the constructor, then we get it back: (\phi_j)_{1 \leq j \leq p'}. Its size is p'.

Otherwise, the functional basis is composed of the projections \phi_k : \Rset^p \rightarrow \Rset such that \phi_k(\vect{x}) = x_k for 1 \leq k \leq p, completed with the constant function: \phi_0 : \vect{x} \rightarrow 1. Its size is p+1.

getClassName()

Accessor to the object’s name.

Returns:
class_namestr

The object class name (object.__class__.__name__).

getCoefficients()

Accessor to the coefficients of the linear model.

Returns:
coefficientsPoint

The estimated of the coefficients \hat{\vect{a}}.

getCoefficientsNames()

Accessor to the coefficients names.

Returns:
coefficientsNamesDescription

Notes

The name of the coefficient a_k is the name of the regressor X_k.

getCoefficientsStandardErrors()

Accessor to the coefficients of standard error.

Returns:
standardErrorsPoint

Notes

The standard deviation \sigma(a_k) of the estimator \hat{a}_k is defined by:

(1)\sigma(a_k)^2 = \sigma^2 \left(\Tr{\mat{\Psi}}\mat{\Psi}\right)^{-1}_{k+1, k+1}

where:

  • the variance \sigma^2 of the residual \varepsilon is approximated by its unbiaised estimator \hat{\sigma}^2 defined in (8),

  • the matrix \mat{\Psi} is the design matrix defined in (5) or (6).

getCookDistances()

Accessor to the cook’s distances.

Returns:
cookDistancesPoint

The Cook’s distance of each experience (CookD_i)_{1 \leq i  \leq \sampleSize}.

Notes

The Cook’s distance measures the impact of every experience on the linear regression. See [rawlings2001] (section 11.2.1, Cook’s D page 362) for more details.

The Cook distance of experience i is defined by:

(2)CookD_i = \left(\dfrac{1}{\sampleSize-dof}\right) \left( \dfrac{\ell_i}{1 - \ell_i} \right)
(\varepsilon_i^{st})^2

where \varepsilon_i^{st} is the standardized residual defined in (9) and dof is the degress of freedom defined in (3) or (4).

getDegreesOfFreedom()

Accessor to the degrees of freedom.

Returns:
dofint, \geq 1

Number of degrees of freedom.

Notes

If the linear model is defined by (1), the degrees of freedom dof is:

(3)dof = \sampleSize - (p + 1)

where p is the number of regressors.

Otherwise, the linear model is defined by (3) and its dof is:

(4)dof = \sampleSize - p'

where p' is the number of functions in the provided basis.

getDesign()

Accessor to the design matrix.

Returns:
design: Matrix

The design matrix \mat{\Psi}.

Notes

If the linear model is defined by (1), the design matrix is:

(5)\mat{\Psi} = (\vect{1}, \vect{X}_1, \dots, \vect{X}_{p})

where \vect{1} = \Tr{(1, \dots, 1)} and \vect{X}_k = \Tr{(X_k^1, \dots, X_k^\sampleSize)} is the values of the regressor k in the \sampleSize experiences. Thus, \mat{\Psi} has \sampleSize rows and p+1 columns.

If the linear model is defined by (3), the design matrix is:

(6)\mat{\Psi} = (\vect{\phi}_1, \dots, \vect{\phi}_{p'})

where \vect{\phi_j} = \Tr{(\phi_j(\vect{X}^1), \dots, \phi_j(\vect{X}^\sampleSize))} is the values of the function \phi_j at the \sampleSize experiences. Thus, \mat{\Psi} has \sampleSize rows and p' columns.

getDiagonalGramInverse()

Accessor to the diagonal gram inverse matrix.

Returns:
diagonalGramInversePoint

The diagonal of the Gram inverse matrix.

Notes

The Gram matrix is \Tr{\mat{\Psi}}\mat{\Psi} where \mat{\Psi} is the design matrix defined in (5) or (6).

getFittedSample()

Accessor to the fitted sample.

Returns:
outputSampleSample

Notes

The fitted sample is (\hat{Y}_1, \dots, \hat{Y}_\sampleSize) where \hat{Y}_i is defined in (2) or (4).

getFormula()

Accessor to the formula.

Returns:
condensedFormulastr

Notes

This formula gives access to the linear model.

getInputSample()

Accessor to the input sample.

Returns:
inputSampleSample

The input sample.

getLeverages()

Accessor to the leverages.

Returns:
leveragesPoint

The leverage of all the experiences (\ell_i)_{1 \leq i  \leq \sampleSize}.

Notes

We denote by \hat{\vect{Y}} = (\hat{Y}_1, \dots, \hat{Y}_n) the fitted values of the n experiences. Then we have:

\hat{\vect{Y}} =  \mat{\Psi} \hat{\vect{a}}

where \mat{\Psi} is the design matrix defined in (5). It leads to:

\Var{\hat{\vect{Y}}} & = \mat{\Psi} \Var{\hat{\vect{a}}}\Tr{\mat{\Psi}} \\
                     & = \sigma^2 \mat{H}

where:

\mat{H} = \mat{\Psi} (\Tr{\mat{\Psi}}\mat{\Psi})^{-1} \Tr{\mat{\Psi}}

Thus, for the experience i, we get:

(7)\Var{\hat{Y}_i} = \sigma^2 \ell_{ii}

where \ell_{ii} is the i-th element of the diagonal of \mat{H}: \ell_{ii} is the leverage \ell_i of experience i.

getMetaModel()

Accessor to the metamodel.

Returns:
metaModelFunction

Metamodel.

getName()

Accessor to the object’s name.

Returns:
namestr

The name of the object.

getNoiseDistribution()

Accessor to the normal distribution of the residuals.

Returns:
noiseDistributionNormal

The normal distribution estimated from the residuals.

Notes

The noise distribution is the distribution of the residuals. It is assumed to be Gaussian. The normal distribution has zero mean and its variance is estimated from the residuals sample (\varepsilon_i)_{1 \leq i \leq \sampleSize} defined in (5), using the unbiaised estimator defined in (8).

If the residuals are not Gaussian, this distribution is not appropriate and should not be used.

getOutputSample()

Accessor to the output sample.

Returns:
outputSampleSample

The output sample.

getRSquared()

Accessor to the R-squared test.

Returns:
rSquaredfloat

The indicator R^2.

Notes

The R^2 value quantifies the quality of the linear approximation.

If the model is defined by (3) such that the basis does not contain any intercept (constant function), then R^2 is defined by:

R^2 = 1- \dfrac{\sum_{i=1}^\sampleSize \varepsilon_i^2}{\sum_{i=1}^\sampleSize Y_i^2}

where the \varepsilon_i are the residuals defined in (5) and Y_i the output sample values.

Otherwise, when the model is defined by (1) or by (3) with an intercept, R^2 is defined by:

R^2 = 1- \dfrac{\sum_{i=1}^\sampleSize \varepsilon_i^2}{\sum_{i=1}^\sampleSize (Y_i-\bar{Y})^2}

where \bar{Y} = \dfrac{1}{\sampleSize} \sum_{i=1}^\sampleSize Y_i.

getResidualsVariance()

Accessor to the unbiased sample variance of the residuals.

Returns:
residualsVariancefloat

The residuals variance estimator.

Notes

The residual variance estimator is the unbiaised empirical variance of the residuals:

(8)\hat{\sigma}^2 = \dfrac{1}{dof} \sum_{i=1}^\sampleSize  \varepsilon_i^2

where dof is the degrees of freedom of the model defined in (3) or (4).

getSampleResiduals()

Accessor to the residuals.

Returns:
sampleResidualsSample

The sample of the residuals.

Notes

The residuals sample is (\varepsilon_i)_{1 \leq i \leq \sampleSize} defined in (5).

getStandardizedResiduals()

Accessor to the standardized residuals.

Returns:
standardizedResidualsSample

The standarduzed residuals (\varepsilon_i^{st})_{1 \leq i  \leq \sampleSize}.

Notes

The standardized residuals are defined by:

(9)\varepsilon_i^{st} = \dfrac{\varepsilon_i}{\sqrt{\hat{\sigma}^2(1 - \ell_i)}}

where \hat{\sigma}^2 is the unbiaised residuals variance defined in (8) and \ell_i is the leverage of experience i defined in (7).

hasIntercept()

Returns if intercept is provided in the basis or not.

Returns:
interceptBool

Tells if the model has a constant regressor.

Notes

The intercept is True when:

  • the model is defined in (1),

  • the model is defined in (3) such that the basis contains a constant function.

hasName()

Test if the object is named.

Returns:
hasNamebool

True if the name is not empty.

involvesModelSelection()

Get the model selection flag.

A model selection method can be used to select the coefficients to best predict the output. Model selection can lead to a sparse model.

Returns:
involvesModelSelectionbool

True if the method involves a model selection method.

setInputSample(sampleX)

Accessor to the input sample.

Parameters:
inputSampleSample

The input sample.

setInvolvesModelSelection(involvesModelSelection)

Set the model selection flag.

A model selection method can be used to select the coefficients to best predict the output. Model selection can lead to a sparse model.

Parameters:
involvesModelSelectionbool

True if the method involves a model selection method.

setMetaModel(metaModel)

Accessor to the metamodel.

Parameters:
metaModelFunction

Metamodel.

setName(name)

Accessor to the object’s name.

Parameters:
namestr

The name of the object.

setOutputSample(sampleY)

Accessor to the output sample.

Parameters:
outputSampleSample

The output sample.

Examples using the class

Build and validate a linear model

Build and validate a linear model

Create a linear model

Create a linear model

Perform stepwise regression

Perform stepwise regression