LinearModelStepwiseAlgorithm

class LinearModelStepwiseAlgorithm(*args)

Stepwise linear model algorithm.

Parameters:
inputSample, outputSample2-d sequence of float

The input and output samples of a model.

basisBasis

Functional basis to estimate the trend.

minimalIndicessequence of int

The indices of minimal model

directionint, default=FORWARD

BACKWARD, FORWARD or BOTH.

startIndicessequence of int, default=[]

The indices of start model used for the stepwise regression method. Can only be specified in BOTH mode.

Methods

getClassName()

Accessor to the object's name.

getDirection()

Accessor to the direction.

getInputSample()

Accessor to the input sample.

getMaximumIterationNumber()

Accessor to the maximum iteration number.

getName()

Accessor to the object's name.

getOutputSample()

Accessor to the output sample.

getPenalty()

Accessor to the penalty.

getResult()

Accessor to the result.

hasName()

Test if the object is named.

run()

Run the algorithm.

setMaximumIterationNumber(maximumIteration)

Accessor to the maximum iteration number.

setName(name)

Accessor to the object's name.

setPenalty(penalty)

Accessor to the penalty.

Notes

The objective is to select the best linear regression model by using the stepwise method. Starting from the basis and minimalIndices, the stepwise strategy consists in adding basis elements (FORWARD), dropping some (BACKWARD) or adding and dropping (BOTH strategy) some elements. At each step, we get a model. We compute the corresponding penalty (BIC or AIC) and we continue repeatedly this process until the penalty could not be improved or the maximum iterations number is reached. Finally we get a regression model between the scalar variable Y and the n-dimensional one \vect{X} = (X_i)_{i \leq n} writes as follows:

\tilde{Y} = a_0 + \sum_{i \in I} a_i \phi_i(X) + \epsilon

where \epsilon is the residual, supposed to follow the standard Normal distribution, \phi_i the i-th element of the basis.

Examples

Definition of the data set

>>> import openturns as ot
>>> ot.RandomGenerator.SetSeed(0)
>>> distribution = ot.Normal()
>>> func = ot.SymbolicFunction(['x1','x2', 'x3'], ['x1 + x2 + sin(x2 * 2 * pi_)/5 + 1e-3 * x3'])
>>> dimension = 3
>>> distribution = ot.JointDistribution([ot.Normal()]*dimension)
>>> input_sample = distribution.getSample(20)
>>> output_sample = func(input_sample)

Creation of a basis

>>> functions = []
>>> input_description = ['x1','x2', 'x3']
>>> functions.append(ot.SymbolicFunction(input_description, ['1.0'])) #Constant term
>>> for i in range(dimension): #Linear terms
...     functions.append(ot.SymbolicFunction(input_description, [input_description[i]])) 
>>> basis = ot.Basis(functions)

Stepwise regression

>>> minimalIndices = [0]
>>> direction = ot.LinearModelStepwiseAlgorithm.BACKWARD
>>> penalty = 2.0 #Akaike Information Criterion, log(n) can be used for a BIC
>>> algo_forward = ot.LinearModelStepwiseAlgorithm(input_sample, output_sample, basis, minimalIndices, direction)
>>> algo_forward.setPenalty(penalty)
>>> algo_forward.run()
>>> result_forward = algo_forward.getResult()
__init__(*args)
getClassName()

Accessor to the object’s name.

Returns:
class_namestr

The object class name (object.__class__.__name__).

getDirection()

Accessor to the direction.

Returns:
directionint

Direction.

getInputSample()

Accessor to the input sample.

Returns:
input_sampleSample

Input sample.

getMaximumIterationNumber()

Accessor to the maximum iteration number.

Returns:
maximum_iterationint

Maximum number of iterations.

getName()

Accessor to the object’s name.

Returns:
namestr

The name of the object.

getOutputSample()

Accessor to the output sample.

Returns:
output_sampleSample

Output sample.

getPenalty()

Accessor to the penalty.

Returns:
penaltyfloat

Penalty.

getResult()

Accessor to the result.

Returns:
resultLinearModelResult

The result.

hasName()

Test if the object is named.

Returns:
hasNamebool

True if the name is not empty.

run()

Run the algorithm.

setMaximumIterationNumber(maximumIteration)

Accessor to the maximum iteration number.

Parameters:
maximum_iterationint

The maximum number of iterations of the stepwise regression method.

setName(name)

Accessor to the object’s name.

Parameters:
namestr

The name of the object.

setPenalty(penalty)

Accessor to the penalty.

Parameters:
penaltypositive float

The multiple of the degrees of freedom used for the penalty of the stepwise regression method:

  • 2 Akaike information criterion (AIC) (default)

  • log(n) Bayesian information criterion (BIC)

Examples using the class

Perform stepwise regression

Perform stepwise regression