PartialRegression

PartialRegression(firstSample, secondSample, selection, level=0.05)

Test a linear correlation between two samples.

Available usages:

LinearModelTest.PartialRegression(firstSample, secondSample, selection)

LinearModelTest.PartialRegression(firstSample, secondSample, selection, level)

Parameters:
firstSample2-d sequence of float

First tested sample, of dimension p \geq 1.

secondSample2-d sequence of float

Second tested sample, of dimension 1.

selectionsequence of int, maximum integer value < n

List of indices selecting which subsets of the first sample will successively be tested with the second sample through the regression test.

levelpositive float < 1

Threshold p-value of the test (error Type I).

Default value is 0.05.

Returns:
testResultsCollection of TestResult

Results for each component of the linear model including intercept.

Notes

The PartialRegression method fits a linear model between Y with respect to the selected components of (X_1, \dots, X_p) from a sample of (X_1, \dots, X_p) (called firstSample) and a sample of Y (called secondSample):

Y = a_0 + \sum_{k \in I}X_k

where I is the set of selected indices.

The method returns a collection of TestResult that tests the null hypothesis: The coefficient is null for each coefficient of the linear relation: (a_k)_{k=0, \in I}. The statistics used is detailed in LinearModelAnalysis and is based on the evaluation of the t_{score}: see getCoefficientsTScores() and getCoefficientsPValues().

If the result of the test is True, then the coefficient is assumed to be null: there is no linear relation between the tested components. If the result of the test is False, then the coefficient is assumed to be significantly different from 0: there is a linear relation between the tested components. The Partial Regression Test is used to assess the linearity between a subset of components of firstSample and secondSample. The parameter selection enables to select specific subsets of the firstSample to be tested.

Examples

We create a sample generated by a Gaussian vector (X_1, X_2, X_3) with zero mean, unit variance and which components (X_1, X_3) are correlated.

We fit the linear model: X_3 = a_0 + a_1X_1 and we test if each coefficient is significantly different from 0. >>> import openturns as ot >>> ot.RandomGenerator.SetSeed(0) >>> S = ot.CorrelationMatrix(3) >>> distribution = ot.Normal([0]*3, [1]*3, S) >>> sample = distribution.getSample(30) >>> firstSample = sample.getMarginal([0,1]) >>> secondSample = sample.getMarginal(2) >>> selection = [1] >>> test_result = ot.LinearModelTest.PartialRegression(firstSample, secondSample, selection) >>> print(test_result[1]) class=TestResult name=Unnamed type=Regression binaryQualityMeasure=true p-value threshold=0.05 p-value=0.172847 statistic=1.39882 description=[]