Note

Go to the end to download the full example code.

Sample manipulation¶

This example will describe the main statistical functionalities on data through the Sample object. The Sample is an output variable of interest.

import openturns as ot

A typical example¶

A recurring issue in uncertainty quantification is to perform analysis on an output variable of interest Y obtained through a model f and input parameters X. Here we shall consider the input parameters as two independent standard Normal distributions $X=(X_1, X_2)$ . We therefore use an IndependentCopula to describe the link between the two marginals.

# input parameters
inputDist = ot.JointDistribution([ot.Normal()] * 2, ot.IndependentCopula(2))
inputDist.setDescription(["X1", "X2"])

We create a vector from the 2d-distribution created before :

inputVector = ot.RandomVector(inputDist)

Suppose our model f is known and reads as :

$f(x) = \begin{pmatrix} x_1^2 + x_2 \\ x_1 + x_2^2 \end{pmatrix}$

We define our model f with a SymbolicFunction

f = ot.SymbolicFunction(["x1", "x2"], ["x1^2+x2", "x2^2+x1"])

Our output vector is $Y = f(X)$ , the image of the inputVector by the model

outputVector = ot.CompositeRandomVector(f, inputVector)

We can now get a sample out of Y, that is realizations (here 1000) of the random outputVector

size = 1000
sample = outputVector.getSample(size)

The sample may be seen as a matrix of size $1000 \times 2$ . We print the 5 first samples (out of 1000) :

sample[:5]

	y0	y1
0	0.5519547	0.6413422
1	2.069226	-1.048988
2	-2.211924	5.340953
3	0.7773735	1.66219
4	4.325861	-1.994259

Basic operations on samples¶

We have access to basic information about a sample such as

minimum and maximum per component

sample.getMin(), sample.getMax()

(class=Point name=Unnamed dimension=2 values=[-3.24513,-2.98342], class=Point name=Unnamed dimension=2 values=[10.6987,10.5037])

the range per component (max-min)

sample.computeRange()

class=Point name=Unnamed dimension=2 values=[13.9438,13.4871]

More elaborate functionalities are also available :

get the median per component

sample.computeMedian()

class=Point name=Unnamed dimension=2 values=[0.631033,0.715266]

compute the covariance

sample.computeCovariance()

[[ 2.77895 0.0913307 ]
[ 0.0913307 3.30989 ]]

get the empirical 0.95 quantile per component

sample.computeQuantilePerComponent(0.95)

class=Point name=Unnamed dimension=2 values=[3.96521,4.45879]

get the value of the empirical CDF at a point

point = [1.1, 2.2]
sample.computeEmpiricalCDF(point)

0.569

Estimate the statistical moments¶

Oftentimes, we need to estimate the first moments of the output data. We can then estimate statistical moments from the output sample :

estimate the moment of order 1 : mean

sample.computeMean()

class=Point name=Unnamed dimension=2 values=[0.928558,1.01799]

estimate the standard deviation for each component

sample.computeStandardDeviation()

class=Point name=Unnamed dimension=2 values=[1.66702,1.81931]

estimate the moment of order 2 : variance

sample.computeVariance()

class=Point name=Unnamed dimension=2 values=[2.77895,3.30989]

estimate the moment of order 3 : skewness

sample.computeSkewness()

class=Point name=Unnamed dimension=2 values=[1.40965,1.73437]

estimate the moment of order 4 : kurtosis

sample.computeKurtosis()

class=Point name=Unnamed dimension=2 values=[6.84373,7.96431]

Test the correlation¶

Some statistical test for correlation are available :

get the sample linear correlation matrix :

sample.computeLinearCorrelation()

[[ 1 0.0301141 ]
[ 0.0301141 1 ]]

get the sample Kendall correlation matrix :

sample.computeKendallTau()

[[ 1 -0.010026 ]
[ -0.010026 1 ]]

get the sample Spearman correlation matrix :

sample.computeSpearmanCorrelation()

[[ 1 0.00483728 ]
[ 0.00483728 1 ]]

OpenTURNS

An Open source initiative for the Treatment of Uncertainties, Risks'N Statistics

Table of Contents

Previous topic

Next topic

This Page