Note

Click here to download the full example code

Sample manipulation¶

This example will describe the main statistical functionalities on data through the Sample object. The Sample is an output variable of interest.

from __future__ import print_function
import openturns as ot
ot.Log.Show(ot.Log.NONE)

A typical example¶

A recurring issue in uncertainty quantification is to perform analysis on an output variable of interest Y obtained through a model f and input parameters X. Here we shall consider the input parameters as two independent standard normal distributions $X=(X_1, X_2)$ . We therefore use an IndependentCopula to describe the link between the two marginals.

# input parameters
inputDist = ot.ComposedDistribution([ot.Normal()] * 2, ot.IndependentCopula(2))
inputDist.setDescription(['X1', 'X2'])

We create a vector from the 2D-distribution created before :

inputVector = ot.RandomVector(inputDist)

Suppose our model f is known and reads as :

$f(X) = \begin{pmatrix} x_1^2 + x_2 \\ x_1 + x_2^2 \end{pmatrix}$

We define our model f with a SymbolicFunction

f = ot.SymbolicFunction(["x1", "x2"], ["x1^2+x2", "x2^2+x1"])

Our output vector is Y=f(X), the image of the inputVector by the model

outputVector = ot.CompositeRandomVector(f, inputVector)

We can now get a sample out of Y, that is realizations (here 1000) of the random outputVector

size = 1000
sample = outputVector.getSample(size)

The sample may be seen as a matrix of size $1000 \times 2$ . We print the 5 first samples (out of 1000) :

sample[:5]

	y0	y1
0	-0.5815072	0.7240122
1	3.26726	-0.5563772
2	-0.3683326	-0.08640049
3	-1.139952	1.854578
4	5.692328	-1.219674

Basic operations on samples¶

We have access to basic information about a sample such as

minimum and maximum per component

sample.getMin(), sample.getMax()

Out:

(class=Point name=Unnamed dimension=2 values=[-2.56587,-2.84726], class=Point name=Unnamed dimension=2 values=[9.93535,12.1777])

the range per component (max-min)

sample.computeRange()

[12.5012,15.025]

More elaborate functionalities are also available :

get the median per component

sample.computeMedian()

[0.680688,0.874763]

compute the covariance

sample.computeCovariance()

[[ 2.59234 -0.0758625 ]
[ -0.0758625 3.30636 ]]

get the empirical 0.95 quantile per component

sample.computeQuantilePerComponent(0.95)

[3.67518,4.13131]

get the value of the empirical CDF at a point

point = [1.1, 2.2]
sample.computeEmpiricalCDF(point)

Out:

0.518

Estimate the statistical moments¶

Oftentimes, we need to estimate the first moments of the output data. We can then estimate statistical moments from the output sample :

estimate the moment of order 1 : mean

sample.computeMean()

[0.903872,1.15217]

estimate the standard deviation (returns the Cholesky factor)

sample.computeStandardDeviation()

[[ 1.61007 0 ]
[ -0.0471174 1.81773 ]]

estimate the standard deviation for each component

sample.computeStandardDeviationPerComponent()

[1.61007,1.81834]

estimate the moment of order 2 : variance

sample.computeVariance()

[2.59234,3.30636]

estimate the moment of order 3 : skewness

sample.computeSkewness()

[1.28241,1.80582]

estimate the moment of order 4 : kurtosis

sample.computeKurtosis()

[6.40216,9.59074]

Test the correlation¶

Some statistical test for correlation are available :

get the sample Pearson correlation matrix :

sample.computePearsonCorrelation()

[[ 1 -0.0259123 ]
[ -0.0259123 1 ]]

get the sample Kendall correlation matrix :

sample.computeKendallTau()

[[ 1 0.0183584 ]
[ 0.0183584 1 ]]

get the sample Spearman correlation matrix :

sample.computeSpearmanCorrelation()

[[ 1 0.0200394 ]
[ 0.0200394 1 ]]

Total running time of the script: ( 0 minutes 0.008 seconds)

Gallery generated by Sphinx-Gallery

OpenTURNS

An Open source initiative for the Treatment of Uncertainties, Risks'N Statistics

Table of Contents

Previous topic

Next topic

This Page