# Sample manipulation¶

This example will describe the main statistical functionalities on data through the Sample object. The Sample is an output variable of interest.

from __future__ import print_function
import openturns as ot
ot.Log.Show(ot.Log.NONE)


## A typical example¶

A recurring issue in uncertainty quantification is to perform analysis on an output variable of interest Y obtained through a model f and input parameters X. Here we shall consider the input parameters as two independent standard normal distributions . We therefore use an IndependentCopula to describe the link between the two marginals.

# input parameters
inputDist = ot.ComposedDistribution([ot.Normal()] * 2, ot.IndependentCopula(2))
inputDist.setDescription(['X1', 'X2'])


We create a vector from the 2D-distribution created before :

inputVector = ot.RandomVector(inputDist)


Suppose our model f is known and reads as : We define our model f with a SymbolicFunction

f = ot.SymbolicFunction(["x1", "x2"], ["x1^2+x2", "x2^2+x1"])


Our output vector is Y=f(X), the image of the inputVector by the model

outputVector = ot.CompositeRandomVector(f, inputVector)


We can now get a sample out of Y, that is realizations (here 1000) of the random outputVector

size = 1000
sample = outputVector.getSample(size)


The sample may be seen as a matrix of size . We print the 5 first samples (out of 1000) :

sample[:5]

y0 y1 -0.5815072 0.7240122 3.26726 -0.5563772 -0.3683326 -0.08640049 -1.139952 1.854578 5.692328 -1.219674

## Basic operations on samples¶

• minimum and maximum per component

sample.getMin(), sample.getMax()


Out:

(class=Point name=Unnamed dimension=2 values=[-2.56587,-2.84726], class=Point name=Unnamed dimension=2 values=[9.93535,12.1777])

• the range per component (max-min)

sample.computeRange()


[12.5012,15.025]

More elaborate functionalities are also available :

• get the median per component

sample.computeMedian()


[0.680688,0.874763]

• compute the covariance

sample.computeCovariance()


[[ 2.59234 -0.0758625 ]
[ -0.0758625 3.30636 ]]

• get the empirical 0.95 quantile per component

sample.computeQuantilePerComponent(0.95)


[3.67518,4.13131]

• get the value of the empirical CDF at a point

point = [1.1, 2.2]
sample.computeEmpiricalCDF(point)


Out:

0.518


## Estimate the statistical moments¶

Oftentimes, we need to estimate the first moments of the output data. We can then estimate statistical moments from the output sample :

• estimate the moment of order 1 : mean

sample.computeMean()


[0.903872,1.15217]

• estimate the standard deviation (returns the Cholesky factor)

sample.computeStandardDeviation()


[[ 1.61007 0 ]
[ -0.0471174 1.81773 ]]

• estimate the standard deviation for each component

sample.computeStandardDeviationPerComponent()


[1.61007,1.81834]

• estimate the moment of order 2 : variance

sample.computeVariance()


[2.59234,3.30636]

• estimate the moment of order 3 : skewness

sample.computeSkewness()


[1.28241,1.80582]

• estimate the moment of order 4 : kurtosis

sample.computeKurtosis()


[6.40216,9.59074]

## Test the correlation¶

Some statistical test for correlation are available :

• get the sample Pearson correlation matrix :

sample.computePearsonCorrelation()


[[ 1 -0.0259123 ]
[ -0.0259123 1 ]]

• get the sample Kendall correlation matrix :

sample.computeKendallTau()


[[ 1 0.0183584 ]
[ 0.0183584 1 ]]

• get the sample Spearman correlation matrix :

sample.computeSpearmanCorrelation()


[[ 1 0.0200394 ]
[ 0.0200394 1 ]]

Total running time of the script: ( 0 minutes 0.008 seconds)

Gallery generated by Sphinx-Gallery