Note
Click here to download the full example code
Sample manipulation¶
This example will describe the main statistical functionalities on data through the Sample object. The Sample is an output variable of interest.
from __future__ import print_function
import openturns as ot
ot.Log.Show(ot.Log.NONE)
A typical example¶
A recurring issue in uncertainty quantification is to perform analysis on an output variable of interest Y obtained through a model f and input parameters X. Here we shall consider the input parameters as two independent standard normal distributions . We therefore use an IndependentCopula to describe the link between the two marginals.
# input parameters
inputDist = ot.ComposedDistribution([ot.Normal()] * 2, ot.IndependentCopula(2))
inputDist.setDescription(['X1', 'X2'])
We create a vector from the 2D-distribution created before :
inputVector = ot.RandomVector(inputDist)
Suppose our model f is known and reads as :
We define our model f with a SymbolicFunction
f = ot.SymbolicFunction(["x1", "x2"], ["x1^2+x2", "x2^2+x1"])
Our output vector is Y=f(X), the image of the inputVector by the model
outputVector = ot.CompositeRandomVector(f, inputVector)
We can now get a sample out of Y, that is realizations (here 1000) of the random outputVector
size = 1000
sample = outputVector.getSample(size)
The sample may be seen as a matrix of size . We print the 5 first samples (out of 1000) :
sample[:5]
y0 | y1 | |
---|---|---|
0 | -0.5815072 | 0.7240122 |
1 | 3.26726 | -0.5563772 |
2 | -0.3683326 | -0.08640049 |
3 | -1.139952 | 1.854578 |
4 | 5.692328 | -1.219674 |
Basic operations on samples¶
We have access to basic information about a sample such as
minimum and maximum per component
sample.getMin(), sample.getMax()
Out:
(class=Point name=Unnamed dimension=2 values=[-2.56587,-2.84726], class=Point name=Unnamed dimension=2 values=[9.93535,12.1777])
the range per component (max-min)
sample.computeRange()
[12.5012,15.025]
More elaborate functionalities are also available :
get the median per component
sample.computeMedian()
[0.680688,0.874763]
compute the covariance
sample.computeCovariance()
[[ 2.59234 -0.0758625 ]
[ -0.0758625 3.30636 ]]
get the empirical 0.95 quantile per component
sample.computeQuantilePerComponent(0.95)
[3.67518,4.13131]
get the value of the empirical CDF at a point
point = [1.1, 2.2]
sample.computeEmpiricalCDF(point)
Out:
0.518
Estimate the statistical moments¶
Oftentimes, we need to estimate the first moments of the output data. We can then estimate statistical moments from the output sample :
estimate the moment of order 1 : mean
sample.computeMean()
[0.903872,1.15217]
estimate the standard deviation (returns the Cholesky factor)
sample.computeStandardDeviation()
[[ 1.61007 0 ]
[ -0.0471174 1.81773 ]]
estimate the standard deviation for each component
sample.computeStandardDeviationPerComponent()
[1.61007,1.81834]
estimate the moment of order 2 : variance
sample.computeVariance()
[2.59234,3.30636]
estimate the moment of order 3 : skewness
sample.computeSkewness()
[1.28241,1.80582]
estimate the moment of order 4 : kurtosis
sample.computeKurtosis()
[6.40216,9.59074]
Test the correlation¶
Some statistical test for correlation are available :
get the sample Pearson correlation matrix :
sample.computePearsonCorrelation()
[[ 1 -0.0259123 ]
[ -0.0259123 1 ]]
get the sample Kendall correlation matrix :
sample.computeKendallTau()
[[ 1 0.0183584 ]
[ 0.0183584 1 ]]
get the sample Spearman correlation matrix :
sample.computeSpearmanCorrelation()
[[ 1 0.0200394 ]
[ 0.0200394 1 ]]
Total running time of the script: ( 0 minutes 0.008 seconds)