` to download the full example code
.. rst-class:: sphx-glr-example-title
.. _sphx_glr_auto_data_analysis_manage_data_and_samples_plot_sample_manipulation.py:
Sample manipulation
===================
This example will describe the main statistical functionalities on data through the Sample object. The Sample is an output variable of interest.
.. code-block:: default
from __future__ import print_function
import openturns as ot
ot.Log.Show(ot.Log.NONE)
A typical example
-----------------
A recurring issue in uncertainty quantification is to perform analysis on an output variable of interest Y obtained through a model `f` and input parameters `X`.
Here we shall consider the input parameters as two independent standard normal distributions :math:`X=(X_1, X_2)`. We therefore use an `IndependentCopula` to describe the link between the two marginals.
.. code-block:: default
# input parameters
inputDist = ot.ComposedDistribution([ot.Normal()] * 2, ot.IndependentCopula(2))
inputDist.setDescription(['X1', 'X2'])
We create a vector from the 2D-distribution created before :
.. code-block:: default
inputVector = ot.RandomVector(inputDist)
Suppose our model `f` is known and reads as :
.. math::
f(X) = \begin{pmatrix}
x_1^2 + x_2 \\
x_1 + x_2^2
\end{pmatrix}
We define our model `f` with a `SymbolicFunction`
.. code-block:: default
f = ot.SymbolicFunction(["x1", "x2"], ["x1^2+x2", "x2^2+x1"])
Our output vector is Y=f(X), the image of the inputVector by the model
.. code-block:: default
outputVector = ot.CompositeRandomVector(f, inputVector)
We can now get a sample out of Y, that is realizations (here 1000) of the random outputVector
.. code-block:: default
size = 1000
sample = outputVector.getSample(size)
The `sample` may be seen as a matrix of size :math:`1000 \times 2`. We print the 5 first samples (out of 1000) :
.. code-block:: default
sample[:5]
.. raw:: html
| y0 | y1 |
0 | -0.5815072 | 0.7240122 |
1 | 3.26726 | -0.5563772 |
2 | -0.3683326 | -0.08640049 |
3 | -1.139952 | 1.854578 |
4 | 5.692328 | -1.219674 |
Basic operations on samples
---------------------------
We have access to basic information about a sample such as
- minimum and maximum per component
.. code-block:: default
sample.getMin(), sample.getMax()
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
(class=Point name=Unnamed dimension=2 values=[-2.56587,-2.84726], class=Point name=Unnamed dimension=2 values=[9.93535,12.1777])
- the range per component (max-min)
.. code-block:: default
sample.computeRange()
.. raw:: html
[12.5012,15.025]
More elaborate functionalities are also available :
- get the median per component
.. code-block:: default
sample.computeMedian()
.. raw:: html
[0.680688,0.874763]
- compute the covariance
.. code-block:: default
sample.computeCovariance()
.. raw:: html
[[ 2.59234 -0.0758625 ]
[ -0.0758625 3.30636 ]]
- get the empirical 0.95 quantile per component
.. code-block:: default
sample.computeQuantilePerComponent(0.95)
.. raw:: html
[3.67518,4.13131]
- get the value of the empirical CDF at a point
.. code-block:: default
point = [1.1, 2.2]
sample.computeEmpiricalCDF(point)
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
0.518
Estimate the statistical moments
--------------------------------
Oftentimes, we need to estimate the first moments of the output data. We can then estimate statistical moments from the output sample :
- estimate the moment of order 1 : mean
.. code-block:: default
sample.computeMean()
.. raw:: html
[0.903872,1.15217]
- estimate the standard deviation (returns the Cholesky factor)
.. code-block:: default
sample.computeStandardDeviation()
.. raw:: html
[[ 1.61007 0 ]
[ -0.0471174 1.81773 ]]
- estimate the standard deviation for each component
.. code-block:: default
sample.computeStandardDeviationPerComponent()
.. raw:: html
[1.61007,1.81834]
- estimate the moment of order 2 : variance
.. code-block:: default
sample.computeVariance()
.. raw:: html
[2.59234,3.30636]
- estimate the moment of order 3 : skewness
.. code-block:: default
sample.computeSkewness()
.. raw:: html
[1.28241,1.80582]
- estimate the moment of order 4 : kurtosis
.. code-block:: default
sample.computeKurtosis()
.. raw:: html
[6.40216,9.59074]
Test the correlation
--------------------
Some statistical test for correlation are available :
- get the sample Pearson correlation matrix :
.. code-block:: default
sample.computePearsonCorrelation()
.. raw:: html
[[ 1 -0.0259123 ]
[ -0.0259123 1 ]]
- get the sample Kendall correlation matrix :
.. code-block:: default
sample.computeKendallTau()
.. raw:: html
[[ 1 0.0183584 ]
[ 0.0183584 1 ]]
- get the sample Spearman correlation matrix :
.. code-block:: default
sample.computeSpearmanCorrelation()
.. raw:: html
[[ 1 0.0200394 ]
[ 0.0200394 1 ]]
.. rst-class:: sphx-glr-timing
**Total running time of the script:** ( 0 minutes 0.008 seconds)
.. _sphx_glr_download_auto_data_analysis_manage_data_and_samples_plot_sample_manipulation.py:
.. only :: html
.. container:: sphx-glr-footer
:class: sphx-glr-footer-example
.. container:: sphx-glr-download sphx-glr-download-python
:download:`Download Python source code: plot_sample_manipulation.py `
.. container:: sphx-glr-download sphx-glr-download-jupyter
:download:`Download Jupyter notebook: plot_sample_manipulation.ipynb `
.. only:: html
.. rst-class:: sphx-glr-signature
`Gallery generated by Sphinx-Gallery `_