Estimating moments with Monte Carlo¶

Let us denote $\outputRV = \model\left( \inputRV \right) = \left( Y^1,\ldots,Y^{\outputDim} \right)$ , where $\inputRV= \left( X^1,\ldots,X^{\inputDim} \right)$ is a random vector. We seek here to evaluate, the characteristics of the central part (central tendency and spread i.e. mean and variance) of the probability distribution of a variable $Y^i$ , using the probability distribution of the random vector $\inputRV$ .

The Monte Carlo method is a numerical integration method using sampling, which can be used, for example, to determine the mean and standard deviation of a random variable $Y^i$ (if these quantities exist, which is not the case for all probability distributions):

$\begin{aligned} m_{Y^i} = \int u \, f_{Y^i}(u) \, du,\ \sigma_{Y^i} = \sqrt{\int \left( u-m_{Y^i} \right)^2 \, f_{Y^i}(u) \, du} \end{aligned}$

where $f_{Y^i}$ represents the probability density function of $Y^i$ .

Suppose now that we have the sample $\left\{ y^i_1,\ldots,y^i_\sampleSize \right\}$ of $\sampleSize$ values randomly and independently sampled from the probability distribution $f_{Y^i}$ ; this sample can be obtained by drawing a $\sampleSize$ sample $\left\{ \vect{x}_1,\ldots,\vect{x}_\sampleSize \right\}$ of the random vector $\inputRV$ (the distribution of which is known) and by computing $\vect{y}_j = \model \left( \vect{x}_j \right) \ \forall 1 \leq j \leq \sampleSize$ . Then, the Monte-Carlo estimations for the mean and standard deviation are the empirical mean and standard deviations of the sample:

$\begin{aligned} \widehat{m}_{Y^i} = \frac{1}{\sampleSize} \sum_{j=1}^\sampleSize y^i_j,\ \widehat{\sigma}_{Y^i} = \sqrt{\frac{1}{\sampleSize} \sum_{j=1}^\sampleSize \left( y^i_j - \widehat{m}_{Y^i} \right)^2} \end{aligned}$

These are just estimations, but by the law of large numbers their convergence to the real values $m_{Y^i}$ and $\sigma_{Y^i}$ is assured as the sample size $\sampleSize$ tends to infinity. The Central Limit Theorem enables the difference between the estimated value and the sought value to be controlled by means of a confidence interval (especially if $\sampleSize$ is sufficiently large, typically $\sampleSize$ > a few dozens even if there is now way to say for sure if the asymptotic behavior is reached). For a probability $\alpha$ strictly between $0$ and $1$ chosen by the user, one can, for example, be sure with a confidence $\alpha$ , that the true value of $m_{Y^i}$ is between $\widehat{m}_{i,\inf}$ and $\widehat{m}_{i,\sup}$ calculated analytically from simple formulae. To illustrate, for $\alpha = 0.95$ :

$\begin{aligned} \widehat{m}_{i,\inf} = \widehat{m}_{Y^i} - 1.96 \frac{\displaystyle \widehat{\sigma}_{Y^i}}{\displaystyle \sqrt{\sampleSize}},\ \widehat{m}_{i,\sup} = \widehat{m}_{Y^i} + 1.96 \frac{\widehat{\sigma}_{Y^i}}{\sqrt{\sampleSize}},\ \textrm{that is to say}\ \textrm{Pr} \left( \widehat{m}_{i,\inf} \leq m_{Y^i} \leq \widehat{m}_{i,\sup} \right) = 0.95 \end{aligned}$

The size of the confidence interval, which represents the uncertainty of this mean estimation, decreases as $\sampleSize$ increases but more gradually (the rate is proportional to $\sqrt{\sampleSize}$ : multiplying $\sampleSize$ by $100$ reduces the length of the confidence interval $\left| \widehat{m}_{i,\inf}-\widehat{m}_{i,\sup} \right|$ by a factor $10$ ).

This method is also referred to as Direct sampling, crude Monte Carlo method, Classical Monte Carlo integration.

(Source code, svg)

OpenTURNS

An Open source initiative for the Treatment of Uncertainties, Risks'N Statistics

Previous topic

Next topic

This Page

Estimating moments with Monte Carlo¶