Cramer-Von Mises test¶

The Cramer-Von Mises test is a statistical test of whether a given sample of data is drawn from a given probability distribution which is of dimension 1 and continuous.

We denote by $\left\{ x_1,\ldots, x_{\sampleSize} \right\}$ the data of dimension 1. Let $F$ be the (unknown) cumulative distribution function of the continuous distribution.

We want to test whether the sample is drawn from the cumulative distribution function $G$ .

This test involves the calculation of the test statistic which is the integrated squared distance between the empirical cumulative distribution function $\widehat{F}$ built from the sample and $G$ . Letting $X_1, \ldots , X_\sampleSize$ be i.i.d. random variables following the distribution with CDF $F$ , the test statistic is defined by:

$\begin{aligned} D_{\sampleSize} = \int^{\infty}_{-\infty} \left[G\left(x\right) - \widehat{F}\left(x\right)\right]^2 \, p\left(x\right) dx \end{aligned}$

The empirical value of the test statistic, evaluated from the sample is:

$\begin{aligned} d_{\sampleSize} = \frac{1}{12 \sampleSize} + \sum_{i=1}^{\sampleSize}\left[\frac{2i-1}{2\sampleSize} - G\left(x_i\right)\right]^2 \end{aligned}$

Under the null hypothesis $\mathcal{H}_0 = \{ G = F\}$ , the distribution of the test statistic $D_{\sampleSize}$ is asymptotically known i.e. when $\sampleSize \rightarrow +\infty$ . If $\sampleSize$ is sufficiently large, we can use the asymptotic distribution to apply the test as follows. We fix a risk $\alpha$ (error type I) and we evaluate the associated critical value $d_\alpha$ which is the quantile of order $1-\alpha$ of $D_{\sampleSize}$ .

Then a decision is made, either by comparing the test statistic to the theoretical threshold $d_\alpha$ (or equivalently by evaluating the p-value of the sample defined as $\Prob{D_{\sampleSize} > d_{\sampleSize}}$ and by comparing it to $\alpha$ ):

if $d_{\sampleSize}>d_{\alpha}$ (or equivalently $\Prob{D_{\sampleSize} > d_{\sampleSize}} < \alpha$ ), then we reject $G$ ,
if $d_{\sampleSize} \leq d_{\alpha}$ (or equivalently $\Prob{D_{\sampleSize} > d_{\sampleSize}} \geq \alpha$ ), then $G$ is considered acceptable.

OpenTURNS

An Open source initiative for the Treatment of Uncertainties, Risks'N Statistics

Previous topic

Next topic

This Page

Cramer-Von Mises test¶