Estimation of a quantile upper bound by Wilks’ method¶

We consider a random variable $X$ of dimension 1 and the unknown $x_{\alpha}$ level quantile of its distribution ( $\alpha \in [0, 1]$ ). We seek to evaluate an upper bound of $x_{\alpha}$ with a confidence greater or equal to $\beta$ , using a given order statistics.

Let $(X_1, \dots, X_\sampleSize)$ be some independent copies of $X$ . Let $X_{(k)}$ be the $k$ -th order statistics of $(X_1, \dots, X_\sampleSize)$ which means that $X_{(k)}$ is the $k$ -th maximum of $(X_1, \dots, X_\sampleSize)$ for $1 \leq k \leq \sampleSize$ . For example, $X_{(1)} = \min (X_1, \dots, X_\sampleSize)$ is the minimum and $X_{(\sampleSize)} = \max (X_1, \dots, X_\sampleSize)$ is the maximum. We have:

$X_{(1)} \leq X_{(2)} \leq \dots \leq X_{(\sampleSize)}$

Smallest rank for an upper bound to the quantile¶

Let $(x_1, \dots, x_\sampleSize)$ be an i.i.d. sample of size $\sampleSize$ of the random variable $X$ . Given a quantile level $\alpha \in [0,1]$ , a confidence level $\beta \in [0,1]$ , and a sample size $\sampleSize$ , we seek the smallest rank $k \in \llbracket 1, \sampleSize \rrbracket$ such that:

(1)¶ $\Prob{x_{\alpha} \leq X_{(k)}} \geq \beta$

The probability density and cumulative distribution functions of the order statistics $X_{(k)}$ are:

(2)¶ $F_{X_{(k)}}(x) & = \sum_{i=k}^{\sampleSize} \binom{\sampleSize}{i}\left(F(x) \right)^i \left(1-F(x) \right)^{\sampleSize-i} \\ p_{X_{(k)}}(x) & = (\sampleSize-k+1)\binom{\sampleSize}{k-1}\left(F(x)\right)^{k-1} \left(1-F(x) \right)^{\sampleSize-k} p(x)$

We notice that $F_{X_{(k)}}(x) = \overline{F}_{(\sampleSize,F(x))}(k-1)$ where $F_{(\sampleSize,F(x))}$ is the cumulated distribution function of the Binomial distribution $\cB(\sampleSize,F(x))$ and $\overline{F}_{(\sampleSize,F(x))}(k) = 1 - F_{(\sampleSize,F(x))}(k)$ is the complementary cumulated distribution fonction (also named survival function in dimension 1). Therefore:

$F_{X_{(k)}}(x_{\alpha}) = \sum_{i=k}^{\sampleSize} \binom{\sampleSize}{i} \alpha^i (1-\alpha)^{\sampleSize-i} = \overline{F}_{(\sampleSize,\alpha)}(k-1)$

and equation (1) implies:

(3)¶ $1-F_{X_{(k)}}(x_{\alpha})\geq \beta$

This implies:

$F_{\sampleSize, \alpha}(k-1)\geq \beta$

The smallest rank $k_{sol}$ such that the previous equation is satisfied is:

$k_{sol} & = \min \{ k \in \llbracket 1, n \rrbracket \, | \, F_{\sampleSize, \alpha}(k-1)\geq \beta \}\\ & = 1 + \min \{ k \in \llbracket 1, n\rrbracket \, | \, F_{\sampleSize, \alpha}(k)\geq \beta \}$

An upper bound of $x_{\alpha}$ is estimated by the value of $X_{(k_{sol})}$ on the sample $(x_1, \dots, x_\sampleSize)$ .

Minimum sample size for an upper bound to the quantile¶

Given $\alpha$ , $\beta$ , and $k$ , we seek for the smallest sample size $\sampleSize$ such that the equation (1) is satisfied. In order to do so, we solve the equation (3) with respect to the sample size $\sampleSize$ .

Once the smallest size $\sampleSize$ has been estimated, a sample of size $\sampleSize$ can be generated from $X$ and an upper bound of $x_{\alpha}$ is estimated using $x_{(\sampleSize-i)}$ i.e. the $\sampleSize - i$ -th observation in the ordered sample $(x_{(1)}, \dots, x_{(\sampleSize)})$ .

OpenTURNS

An Open source initiative for the Treatment of Uncertainties, Risks'N Statistics

Table of Contents

Previous topic

Next topic

This Page

Estimation of a quantile upper bound by Wilks’ method¶

Smallest rank for an upper bound to the quantile¶

Minimum sample size for an upper bound to the quantile¶