Estimation of a quantile bound¶

We consider a random variable $X$ of dimension 1 and the unknown quantile $x_{\alpha}$ of level $\alpha$ of its distribution ( $\alpha \in [0, 1]$ ). We seek to evaluate an upper bound of $x_{\alpha}$ with a confidence greater or equal to $\beta$ , using order statistics.

Let $(X_1, \dots, X_\sampleSize)$ be some independent copies of $X$ . Let $X_{(k)}$ be the $k$ -th order statistics of $(X_1, \dots, X_\sampleSize)$ which means that $X_{(k)}$ is the $k$ -th minimum of $(X_1, \dots, X_\sampleSize)$ for $1 \leq k \leq \sampleSize$ . For example, $X_{(1)} = \min (X_1, \dots, X_\sampleSize)$ is the minimum and $X_{(\sampleSize)} = \max (X_1, \dots, X_\sampleSize)$ is the maximum. We have:

$X_{(1)} \leq X_{(2)} \leq \dots \leq X_{(\sampleSize)}$

The probability density and cumulative distribution functions of the order statistics $X_{(k)}$ are:

(1)¶ $F_{X_{(k)}}(x) & = \sum_{i=k}^{\sampleSize} \binom{\sampleSize}{i}\left(F(x) \right)^i \left(1-F(x) \right)^{\sampleSize-i} \\ p_{X_{(k)}}(x) & = (\sampleSize-k+1)\binom{\sampleSize}{k-1}\left(F(x)\right)^{k-1} \left(1-F(x) \right)^{\sampleSize-k} p(x)$

We notice that $F_{X_{(k)}}(x) = \overline{F}_{(\sampleSize,F(x))}(k-1)$ where $F_{(\sampleSize,F(x))}$ is the cumulated distribution function of the Binomial distribution $\cB(\sampleSize,F(x))$ and $\overline{F}_{(\sampleSize,F(x))}(k) = 1 - F_{(\sampleSize,F(x))}(k)$ is the complementary cumulated distribution fonction (also named survival function in dimension 1). Therefore:

$F_{X_{(k)}}(x_{\alpha}) = \sum_{i=k}^{\sampleSize} \binom{\sampleSize}{i} \alpha^i (1-\alpha)^{\sampleSize-i} = \overline{F}_{(\sampleSize,\alpha)}(k-1)$

Rank for an upper bound of the quantile¶

Let $(x_1, \dots, x_\sampleSize)$ be an i.i.d. sample of size $\sampleSize$ of the random variable $X$ . Given a quantile level $\alpha \in [0,1]$ , a confidence level $\beta \in [0,1]$ , and a sample size $\sampleSize$ , we seek the smallest rank $k \in \llbracket 1, \sampleSize \rrbracket$ such that:

(2)¶ $\Prob{x_{\alpha} \leq X_{(k)}} \geq \beta \qquad$

As equation (2) implies:

(3)¶ $1-F_{X_{(k)}}(x_{\alpha})\geq \beta$

This implies:

$F_{\sampleSize, \alpha}(k-1)\geq \beta$

The smallest rank $k_{sol}$ such that the previous equation is satisfied is:

$k_{sol} & = \min \{ k \in \llbracket 1, n \rrbracket \, | \, F_{\sampleSize, \alpha}(k-1)\geq \beta \}\\ & = 1 + \min \{ k \in \llbracket 1, n\rrbracket \, | \, F_{\sampleSize, \alpha}(k)\geq \beta \}$

An upper bound of $x_{\alpha}$ is estimated by the value of $X_{(k_{sol})}$ on the sample $(x_1, \dots, x_\sampleSize)$ .

Here is a recap of the existence of solutions for this case:

$K_{sol}$	$\beta=0$	$0 < \beta < 1$	$\beta=1$
$\alpha=0$	1	1	1
$0 < \alpha < 1$	1	see (4)	$\emptyset$
$\alpha=1$	1	$\emptyset$	$\emptyset$

With:

(4)¶ $1+F_{n,\alpha}^{-1}(\beta) \text{if} 1-\alpha^n \geq \beta \text{else} \emptyset$

Rank for a lower bound of the quantile¶

Similarly for the lower bound we seek the greatest rank $k \in \llbracket 1, \sampleSize \rrbracket$ such that:

(5)¶ $\Prob{X_{(k)} \leq x_{\alpha}} \geq \beta \qquad$

Here is a recap of the existence of solutions for this case:

$K_{sol}$	$\beta=0$	$0 < \beta < 1$	$\beta=1$
$\alpha=0$	n	$\emptyset$	$\emptyset$
$0 < \alpha < 1$	n	see (6)	$\emptyset$
$\alpha=1$	n	n	n

With

(6)¶ $\emptyset \text{if} (1-\alpha)^n > 1 - \beta \\ \text{otherwise if there exists} k_0 | 1-\beta = F_{(\sampleSize,\alpha}(k_0 - 1) \text{then} k_{sol} = 1+F_{n,\alpha}^{-1}(1-\beta) \text{and if not} k_{sol} = F_{n,\alpha}^{-1}(1-\beta)$

Ranks for bilateral bounds of the quantile¶

Similarly for the lower bound we seek the ranks $k_1, k_2 \in \llbracket 1, \sampleSize \rrbracket^2$ such that:

(7)¶ $\Prob{X_{(k_1)} \leq x_{\alpha} \leq X_{(k_2)}} \geq \beta \qquad$

with $k_2 - k_1$ the smallest.

Here is a recap of the existence of solutions for this case:

$K_{sol}$	$\beta=0$	$0 < \beta < 1$	$\beta=1$
$\alpha=0$	$\Bigl\lfloor \frac{n}{2} \Bigr\rfloor$	$\emptyset$	$\emptyset$
$0 < \alpha < 1$	1	$\emptyset$ or 1	$\emptyset$
$\alpha=1$	$\Bigl\lfloor \frac{n}{2} \Bigr\rfloor$	$\emptyset$	$\emptyset$

Minimum sample size for an upper bound of the quantile¶

Given $\alpha$ , $\beta$ , and order $k$ , we seek for the smallest sample size $\sampleSize$ such that the equation (2) is satisfied. In order to do so, we solve the equation (3) with respect to the sample size $\sampleSize$ .

Once the smallest size $\sampleSize$ has been estimated, a sample of size $\sampleSize$ can be generated from $X$ and an upper bound of $x_{\alpha}$ is estimated using $x_{(k)}$ i.e. the $k$ -th observation in the ordered sample $(x_{(1)}, \dots, x_{(\sampleSize)})$ .

Here is a recap of the existence of solutions for this case:

	$\beta=0$	$0 < \beta < 1$	$\beta=1$
$0 \leq \alpha \leq 1$	$k \text{if} 1-\alpha^k \geq \beta \text{else} \emptyset$

Minimum sample size for a lower bound of the quantile¶

Similarly for the lower bound, we seek for the smallest sample size $\sampleSize$ such that the equation (5) is satisfied.

Here is a recap of the existence of solutions for this case:

	$\beta=0$	$0 < \beta < 1$	$\beta=1$
$\alpha=0$	$k$	$\emptyset$	$\emptyset$
$0 < \alpha < 1$	$\argmin \{n \geq k \| f_{k,\alpha} \leq 1-\beta \}$		$\emptyset$
$\alpha=1$	$k$	$k$	$k$

Minimum sample size for bilateral bounds of the quantile¶

Similarly for the bilateral bounds, we seek for the smallest sample size $\sampleSize$ such that the equation (7) is satisfied.

Here is a recap of the existence of solutions for this case:

	$\beta=0$	$0 < \beta < 1$	$\beta=1$
$\alpha=1$	$k_2$	$\emptyset$	$\emptyset$
$0 < \alpha < 1$	$k_2$ if $1-\alpha^{k_2} - F_{k_2,\alpha}(k_1-1) \geq \beta$ else $\emptyset$
$\alpha=0$	$\emptyset$ if $k_1 \neq 0$ and $\beta > 0$ else $k_2$

OpenTURNS

An Open source initiative for the Treatment of Uncertainties, Risks'N Statistics

Table of Contents

Previous topic

Next topic

This Page

Estimation of a quantile bound¶

Rank for an upper bound of the quantile¶

Rank for a lower bound of the quantile¶

Ranks for bilateral bounds of the quantile¶

Minimum sample size for an upper bound of the quantile¶

Minimum sample size for a lower bound of the quantile¶

Minimum sample size for bilateral bounds of the quantile¶