Asymptotic quantile confidence interval based on order statisticsΒΆ

We consider a random variable X of dimension 1 and its quantile x_{\alpha} of level \alpha (where \alpha \in [0, 1]). We want to determine an asymptotic confidence interval of x_{\alpha} with a confidence greater or equal to \beta, using order statistics.

Let (X_1, \dots, X_\sampleSize) be some independent copies of X. Let X_{(k)} be the k -th order statistics of (X_1, \dots, X_\sampleSize):

X_{(1)} \leq X_{(2)} \leq \cdots \leq X_{(\sampleSize)}.

Empirical quantile estimatorΒΆ

We first introduce the empirical estimator of the quantile x_{\alpha}. We denote by \hat{F} the empirical cumulative distribution function defined by:

\hat{F}(x) = \dfrac{1}{\sampleSize} \sum_{i=1}^\sampleSize \mathbb{1}_{X_i \leq x}

Then, the empirical estimator x_{\alpha} is defined by:

\hat{X}_{\alpha} = \inf \left\{ x, \hat{F}(x) \geq \alpha \right\} = X_{(\lceil \sampleSize\alpha \rceil)}

where \lceil x \rceil is the smallest integer value that is greater than or equal to x.

The empirical estimator is asymptotically normal (see [delmas2006], [garnier2008]):

\lim_{\sampleSize \to +\infty} \sqrt{\sampleSize}( \hat{X}_{\alpha} - x_{\alpha}) = \cN(0, \sigma^2)
\quad \mbox{with}  \quad \sigma^2 = \dfrac{\alpha(1-\alpha)}{p^2(x_{\alpha})}

The empirical estimator has a bias and a variance of order 1/\sampleSize (see [david1981], [garnier2008], [Motoyama2025]). We get the following asymptotic results:

\Expect{\hat{X}_{\alpha}} & = x_{\alpha} - \dfrac{\alpha(1-\alpha)p'(x_{\alpha})}{2(\sampleSize+2)p^3(x_{\alpha})} + O\left(
\dfrac{1}{\sampleSize^2}\right)\\
\Var{\hat{X}_{\alpha}} & = \dfrac{\alpha(1-\alpha)}{(\sampleSize+2)p^2(x_{\alpha})} + O\left(\dfrac{1}{\sampleSize^2}\right)

where p is the (continuously differentiable) density of X. This result is not very useful for the construction of a confidence interval as p(x_{\alpha}) is not known.

Asymptotic quantile confidence intervalΒΆ

Here, we seek an asymptotic confidence interval of level \beta based on order statistics. This confidence interval is \left[ X_{(i_\sampleSize)}, X_{(j_\sampleSize)}\right] where the ranks i_\sampleSize and j_\sampleSize are defined by:

i_\sampleSize & = \left\lfloor \sampleSize \alpha - \sqrt{\sampleSize} \; z_{\frac{1+\beta}{2}} \; \sqrt{\alpha(1 - \alpha)} \right\rfloor\\
j_\sampleSize & = \left\lfloor \sampleSize \alpha + \sqrt{\sampleSize} \; z_{\frac{1+\beta}{2}} \;  \sqrt{\alpha(1 - \alpha)} \right\rfloor

where z_{\frac{1+\beta}{2}} is the \frac{1+\beta}{2} level quantile of the standard normal distribution (see [delmas2006] proposition 11.1.13).

Then, we have:

\lim\limits_{\sampleSize \rightarrow +\infty} \Prob{x_{\alpha} \in \left[ X_{(i_\sampleSize,\sampleSize)}, X_{(j_\sampleSize,
\sampleSize)}\right]} = \beta