Estimating a quantile by Wilks’ method¶
Let us denote
,
where
is a random
vector, and
a deterministic vector. We seek here to
evaluate, using the probability distribution of the random vector
, the
-quantile
of
, where
:
If we have a sample
of
independent samples of the random vector
,
can be estimated as follows:
the sample
of vector
is first transformed to a sample
of the variable
, using
,
the sample
is then placed in ascending order, which gives the sample
,
this empirical estimation of the quantile is then calculated by the formula:
where denotes the integral part of
.
For example, if and
,
is equal to
, which is the
largest value of the sample
. We note that this
estimation has no meaning unless
. For
example, if
, one can only consider values of a to be
between 1% and 99%.
It is also possible to calculate an upper limit for the quantile with a
confidence level chosen by the user; one can then be sure
with a
level of confidence that the real value of
is less than or equal to
:
The most robust method for calculating this upper limit consists of
taking
where
is an integer between 2 and
found by
solving the equation:
A solution to this does not necessarily exist, i.e. there may be no
integer value for satisfying this equality;
one can in this case choose the smallest integer
such that:
which ensures that
;
in other words, the level of confidence of the quantile estimation is
greater than that initially required.
This formula of the confidence interval can be used in two ways:
either directly to determine
for the values
chosen by the user,
or in reverse to determine the number
of simulations to be carried out for the values
and
chosen by the user; this is known as Wilks’ formula.
For example for , we take
with
simulations (that is the maximum value out of 59 samples)
or else
with
simulations (that is the
second largest result out of the 93 selections). For values of
between
and
, the upper limit is the maximum value
of the sample. The following tabular presents the whole results for
, still for
.
Rank of the upper bound of the quantile |
Rank of the empirical quantile |
|
---|---|---|
59 |
59 |
57 |
93 |
92 |
89 |
124 |
122 |
118 |
153 |
150 |
146 |
181 |
177 |
172 |
208 |
203 |
198 |
234 |
228 |
223 |
260 |
253 |
248 |
286 |
278 |
272 |
311 |
302 |
296 |
336 |
326 |
320 |
361 |
350 |
343 |
386 |
374 |
367 |
410 |
397 |
390 |
434 |
420 |
413 |
458 |
443 |
436 |
482 |
466 |
458 |
506 |
489 |
481 |
530 |
512 |
504 |
554 |
535 |
527 |
577 |
557 |
549 |
601 |
580 |
571 |
624 |
602 |
593 |
647 |
624 |
615 |
671 |
647 |
638 |
694 |
669 |
660 |
717 |
691 |
682 |
740 |
713 |
704 |
763 |
735 |
725 |
786 |
757 |
747 |
809 |
779 |
769 |
832 |
801 |
791 |
855 |
823 |
813 |
877 |
844 |
834 |
900 |
866 |
856 |
923 |
888 |
877 |
945 |
909 |
898 |
968 |
931 |
920 |
991 |
953 |
942 |
is often called the “empirical
-quantile” for the variable
.