Estimating a quantile by Wilks’ method¶
Let us denote , where is a random vector, and a deterministic vector. We seek here to evaluate, using the probability distribution of the random vector , the -quantile of , where :
If we have a sample of independent samples of the random vector , can be estimated as follows:
the sample of vector is first transformed to a sample of the variable , using ,
the sample is then placed in ascending order, which gives the sample ,
this empirical estimation of the quantile is then calculated by the formula:
where denotes the integral part of .
For example, if and , is equal to , which is the largest value of the sample . We note that this estimation has no meaning unless . For example, if , one can only consider values of a to be between 1% and 99%.
It is also possible to calculate an upper limit for the quantile with a confidence level chosen by the user; one can then be sure with a level of confidence that the real value of is less than or equal to :
The most robust method for calculating this upper limit consists of taking where is an integer between 2 and found by solving the equation:
A solution to this does not necessarily exist, i.e. there may be no integer value for satisfying this equality; one can in this case choose the smallest integer such that:
which ensures that ; in other words, the level of confidence of the quantile estimation is greater than that initially required.
This formula of the confidence interval can be used in two ways:
either directly to determine for the values chosen by the user,
or in reverse to determine the number of simulations to be carried out for the values and chosen by the user; this is known as Wilks’ formula.
For example for , we take with simulations (that is the maximum value out of 59 samples) or else with simulations (that is the second largest result out of the 93 selections). For values of between and , the upper limit is the maximum value of the sample. The following tabular presents the whole results for , still for .
Rank of the upper bound of the quantile |
Rank of the empirical quantile |
|
---|---|---|
59 |
59 |
57 |
93 |
92 |
89 |
124 |
122 |
118 |
153 |
150 |
146 |
181 |
177 |
172 |
208 |
203 |
198 |
234 |
228 |
223 |
260 |
253 |
248 |
286 |
278 |
272 |
311 |
302 |
296 |
336 |
326 |
320 |
361 |
350 |
343 |
386 |
374 |
367 |
410 |
397 |
390 |
434 |
420 |
413 |
458 |
443 |
436 |
482 |
466 |
458 |
506 |
489 |
481 |
530 |
512 |
504 |
554 |
535 |
527 |
577 |
557 |
549 |
601 |
580 |
571 |
624 |
602 |
593 |
647 |
624 |
615 |
671 |
647 |
638 |
694 |
669 |
660 |
717 |
691 |
682 |
740 |
713 |
704 |
763 |
735 |
725 |
786 |
757 |
747 |
809 |
779 |
769 |
832 |
801 |
791 |
855 |
823 |
813 |
877 |
844 |
834 |
900 |
866 |
856 |
923 |
888 |
877 |
945 |
909 |
898 |
968 |
931 |
920 |
991 |
953 |
942 |
is often called the “empirical -quantile” for the variable .