Estimate quantile confidence intervalsΒΆ

In this example, we introduce two methods to estimate a confidence interval of the \alpha level quantile (\alpha \in [0,1]) of the distribution of a random variable X of dimension 1. Both methods use the order statistics to estimate:

  • an asymptotic confidence interval with confidence level \beta \in [0,1],

  • an exact upper bounded confidence interval with confidence level \beta \in [0,1].

In this example, we consider the quantile of level \alpha = 95\%, with a confidence level of \beta = 90\%.

See Exact quantile confidence interval based on order statistics and Asymptotic quantile confidence interval based on order statistics to get details on the signification of these confidence interval.

import openturns as ot
import openturns.experimental as otexp
import math as m

We consider a random vector which is the output of a model and an input distribution.

model = ot.SymbolicFunction(["x1", "x2"], ["x1^2+x2"])
R = ot.CorrelationMatrix(2)
R[0, 1] = -0.6
inputDist = ot.Normal([0.0, 0.0], R)
inputDist.setDescription(["X1", "X2"])
inputVector = ot.RandomVector(inputDist)

Create the output random vector

output = ot.CompositeRandomVector(model, inputVector)

We define the level \alpha of the quantile and the confidence level \beta.

alpha = 0.95
beta = 0.90

We generate a sample of the variable.

n = 10**4
sample = output.getSample(n)

We get the empirical estimator of the \alpha level quantile which is the \lfloor \sampleSize \alpha \rfloor -th order statistics evaluated on the sample.

empiricalQuantile = sample.computeQuantile(alpha)
print(empiricalQuantile)
[4.31123]

The asymptotic confidence interval of level \beta is \left[ X_{(i_n)}, X_{(j_n)}\right] such that:

i_\sampleSize & = \left\lfloor \sampleSize \alpha - \sqrt{\sampleSize} \; z_{\frac{1+\beta}{2}} \; \sqrt{\alpha(1 - \alpha)} \right\rfloor\\
j_\sampleSize & = \left\lfloor \sampleSize \alpha + \sqrt{\sampleSize} \; z_{\frac{1+\beta}{2}} \;  \sqrt{\alpha(1 - \alpha)} \right\rfloor

where z_{\frac{1+\beta}{2}} is the \frac{1+\beta}{2} level quantile of the standard normal distribution (see [delmas2006] proposition 11.1.13).

Then we have:

\lim\limits_{\sampleSize \rightarrow +\infty} \Prob{x_{\alpha} \in \left[ X_{(i_\sampleSize,\sampleSize)}, X_{(j_\sampleSize,\sampleSize)}\right]} = \beta

a_beta = ot.Normal(1).computeQuantile((1.0 + beta) / 2.0)[0]
i_n = int(n * alpha - a_beta * m.sqrt(n * alpha * (1.0 - alpha)))
j_n = int(n * alpha + a_beta * m.sqrt(n * alpha * (1.0 - alpha)))
print(i_n, j_n)
9464 9535

Get the sorted sample

sortedSample = sample.sort()

Get the asymptotic confidence interval \left[ X_{(i_n)}, X_{(j_n)}\right].

Care: the index in the sorted sample is i_n-1 and j_n-1 because python numbering starts at 0.

infQuantile = sortedSample[i_n - 1]
supQuantile = sortedSample[j_n - 1]
print(infQuantile, empiricalQuantile, supQuantile)
[4.20242] [4.31123] [4.4304]

This can be done using QuantileConfidence when the ranks i_n and j_n are directly given in \llbracket 0, \sampleSize-1 \rrbracket.

algo = otexp.QuantileConfidence(alpha, beta)
i_n, j_n = algo.computeAsymptoticBilateralRank(n)
ci = algo.computeAsymptoticBilateralConfidenceInterval(sample)
print(f"asymptotic ranks={[i_n, j_n]} CI={ci}")
asymptotic ranks=[9463, 9534] CI=[4.20242, 4.4304]

The empirical quantile was estimated with the \lfloor \sampleSize\alpha \rfloor -th order statistics evaluated on the sample of size \sampleSize. We define i = \sampleSize-\lfloor \sampleSize\alpha \rfloor and we evaluate the minimum sample size \tilde{\sampleSize} that ensures that the (\tilde{\sampleSize}-i) order statistics is greater than x_{\alpha} with the confidence \beta.

i = n - int(n * alpha)
minSampleSize = algo.computeUnilateralMinimumSampleSize(i, True)
print(minSampleSize)
10583

Now we can evaluate the upper bounded confidence interval: we generate a sample with the previous minimum size and extract the empirical quantile of order (\tilde{\sampleSize}-i).

sample = output.getSample(minSampleSize)
upperBoundQuantile = sample.sort()[-i - 1]
print(upperBoundQuantile)
[4.44384]

We can also find this rank back from the size.

k = algo.computeUnilateralRank(minSampleSize, True)
print(k, minSampleSize - i - 1)
10024 10082

The quantile bound is given in the same manner from the sample and the rank.

ci = algo.computeUnilateralConfidenceInterval(sample, True)
print(ci)
[4.25546, (1.79769e+308) +inf[