Estimate a confidence interval of a quantileΒΆ

In this example, we introduce two methods to estimate a confidence interval of the \alpha level quantile (\alpha \in [0,1]) of the distribution of a scalar random variable X. Both methods use the order statistics to estimate:

  • an asymptotic confidence interval with confidence level \beta \in [0,1],

  • an exact upper bounded confidence interval with confidence level \beta \in [0,1].

In this example, we consider the quantile of level \alpha = 95\%, with a confidence level of \beta = 90\%.

import openturns as ot
import openturns.experimental as otexp
import math as m

We consider a random vector which is the output of a model and an input distribution.

model = ot.SymbolicFunction(["x1", "x2"], ["x1^2+x2"])
R = ot.CorrelationMatrix(2)
R[0, 1] = -0.6
inputDist = ot.Normal([0.0, 0.0], R)
inputDist.setDescription(["X1", "X2"])
inputVector = ot.RandomVector(inputDist)

Create the output random vector

output = ot.CompositeRandomVector(model, inputVector)

We define the level \alpha of the quantile and the confidence level \beta.

alpha = 0.95
beta = 0.90

We generate a sample of the variable.

n = 10**4
sample = output.getSample(n)

We get the empirical estimator of the \alpha level quantile which is the \lfloor \sampleSize \alpha \rfloor -th order statistics evaluated on the sample.

empiricalQuantile = sample.computeQuantile(alpha)
print(empiricalQuantile)
[4.31123]

The asymptotic confidence interval of level \beta is \left[ X_{(i_n)}, X_{(j_n)}\right] such that:

i_\sampleSize & = \left\lfloor \sampleSize \alpha - \sqrt{\sampleSize} \; z_{\frac{1+\beta}{2}} \; \sqrt{\alpha(1 - \alpha)} \right\rfloor\\
j_\sampleSize & = \left\lfloor \sampleSize \alpha + \sqrt{\sampleSize} \; z_{\frac{1+\beta}{2}} \;  \sqrt{\alpha(1 - \alpha)} \right\rfloor

where z_{\frac{1+\beta}{2}} is the \frac{1+\beta}{2} level quantile of the standard normal distribution (see [delmas2006] proposition 11.1.13).

Then we have:

\lim\limits_{\sampleSize \rightarrow +\infty} \Prob{x_{\alpha} \in \left[ X_{(i_\sampleSize,\sampleSize)}, X_{(j_\sampleSize,\sampleSize)}\right]} = \beta

a_beta = ot.Normal(1).computeQuantile((1.0 + beta) / 2.0)[0]
i_n = int(n * alpha - a_beta * m.sqrt(n * alpha * (1.0 - alpha)))
j_n = int(n * alpha + a_beta * m.sqrt(n * alpha * (1.0 - alpha)))
print(i_n, j_n)
9464 9535

Get the sorted sample

sortedSample = sample.sort()

Get the asymptotic confidence interval \left[ X_{(i_n)}, X_{(j_n)}\right] Care: the index in the sorted sample is i_n-1 and j_n-1

infQuantile = sortedSample[i_n - 1]
supQuantile = sortedSample[j_n - 1]
print(infQuantile, empiricalQuantile, supQuantile)
[4.20242] [4.31123] [4.4304]

This can be done using QuantileConfidence when the ranks i_n and j_n are directly given in [0, n-1]

algo = otexp.QuantileConfidence(alpha, beta)
i_n, j_n = algo.computeAsymptoticBilateralRank(n)
ci = algo.computeAsymptoticBilateralConfidenceInterval(sample)
print(f"asymptotic ranks={[i_n, j_n]} CI={ci}")
asymptotic ranks=[9463, 9534] CI=[4.20242, 4.4304]

The empirical quantile was estimated with the \lfloor \sampleSize\alpha \rfloor -th order statistics evaluated on the sample of size \sampleSize. We define i = \sampleSize-\lfloor \sampleSize\alpha \rfloor and we evaluate the minimum sample size \tilde{\sampleSize} that ensures that the (\tilde{\sampleSize}-i) order statistics is greater than x_{\alpha} with the confidence \beta.

i = n - int(n * alpha)
minSampleSize = algo.computeUnilateralMinimumSampleSize(i, True)
print(minSampleSize)
10583

Now we can evaluate the upper bounded confidence interval: we generate a sample with the previous minimum size and extract the empirical quantile of order (\tilde{\sampleSize}-i).

sample = output.getSample(minSampleSize)
upperBoundQuantile = sample.sort()[-i - 1]
print(upperBoundQuantile)
[4.44384]

We can also find this rank back from the size

k = algo.computeUnilateralRank(minSampleSize, True)
print(k, minSampleSize - i - 1)
10024 10082

The quantile bound is given in the same manner from the sample and the rank

ci = algo.computeUnilateralConfidenceInterval(sample, True)
print(ci)
[4.25546, (1.79769e+308) +inf[