QQ-plot

The Quantile - Quantile - Plot (QQ Plot) enables to validate whether two given samples of data are drawn from the same continuous distribution of dimension 1.

We denote by \left\{ x_1,\ldots,x_{\sampleSize} \right\} and \left\{ y_1,\ldots,y_{\sampleSize} \right\} two given samples of dimension 1.

A QQ-Plot is based on the comparison of some empirical quantiles. Let q_{X}(\alpha) be the quantile of order \alpha of the distribution F, with \alpha \in (0, 1). It is defined by:

\begin{aligned}
    q_{X}(\alpha) = \inf \{ x \in \Rset \, |\, F(x) \geq \alpha \}
  \end{aligned}

The empirical quantile of order \alpha built on the sample is defined by:

\begin{aligned}
        \widehat{q}_{X}(\alpha) = x_{([\sampleSize \alpha]+1)}
\end{aligned}

where [\sampleSize\alpha] denotes the integral part of \sampleSize \alpha and \left\{ x_{(1)},\ldots,x_{(\sampleSize)} \right\} is the sample sorted in ascended order:

x_{(1)} \leq \dots \leq x_{(\sampleSize)}

Thus, the j^\textrm{th} smallest value of the sample x_{(j)} is an estimate \widehat{q}_{X}(\alpha) of the \alpha-quantile where \alpha = (j-1)/\sampleSize, for 1 < j \leq \sampleSize.

The QQ-plot draws the couples of empirical quantiles of the same order from both samples: (x_{(j)}, y_{(j)})_{1 < j \leq \sampleSize}. If both samples follow the same distribution, then the points should be close to the diagonal.

The following figure illustrates a QQ-plot with two samples of size \sampleSize=50. In this example, the points remain close to the diagonal and the hypothesis “Both samples are drawn from the same distribution” does not seem false, even if a more quantitative analysis should be carried out to confirm this.

(Source code, png)

../../_images/qqplot_graph-1.png

(Source code, png)

../../_images/qqplot_graph-2.png

In this second example, the two samples clearly arise from two different distributions.