Anderson-Darling goodness-of-fit test¶
This method deals with the modelling of a probability distribution of a random vector . It seeks to verify the compatibility between a sample of data and a candidate probability distribution previous chosen. The Anderson-Darling Goodness-of-Fit test allows to answer this question in the one dimensional case , and with a continuous distribution. The current version is limited to the case of the Normal distribution.
Let us limit the case to . Thus we denote . This goodness-of-fit test is based on the distance between the cumulative distribution function of the sample and that of the candidate distribution, denoted . This distance is a quadratic type, as in the Cramer-Von Mises test, but gives more weight to deviations of extreme values:
With a sample , the distance is estimated by:
where describes the sample placed in increasing order.
The probability distribution of the distance is asymptotically known (i.e. as the size of the sample tends to infinity). If is sufficiently large, this means that for a probability and a candidate distribution type, one can calculate the threshold / critical value such that:
if , we reject the candidate distribution with a risk of error ,
if , the candidate distribution is considered acceptable.
Note that depends on the candidate distribution being tested; the current version is limited to the case of the Normal distribution.
An important notion is the so-called “-value” of the test. This quantity is equal to the limit error probability under which the candidate distribution is rejected. Thus, the candidate distribution will be accepted if and only if is greater than the value desired by the user. Note that the higher , the more robust the decision.
Examples: