Anderson-Darling goodness-of-fit test¶
This method deals with the modelling of a probability distribution of a
random vector . It
seeks to verify the compatibility between a sample of data
and a
candidate probability distribution previous chosen. The Anderson-Darling
Goodness-of-Fit test allows one to answer this
question in the one dimensional case
, and with a
continuous distribution. The current version is limited to the case of
the Normal distribution.
Let us limit the case to . Thus we denote
. This goodness-of-fit test is based on the
distance between the cumulative distribution function
of the sample
and that of the
candidate distribution, denoted
. This distance is a quadratic
type, as in the Cramer-Von Mises test,
but gives more weight to deviations of extreme values:
With a sample , the distance
is estimated by:
where describes the
sample placed in increasing order.
The probability distribution of the distance is
asymptotically known (i.e. as the size of the sample tends to infinity).
If
is sufficiently large, this means that for a probability
and a candidate distribution type, one can calculate the
threshold / critical value
such that:
if
, we reject the candidate distribution with a risk of error
,
if
, the candidate distribution is considered acceptable.
Note that depends on the candidate distribution
being tested; the current version is limited to
the case of the Normal distribution.
An important notion is the so-called “-value” of the test. This
quantity is equal to the limit error probability
under which the candidate distribution is
rejected. Thus, the candidate distribution will be accepted if and only
if
is greater than the value
desired by the user. Note that the higher
, the more robust the decision.