Chi-squared testΒΆ
The test is a statistical test of whether a given sample of data is drawn
from a given discrete distribution. The library only provides the
test for
distributions of dimension 1.
We denote by a sample of dimension 1.
Let
be the (unknown) cumulative distribution function of the discrete distribution.
We want to
test whether the sample is drawn from the discrete distribution characterized by the
probabilities
where
is the set of parameters of the distribution and
and
its support. Let
be the cumulative distribution function of this candidate distribution.
This test involves the calculation of the test statistic which is
the distance between the empirical number of values equal to in the sample and the
theoretical mean one evaluated from the discrete distribution.
Let be i.i.d. random variables following the
distribution with CDF
. According to the tested distribution
,
the theoretical mean number of values equal to
is
whereas the number evaluated from
is
.
Then the test statistic is defined by:
If some values of have not been observed in the sample, we have to gather values in
classes so that they contain at least 5 data points (empirical rule). Then the theoretical
probabilities of all the values in the class are added to get the
theoretical probability of the class.
Let be the realization of the test statistic
on the sample
.
Under the null hypothesis
,
the distribution of the test statistic
is
known: this is the
distribution, where
is the number
of distinct values in the support of
.
We apply the test as follows.
We fix a risk (error type I) and we evaluate the associated critical value
which is the quantile of order
of
.
Then a decision is made, either by comparing the test statistic to the theoretical threshold
(or equivalently by evaluating the p-value of the sample defined as
and by comparing it to
):
if
(or equivalently
), then we reject
,
if
(or equivalently
), then
is considered acceptable.