Kolmogorov-Smirnov two samples test¶
Let  be a scalar uncertain variable modeled as a random
variable. This method deals with the construction of a dataset prior to
the choice of a probability distribution for 
. This statatistical
test is used to compare two samples 
and 
; the goal is to determine
whether these two samples come from the same probability distribution or
not. If this is the case, the two samples should be aggregated in order
to increase the robustness of further statistical analysis.
The test relies on the maximum distance between the cumulative distribution
functions  and 
 of the samples
 and 
.
This distance is expressed as follows:
The probability distribution of the distance 
is asymptotically known (i.e. as the size of the samples tends to
infinity). If 
 and 
 are sufficiently large, this means
that for a probability 
, one can calculate the threshold /
critical value 
 such that:
- if - , we conclude that the two samples are not identically distributed, with a risk of error - , 
- if - , it is reasonable to say that both samples arise from the same distribution. 
An important notion is the so-called “-value” of the test. This
quantity is equal to the limit error probability
 under which the “identically-distributed”
hypothesis is rejected. Thus, the two samples will be supposed
identically distributed if and only if 
 is
greater than the value 
 desired by the user. Note that the
higher 
, the more robust the
decision.
This test is also referred to as the Kolmogorov-Smirnov’s test for two samples.
Examples:
References:
 OpenTURNS
      OpenTURNS