Optimal LHS design¶
Let  be a random vector of input parameters.
Latin Hypercube Sample (LHS) is a way to distribute N sample points: each
parameter range is divided into N equal intervals, and sample points are
chosen such that any hyperplane in that dimension contains one and only one
sample point.
The goal of this module is to improve standard LHS techniques by minimizing a space filling criterion.
Principles¶
We may notice two types of LHS designs:
- Centered design is obtained by choosing for each point the center of the corresponding cell 
- Randomized LHS is obtained by adding random perturbations inside each cell 
Let the input vector  whose marginals are independent and associated probabilistic measure 
In practice, we look for a design in the space  and we use an inverse
iso-probabilistic transformation to get the result in the original domain.
Let  be a a space filling criterion, which is a measure of accuracy of an optimal LHS design.
Most of these criteria focus on discrepancy, which measures how far a given distribution of points deviates
from a perfectly uniform one.
We denote by 
 the experiment design with 
points where the 
 -th point is 
.`
Two space filling criteria are implemented:
- The centered - -discrepancy, called - and given by: 
This discrepancy is to be minimized to get an optimal design.
- The mindist criterion (minimal distance between two points in the design): 
This criterion is to be maximized.
- In practice, the - criterion is used instead of mindist and writes: 
This is supposed to be more robust. When p tends to infinity, optimizing a design with 
is equivalent to optimizing a design with mindist.
This criterion is to be minimized to get an optimal design.
The objective is to generate an LHS design 
that minimizes a space filling criterion 
 (or maximizes mindist).
For that purpose, two techniques are implemented and presented
hereafter.
Monte Carlo¶
This problem can be approximated by a Monte Carlo algorithm: a fixed number of designs are generated, and
the optimal one is kept.
This algorithm is trivial and available in MonteCarloLHS.
One of the major drawbacks of Monte Carlo sampling is the CPU time consumption, because the number of
generated designs must be high.
Simulated Annealing¶
An alternate solution is to use an adapted simulated annealing method, available in
SimulatedAnnealingLHS, which we will now describe.
Starting from an LHS design, a new design is obtained by permuting one random coordinate of two
randomly chosen elements; by construction, this design is also an LHS design.
If the new design is better than the previous one, it is kept.
If it is worse, it may anyway be kept with some probability, which depends on how these designs compare,
but also on a temperature profile T which decreases over time.
This means that jumping away from local extrema becomes less probable over time.
It is important to highlight here that this specific permutation has been chosen in this algorithm
because it allows highly efficient computations of the criterion during the simulated annealing process.
The naive criterion evaluation, as is done in Monte Carlo algorithm, has a complexity of
 for 
 and 
 criteria.
Let us first illustrate with the  criterion. We set 
, equation rewrites:
with:
(1)¶
We set  the elements of a new design
 obtained by
permuting a coordinate of sample points 
 and 
.
We can see that
(2)¶
and thus,  becomes:
Updating  criterion can be performed by a 
algorithm, which has a much better complexity than a naive computation.
The same trick can also be applied on  criterion, because we can write
with
These  coefficients satisfy the same conditions, so the same computations give:
In practice, a marginal transformation is performed to map the initial multivariate distribution into .
Optimization is performed in 
 and the inverse transformation maps
the design into the initial space.
 OpenTURNS
      OpenTURNS