LOLAVoronoi¶

class LOLAVoronoi(*args)¶

LOLA-Voronoi sequential design algorithm.

Warning

This class is experimental and likely to be modified in future releases. To use it, import the openturns.experimental submodule.

This class implements the design of experiments algorithm described in [crombecq2011] which allows one to sequentially generate new input samples according to the input measure and model output. It combines an exploration criterion based on Voronoi tesselation to identify undersampled input design regions and an exploitation criterion based on a measure of the nonlinearity of the model.

Lets define a random vector $\vect{X}$ with measure $\inputMeasure(\vect{x})$ and dimension $d$ , $\sampleSize$ initial points $\vect{p}_i, \forall i \in [1,\sampleSize]$ and a model $\model$ .

The exploration criterion score of the point $\vect{p}_i$ is equal to the probability of its Voronoi cell, which can be written as:

$v(\vect{p}_i) &= \int_{\vect{x} \in \mathcal{V}_i} \inputMeasure(\vect{x}) d\vect{x} \\ &= \Expect{1_{\mathcal{V}_i}(\vect{X})}$

for $i \in\{1, ..., \sampleSize\}$ where $\sampleSize$ is the sample size and $1_{\mathcal{V}_i}$ is the indicator function of the i-th Voronoi cell:

$1_{\mathcal{V}_i}(\vect{x}) = \begin{cases} 1 & \textrm{ if } \vect{x} \in \mathcal{V}_i \\ 0 & \textrm{otherwise.} \end{cases}$

The Voronoi score is approximated using a Monte Carlo experiment.

For the nonlinearity criterion the algorithm only uses the data available and chooses a set of neighbouring samples $N(\vect{p}_i) = \{\vect{p}_{i1}, \hdots, \vect{p}_{im}\}$ with $m = 2 d$ to build the gradient approximation of the model $\model$ at the point $\vect{p}_i$ , denoted by $\mat{J}_i$ .

The LOLA algorithm local nonlinearity score of the point $\vect{p}_{r}$ is the cumulated error between the model and the linear approximation built from its neighbourhood. For each component $\model_k$ of the model, the local nonlinearity score is given by:

$e_k(\vect{p}_r) = \sum_{i=1}^m | \model_k(\vect{p}_{ri}) - (\model_k(\vect{p}_r) + \mat{J}_i(\vect{p}_{ri} - \vect{p}_r)) |$

for $i \in \{1, ..., \sampleSize\}$ where $\model$ is the model and $\model(\vect{p}_{r}) + \mat{J}_i (\vect{p}_{ri} - \vect{p}_{r})$ is its linear approximation in the neighbourhood of the point $\vect{p}_{r}$ .

In order to aggregate all the $e_k(\vect{p}_r) \forall k \in [1, d_Y]$ , we can use two different norms:

$e(\vect{p}_r) = \frac{1}{d_Y} \sum_{k=1}^{d_Y} e_k(\vect{p}_r)$

or

$e(\vect{p}_r) = \max_{k=1, .. d_Y} e_k(\vect{p}_r)$

Now the two metrics are combined into the hybrid score given by:

$h(\vect{p}_i) = \lambda v(\vect{p}_i) + (1 - \lambda) \frac{e(\vect{p}_i)}{\sum_{j=1}^n e(\vect{p}_j)}$

for $i \in \{1, ..., \sampleSize\}$ , with $\lambda$ a tradeoff coefficient.

To generate a new point, random points are generated by Monte Carlo close to the reference point $\vect{p}_i$ with highest hybrid score. The new point is selected among the generated sample inside the Voronoi cell of $\vect{p}_i$ and farthest away from $\vect{p}_i$ and its neighbours. To generate a batch of $n_{new}$ points this procedure is repeated on the $n_{new}$ reference points with highest hybrid score taking into account the voronoi cells of the new candidates.

Parameters:

x2-d sequence of float: Initial input sample
y2-d sequence of float: Initial output sample
distributionDistribution: Distribution to generate new input samples

Methods

`generate`(size)	Generate a new sample.
`getClassName`()	Accessor to the object's name.
`getGenerationIndices`()	Accessor to the generation indices.
`getHybridScore`()	Hybrid score accessor.
`getInputSample`()	Accessor to the input sample.
`getLOLAScore`()	LOLA score accessor.
`getName`()	Accessor to the object's name.
`getNeighbourhoodCandidatesNumber`()	Neighbourhood candidates number accessor.
`getOutputSample`()	Accessor to the output sample.
`getVoronoiSamplingSize`()	Voronoi sampling size accessor.
`getVoronoiScore`()	Voronoi score accessor.
`hasName`()	Test if the object is named.
`setName`(name)	Accessor to the object's name.
`setNeighbourhoodCandidatesNumber`(...)	Neighbourhood candidates number accessor.
`setVoronoiSamplingSize`(voronoiSamplingSize)	Voronoi sampling size accessor.
`update`(x, y)	Update the current sample.

Notes

Various ResourceMap entries allow for a more fine-grained control over the algorithm:

The entry LOLAVoronoi-DefaultNeighbourhoodCandidatesNumber defines the number of extra closest neighbours around each reference point to consider for selection of the neighbourhood with greatest combined score used for the LOLA criterion.
The entry LOLAVoronoi-DefaultVoronoiSamplingSize controls the sampling size used to estimate the Voronoi cells relative volume or generate new candidates.
The entry LOLAVoronoi-MaximumCombinationsNumber controls the maximum number of existing point subsets will be tested to get the best neighbourhood for the exploitation criterion.
The entry LOLAVoronoi-UseTruncatedDistribution controls the way the $\inputMeasure(\vect{x})$ distribution is truncated to the bounding box B of the Voronoi cell being sampled. If the flag is set to False, the sampling is done by a rejection method with respect to $\inputMeasure(\vect{x})$ and B. The advantage is to avoid the expensive computation of the normalization factor of the truncation of $\inputMeasure(\vect{x})$ . The drawback is that the acceptation rate might be low. If the flag is set to True, the truncated distribution of $\inputMeasure(\vect{x})$ to B is explicitly built and sampled. The advantage is to possibly use the CDF inversion method to sample this truncated distribution: all the points are accepted. But the CDF inversion method is only available if the copula of $\inputMeasure(\vect{x})$ is the independent one: otherwise a rejection method is used and the advantage of this option is lost. The drawback is the automatic computation of the normalization factor of the truncation of $\inputMeasure(\vect{x})$ which is used only if the CDF inversion method is used. As a result, the default value is False.
The string entry LOLAVoronoi-NonLinearityAggregationMethod controls how the non-linearity score is computed across multiple outputs from the gradient vectors: either Maximum (default) to distinguish the most nonlinear regions or Average that will smooth the score but can be a good alternative when there are only a few outputs.
The string entry LOLAVoronoi-DecompositionMethod selects the least-squares method to compute the gradient.
The float entry LOLAVoronoi-HybridScoreTradeoff is the value of the tradeoff coefficient $\lambda$ .

Examples

>>> import openturns as ot
>>> import openturns.experimental as otexp
>>> formula = '-4 * exp((-25 / 8) * (a0^2 + a1^2)) + 7 * exp((-125 / 4) * (a0^2 + a1^2))'
>>> f1 = ot.SymbolicFunction(['a0', 'a1'], [formula])
>>> distribution = ot.JointDistribution([ot.Uniform(-1.0, 1.0)] * 2)
>>> x0 = ot.LowDiscrepancyExperiment(ot.HaltonSequence(), distribution, 10).generate()
>>> y0 = f1(x0)
>>> algo = otexp.LOLAVoronoi(x0, y0, distribution)

Now add 2 blocks of 3 points:

>>> for i in range(2):
...     x = algo.generate(3)
...     y = f1(x)
...     algo.update(x, y)

__init__(*args)¶

generate(size)¶

Generate a new sample.

Parameters:

sizeint: Size of sample to generate

Returns:

xSample: New input sample.

getClassName()¶

Accessor to the object’s name.

Returns:

class_namestr: The object class name (object.__class__.__name__).

getGenerationIndices()¶

Accessor to the generation indices.

Returns:

generationIndicesIndices: Indices of last element for each generation. This is updated each time the update() method is evaluated.

getHybridScore()¶

Hybrid score accessor.

Returns:

hybridScoreSample: The hybrid score $h(p_i)$ for the current sample.

getInputSample()¶

Accessor to the input sample.

Returns:

xSample: Input sample

getLOLAScore()¶

LOLA score accessor.

Returns:

lolaScoreSample: The exploitation score $e(p_i)$ for the current sample. Note that unlike $v(p_i)$ this is scaled in $[0, 1]$ .

getName()¶

Accessor to the object’s name.

Returns:

namestr: The name of the object.

getNeighbourhoodCandidatesNumber()¶

Neighbourhood candidates number accessor.

Returns:

neighbourhoodCandidatesNumberint: The number of extra closest neighbour around each reference point to consider for selection of the neighbourhood with greatest combined score used for the LOLA criterion. The default value is set to the LOLAVoronoi-DefaultNeighbourhoodCandidatesNumber entry from ResourceMap.

getOutputSample()¶

Accessor to the output sample.

Returns:

ySample: Input sample

getVoronoiSamplingSize()¶

Voronoi sampling size accessor.

Returns:

voronoiSamplingSizeint: The sampling size used to estimate the Voronoi cells relative volume or generate new candidates. The default value is set to the LOLAVoronoi-DefaultVoronoiSamplingSize entry from ResourceMap.

getVoronoiScore()¶

Voronoi score accessor.

Returns:

voronoiScoreSample: The exploration score $v(p_i)$ for the current sample.

hasName()¶

Test if the object is named.

Returns:

hasNamebool: True if the name is not empty.

setName(name)¶

Accessor to the object’s name.

Parameters:

namestr: The name of the object.

setNeighbourhoodCandidatesNumber(neighbourhoodCandidatesNumber)¶

Neighbourhood candidates number accessor.

Parameters:

neighbourhoodCandidatesNumberint: The number of extra closest neighbour around each reference point to consider for selection of the neighbourhood with greatest combined score used for the LOLA criterion. The default value is set to the LOLAVoronoi-DefaultNeighbourhoodCandidatesNumber entry from ResourceMap.

setVoronoiSamplingSize(voronoiSamplingSize)¶

Voronoi sampling size accessor.

Parameters:

voronoiSamplingSizeint: The sampling size used to estimate the Voronoi cells relative volume or generate new candidates. The default value is set to the LOLAVoronoi-DefaultVoronoiSamplingSize entry from ResourceMap.

update(x, y)¶

Update the current sample.

Adds a new incremental sample (input and output) to the existing sample.

Parameters:

x, ySample: New incremental sample.

Examples using the class¶

LOLA-Voronoi sequential design of experiment

OpenTURNS

An Open source initiative for the Treatment of Uncertainties, Risks'N Statistics

Table of Contents

Previous topic

Next topic

This Page

LOLAVoronoi¶

Examples using the class¶