KernelSmoothing¶

class
KernelSmoothing
(*args)¶ Non parametric continuous distribution estimation by kernel smoothing.
Refer to Kernel smoothing.
Parameters:  kernel :
Distribution
, optional Univariate distribution of the kernel that will be used. By default, the standard Normal distribution is used.
 bined : bool, optional
Activates bining mecanism only in the univariate or bivariate cases. It allows to speed up the manipulation of the density function of the resulting distribution. By default, the mecanism is activated.
 binNumber : int, , optional
Indicates the number of bins used by the bining mecanism. By default, OpenTURNS uses the values stored in the ResourceMap.
 boundaryCorrection : bool, optional
Activates the boundary correction using the mirroring technique. By default, the correction is not provided.
Notes
The binning mecanism creates a regular grid of binNumber intervals in each dimension, then the unit weight of each point is linearly affected to the vertices of the bin containing the point. See [Wand1994] for the details.
The boundary correction is available only in one dimension, and it is done using the mirroring technique. See the notes of the setBoundingOption method for the details.
When applied to multivariate samples, the kernel is the kernel product of the univariate distribution specified in the constructor.
Examples
Fit a distribution on data thanks to the kernel smoothing technique:
>>> import openturns as ot >>> sample = ot.Gamma(6.0, 1.0).getSample(100) >>> kernel = ot.KernelSmoothing() >>> fittedDist = kernel.build(sample)
Compare the PDFs:
>>> graph = fittedDist.drawPDF() >>> graph.add( ot.Gamma(6.0, 1.0).drawPDF()) >>> graph.setColors(['blue', 'red']) >>> graph.setLegends(['KS dist', 'Gamma'])
Attributes: thisown
The membership flag
Methods
build
(*args)Fit a kernel smoothing distribution on data. buildAsKernelMixture
(sample, bandwidth)Fit a kernel smoothing distribution on data. buildAsMixture
(sample, bandwidth)Fit a kernel smoothing distribution on data. buildAsTruncatedDistribution
(sample, bandwidth)Fit a kernel smoothing distribution on data. buildEstimator
(*args)Build the distribution and the parameter distribution. computeMixedBandwidth
(sample)Compute the bandwith according to a mixed rule. computePluginBandwidth
(sample)Compute the bandwith according to the plugin rule. computeSilvermanBandwidth
(sample)Compute the bandwith according to the Silverman rule. getBandwidth
()Accessor to the bandwith used in the kernel smoothing. getBootstrapSize
()Accessor to the bootstrap size. getClassName
()Accessor to the object’s name. getId
()Accessor to the object’s id. getKernel
()Accessor to kernel used in the kernel smoothing. getName
()Accessor to the object’s name. getShadowedId
()Accessor to the object’s shadowed id. getVisibility
()Accessor to the object’s visibility state. hasName
()Test if the object is named. hasVisibleName
()Test if the object has a distinguishable name. setAutomaticLowerBound
(automaticLowerBound)Accessor to the flag for an automatic selection of lower bound. setAutomaticUpperBound
(automaticUpperBound)Accessor to the flag for an automatic selection of upper bound. setBootstrapSize
(bootstrapSize)Accessor to the bootstrap size. setBoundaryCorrection
(boundaryCorrection)Accessor to the boundary correction flag. setBoundingOption
(boundingOption)Accessor to the boundary correction option. setLowerBound
(lowerBound)Accessor to the lower bound for boundary correction. setName
(name)Accessor to the object’s name. setShadowedId
(id)Accessor to the object’s shadowed id. setUpperBound
(upperBound)Accessor to the upper bound for boundary correction. setVisibility
(visible)Accessor to the object’s visibility state. 
__init__
(*args)¶ Initialize self. See help(type(self)) for accurate signature.

build
(*args)¶ Fit a kernel smoothing distribution on data.
Parameters:  sample : 2d sequence of float
Data on which the distribution is fitted. Any dimension.
 bandwidth :
Point
, optional Contains the bandwith in each direction. If not specified, the bandwith is calculated using the mixed rule from data.
Returns:  fittdDist :
Distribution
The fitted distribution.
Notes
According to the dimension of the data and the specified treatments, the resulting distribution differs.
If the sample is constant, a
Dirac
distribution is built.If dimension > 2 or if no treatment has been asked for, a
KernelMixture
is built by calling buildAsKernelMixture.If dimension = 1 and a boundary treatment has been asked for, a
TruncatedDistribution
is built by calling buildAsTruncatedDistributionIf dimension = 1 or 2 and no boundary treatment has been asked for, but a binning treatment has been asked for,
 If the sample size is greater than the bin number, then a
Mixture
is built by calling buildAsMixture  Otherwise a
KernelMixture
is built by calling buildAsKernelMixture
 If the sample size is greater than the bin number, then a
Examples
See the effect of the boundary correction:
>>> import openturns as ot >>> sample = ot.Exponential(1.0).getSample(1000) >>> smoother = ot.KernelSmoothing() >>> fittedDistNoCorr = smoother.build(sample) >>> smoother.setBoundaryCorrection(True) >>> fittedDistWithCorr = smoother.build(sample)
Compare the PDFs:
>>> graph = ot.Exponential(1.0).drawPDF() >>> graph.add(fittedDistNoCorr.drawPDF()) >>> graph.add(fittedDistWithCorr.drawPDF()) >>> graph.setColors(['black', 'blue', 'red']) >>> graph.setLegends(['Exp dist', 'No boundary corr', 'Boundary corr'])

buildAsKernelMixture
(sample, bandwidth)¶ Fit a kernel smoothing distribution on data.
Parameters:  sample : 2d sequence of float
Data on which the distribution is fitted. Any dimension.
 bandwidth :
Point
, optional Contains the bandwith in each direction.
Returns:  fittdDist :
KernelMixture
The fitted distribution.
Notes
It builds a using the given data and bandwidth regardless of the binning or boundary treatment flags.
Examples
>>> import openturns as ot >>> sample = ot.Exponential(1.0).getSample(1000) >>> smoother = ot.KernelSmoothing() >>> kernelMixture = smoother.buildAsKernelMixture(sample, [1.0])

buildAsMixture
(sample, bandwidth)¶ Fit a kernel smoothing distribution on data.
Parameters:  sample : 2d sequence of float
Data on which the distribution is fitted. Any dimension.
 bandwidth :
Point
, optional Contains the bandwith in each direction.
Returns:  fittdDist :
KernelMixture
The fitted distribution.
Notes
It builds a using the given bandwidth and a binning of the given data regardless of the bin number, the data size, the binning flag or boundary treatment flags. This method is available only for 1D or 2D samples.
Examples
>>> import openturns as ot >>> sample = ot.Exponential(1.0).getSample(1000) >>> smoother = ot.KernelSmoothing(ot.Normal(), True, 100, False) >>> mixture = smoother.buildAsMixture(sample, [1.0])

buildAsTruncatedDistribution
(sample, bandwidth)¶ Fit a kernel smoothing distribution on data.
Parameters:  sample : 2d sequence of float
Data on which the distribution is fitted. Any dimension.
 bandwidth :
Point
, optional Contains the bandwith in each direction.
Returns:  fittdDist :
KernelMixture
The fitted distribution.
Notes
It builds a using the given bandwidth and the given data regardless of the boundary treatment flags. A binning is done if asked for and if the sample size is greater than the bin number. Then, the corresponding
KernelMixture
if no binning,Mixture
otherwise, is truncated to the range of the sample.Examples
>>> import openturns as ot >>> sample = ot.Exponential(1.0).getSample(1000) >>> smoother = ot.KernelSmoothing(ot.Normal(), False, 0, True) >>> truncated = smoother.buildAsTruncatedDistribution(sample, [1.0])

buildEstimator
(*args)¶ Build the distribution and the parameter distribution.
Parameters:  sample : 2d sequence of float
Sample from which the distribution parameters are estimated.
 parameters :
DistributionParameters
Optional, the parametrization.
Returns:  resDist :
DistributionFactoryResult
The results.
Notes
According to the way the native parameters of the distribution are estimated, the parameters distribution differs:
 Moments method: the asymptotic parameters distribution is normal and estimated by Bootstrap on the initial data;
 Maximum likelihood method with a regular model: the asymptotic parameters distribution is normal and its covariance matrix is the inverse Fisher information matrix;
 Other methods: the asymptotic parameters distribution is estimated by Bootstrap on the initial data and kernel fitting (see
KernelSmoothing
).
If another set of parameters is specified, the native parameters distribution is first estimated and the new distribution is determined from it:
 if the native parameters distribution is normal and the transformation regular at the estimated parameters values: the asymptotic parameters distribution is normal and its covariance matrix determined from the inverse Fisher information matrix of the native parameters and the transformation;
 in the other cases, the asymptotic parameters distribution is estimated by Bootstrap on the initial data and kernel fitting.
Examples
Create a sample from a Beta distribution:
>>> import openturns as ot >>> sample = ot.Beta().getSample(10) >>> ot.ResourceMap.SetAsUnsignedInteger('DistributionFactoryDefaultBootstrapSize', 100)
Fit a Beta distribution in the native parameters and create a
DistributionFactory
:>>> fittedRes = ot.BetaFactory().buildEstimator(sample)
Fit a Beta distribution in the alternative parametrization :
>>> fittedRes2 = ot.BetaFactory().buildEstimator(sample, ot.BetaMuSigma())

computeMixedBandwidth
(sample)¶ Compute the bandwith according to a mixed rule.
Returns:  bandwidth :
Point
Bandwith wich components are evaluated according to a mixed rule.
Notes
Simply use the plugin rule for small sample, and estimate the ratio between the plugin rule and the Silverman rule on a small sample, then scale the Silverman bandwidth computed on the full sample with this ratio.
 bandwidth :

computePluginBandwidth
(sample)¶ Compute the bandwith according to the plugin rule.
Returns:  bandwidth :
Point
Bandwith wich components are evaluated according to the plugin rule.
Notes
Warning! It can take a lot of time for large samples, as the cost is quadratic with the sample size.
 bandwidth :

computeSilvermanBandwidth
(sample)¶ Compute the bandwith according to the Silverman rule.
Returns:  bandwidth :
Point
Bandwith wich components are evaluated according to the Silverman rule supposing a normal distribution. The bandwith is based on the evaluation of the interquartiles rather than the standard deviation of the distribution and the sample.
 bandwidth :

getBandwidth
()¶ Accessor to the bandwith used in the kernel smoothing.
Returns:  bandwidth :
Point
Bandwith used in each direction.
 bandwidth :

getBootstrapSize
()¶ Accessor to the bootstrap size.
Returns:  size : integer
Size of the bootstrap.

getClassName
()¶ Accessor to the object’s name.
Returns:  class_name : str
The object class name (object.__class__.__name__).

getId
()¶ Accessor to the object’s id.
Returns:  id : int
Internal unique identifier.

getKernel
()¶ Accessor to kernel used in the kernel smoothing.
Returns:  kernel :
Distribution
Univariate distribution used to build the kernel.
 kernel :

getName
()¶ Accessor to the object’s name.
Returns:  name : str
The name of the object.

getShadowedId
()¶ Accessor to the object’s shadowed id.
Returns:  id : int
Internal unique identifier.

getVisibility
()¶ Accessor to the object’s visibility state.
Returns:  visible : bool
Visibility flag.

hasName
()¶ Test if the object is named.
Returns:  hasName : bool
True if the name is not empty.

hasVisibleName
()¶ Test if the object has a distinguishable name.
Returns:  hasVisibleName : bool
True if the name is not empty and not the default one.

setAutomaticLowerBound
(automaticLowerBound)¶ Accessor to the flag for an automatic selection of lower bound.
Parameters:  automaticLowerBound : bool
Flag to tell if the userdefined lower bound has to be taken into account (value False) or if the minimum of the given sample has to be used (value True).

setAutomaticUpperBound
(automaticUpperBound)¶ Accessor to the flag for an automatic selection of upper bound.
Parameters:  automaticUpperBound : bool
Flag to tell if the userdefined upper bound has to be taken into account (value False) or if the maximum of the given sample has to be used (value True).

setBootstrapSize
(bootstrapSize)¶ Accessor to the bootstrap size.
Parameters:  size : integer
Size of the bootstrap.

setBoundaryCorrection
(boundaryCorrection)¶ Accessor to the boundary correction flag.
Parameters:  boundaryCorrection : bool
Activates the boundary correction using the mirroring technique.

setBoundingOption
(boundingOption)¶ Accessor to the boundary correction option.
Parameters:  boundingOption : int
Select the boundary correction option, see notes.
Notes
The possible values for the bounding option are:
 KernelSmoothing.NONE or 0: no boundary correction
 KernelSmoothing.LOWER or 1: apply the boundary correction to the lower bound
 KernelSmoothing.UPPER or 2: apply the boundary correction to the upper bound
 KernelSmoothing.BOTH or 3: apply the boundary correction to both bounds
It applies only to 1D samples. Each bound can be defined by the user or computed automatically from the sample, see setLowerBound, setUpperBound, setAutomaticLowerBound, setAutomaticUpperBound.

setLowerBound
(lowerBound)¶ Accessor to the lower bound for boundary correction.
Parameters:  lowerBound : float
A userdefined lower bound to take into account for boundary correction.
Notes
This method automatically sets the automaticLowerBound flag to False. The given value will be taken into account only if boundingOption is set to either 1 or 3. If the algorithm is applied to a sample with a minimum value less than the userdefined lower bound and the automaticLowerBound is set to False, then an exception it raised.

setName
(name)¶ Accessor to the object’s name.
Parameters:  name : str
The name of the object.

setShadowedId
(id)¶ Accessor to the object’s shadowed id.
Parameters:  id : int
Internal unique identifier.

setUpperBound
(upperBound)¶ Accessor to the upper bound for boundary correction.
Parameters:  upperBound : float
A userdefined lower bound to take into account for boundary correction.
Notes
This method automatically sets the automaticLowerBound flag to False. The given value will be taken into account only if boundingOption is set to either 1 or 3. If the algorithm is applied to a sample with a minimum value less than the userdefined lower bound and the automaticLowerBound is set to False, then an exception it raised.

setVisibility
(visible)¶ Accessor to the object’s visibility state.
Parameters:  visible : bool
Visibility flag.

thisown
¶ The membership flag
 kernel :