Note
Go to the end to download the full example code.
Compare frequentist and Bayesian estimationΒΆ
In this example, we want to estimate of the parameter
of the distribution of a random vector
from which we have some data.
We compare the frequentist and the Bayesian approaches to estimate
.
Let be a random vector following a bivariate normal distribution
with zero mean, unit variance and independent components:
where the parameter
.
We assume to have a sample generated from
denoted by
where
.
We assume to know the parameters and we want to estimate the parameters
.
In the Bayesian approach, we assume that
is a random vector and we define
a link function
such that:
where follows the prior distribution denoted by
.
Note that the link function already contains the information that the two components of
are independent (as
) and centered (as
).
We consider two interesting link functions:
each one being associated to the respective prior distributions:
The second case is such that the link function is not bijective on the range of
.
The Bayesian approach uses the PosteriorDistribution
that estimates the posterior distribution of denoted by
maximizing the likelihood of the conditioned model on the sample, weighted by the prior distribution
. From the
distribution, we extract the vector of modes
denoted by
: this point maximizes
.
It is interesting to note that when , then
such that
is a Dirac distribution centered at
.
The frequentist approach estimates the parameter by maximizing the likelihood of the normal model on the
sample. To model the same level of information, we consider a centered bivariate normal model with independent components.
We denote by
the parameter obtained.
import openturns as ot
import openturns.viewer as otv
import openturns.experimental as otexp
ot.ResourceMap.SetAsUnsignedInteger(
"DeconditionedDistribution-MarginalIntegrationNodesNumber", 32
)
ot.ResourceMap.SetAsString(
"DeconditionedDistribution-ContinuousDiscretizationMethod", "GaussProduct"
)
Here, we define a function that computes the mode of a distribution, which is the point that maximises its PDF.
def computeMode(distribution):
def obj_py(X):
return distribution.computeLogPDF(X) * (-1.0)
obj = ot.PythonFunction(distribution.getDimension(), 1, func_sample=obj_py)
pb = ot.OptimizationProblem(obj)
pb.setBounds(distribution.getRange())
algo = ot.Cobyla(pb)
algo.setStartingPoint(distribution.getMean())
algo.run()
return algo.getResult().getOptimalPoint()
Define the theoretical distribution which is a normal distribution with zero mean,
unit variance and independent components.
conditioned = ot.Normal(2)
conditioned.setDescription(["X0", "X1"])
We create the sample that we are going to use to estimate the parameters.
Nsample = 25
observations = conditioned.getSample(Nsample)
Case 1: Bijective link functionΒΆ
Here, and
.
We create the link function.
linkFunction = ot.SymbolicFunction(["u0", "u1"], ["0.0", "u0", "0.0", "u1", "0.0"])
We create the prior distribution of .
conditioning = ot.JointDistribution([ot.Triangular(0.0, 1.0, 2.0)] * 2)
conditioning.setDescription(["Y0", "Y1"])
We have to decondition the distribution with
respect to the prior distribution
in order to get the
final distribution of
. To do that, we use the
DeconditionedDistribution
.
deconditioned = ot.DeconditionedDistribution(conditioned, conditioning, linkFunction)
Then, we can create the posterior distribution based on the deconditioned distribution of
and
the sample.
posterior_Y = otexp.PosteriorDistribution(deconditioned, observations)
From , we get:
the mode parameter
that maximizes the PDF,
the distribution of
parameterized by
,
a bilateral condifence interval of level
of the posterior distribution of
,
the volume of this condifence interval,
the Kullback-Leibler distance between the Bayesian mode based distribution and the theoretical one:
.
theta_Bay = linkFunction(computeMode(posterior_Y))
print("Theta Bay =", theta_Bay)
model_Bay = ot.Distribution(conditioned)
model_Bay.setParameter(theta_Bay)
dist_estimateur_Bay = posterior_Y
alpha = 0.95
interval_Bay, beta = (
dist_estimateur_Bay.computeBilateralConfidenceIntervalWithMarginalProbability(alpha)
)
print("Beta =", beta)
print("Condifence interval Bay =\n", interval_Bay)
print("Volume =", interval_Bay.getVolume())
sample = model_Bay.getSample(1000000)
dist_Bay = (
model_Bay.computeLogPDF(sample) - conditioned.computeLogPDF(sample)
).computeMean()
print("Kullback-Leibler distance Bay =", dist_Bay[0])
Theta Bay = [0,0.987259,0,0.970747,0]
Beta = 0.9739388920386199
Condifence interval Bay =
[0.75347, 1.37423]
[0.741497, 1.35528]
Volume = 0.38101391892307496
Kullback-Leibler distance Bay = 0.001003194257281321
The frequentist approach estimates a normal distribution from the sample. We fix the known parameters
to their true values.
lh_factory = ot.NormalFactory()
lh_factory.setKnownParameter([0.0, 0.0, 0.0], [0, 2, 4])
lh_est = lh_factory.buildEstimator(observations)
We extract from the likelihood estimate:
the asymptotic distribution of the estimator,
the parameters estimates,
a bilateral condifence interval of level
of the asymptotic distribution of the estimator,
the volume of this condifence interval,
the Kullback-Leibler distance between the maximum likelihood based distribution and the theoretical one:
.
model_ML = lh_est.getDistribution()
theta_ML = model_ML.getParameter()
print("Theta ML = ", theta_Bay)
dist_estimator_ML = lh_est.getParameterDistribution().getMarginal([1, 3])
interval_ML, beta = (
dist_estimator_ML.computeBilateralConfidenceIntervalWithMarginalProbability(alpha)
)
print("Beta =", beta)
print("Condifence interval ML =\n", interval_ML)
print("Volume =", interval_ML.getVolume())
sample = model_ML.getSample(1000000)
dist_KL = (
model_ML.computeLogPDF(sample) - conditioned.computeLogPDF(sample)
).computeMean()
print("Kullback-Leibler distance ML =", dist_KL[0])
Theta ML = [0,0.987259,0,0.970747,0]
Beta = 0.9746523439535231
Condifence interval ML =
[0.612215, 1.26385]
[0.660547, 1.21927]
Volume = 0.3640815958194955
Kullback-Leibler distance ML = 0.0035275895390841725
In the following figure, we plot:
the theoretical distribution of
:
(solid lines),
its maximum likelihood based distribution:
(dashed lines),
its Bayesian mode based distribution:
(dotted lines).
We conclude that both approaches lead to the same results.
ot.ResourceMap.SetAsString("Contour-DefaultColorMapNorm", "rank")
g = conditioned.drawPDF()
levels = g.getDrawable(0).getLevels()
dr_ML = model_ML.drawPDF().getDrawable(0).getImplementation()
dr_ML.setLevels(levels)
dr_ML.setColorBarPosition("")
dr_ML.setLineStyle("dashed")
g.add(dr_ML)
dr_Bay = model_Bay.drawPDF().getDrawable(0).getImplementation()
dr_Bay.setLevels(levels)
dr_Bay.setColorBarPosition("")
dr_Bay.setLineStyle("dotted")
g.add(dr_Bay)
g.add(ot.Cloud(observations))
g.setLegends(["Theoretical dist", "ML dist", "Bay dist", "Observations"])
g.setXTitle(r"$X_0$")
g.setYTitle(r"$X_1$")
g.setTitle("Initial distribution, ML estimated dist and Bayesian estimated dist.")
view = otv.View(g, (800, 800), square_axes=True)
In the following figure, we consider the parameter and we plot:
the asymptotic distribution of the maximum likelihood estimator:
(left),
the Bayesian mode based distribution:
(right).
On each figure, we draw the bilateral confidence interval or bilateral credibility interval of
level
computed from the estimator distribution.
First the maximum likelihood estimator.
g_ML = dist_estimator_ML.drawPDF([0.5] * 2, [1.5] * 2)
c = ot.Cloud([theta_ML[[1, 3]]])
c.setColor("red")
c.setPointStyle("bullet")
g_ML.add(c)
c = ot.Cloud([theta_Bay[[1, 3]]])
c.setColor("red")
c.setPointStyle("+")
g_ML.add(c)
a = interval_ML.getLowerBound()
b = interval_ML.getUpperBound()
c = ot.Curve([a, [a[0], b[1]], b, [b[0], a[1]], a])
c.setColor("blue")
g_ML.add(c)
a = interval_Bay.getLowerBound()
b = interval_Bay.getUpperBound()
c = ot.Curve([a, [a[0], b[1]], b, [b[0], a[1]], a])
c.setColor("blue")
c.setLineStyle("dashed")
g_ML.add(c)
g_ML.setLegends(
[
r"dist of $\mathbf{\theta}_n^{MV}$",
r"$\mathbf{\theta}_n^{MV}$",
r"$g(\mathbf{Y}_n^m)$",
"ML CI(" + str(int(100 * alpha)) + "%)",
"Bay CI(" + str(int(100 * alpha)) + "%)",
]
)
g_ML.setXTitle(r"$\sigma_0$")
g_ML.setYTitle(r"$\sigma_1$")
g_ML.setTitle("ML Estimator")
Then the Bayesian estimator.
g_Bay = dist_estimateur_Bay.drawPDF([0.5] * 2, [1.5] * 2)
c = ot.Cloud([theta_Bay[[1, 3]]])
c.setColor("red")
c.setPointStyle("bullet")
g_Bay.add(c)
c = ot.Cloud([theta_ML[[1, 3]]])
c.setColor("red")
c.setPointStyle("+")
g_Bay.add(c)
a = interval_Bay.getLowerBound()
b = interval_Bay.getUpperBound()
c = ot.Curve([a, [a[0], b[1]], b, [b[0], a[1]], a])
c.setColor("blue")
g_Bay.add(c)
a = interval_ML.getLowerBound()
b = interval_ML.getUpperBound()
c = ot.Curve([a, [a[0], b[1]], b, [b[0], a[1]], a])
c.setColor("blue")
c.setLineStyle("dashed")
g_Bay.add(c)
g_Bay.setLegends(
[
r"dist of $g(\pi_{n, \mathbf{Y}})$",
r"$g(\mathbf{Y}_n^m)$",
r"$\mathbf{\theta}_n^{MV}$",
"Bay CI(" + str(int(100 * alpha)) + "%)",
"ML CI(" + str(int(100 * alpha)) + "%)",
]
)
g_Bay.setXTitle(r"$\sigma_0$")
g.setYTitle(r"$\sigma_1$")
g_Bay.setTitle("Bayesian Estimator")
# Gather both graph in a grid.
grid = ot.GridLayout(1, 2)
grid.setGraph(0, 0, g_ML)
grid.setGraph(0, 1, g_Bay)
view = otv.View(grid)
Finally, the following table sums up the previous computed quantities.
Approach |
IC( |
IC( |
||
---|---|---|---|---|
Frequentist |
0.967 |
0.951 |
||
Bayesian |
0.950 |
0.934 |
Approach |
|||
---|---|---|---|
Frequentist |
0.974 |
0.352 |
|
Bayesian |
0.974 |
0.381 |
Case 2: we consider the second link functionΒΆ
Here, and
.
We note that the posterior distribution of is quadri-modal, as the link function
is no more bijective
on the range of
.
We go through the same steps as described previously. The maximum estimator is not changed. But in order to get the posterior distribution of :math:
linkFunction = ot.SymbolicFunction(
["u0", "u1"], ["0.0", "0.5+u0^2", "0.0", "0.5+u1^2", "0.0"]
)
conditioning = ot.JointDistribution([ot.Triangular(-1.0, 0.0, 1.0)] * 2)
conditioning.setDescription(["Y0", "Y1"])
deconditioned = ot.DeconditionedDistribution(conditioned, conditioning, linkFunction)
posterior_Y = otexp.PosteriorDistribution(deconditioned, observations)
sample_posterior = linkFunction(posterior_Y.getSample(100000)).getMarginal([1, 3])
dist_estimateur_Bay = ot.KernelSmoothing().build(sample_posterior)
theta_Bay = linkFunction(computeMode(posterior_Y))
print("Theta Bay =", theta_Bay)
model_Bay = ot.Distribution(conditioned)
model_Bay.setParameter(theta_Bay)
alpha = 0.95
interval_Bay, beta = (
dist_estimateur_Bay.computeBilateralConfidenceIntervalWithMarginalProbability(alpha)
)
print("Beta =", beta)
print("Condifence interval Bay =\n", interval_Bay)
print("Volume =", interval_Bay.getVolume())
sample = model_Bay.getSample(1000000)
dist_Bay = (
model_Bay.computeLogPDF(sample) - conditioned.computeLogPDF(sample)
).computeMean()
print("Kullback-Leibler distance Bay =", dist_Bay[0])
Theta Bay = [0,0.929774,0,0.915277,0]
Beta = 0.9746612121088225
Condifence interval Bay =
[0.706559, 1.26452]
[0.696458, 1.25376]
Volume = 0.3109536335363996
Kullback-Leibler distance Bay = 0.012475430552002089
g = conditioned.drawPDF()
levels = g.getDrawable(0).getLevels()
dr_ML = model_ML.drawPDF().getDrawable(0).getImplementation()
dr_ML.setLevels(levels)
dr_ML.setColorBarPosition("")
dr_ML.setLineStyle("dashed")
g.add(dr_ML)
dr_Bay = model_Bay.drawPDF().getDrawable(0).getImplementation()
dr_Bay.setLevels(levels)
dr_Bay.setColorBarPosition("")
dr_Bay.setLineStyle("dotted")
g.add(dr_Bay)
g.add(ot.Cloud(observations))
g.setLegends(["Theoretical dist", "ML dist", "Bay dist", "Observations"])
g.setXTitle(r"$X_0$")
g.setYTitle(r"$X_1$")
g.setTitle("Initial distribution, ML estimated dist and Bayesian estimated dist.")
view = otv.View(g, (800, 800), square_axes=True)
g_ML = dist_estimator_ML.drawPDF([0.5] * 2, [1.5] * 2)
c = ot.Cloud([theta_ML[[1, 3]]])
c.setColor("red")
c.setPointStyle("bullet")
g_ML.add(c)
c = ot.Cloud([theta_Bay[[1, 3]]])
c.setColor("red")
c.setPointStyle("+")
g_ML.add(c)
a = interval_ML.getLowerBound()
b = interval_ML.getUpperBound()
c = ot.Curve([a, [a[0], b[1]], b, [b[0], a[1]], a])
c.setColor("blue")
g_ML.add(c)
a = interval_Bay.getLowerBound()
b = interval_Bay.getUpperBound()
c = ot.Curve([a, [a[0], b[1]], b, [b[0], a[1]], a])
c.setColor("blue")
c.setLineStyle("dashed")
g_ML.add(c)
g_ML.setLegends(
[
r"dist of $\mathbf{\theta}_n^{MV}$",
r"$\mathbf{\theta}_n^{MV}$",
r"$g(\mathbf{Y}_n^m)$",
"ML CI(" + str(int(100 * alpha)) + "%)",
"Bay CI(" + str(int(100 * alpha)) + "%)",
]
)
g_ML.setXTitle(r"$\sigma_0$")
g_ML.setYTitle(r"$\sigma_1$")
g_ML.setTitle("ML Estimator")
g_Bay = dist_estimateur_Bay.drawPDF([0.5] * 2, [1.5] * 2)
c = ot.Cloud([theta_Bay[[1, 3]]])
c.setColor("red")
c.setPointStyle("bullet")
g_Bay.add(c)
c = ot.Cloud([theta_ML[[1, 3]]])
c.setColor("red")
c.setPointStyle("+")
g_Bay.add(c)
a = interval_Bay.getLowerBound()
b = interval_Bay.getUpperBound()
c = ot.Curve([a, [a[0], b[1]], b, [b[0], a[1]], a])
c.setColor("blue")
g_Bay.add(c)
a = interval_ML.getLowerBound()
b = interval_ML.getUpperBound()
c = ot.Curve([a, [a[0], b[1]], b, [b[0], a[1]], a])
c.setColor("blue")
c.setLineStyle("dashed")
g_Bay.add(c)
g_Bay.setLegends(
[
r"dist of $g(\pi_{n, \mathbf{Y}})$",
r"$g(\mathbf{Y}_n^m)$",
r"$\mathbf{\theta}_n^{MV}$",
"Bay CI(" + str(int(100 * alpha)) + "%)",
"ML CI(" + str(int(100 * alpha)) + "%)",
]
)
g_Bay.setXTitle(r"$\sigma_0$")
g.setYTitle(r"$\sigma_1$")
g_Bay.setTitle("Bayesian Estimator")
grid = ot.GridLayout(1, 2)
grid.setGraph(0, 0, g_ML)
grid.setGraph(0, 1, g_Bay)
view = otv.View(grid)
Finally, the following table sums up the previous computed quantities.
Approach |
IC( |
IC( |
||
---|---|---|---|---|
Frequentist |
0.967 |
0.951 |
||
Bayesian |
0.929 |
0.915 |
Approach |
|||
---|---|---|---|
Frequentist |
0.974 |
0.352 |
|
Bayesian |
0.974 |
0.316 |
We also plot the PDF of the posterior distribution of
which is quadri-modal, with a sample.
sphinx_gallery_thumbnail_number = 5
g_pinY = posterior_Y.drawPDF()
g_pinY.setXTitle(r"$Y_0$")
g_pinY.setYTitle(r"$Y_1$")
g_pinY.setTitle(r"Posterior Bayesian Distribution of $\mathbf{Y}$")
g_pinY.add(ot.Cloud(posterior_Y.getSample(100)))
view = otv.View(g_pinY, (800, 800))
view.ShowAll()