Box Cox transformation¶
the estimation of the Box Cox transformation from a given field of the process
,
the action of the Box Cox transformation on a field generated from
.
which leads to:
and then:
To have constant with respect to
at the first order, we need:
(1)¶
Now, we make some additional hypotheses on the relation between
and
:
If we suppose that
, then (1) leads to the function
and we take
;
If we suppose that
, then (1) leads to the function
and we take
;
More generally, if we suppose that
, then (1) leads to the function
parametrized by the scalar
:
(2)¶
The inverse Box Cox transformation is defined by:
(3)¶
(4)¶
from which we derive the density probability function of
for all vertices
:
(5)¶
Using (5), the likelihood of the values
with respect to the model (4)
writes:
(6)¶
We notice that for each fixed , the likelihood equation
is proportional to the likelihood equation which estimates
. Thus, the maximum likelihood estimator for
for a given
are:
(7)¶
(8)¶
where is a constant.
The parameter is the one maximizing
defined in (8).
In the frame of the general linear model, we consider a functional relation between some input and
output values. Let us consider the following dataset:
.
The general linear model aims at assessing the following prior model :
where:
is a general linear model based upon a functional basis
and a vector of coefficients
,
is a zero-mean stationary Gaussian process whose covariance function reads:
where
is the variance and
is the correlation function that solely depends on the Manhattan distance between input points
and a vector of parameters
.
The optimal parameters of such model are estimated by maximizing a log-likelihood function.
Here we suppose a gaussian prior on . Thus, if we write our various hypotheses,
we get the following log-likelihood function to be optimized:
(9)¶
where is a constant,
(10)¶
Remarks :
The equation (9) applies also if we replace the general linear model by a linear regression model. Indeed a linear model is a specific case of general linear model where the correlation model is a Dirac covariance model.
Note that such estimate might be heavy as we get a double loop optimization. Indeed for each value, we optimize
the parameters of the underlying general linear model. Some practitioners are used to freeze the first general linear model parameters
and then preform a one loop optimization selecting only the best
value.
In the frame of linear models, we consider a functional relation between some input and
output values. Let us consider the following dataset:
.
The general linear model aims at assessing the following prior model :
where:
is a general linear model based upon a functional basis
and a vector of coefficients
,
is a zero-mean stationary white noise process.
The optimal parameters of such model are estimated by maximizing a log-likelihood function.
Here we suppose a gaussian prior on . Thus, if we write our various hypotheses,
we get the following log-likelihood function to be optimized:
(11)¶
where is a constant,
(12)¶
As a remark, the above case is a particular case of (11). Indeed if a linear model is a specific case of general linear model where the correlation model is a Dirac covariance model (White noise model).
In term of costs, a factorization (QR or SVD) is done once for the regression matrix and the parameters defined in (12)
are easily obtained, for each new value, solving the corresponding linear systems.
Sometimes, people perform a grid search for example varying for example
from -3 to 3 using a small step. It allows one
to get both the optimal and assess the robustness of the optimum.