Estimation of a spectral density functionΒΆ
Let be a multivariate
stationary normal process of dimension
. We only treat here
the case where the domain is of dimension 1:
.
If the process is continuous, then
. In the discrete
case,
is a lattice.
is supposed to be a second order process with zero mean and
we suppose that its spectral density function
defined in
(8) exists.
is the set of
-dimensional positive definite Hermitian matrices.
Our goal is to estimate the spectral density
function
from data, which can be a sample of time series or
one time series.
Depending on the available data, we proceed differently:
if the data correspond to several independent realizations of the process, a statistical estimate is performed using statistical average of a realization-based estimator;
if the data correspond to one realization of the process at different time stamps (stored in a TimeSeries object), the process being observed during a long period of time, an ergodic estimate is performed using a time average of an ergodic-based estimator.
The estimation of the spectral density function from data may use some
parametric or non parametric methods.
The Welch method is a non parametric estimation technique, known
to be performant. We detail it in the case where the available data on
the process is a time series which values are
associated to the time grid
which is a discretization of the domain
.
We assume that the process has a spectral density
defined on
.
The method is based on the segmentation of the time series into
segments of length
, possibly overlapping (size of
overlap
).
Let
be the first such
segment. Then:
Applying the same decomposition,
and finally:
The objective is to get a statistical estimator from these
segments. We define the periodogram associated with the segment
by:
with and
.
It has been proven that the periodogram has bad statistical properties. Indeed, two quantities summarize the properties of an estimator: its bias and its variance. The bias is the expected error one makes on the average using only a finite number of time series of finite length, whereas the covariance is the expected fluctuations of the estimator around its mean value. For the periodogram, we have:
Bias
where
is the squared module of the Fourier transform of the function
(Barlett window) defined by:
This estimator is biased but this bias vanishes when
as
.
Covariance
as
, which means that the fluctuations of the periodogram are of the same order of magnitude as the quantity to be estimated and thus the estimator is not convergent.
The periodogramβs lack of convergence may be easily fixed if we consider
the averaged periodogram over independent time series or
segments:
The averaging process has no effect on the significant bias of the periodogram.
The use of a tapering window may significantly reduce
it. The time series
is replaced by a tapered time
series
in the computation of
. One gets:
where is the square module of the Fourier transform
of
at the frequency
. A judicious choice of
tapering function such as the Hann window
can
dramatically reduce the bias of the estimate:
(1)ΒΆ
The library builds an estimation of the spectrum on a TimeSeries by fixing the number of segments, the overlap size parameter and a FilteringWindows. The available ones are:
The Hamming window
with
=
The Hann window described in (1) which is supposed to be the most useful.