Estimation of a spectral density function

Let X: \Omega \times \cD \rightarrow \Rset^d be a multivariate stationary normal process of dimension d. We only treat here the case where the domain is of dimension 1: \cD \in \Rset (n=1).
If the process is continuous, then \cD=\Rset. In the discrete case, \cD is a lattice.
X is supposed to be a second order process with zero mean and we suppose that its spectral density function S : \Rset \rightarrow \mathcal{H}^+(d) defined in (8) exists. \mathcal{H}^+(d) \in \mathcal{M}_d(\Cset) is the set of d-dimensional positive definite hermitian matrices.
This objective of this use case is to estimate the spectral density function S from data, which can be a sample of time series or one time series.

Depending on the available data, we proceed differently:

  • if the data correspond to several independent realizations of the process, a statistical estimate is performed using statistical average of a realization-based estimator;

  • if the data correspond to one realization of the process at different time stamps (stored in a TimeSeries object), the process being observed during a long period of time, an ergodic estimate is performed using a time average of an ergodic-based estimator.

The estimation of the spectral density function from data may use some parametric or non parametric methods.
The Welch method is a non parametric estimation technique, known to be performant. We detail it in the case where the available data on the process is a time series which values are (\vect{x}_0, \dots,\vect{x}_{N-1}) associated to the time grid (t_0, \dots, t_{N-1}) which is a discretization of the domain [0,T].
We assume that the process has a spectral density S defined on | f | \leq \frac{T}{2}.
The method is based on the segmentation of the time series into K segments of length L, possibly overlapping (size of overlap R).
Let \vect{X}_{1}(j), \ j = 0, 1,...,L-1 be the first such segment. Then:

\vect{X}_{1}(j) = \vect{X}(j) , \ j = 0, 1,...,L-1

Applying the same decomposition,

\vect{X}_{2}(j) = \vect{X}(j + (L - R)) , \ j = 0, 1,...,L-1

and finally:

\vect{X}_{K}(j) = \vect{X}(j + (K-1)(L-R)) , \ j = 0, 1,...,L-1

The objective is to get a statistical estimator from these K segments. We define the periodogram associated with the segment \vect{X}_k by:

  \vect{X}_{k}(f_p,T)&=&\Delta t\sum_{n=0}^{L-1}\vect{x}(n\Delta t)\exp\left[\frac{-j2\pi pn}{N}\right], \quad p=0,\dots, L-1\\
  \hat{G}_{\vect{x}}(f_p,T)&=&\frac{2}{T}\vect{X}_{k}(f_p,T)\,{\vect{X}_{k}(f_p,T)^*}^t,\quad p=0,\dots,L/2-1\end{aligned}

with \Delta t=\frac{T}{N} and f_p=\frac{p}{T}=\frac{p}{N}\frac{1}{\Delta t}.

It has been proven that the periodogram has bad statistical properties. Indeed, two quantities summarize the properties of an estimator: its bias and its variance. The bias is the expected error one makes on the average using only a finite number of time series of finite length, whereas the covariance is the expected fluctuations of the estimator around its mean value. For the periodogram, we have:
  • Bias=\mathbb{E}[\hat{G}_{\vect{x}}(f_p, T)-G_{\vect{X}}(f_p)]=(\frac{1}{T}W_B(f_p, T)-\delta_0)*G_{\vect{X}}(f_p) where W_B(f_p, T) = \left(\frac{\sin\pi fT}{\pi fT}\right)^2 is the squared module of the Fourier transform of the function w_B(t, T) (Barlett window) defined by:

    w_B(t, T) = \mathbf{1}_{[0,T]}(t)

    This estimator is biased but this bias vanishes when T\rightarrow\infty as \lim_{T\rightarrow\infty} \frac{1}{T}W_B(f_p, T)=\delta_0.

  • Covariance=\frac{1}{T}W_B(f_p, T)*G_{\vect{X}}(f_p)\rightarrow G_{\vect{X}}(f_p) as T\rightarrow\infty, which means that the fluctuations of the periodogram are of the same order of magnitude as the quantity to be estimated and thus the estimator is not convergent.

The periodogram’s lack of convergence may be easily fixed if we consider the averaged periodogram over K independent time series or segments:


The averaging process has no effect on the significant bias of the periodogram.

The use of a tapering window w(t, T) may significantly reduce it. The time series \vect{x}(t) is replaced by a tapered time series w(t, T)\vect{x}(t) in the computation of \vect{X}(f_p,T). One gets:

\mathbb{E}[\hat{G}_{\vect{x}}(f_p, T)-G_{\vect{X}}(f_p)=(\frac{1}{T}W(f_p, T)-\delta_0)*G_{\vect{X}}(f_p)

where W(f_p, T) is the square module of the Fourier transform of w(t, T) at the frequency f_p. A judicious choice of tapering function such as the Hanning window w_H can dramatically reduce the bias of the estimate:

(1)w_H(t, T) = \sqrt{\frac{8}{3}}\left(1-\cos^2\left(\frac{\pi t}{T}\right)\right)\mathbf{1}_{[0,T]}(t)

The library builds an estimation of the spectrum on a TimeSeries by fixing the number of segments, the overlap size parameter and a FilteringWindows. The available ones are:

  • The Hamming window

    w(t, T) = \sqrt{\frac{1}{K}}\left(0.54-0.46\cos^2\left(\frac{2 \pi t}{T}\right)\right)\mathbf{1}_{[0,T]}(t)

    with K = \sqrt{0.54^2 + \frac{1}{2} 0.46^2}

  • The Hanning window described in (1) which is supposed to be the most useful.