Affine combination of independent univariate random variables

Introduction

Let \vect{Y} be the random vector defined as the affine transform of n independent univariate random variables. More precisely, consider:

(1)\vect{Y}=\vect{y}_0+\mat{M}\,\vect{X}

where \vect{y}_0 \in \Rset^{\inputDim} is a deterministic vector with \inputDim \in \{1,2,3\}, \mat{M} \in \mathcal{M}_{\inputDim,n}(\Rset) is a deterministic matrix and \left(X_k\right)_{ 1 \leq k \leq n} are independent random variables. In this case, it is possible to directly evaluate the distribution of \vect{Y} and then to ask \vect{Y} any request compatible with a distribution: moments, probability and cumulative density functions, quantiles (in dimension 1 only)… In this document, we present a method using the Poisson summation formula to evaluate the distribution of \vect{Y}.

Evaluation of the probability density function

Since, by hypothesis, the univariate random variables X_i are independent, the characteristic function of \vect{Y}, denoted \phi_{\vect{Y}}, is easily defined from the characteristic function of X_k denoted \phi_{X_k} as follows:

(2)\phi_{\vect{Y}}(u_1,\hdots,u_{\inputDim})
 = \prod_{j=1}^{\inputDim} e^{iu_j{y_0}_j} \prod_{k=1}^n\phi_{X_k}\left(\left(M^t u\right)_k\right),

for any \vect{u} \in \Rset^{\inputDim}. Once \phi_{\vect{Y}} is evaluated, it is possible to evaluate the probability density function of \vect{Y}, denoted p_{\vect{Y}} : several techniques are possible, as the inversion of the Fourier transformation, but this method is not easy to implement. We can alternatively use the Poisson summation formula:

(3)& \sum_{j_1 \in \Zset}\hdots\sum_{j_{\inputDim} \in \Zset} p_{\vect{Y}}\left(y_1+\frac{2\pi j_1}{h_1},\hdots,y_{\inputDim}+\frac{2\pi j_{\inputDim}}{h_{\inputDim}}\right) \\
 & = \left(\prod_{j=1}^{\inputDim} \frac{h_j}{2 \pi} \right) \sum_{k_1 \in \Zset}\hdots\sum_{k_{\inputDim} \in \Zset}\phi\left(k_1h_1,\hdots,k_{\inputDim}h_{\inputDim}\right)e^{-\imath \left(\sum_{m=1}^{\inputDim}k_m h_m y_m\right)}

where h_1, \hdots, h_{\inputDim} \in \Rset and \imath is the complex imaginary number, i.e. \imath^2 = -1. If h_1,\hdots,h_{\inputDim} are close to zero, then:

\frac{2k\pi}{h_j} \approx + \in fty

and:

p_{\vect{Y}}\left(\hdots,\frac{2k\pi}{h_j},\hdots\right) \approx 0

because of the decreasing properties of p_{\vect{Y}}. Thus the nested sums of the left term of (3) are reduced to the central term j_1=\hdots=j_{\inputDim} = 0: the left term is approximately equal to p_{\vect{Y}}(y). Furthermore, the right term of (3) is a series which converges very fast: few terms of the series are enough to get machine-precision accuracy. Let us note that the factors \phi_{\vect{Y}}(k_1 h_1,\hdots,k_{\inputDim},h_{\inputDim}), which are expensive to evaluate, do not depend on y and are evaluated once only.

It is also possible to greatly improve the performance of the algorithm by noticing that the equation is linear between p_{\vect{Y}} and \phi_{\vect{Y}}. We denote by q_Y and \psi_Y respectively the density and the characteristic function of the multivariate normal distribution with the same mean \vect{\mu} and same covariance matrix \vect{C} as the affine combination. By applying this multivariate normal distribution to the equation, we obtain by subtraction:

(4)p_{\vect{Y}}\left(y\right)
 & = \sum_{j \in \Zset^{\inputDim}} q_Y\left(y_1+\frac{2\pi j_1}{h_1},\cdots,y_{\inputDim}+\frac{2\pi j_{\inputDim}}{h_{\inputDim}}\right) \\
 & \quad + \frac{H}{2^{\inputDim}\pi^{\inputDim}}\sum_{|k_1|\leq N}\cdots\sum_{|k_{\inputDim}|\leq N} \delta_Y\left(k_1h_1,\cdots,k_{\inputDim}h_{\inputDim}\right)e^{-\imath \left(\sum_{m=1}^{d}k_m h_m y_m\right)}

where H = h_1\times\cdots\times h_{\inputDim}, j=(j_1,\cdots,j_{\inputDim}) and \delta_Y:=\phi_{\vect{Y}} - \psi_Y. In the case where n \gg 1, using the limit central theorem, the law of \vect{Y} tends to the normal distribution density q, which will drastically reduce N. The sum on q will become the most CPU-intensive part, because in the general case we will have to keep more terms than the central one in this sum, since the parameters h_1, \dots  h_{\inputDim} were calibrated with respect to p and not q.

The parameters h_1, \dots  h_{\inputDim} are calibrated using the following formula:

h_\ell = \frac{2\pi}{(\beta+4\alpha)\sigma_\ell}

where \sigma_\ell=\sqrt{\Cov{\vect{Y}}_{\ell,\ell}} and \alpha, \beta are respectively the number of standard deviations covered by the marginal distribution (\alpha=5 by default) and \beta the number of marginal deviations beyond which the density is negligible (\beta=8.5 by default). The parameter N is dynamically calibrated: we start with N=8 then we double N value until the total contribution of the additional terms is negligible.

Evaluation of the moments

The relation (1) enables to evaluate all the moments of the affine combination, if mathematically defined. For example, we have:

\Expect{\vect{Y}} & = \vect{y_0} + \mat{M}\Expect{\vect{X}} \\
\Cov{\vect{Y}} & = \mat{M}\,\Cov{\vect{X}}\mat{M}^t

Computation on a regular grid

We want to compute the density function on a regular grid and to get an approximation quickly. The regular grid is:

\:y_{r,m}=\mu_r+b\left(\frac{2m+1}{M} - 1\right)\sigma_r

for all r \in \{1,\hdots,\inputDim\} and m \in \{0,\hdots,M-1\}. Denoting p_{m_1,\hdots,m_{\inputDim}}=p_{\vect{Y}}(y_{1,m_1},\hdots,y_{d,m_{\inputDim}}):

p_{m_1,\hdots,m_{\inputDim}}= Q_{m_1,\hdots,m_{\inputDim}}+S_{m_1,\hdots,m_{\inputDim}}

for which the term S_{m_1,\hdots,m_{\inputDim}} is the most CPU consuming. This term rewrites:

S_{m_1,\hdots,m_{\inputDim}}=&\frac{H}{2^{\inputDim}\pi^{\inputDim}}\sum_{k_1=-N}^{N}\hdots\sum_{k_{\inputDim}=-N}^{N}\delta\left(k_1h_1,\hdots,k_{\inputDim}h_{\inputDim}\right)
E_{m_1,\hdots,m_{\inputDim}}(k_1,\hdots,k_{\inputDim})

with:

\delta\left(k_1h_1,\hdots,k_{\inputDim}h_{\inputDim}\right) & = (\phi-\psi)\left(k_1h_1,\hdots,k_{\inputDim}h_{\inputDim}\right)\\
E_{m_1,\hdots,m_{\inputDim}}(k_1,\hdots,k_{\inputDim}) & = e^{-i\sum_{j=1}^{\inputDim} k_jh_j\left(\mu_j+b\left(\frac{2m_j+1}{M}-1\right)\sigma_j\right)}

The aim is to rewrite the previous expression as a d- discrete Fourier transform, in order to apply Fast Fourier Transform (FFT) for its evaluation. We set M=N and \forall j  \in \{1,\hdots,d\},\: h_j=\frac{\pi}{b\sigma_j} and \tau_j=\frac{\mu_j}{b\sigma_j}. For convenience, we introduce the functions:

f_j(k) = e^{-i\pi (k+1)\left(\tau_j-1+\frac{1}{N}\right)}

We use k+1 instead of k in this function to simplify expressions below. We obtain:

& E_{m_1,\hdots,m_{\inputDim}}(k_1,\hdots,k_{\inputDim}) \\
& = e^{-i\sum_{j=1}^{d} k_jh_jb\sigma_j\left(\frac{\mu_j}{b\sigma_j}+\frac{2m_j}{N}+\frac{1}{N}-1\right)}\\
& = e^{-2i\pi\left(\frac{\sum_{j=1}^{d}k_j m_j}{N}\right)}e^{-i\pi\sum_{j=1}^{d} k_j\left(\tau_j-1+\frac{1}{N}\right)} \\
& = e^{-2i\pi\left(\frac{\sum_{j=1}^{d}k_j m_j}{N}\right)} f_1(k_1-1) \times \hdots \times f_{\inputDim}(k_{\inputDim}-1)

For performance reasons, we want to use the discrete Fourier transform with the following convention in dimension 1:

A_m = \sum_{k=0}^{N-1} a_k e^{-2i\pi\frac{km}{N}}

which extension to dimensions 2 and 3 are respectively:

A_{m,n} & = \sum_{k=0}^{N-1}\sum_{l=0}^{N-1} a_{k,l} e^{-2i\pi\frac{km}{N}} e^{-2i\pi\frac{ln}{N}}\\
A_{m,n,p} & = \sum_{k=0}^{N-1}\sum_{l=0}^{N-1}\sum_{s=0}^{N-1} a_{k,l,s} e^{-2i\pi\frac{km}{N}} e^{-2i\pi\frac{ln}{N}} e^{-2i\pi\frac{sp}{N}}

We decompose sums of on the interval [-N,N] into three parts:

(5)& \sum_{k_j=-N}^{N}\delta\left(k_1h_1,\hdots,k_{\inputDim}h_{\inputDim}\right) E_{m_1,\hdots,m_{\inputDim}}(k_1,\hdots,k_{\inputDim}) \\
    & = \sum_{k_j=-N}^{-1} \delta\left(k_1h_1,\hdots,k_{\inputDim}h_{\inputDim}\right) E_{m_1,\hdots,m_{\inputDim}}(k_1,\hdots,k_{\inputDim}) \\
    & \quad + \delta\left(k_1h_1,\hdots,0,\hdots,k_{\inputDim}h_{\inputDim}\right) E_{m_1,\hdots,0,\hdots,m_{\inputDim}}(k_1,\hdots,0,\hdots,k_{\inputDim}) \\
    & \quad+ \sum_{k_j=1}^{N}\delta\left(k_1h_1,\hdots,k_{\inputDim}h_{\inputDim}\right) E_{m_1,\hdots,m_{\inputDim}}(k_1,\hdots,k_{\inputDim})

If we compute E for dimension d-1, then the middle term in this sum is trivial.

To compute the last sum, we apply a change of variable k_j' = k_j-1:

& \sum_{k_j=1}^{N}\delta\left(k_1h_1,\hdots,k_{\inputDim}h_{\inputDim}\right) E_{m_1,\hdots,m_{\inputDim}}(k_1,\hdots,k_{\inputDim}) \\
& = \sum_{k_j=0}^{N-1}\delta\left(k_1h_1,\hdots,(k_j+1)h_j,\hdots,k_{\inputDim}h_{\inputDim}\right) \times\\
& \quad E_{m_1,\hdots,m_{\inputDim}}(k_1,\hdots,k_j+1,\hdots,k_{\inputDim})

This implies:

& E_{m_1,\hdots,m_{\inputDim}}(k_1, \hdots, k_j+1, \hdots, k_{\inputDim}) \\
&= e^{-2i\pi\left(\frac{\sum_{l = 1}^{d}k_l m_l}{N} +\frac{m_j}{N}\right)}
    f_1(k_1 - 1)\times \hdots \times f_j(k_j) \times \hdots \times f_{\inputDim}(k_{\inputDim} - 1) \\
&= e^{-2i\pi\left(\frac{m_j}{N}\right)}
    e^{-2i\pi\left(\frac{\sum_{l = 1}^{d}k_l m_l}{N}\right)}
    f_1(k_1 - 1)\times \hdots \times f_j(k_j) \times \hdots \times f_{\inputDim}(k_{\inputDim} - 1)

Thus:

& \sum_{k_j=1}^{N}\delta\left(k_1h_1,\hdots,k_{\inputDim}h_{\inputDim}\right) E_{m_1,\hdots,m_{\inputDim}}(k_1,\hdots,k_{\inputDim}) \\
& = e^{-2i\pi\left(\frac{m_j}{N}\right)} \sum_{k_j=0}^{N-1}\delta\left(k_1h_1,\hdots,(k_j+1)h_j,\hdots,k_{\inputDim}h_{\inputDim}\right) \times \\
& \quad e^{-2i\pi\left(\frac{\sum_{l=1}^{d}k_l m_l}{N}\right)}
    f_1(k_1-1)\times \hdots \times f_j(k_j)\times \hdots \times f_{\inputDim}(k_{\inputDim}-1)

To compute the first sum of equation, we apply a change of variable k_j'=N+k_j:

& \sum_{k_j=-N}^{-1}\delta\left(k_1h_1,\hdots,k_{\inputDim}h_{\inputDim}\right) E_{m_1,\hdots,m_{\inputDim}}(k_1,\hdots,k_{\inputDim}) \\
&=  \sum_{k_j=0}^{N-1}\delta\left(k_1h_1,\hdots,(k_j-N)h_j,\hdots,k_{\inputDim}h_{\inputDim}\right) \times \\
  & \quad  E_{m_1,\hdots,m_{\inputDim}}(k_1,\hdots,k_j-N,\hdots,k_{\inputDim})

This implies:

& E_{m_1,\hdots,m_{\inputDim}}(k_1,\hdots,k_j-N,\hdots,k_{\inputDim}) \\
&= e^{-2i\pi\left(\frac{\sum_{l=1}^{d}k_l m_l}{N} -m_j\right)}
    f_1(k_1-1)\times \hdots \times f_j(k_j-1-N)\times \hdots \times f_{\inputDim}(k_{\inputDim}-1) \\
& = e^{-2i\pi\left(\frac{\sum_{l=1}^{d}k_l m_l}{N}\right)}
    f_1(k_1-1)\times \hdots \times \overline{f}_j(N-1-k_j)\times \hdots \times f_{\inputDim}(k_{\inputDim}-1)

Thus:

& \sum_{k_j=-N}^{-1}\delta\left(k_1h_1,\hdots,k_{\inputDim}h_{\inputDim}\right) E_{m_1,\hdots,m_{\inputDim}} (k_1,\hdots,k_{\inputDim}) \\
& = \sum_{k_j=0}^{N-1}\delta\left(k_1h_1,\hdots,(k_j-N)h_j,\hdots,k_{\inputDim}h_{\inputDim}\right) \times \\
& \quad e^{-2i\pi\left(\frac{\sum_{l=1}^{d}k_l m_l}{N}\right)}
    f_1(k_1-1)\times \hdots \times \overline{f}_j(N-1-k_j)\times \hdots \times f_{\inputDim}(k_{\inputDim}-1)

To summarize:

  1. In order to compute sum from k_1=1 to N, we multiply by e^{-2i\pi\left(\frac{m_1}{N}\right)} and consider \delta((k_1+1)h,\hdots)f_1(k_1)

  2. In order to compute sum from k_1=-N to -1, we consider \delta((k_1-N)h,\hdots)\overline{f}_1(N-1-k_1)