Quadratic form (statistics)

In multivariate statistics, if \epsilon is a vector of n random variables, and \Lambda is an n-dimensional symmetric matrix, then the scalar quantity \epsilon^T\Lambda\epsilon is known as a quadratic form in \epsilon.

Expectation

It can be shown that[1]

\operatorname{E}\left[\epsilon^T\Lambda\epsilon\right]=\operatorname{tr}\left[\Lambda \Sigma\right] + \mu^T\Lambda\mu

where \mu and \Sigma are the expected value and variance-covariance matrix of \epsilon, respectively, and tr denotes the trace of a matrix. This result only depends on the existence of \mu and \Sigma; in particular, normality of \epsilon is not required.

A book treatment of the topic of quadratic forms in random variables is [2]

Proof

Since the quadratic form is a scalar quantity  \operatorname{E}\left[\epsilon^T\Lambda\epsilon\right] = \operatorname{tr}(\operatorname{E}[\epsilon^T\Lambda\epsilon]). Note that both \operatorname{E} and \operatorname{tr} are linear operators, so  \operatorname{E} \circ \operatorname{tr} = \operatorname{tr} \circ \operatorname{E} . It follows that

 \operatorname{E}\left[\epsilon^T\Lambda\epsilon\right] = \operatorname{E}[\operatorname{tr}(\epsilon^T\Lambda\epsilon)],

and that, by the cyclic property of the trace operator,

 \operatorname{E}[\operatorname{tr}(\epsilon^T\Lambda\epsilon)] = \operatorname{E}[\operatorname{tr}(\Lambda\epsilon\epsilon^T)] 
= \operatorname{tr} (\Lambda(\Sigma + \mu\mu^T)) = \operatorname{tr}(\Lambda\Sigma) + \mu^T\Lambda\mu.

Variance

In general, the variance of a quadratic form depends greatly on the distribution of \epsilon. However, if \epsilon does follow a multivariate normal distribution, the variance of the quadratic form becomes particularly tractable. Assume for the moment that \Lambda is a symmetric matrix. Then,

\operatorname{var}\left[\epsilon^T\Lambda\epsilon\right]=2\operatorname{tr}\left[\Lambda \Sigma\Lambda \Sigma\right] + 4\mu^T\Lambda\Sigma\Lambda\mu

In fact, this can be generalized to find the covariance between two quadratic forms on the same \epsilon (once again, \Lambda_1 and \Lambda_2 must both be symmetric):

\operatorname{cov}\left[\epsilon^T\Lambda_1\epsilon,\epsilon^T\Lambda_2\epsilon\right]=2\operatorname{tr}\left[\Lambda _1\Sigma\Lambda_2 \Sigma\right] + 4\mu^T\Lambda_1\Sigma\Lambda_2\mu

Computing the variance in the non-symmetric case

Some texts incorrectly state that the above variance or covariance results hold without requiring \Lambda to be symmetric. The case for general \Lambda can be derived by noting that

\epsilon^T\Lambda^T\epsilon=\epsilon^T\Lambda\epsilon

so

\epsilon^T\tilde{\Lambda}\epsilon=\epsilon^T\left(\Lambda+\Lambda^T\right)\epsilon/2

But this is a quadratic form in the symmetric matrix \tilde{\Lambda}=\left(\Lambda+\Lambda^T\right)/2, so the mean and variance expressions are the same, provided \Lambda is replaced by \tilde{\Lambda} therein.

Examples of quadratic forms

In the setting where one has a set of observations y and an operator matrix H, then the residual sum of squares can be written as a quadratic form in y:

\textrm{RSS}=y^T\left(I-H\right)^T\left(I-H\right)y.

For procedures where the matrix H is symmetric and idempotent, and the errors are Gaussian with covariance matrix \sigma^2I, \textrm{RSS}/\sigma^2 has a chi-squared distribution with k degrees of freedom and noncentrality parameter \lambda, where

k=\operatorname{tr}\left[\left(I-H\right)^T\left(I-H\right)\right]
\lambda=\mu^T\left(I-H\right)^T\left(I-H\right)\mu/2

may be found by matching the first two central moments of a noncentral chi-squared random variable to the expressions given in the first two sections. If Hy estimates \mu with no bias, then the noncentrality \lambda is zero and \textrm{RSS}/\sigma^2 follows a central chi-squared distribution.

References

  1. Douglas, Bates. "Quadratic Forms of Random Variables". STAT 849 lectures. Retrieved August 21, 2011.
  2. Mathai, A. M. and Provost, Serge B. (1992). Quadratic Forms in Random Variables. CRC Press. p. 424. ISBN 978-0824786915.

See also