Wishart distribution

From Wikipedia, the free encyclopedia

Wishart
Probability density function
Cumulative distribution function
Parameters  n > 0\! deg. of freedom (real)
\mathbf{V} > 0\, scale matrix ( pos. def)
Support \mathbf{W}\! is positive definite
Probability density function (pdf) \frac{\left|\mathbf{W}\right|^\frac{n-p-1}{2}}
                         {2^\frac{np}{2}\left|{\mathbf V}\right|^\frac{n}{2}\Gamma_p(\frac{n}{2})} \exp\left(-\frac{1}{2}{\rm Tr}({\mathbf V}^{-1}\mathbf{W})\right)
Cumulative distribution function (cdf)
Mean n \mathbf{V}
Median
Mode (n-p-1)\mathbf{V}\text{ for }n \geq p+1
Variance n(v_{ij}^2+v_{ii}v_{jj})
Skewness
Excess kurtosis
Entropy
Moment-generating function (mgf)
Characteristic function \Theta \mapsto \left|{\mathbf I} - 2i\,{\mathbf\Theta}{\mathbf V}\right|^{-n/2}

In statistics, the Wishart distribution, named in honor of John Wishart, is a generalization of the gamma distribution to multiple dimensions. It is any of a family of probability distributions for nonnegative-definite matrix-valued random variables ("random matrices"). These distributions are of great importance in the estimation of covariance matrices in multivariate statistics.

Contents

[edit] Definition

Suppose X is an n × p matrix, each row of which is independently drawn from p-variate normal distribution with zero mean:

X_{(i)}{=}(x_i^1,\dots,x_i^p)^T\sim N_p(0,V),

Then the Wishart distribution is the probability distribution of the p×p random matrix

S=X^T X = \sum_{i = 1}^{n} X_{(i)} X_{(i)}^T, \,\!

known as the scatter matrix. One indicates that S has that probability distribution by writing

S\sim W_p(V,n).

The positive integer n is the number of degrees of freedom. Sometimes this is written W(Vpn).

If p = 1 and V = 1 then this distribution is a chi-square distribution with n degrees of freedom.

[edit] Occurrence

The Wishart distribution arises frequently in likelihood-ratio tests in multivariate statistical analysis. It also arises in the spectral theory of random matrices.

[edit] Probability density function

The Wishart distribution can be characterized by its probability density function, as follows.

Let W be a p × p symmetric matrix of random variables that is positive definite. Let V be a (fixed) positive definite matrix of size p × p.

Then, if np, then W has a Wishart distribution with n degrees of freedom if it has a probability density function fW given by


f_{\mathbf W}(w)=
\frac{
  \left|w\right|^{(n-p-1)/2}
  \exp\left[ - {\rm trace}({\mathbf V}^{-1}w/2 )\right] 
}{
2^{np/2}\left|{\mathbf V}\right|^{n/2}\Gamma_p(n/2)
}

where Γp(·) is the multivariate gamma function defined as


\Gamma_p(n/2)=
\pi^{p(p-1)/4}\Pi_{j=1}^p
\Gamma\left[ (n+1-j)/2\right].

In fact the above definition can be extended to any real n > p − 1.

[edit] Characteristic function

The characteristic function of the Wishart distribution is


\Theta \mapsto \left|{\mathbf I} - 2i\,{\mathbf\Theta}{\mathbf V}\right|^{-n/2}.

In other words,

\Theta \mapsto {\mathcal E}\left\{\mathrm{exp}\left[i\cdot\mathrm{trace}({\mathbf W}{\mathbf\Theta})\right]\right\}
=
\left|{\mathbf I} - 2i{\mathbf\Theta}{\mathbf V}\right|^{-n/2}

where {\mathcal E}(\cdot) denotes expectation.

(here Θ and {\mathbf I} are matrices the same size as {\mathbf V} ({\mathbf I} is the identity matrix); and i is the square root of minus one).

[edit] Theorem

If \scriptstyle {\mathbf W} has a Wishart distribution with m degrees of freedom and variance matrix \scriptstyle {\mathbf V}—write \scriptstyle {\mathbf W}\sim{\mathbf W}_p({\mathbf V},m)—and \scriptstyle{\mathbf C} is a q × p matrix of rank q, then


{\mathbf C}{\mathbf W}{\mathbf C'}
\sim
{\mathbf W}_q\left({\mathbf C}{\mathbf V}{\mathbf C'},m\right).

[edit] Corollary 1

If {\mathbf z} is a nonzero p\times 1 constant vector, then {\mathbf z'}{\mathbf W}{\mathbf z}\sim\sigma_z^2\chi_m^2.

In this case, \chi_m^2 is the chi-square distribution and \sigma_z^2={\mathbf z'}{\mathbf V}{\mathbf z} (note that \sigma_z^2 is a constant; it is positive because {\mathbf V} is positive definite).

[edit] Corollary 2

Consider the case where {\mathbf z'}=(0,\ldots,0,1,0,\ldots,0) (that is, the j-th element is one and all others zero). Then corollary 1 above shows that


w_{jj}\sim\sigma_{jj}\chi^2_m

gives the marginal distribution of each of the elements on the matrix's diagonal.

Noted statistician George Seber points out that the Wishart distribution is not called the "multivariate chi-square distribution" because the marginal distribution of the off-diagonal elements is not chi-square. Seber prefers to reserve the term multivariate for the case when all univariate marginals belong to the same family.

[edit] Estimator of the multivariate normal distribution

The Wishart distribution is the probability distribution of the maximum-likelihood estimator (MLE) of the covariance matrix of a multivariate normal distribution. The derivation of the MLE is perhaps surprisingly subtle and elegant. It involves the spectral theorem and the reason why it can be better to view a scalar as the trace of a 1×1 matrix than as a mere scalar. See estimation of covariance matrices.

[edit] Drawing values from the distribution

The following procedure is due to Smith & Hocking [1]. One can sample random p × p matrices from a p-variate Wishart distribution with scale matrix {\textbf V} and n degrees of freedom (for n \geq p) as follows:

  1. Generate a random p × p lower triangular matrix {\textbf A} such that:
  2. Compute the Cholesky decomposition of {\textbf V} = {\textbf L}{\textbf L}^T.
  3. Compute the matrix {\textbf X} = {\textbf L}{\textbf A}{\textbf A}^T{\textbf L}^T. At this point, {\textbf X} is a sample from the Wishart distribution W_p({\textbf V},n).

Note that if {\textbf V}={\textbf I}, the identity matrix, then the sample can be directly obtained from {\textbf X} = {\textbf A}{\textbf A}^T since the Cholesky decomposition of {\textbf V}={\textbf I}{\textbf I}^T.

[edit] See also

Languages