Hotelling's ''T''-squared distribution

In statistics Hotelling's T-squared distribution is a multiivariate distribution proportional to the F-distribution and arises importantly as the distribution of a set of statistics which are natural generalizations of the statistics underlying Student's t-distribution. In particular, the distribution arises in multivariate statistics in undertaking tests of the differences between the (multivariate) means of different populations, where tests for univariate problems would make use of a t-test. The distribution is named for Harold Hotelling, who developed it^[1] as a generalization of Student's t-distribution.

Distribution

If the vector _pd₁ is Gaussian multivariate-distributed with zero mean and unit covariance matrix N(_p0₁,_pI_p) and _pM_p is a p x p matrix with unit scale matrix and m degrees of freedom with a Wishart distribution W(_pI_p,m), then the Quadratic form m(₁d^T _p M⁻¹_pd₁) has a Hotelling T²(p,m) distribution with dimensionality parameter p and m degrees of freedom.^[2]

If a random variable X has Hotelling's T-squared distribution, $X\sim T_{p,m}^{2}$ , then:^[1]

\frac{m-p+1}{pm} X\sim F_{p,m-p+1}

where $F_{p,m-p+1}$ is the F-distribution with parameters p and m−p+1.

Statistic

Hotelling's t-squared statistic is a generalization of Student's t statistic that is used in multivariate hypothesis testing.^[1] The definition follows after it is motivated using a simpler problem.

Motivation

Let $\mathcal{N}_p(\boldsymbol{\mu},{\mathbf \Sigma})$ denote a p-variate normal distribution with location ${\boldsymbol {\mu }}$ and known covariance ${\mathbf \Sigma}$ . Let

{\mathbf x}_1,\dots,{\mathbf x}_n\sim \mathcal{N}_p(\boldsymbol{\mu},{\mathbf \Sigma})

be n independent random variables, which may be represented as $p\times1$ column vectors of real numbers. Define

\overline{\mathbf x}=\frac{\mathbf{x}_1+\cdots+\mathbf{x}_n}{n}

to be the sample mean with covariance ${\mathbf {\Sigma } }_{\bar {\mathbf {x} }}={\mathbf {\Sigma } }/n$ . It can be shown that

({\bar {\mathbf {x} }}-{\boldsymbol {\mu }})'{\mathbf {\Sigma } }_{\bar {\mathbf {x} }}^{-1}({\bar {\mathbf {x} }}-{\boldsymbol {\mathbf {\mu } }})\sim \chi _{p}^{2},

where $\chi^2_p$ is the chi-squared distribution with p degrees of freedom.

Proof

To show this use the fact that ${\overline {\mathbf {x} }}\sim {\mathcal {N}}_{p}({\boldsymbol {\mu }},{\mathbf {\Sigma } }_{\bar {\mathbf {x} }})$ derive the characteristic function of the random variable $\mathbf {y} =n({\bar {\mathbf {x} }}-{\boldsymbol {\mu }})'{\mathbf {\Sigma } }^{-1}({\bar {\mathbf {x} }}-{\boldsymbol {\mathbf {\mu } }})$ . This is done below:

{\begin{aligned}&\varphi _{\mathbf {y} }(\theta )=\operatorname {E} e^{i\theta \mathbf {y} },\\[5pt]={}&\operatorname {E} e^{i\theta n({\overline {\mathbf {x} }}-{\boldsymbol {\mu }})'{\mathbf {\Sigma } }^{-1}({\overline {\mathbf {x} }}-{\boldsymbol {\mathbf {\mu } }})}\\[5pt]={}&\int e^{i\theta n({\overline {\mathbf {x} }}-{\boldsymbol {\mu }})'{\mathbf {\Sigma } }^{-1}({\overline {\mathbf {x} }}-{\boldsymbol {\mathbf {\mu } }})}(2\pi )^{-p/2}|{\boldsymbol {\Sigma }}/n|^{-1/2}\,e^{-(1/2)n({\overline {\mathbf {x} }}-{\boldsymbol {\mu }})'{\boldsymbol {\Sigma }}^{-1}({\overline {\mathbf {x} }}-{\boldsymbol {\mu }})}\,dx_{1}\cdots dx_{p}\\[5pt]={}&\int (2\pi )^{-p/2}|{\boldsymbol {\Sigma }}/n|^{-1/2}\,e^{-(1/2)n({\overline {\mathbf {x} }}-{\boldsymbol {\mu }})'({\boldsymbol {\Sigma }}^{-1}-2i\theta {\boldsymbol {\Sigma }}^{-1})({\overline {\mathbf {x} }}-{\boldsymbol {\mu }})}\,dx_{1}\cdots dx_{p},\\[5pt]={}&|({\boldsymbol {\Sigma }}^{-1}-2i\theta {\boldsymbol {\Sigma }}^{-1})^{-1}/n|^{1/2}|{\boldsymbol {\Sigma }}/n|^{-1/2}\int (2\pi )^{-p/2}|({\boldsymbol {\Sigma }}^{-1}-2i\theta {\boldsymbol {\Sigma }}^{-1})^{-1}/n|^{-1/2}\,e^{-(1/2)n({\overline {\mathbf {x} }}-{\boldsymbol {\mu }})'({\boldsymbol {\Sigma }}^{-1}-2i\theta {\boldsymbol {\Sigma }}^{-1})({\overline {\mathbf {x} }}-{\boldsymbol {\mu }})}\,dx_{1}\cdots dx_{p},\end{aligned}}

{\begin{aligned}&=|(\mathbf {I} _{p}-2i\theta \mathbf {I} _{p})|^{-1/2},\\[5pt]&=(1-2i\theta )^{-p/2}.&\blacksquare \end{aligned}}

Definition

The covariance matrix ${\mathbf \Sigma}$ used above is often unknown. Here we use instead the sample covariance:

{\hat {\mathbf {\Sigma } }}={\frac {1}{n-1}}\sum _{i=1}^{n}(\mathbf {x} _{i}-{\overline {\mathbf {x} }})(\mathbf {x} _{i}-{\overline {\mathbf {x} }})'

where we denote transpose by an apostrophe. It can be shown that ${\hat {\mathbf {\Sigma } }}$ is a positive (semi) definite matrix and $(n-1){\hat {\mathbf {\Sigma } }}$ follows a p-variate Wishart distribution with n−1 degrees of freedom.^[3] The sample covariance matrix of the mean reads ${\hat {\mathbf {\Sigma } }}_{\overline {\mathbf {x} }}={\hat {\mathbf {\Sigma } }}/n$ .

Hotelling's t-squared statistic is then defined as:^[4]

t^{2}=({\overline {\mathbf {x} }}-{\boldsymbol {\mu }})'{\hat {\mathbf {\Sigma } }}_{\overline {\mathbf {x} }}^{-1}({\overline {\mathbf {x} }}-{\boldsymbol {\mathbf {\mu } }})

Also, from the distribution,

t^{2}\sim T_{p,n-1}^{2}={\frac {p(n-1)}{n-p}}F_{p,n-p},

where $F_{p,n-p}$ is the F-distribution with parameters p and n − p. In order to calculate a p-value (unrelated to the p variable here), divide the t² statistic by the above fraction and use the F-distribution.

Two-sample statistic

If ${\mathbf x}_1,\dots,{\mathbf x}_{n_x}\sim N_p(\boldsymbol{\mu},{\mathbf V})$ and ${\mathbf y}_1,\dots,{\mathbf y}_{n_y}\sim N_p(\boldsymbol{\mu},{\mathbf V})$ , with the samples independently drawn from two independent multivariate normal distributions with the same mean and covariance, and we define

\overline{\mathbf x}=\frac{1}{n_x}\sum_{i=1}^{n_x} \mathbf{x}_i \qquad \overline{\mathbf y}=\frac{1}{n_y}\sum_{i=1}^{n_y} \mathbf{y}_i

as the sample means, and

{\hat {\mathbf {\Sigma } }}_{\mathbf {x} }={\frac {1}{n_{x}-1}}\sum _{i=1}^{n}(\mathbf {x} _{i}-{\overline {\mathbf {x} }})(\mathbf {x} _{i}-{\overline {\mathbf {x} }})'

{\hat {\mathbf {\Sigma } }}_{\mathbf {y} }={\frac {1}{n_{y}-1}}\sum _{i=1}^{n}(\mathbf {y} _{i}-{\overline {\mathbf {y} }})(\mathbf {y} _{i}-{\overline {\mathbf {y} }})'

as the respective sample covariance matrices. Then

{\hat {\mathbf {\Sigma } }}={\frac {n_{x}{\hat {\mathbf {\Sigma } }}_{\mathbf {x} }+n_{y}{\hat {\mathbf {\Sigma } }}_{\mathbf {y} }}{n_{x}+n_{y}-2}}

is the unbiased pooled covariance matrix estimate (an extension of pooled variance).

Finally, the Hotelling's two-sample t-squared statistic is

t^{2}={\frac {n_{x}n_{y}}{n_{x}+n_{y}}}({\overline {\mathbf {x} }}-{\overline {\mathbf {y} }})'{\hat {\mathbf {\Sigma } }}^{-1}({\overline {\mathbf {x} }}-{\overline {\mathbf {y} }})\sim T^{2}(p,n_{x}+n_{y}-2)

Related concepts

It can be related to the F-distribution by^[3]

\frac{n_x+n_y-p-1}{(n_x+n_y-2)p}t^2 \sim F(p,n_x+n_y-1-p).

The non-null distribution of this statistic is the noncentral F-distribution (the ratio of a non-central Chi-squared random variable and an independent central Chi-squared random variable)

\frac{n_x+n_y-p-1}{(n_x+n_y-2)p}t^2 \sim F(p,n_x+n_y-1-p;\delta),

with

\delta = \frac{n_x n_y}{n_x+n_y}\boldsymbol{\nu}'\mathbf{V}^{-1}\boldsymbol{\nu},

where ${\boldsymbol {\nu }}=\mathbf {{\overline {x}}-{\overline {y}}}$ is the difference vector between the population means.

In the two-variable case, the formula simplifies nicely allowing appreciation of how the correlation, $\rho$ , between the variables affects $t^{2}$ . If we define

d_{1}={\overline {x}}_{1}-{\overline {y}}_{1},\qquad d_{2}={\overline {x}}_{2}-{\overline {y}}_{2}

and

s_{1}={\sqrt {W_{11}}}\qquad s_{2}={\sqrt {W_{22}}}\qquad \rho =W_{12}/(s_{1}s_{2})=W_{21}/(s_{1}s_{2})

then

t^{2}={\frac {n_{x}n_{y}}{(n_{x}+n_{y})(1-r^{2})}}\left[\left({\frac {d_{1}}{s_{1}}}\right)^{2}+\left({\frac {d_{2}}{s_{2}}}\right)^{2}-2\rho \left({\frac {d_{1}}{s_{1}}}\right)\left({\frac {d_{2}}{s_{2}}}\right)\right]

Thus, if the differences in the two rows of the vector $(\overline {{\mathbf x}}-\overline {{\mathbf y}})$ are of the same sign, in general, $t^{2}$ becomes smaller as $\rho$ becomes more positive. If the differences are of opposite sign $t^{2}$ becomes larger as $\rho$ becomes more positive.

A univariate special case can be found in Welch's t-test.

More robust and powerful tests than Hotelling's two-sample test have been proposed in the literature, see for example the interpoint distance based tests which can be applied also when the number of variables is comparable with, or even larger than, the number of subjects.^[5]^[6]

References

1 2 3 Hotelling, H. (1931). "The generalization of Student's ratio". Annals of Mathematical Statistics. 2 (3): 360–378. doi:10.1214/aoms/1177732979.
↑ Eric W. Weisstein, MathWorld
1 2 Mardia, K. V.; Kent, J. T.; Bibby, J. M. (1979). Multivariate Analysis. Academic Press. ISBN 0-12-471250-9.
↑
↑ Marozzi, M. (2014). "Multivariate tests based on interpoint distances with application to magnetic resonance imaging". Statistical Methods in Medical Research. doi:10.1177/0962280214529104.
↑ Marozzi, M. (2015). "Multivariate multidistance tests for high-dimensional low sample size case-control studies". Statistics in Medicine. 34. doi:10.1002/sim.6418.

External links

Prokhorov, A.V. (2001) [1994], "Hotelling T²-distribution", in Hazewinkel, Michiel, Encyclopedia of Mathematics, Springer Science+Business Media B.V. / Kluwer Academic Publishers, ISBN 978-1-55608-010-4

Probability distributions
List
Discrete univariate with finite support	Benford Bernoulli beta-binomial binomial categorical hypergeometric Poisson binomial Rademacher discrete uniform Zipf Zipf–Mandelbrot
Discrete univariate with infinite support	beta negative binomial Borel Conway–Maxwell–Poisson discrete phase-type Delaporte extended negative binomial Gauss–Kuzmin geometric logarithmic negative binomial parabolic fractal Poisson Skellam Yule–Simon zeta
Continuous univariate supported on a bounded interval	arcsine ARGUS Balding–Nichols Bates beta beta rectangular Irwin–Hall Kumaraswamy logit-normal noncentral beta raised cosine reciprocal triangular U-quadratic uniform Wigner semicircle
Continuous univariate supported on a semi-infinite interval	Benini Benktander 1st kind Benktander 2nd kind beta prime Burr chi-squared chi Dagum Davis exponential-logarithmic Erlang exponential F folded normal Flory–Schulz Fréchet gamma gamma/Gompertz generalized inverse Gaussian Gompertz half-logistic half-normal Hotelling's T-squared hyper-Erlang hyperexponential hypoexponential inverse chi-squared scaled inverse chi-squared inverse Gaussian inverse gamma Kolmogorov Lévy log-Cauchy log-Laplace log-logistic log-normal Lomax matrix-exponential Maxwell–Boltzmann Maxwell–Jüttner Mittag-Leffler Nakagami noncentral chi-squared Pareto phase-type poly-Weibull Rayleigh relativistic Breit–Wigner Rice shifted Gompertz truncated normal type-2 Gumbel Weibull Discrete Weibull Wilks's lambda
Continuous univariate supported on the whole real line	Cauchy exponential power Fisher's z Gaussian q generalized normal generalized hyperbolic geometric stable Gumbel Holtsmark hyperbolic secant Johnson's S_U Landau Laplace asymmetric Laplace logistic noncentral t normal (Gaussian) normal-inverse Gaussian skew normal slash stable Student's t type-1 Gumbel Tracy–Widom variance-gamma Voigt
Continuous univariate with support whose type varies	generalized extreme value generalized Pareto Marchenko–Pastur q-exponential q-Gaussian q-Weibull shifted log-logistic Tukey lambda
Mixed continuous-discrete univariate	rectified Gaussian
Multivariate (joint)	Discrete Ewens multinomial Dirichlet-multinomial negative multinomial Continuous Dirichlet generalized Dirichlet multivariate Laplace multivariate normal multivariate stable multivariate t normal-inverse-gamma normal-gamma Matrix-valued inverse matrix gamma inverse-Wishart matrix normal matrix t matrix gamma normal-inverse-Wishart normal-Wishart Wishart
Directional	Univariate (circular) directional Circular uniform univariate von Mises wrapped normal wrapped Cauchy wrapped exponential wrapped asymmetric Laplace wrapped Lévy Bivariate (spherical) Kent Bivariate (toroidal) bivariate von Mises Multivariate von Mises–Fisher Bingham
Degenerate and singular	Degenerate Dirac delta function Singular Cantor
Families	Circular compound Poisson elliptical exponential natural exponential location–scale maximum entropy mixture Pearson Tweedie wrapped

Some common univariate probability distributions
Continuous	beta Cauchy chi-squared exponential F gamma Laplace log-normal normal Pareto Student's t uniform Weibull
Discrete	Bernoulli binomial discrete uniform geometric hypergeometric negative binomial Poisson
List of probability distributions

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.