Cochran's theorem

From Wikipedia, the free encyclopedia

In statistics, Cochran's theorem, devised by William G. Cochran, is a theorem used in the analysis of variance.

[edit] Overview

Suppose U1, ..., Un are independent standard normally distributed random variables, and an identity of the form


\sum_{i=1}^n U_i^2=Q_1+\cdots + Q_k

can be written where each Qi is a sum of squares of linear combinations of the Us. Further suppose that


r_1+\cdots +r_k=n

where ri is the rank of Qi. Cochran's theorem states that the Qi are independent, and each Qi has a chi-square distribution with ri degrees of freedom.

Cochran's theorem is the converse of Fisher's theorem.

[edit] Example

If X1, ..., Xn are independent normally distributed random variables with mean μ and standard deviation σ then

Ui = (Xi − μ) / σ

is standard normal for each i.

It is possible to write


\sum U_i^2=\sum\left(\frac{X_i-\overline{X}}{\sigma}\right)^2
+ n\left(\frac{\overline{X}-\mu}{\sigma}\right)^2

(here, summation is from 1 to n, that is over the observations). To see this identity, multiply throughout by σ and note that


\sum(X_i-\mu)^2=
\sum(X_i-\overline{X}+\overline{X}-\mu)^2

and expand to give


\sum(X_i-\overline{X})^2+\sum(\overline{X}-\mu)^2+
2\sum(X_i-\overline{X})(\overline{X}-\mu).

The third term is zero because it is equal to a constant times

\sum(\overline{X}-X_i),

and the second term is just n identical terms added together.

Combining the above results (and dividing by σ2), we have:


\sum\left(\frac{X_i-\mu}{\sigma}\right)^2=
\sum\left(\frac{X_i-\overline{X}}{\sigma}\right)^2
+n\left(\frac{\overline{X}-\mu}{\sigma}\right)^2
=Q_1+Q_2.

Now the rank of Q2 is just 1 (it is the square of just one linear combination of the standard normal variables). The rank of Q1 can be shown to be n − 1, and thus the conditions for Cochran's theorem are met.

Cochran's theorem then states that Q1 and Q2 are independent, with Chi-squared distribution with n − 1 and 1 degree of freedom respectively.

This shows that the sample mean and sample variance are independent; also


(\overline{X}-\mu)^2\sim \frac{\sigma^2}{n}\chi^2_1.

To estimate the variance σ2, one estimator that is often used is


\widehat{\sigma}^2=
\frac{1}{n}\sum\left(
X_i-\overline{X}\right)^2.

Cochran's theorem shows that


\frac{n\widehat{\sigma}^2}{\sigma^2}\sim\chi^2_{n-1}

which shows that the expected value of \widehat{\sigma}^2 is σ2(n − 1)/n.

Both these distributions are proportional to the true but unknown variance σ2; thus their ratio is independent of σ2 and because they are independent we have


\frac{n\left(\overline{X}-\mu\right)^2}
{\frac{1}{n-1}\sum\left(X_i-\overline{X}\right)^2}\sim
F_{1,n-1}

where F1,n − 1 is the F-distribution with 1 and n − 1 degrees of freedom (see also Student's t-distribution).

Languages