True variance

From Wikipedia, the free encyclopedia

In statistics, the term true variance is often used to refer to the unobservable variance of a whole finite population, as distinguished from an observable statistic based on a sample. Suppose a number, such as a person's height or income or age or cholesterol level, is assigned to every member of a population of n individuals. Let x_i be the number assigned to the ith individual, for i = 1, ..., n. Then the variance is

$\sigma^2={1 \over n}\sum_{i=1}^n (x_i-\overline{x})^2,\quad\quad\quad(1)$

where

$\mu=\overline{x}={x_1+\cdots+x_n \over n}$

is the population mean. If x_i were the ith member of a random sample rather than of the whole population, then one sometimes uses the same function seen in (1) above as an estimate of the "true variance" or "population variance" σ². But sometimes one replaces n with n − 1, or n + 1 or otherwise alters the expression (1), in order to estimate σ². In particular, using n − 1 makes the estimator unbiased, and in some often considered contexts, using n + 1 minimizes the mean squared error of estimation.

Statisticians do not normally use Greek letters μ and σ for estimates based on samples, but only for (often) unobservable characteristics of whole populations. Because the "true" or "population" variance uses the denominator 1/n rather than 1/(n − 1), it is conventional among those concerned with computation sometimes to call the expression (1), with the denominator 1/n, the "true variance" without regard to whether it is an estimate or a characteristic of whole population or a random sample.

[edit] References

Andrae, von (1872). Über die Bestimmung des wahrscheinlichen Fehlers durch die gegebenen Differenzen vom gleich genauen Beobachtungen einer Unbekannten. Astronomische Nachrichten, vol. 84.

Helmert, F.R. (1876). Die Berechnung des wahrscheinlichen Beobachtungsfehlers aus den ersten Potenzen der Differenzen gleichgenauer directer Beobachtungen. Astronomische Nachrichten, vol. 88.

Kendall, M. (1943). The Advanced Theory of Statistics. In Stuart, A., & Ord, J.K. (1987) Kendall’s Advanced Theory of Statistics, 5th Ed. London: Griffin.
Press, W. H., Teukolsky, S.A., Vetterling, W.T., & Flannery, B.P. (1992, 2^nd Ed.). Numerical Recipes. Cambridge, MA: Cambridge University Press.

Krus, D.J., & Ceurvorst, R.W. (1979) Dominance, information, and hierarchical scaling of variance space. Applied Psychological Measurement, 3, 515-527 (Request reprint).