Heteroscedasticity

From Wikipedia, the free encyclopedia

Plot with random data showing heteroscedasticity.
Enlarge
Plot with random data showing heteroscedasticity.

In statistics, a sequence or a vector of random variables is heteroscedastic if the random variables in the sequence or vector may have different variances. The complementary concept is called homoscedasticity. (Note: There seems to be no standard agreed-upon spelling for these words; they are sometimes spelled homo- or heteroskedastic or (incorrectly) -schedastic, depending on location and personal taste.)

When using a variety of techniques in statistics, such as ordinary least squares (OLS), a number of assumptions are typically made. One of these is that the error term has a constant variance. This will be true if the observations of the error term are assumed to be drawn from identical distributions. Heteroscedasticity is a violation of this assumption.

For example, the error term could vary or increase with each observation, something that is often the case with cross sectional or time series measurements. Heteroscedasticity is often studied as part of econometrics, which frequently deals with data exhibiting it. It comes in two forms, pure and impure. Because there are so many types of each, most textbooks limit themselves to dealing with heteroscedasticity in general, or one or two examples.

Now, with the advent of robust standard errors allowing us to do inference without specifying the conditional second moment of error term, testing conditional homoscedasticity is not as important as it used to be, in every case the most popular test for conditional homoscedasticity is due to White (1980).

Recently, the econometrist Robert Engle won the 2003 Nobel Memorial Prize for Economics for his studies on regression analysis in the presence of heteroscedasticity, which led to his formulation of the ARCH (AutoRegressive Conditional Heteroscedasticity) modeling technique.

Contents

[edit] Consequences

The consequences are similar, but not quite the same as for serial correlation.

  1. When OLS is applied to heteroscedastic models the estimated variance is a biased estimator of the true variance. That is, it either overestimates or underestimates the true variance, and, in general it is not possible to determine the nature of the bias. The variances, and so the standard errors may therefore be either understated or overstated. Thus t-tests are not valid any more.

[edit] Examples

Heteroscedasticity often occurs when there is a large difference between the size of observations.

  1. The classic example of heteroscedasticity is that of income versus food consumption. As one's income increases, the variability of food consumption will increase. A poorer person will spend a rather constant amount by always eating fast food; a wealthier person may occasionally buy fast food and other times eat an expensive meal. Those with higher incomes display a greater variability of food consumption.
  2. Imagine you are watching a rocket take off nearby and measuring the distance it has travelled once each second. In the first couple of seconds your measurements may be accurate to the nearest centimetre, say. However, 5 minutes later as the rocket recedes into space, the accuracy of your measurements may only be good to 100 m, because of the increased distance, atmospheric distortion and a variety of other factors. The data you collect would exhibit heteroscedasticity.

[edit] See also

[edit] References

There are a great number of references as most statistics text books will include at least some material on heteroscedasticity. Some examples are:

  1. Studenmund, A.H. Using Econometrics 2nd Ed. ISBN 0-673-52125-7. (devotes a chapter to heteroscedasticity).
  2. Verbeek, Marno (2004): A Guide to Modern Econometrics, 2. ed., Chichester: John Wiley & Sons, 2004, pages
  3. Greene, W.H. (1993), Econometric Analysis, Prentice-Hall, ISBN 0-13-013297-7, an introductory but thorough general text, considered the standard for a pre-doctorate university Econometrics course;
  4. Hamilton, J.D. (1994), Time Series Analysis, Princeton University Press ISBN 0-691-04289-6, the text of reference for historical series analysis; it contains an introduction to ARCH models.

Special subjects:

  • White test: White, Halbert (1980): A Heteroscedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroscedasticity, in: Econometrica, Vol. 48, 1980, page 817-838