Heteroscedasticity-consistent standard errors

The topic of heteroscedasticity-consistent (HC) standard errors arises in statistics and econometrics in the context of linear regression as well as time series analysis. The alternative names of Huber–White standard errors, Eicker–White or Eicker–Huber–White^[1] are also frequently used in relation to the same ideas.

In regression and time-series modelling, basic forms of models make use of the assumption that the errors or disturbances u_i have the same variance across all observation points. When this is not the case, the errors are said to be heteroscedastic, or to have heteroscedasticity, and this behaviour will be reflected in the residuals $\scriptstyle\widehat{u_i}$ estimated from a fitted model. Heteroscedasticity-consistent standard errors are used to allow the fitting of a model that does contain heteroscedastic residuals. The first such approach was proposed by Huber (1967), and further improved procedures have been produced since for cross-sectional data, time-series data and GARCH estimation.

Definition

Assume that we are studying the linear regression model

Y = X' \beta + U, \,

where X is the vector of explanatory variables and β is a k × 1 column vector of parameters to be estimated.

The ordinary least squares (OLS) estimator is

\widehat \beta_{OLS} = (\mathbb{X}' \mathbb{X})^{-1} \mathbb{X}' \mathbb{Y}. \,

where $\mathbb{X}$ denotes the matrix of stacked $X_i'$ values observed in the data.

If the sample errors have equal variance σ² and are uncorrelated, then the least-squares estimate of β is BLUE (best linear unbiased estimator), and its variance is easily estimated with

v_{OLS}[\hat\beta_{OLS}] = s^2 (\mathbb{X}'\mathbb{X})^{-1}, s^2 = \frac{\sum_i \hat u_i^2}{n-k}

where $\hat u_i$ are regression residuals.

When the assumptions of $E[uu'] = \sigma^2 I_n$ are violated, the OLS estimator loses its desirable properties. Indeed,

V[\hat\beta_{OLS}] = V[ (\mathbb{X}'\mathbb{X})^{-1} \mathbb{X}'\mathbb{Y}] = (\mathbb{X}'\mathbb{X})^{-1} \mathbb{X}' \Sigma \mathbb{X} (\mathbb{X}'\mathbb{X})^{-1}

where $\Sigma = V[u]$ .

While the OLS point estimator remains unbiased, it is not "best" in the sense of having minimum mean square error, and the OLS variance estimator $v_{OLS}[\hat\beta_{OLS}]$ does not provide a consistent estimate of the variance of the OLS estimates.

White's heteroscedasticity-consistent estimator

If the regression errors $u_i$ are independent, but have distinct variances σ_i², then $\Sigma = \operatorname{diag}(\sigma_1^2, \ldots, \sigma_n^2)$ which can be estimated with $\hat\sigma_i^2 = \hat u_i^2$ . This provides White's (1980) estimator, often referred to as HCE (heteroscedasticity-consistent estimator):

\begin{align} v_{HCE}[\hat\beta_{OLS}] &= \frac{1}{n} (\frac{1}{n} \sum_i X_i X_i' )^{-1} (\frac{1}{n} \sum_i X_i X_i' \hat{u}_i^2 ) (\frac{1}{n} \sum_i X_i X_i' )^{-1} \\ &= ( \mathbb{X}' \mathbb{X} )^{-1} ( \mathbb{X}' \operatorname{diag}(\hat u_1^2, \ldots, \hat u_n^2) \mathbb{X} ) ( \mathbb{X}' \mathbb{X})^{-1}, \end{align}

where as above $\mathbb{X}$ denotes the matrix of stacked $X_i'$ values from the data. The estimator can be derived in terms of the generalized method of moments (GMM).

Note that also often discussed in the literature (including in White's paper itself) is the covariance matrix $\hat\Omega_n$ of the $\sqrt{n}$ -consistent limiting distribution:

\sqrt{n}(\hat\beta_n - \beta) \xrightarrow{d} N(0,\Omega),

where,

\Omega = E[X X']^{-1}Var[X u]E[X X']^{-1},

and

\begin{align} \hat\Omega_n &= (\frac{1}{n} \sum_i X_i X_i' )^{-1} (\frac{1}{n} \sum_i X_i X_i' \hat u_i^2 ) (\frac{1}{n} \sum_i X_i X_i' )^{-1} \\ &= n ( \mathbb{X}' \mathbb{X} )^{-1} ( \mathbb{X}' \operatorname{diag}(\hat u_1^2, \ldots, \hat u_n^2) \mathbb{X} ) ( \mathbb{X}' \mathbb{X})^{-1}. \end{align}

Thus,

\hat\Omega_n = n \cdot v_{HCE}[\hat\beta_{OLS}]

and

\widehat{Var}[X u] = \frac{1}{n} \sum_i X_i X_i' \hat u_i^2 = \frac{1}{n} \mathbb{X}' \operatorname{diag}(\hat u_1^2, \ldots, \hat u_n^2) \mathbb{X}

Precisely which covariance matrix is of concern should be a matter of context.

Alternative estimators have been proposed in MacKinnon & White (1985) that correct for unequal variances of regression residuals due to different leverage. Unlike the asymptotic White's estimator, their estimators are unbiased when the data are homoscedastic.

Software

Stata: robust option applicable in many pseudo-likelihood based procedures. See online help for _robust option and regress command.
RATS: robusterrors option is available in many of the regression and optimization commands (linreg, nlls, etc.).
Eviews: EViews version 8 offers three different methods for robust least squares: M-estimation (Huber, 1973), S-estimation (Rousseeuw and Yohai, 1984), and MM-estimation (Yohai 1987).

References

↑ Kleiber, C., Zeileis, A (2006) Applied Econometrics with R, UseR-2006 conference

Hayes, Andrew F.; Cai, Li (2007), "Using heteroscedasticity-consistent standard error estimators in OLS regression: An introduction and software implementation", Behavior Research Methods 37: 709–722

Eicker, Friedhelm (1967), "Limit Theorems for Regression with Unequal and Dependent Errors", Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, pp. 59–82, MR 0214223, Zbl 0217.51201

MacKinnon, James G.; White, Halbert (1985), "Some Heteroskedastic-Consistent Covariance Matrix Estimators with Improved Finite Sample Properties", Journal of Econometrics 29 (29): 305–325, doi:10.1016/0304-4076(85)90158-7

Huber, Peter J. (1967), "The behavior of maximum likelihood estimates under nonstandard conditions", Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, pp. 221–233, MR 0216620, Zbl 0212.21504

White, Halbert (1980), "A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity", Econometrica 48 (4): 817–838, doi:10.2307/1912934, JSTOR 1912934, MR 575027

Greene, William. (1998), Econometric Analysis, Prentice Hall

Heteroscedasticity-consistent standard errors

Definition

White's heteroscedasticity-consistent estimator

See also

Software

References