Omitted-variable bias

From Wikipedia, the free encyclopedia

Omitted-variable bias (OVB) is the bias that appears in estimates of parameters in a regression analysis when the assumed specification is incorrect, in that it omits an independent variable that should be in the model.

[edit] Omitted-variable bias in linear regression

Two conditions must hold true for omitted variable bias to exist in linear regression:

  • the omitted variable must be a determinant of the dependent variable (i.e., its true regression coefficient is not zero); and
  • the omitted variable must be correlated with one or more of the included independent variables.

As an example, consider a linear model of the form yi = xiβ + ziδ + ui, where xi is treated as a vector and zi is a scalar. For simplicity suppose that E[ui | xi,zi] = 0. Now consider what happens if one were to regress yi on only xi. Through the usual least squares calculus, the estimated parameter vector \hat{\beta} is given by:

\hat{\beta} = (x'x)^{-1}x'y.\,

Substituting for y based on the assumed linear model,

\hat{\beta} = (x'x)^{-1}x'(x\beta+z\delta+u)=(x'x)^{-1}x'x\beta + (x'x)^{-1}x'z\delta + (x'x)^{-1}x'u.\,

Taking expectations, the final term (x'x) − 1x'u falls out by the assumed conditional expectation above. Simplifying the remaining terms:

E[ \hat{\beta} ] = \beta + \delta (x'x)^{-1}x'z.\,

The above is an expression for the omitted variable bias in this case. Note that the bias is equal to the weighted portion of zi which is "explained" by xi.

[edit] References

  • Greene, WH (1993). Econometric Analysis, 2nd ed.. Macmillan, 245-246. 
Languages