Talk:Instrumental variable

From Wikipedia, the free encyclopedia

This article is within the scope of WikiProject Statistics, which collaborates to improve Wikipedia's coverage of statistics. If you would like to participate, please visit the project page.

This article is within the scope of the Economics WikiProject, an effort to create, expand, organize, and improve economics-related articles..
Start rated as start-Class on the assessment scale
Mid rated as mid-importance on the importance scale

[edit] ?

I think it would be a good idea to combine the two pages, as they are on the same topic, altho one is a simple algebraic derivation of the IV estimator and the other is a more discursive presentation of the purposes and assumptions underpinning IV estimation. The two pages are quite complementary.


I agree, merge them. SCB


There is an important subtlety that is missing from the discussion, and is even often missing in texts. OLS still provides an excellent predictor of response under the current data generating process (DGP). In other words, if I know market price and I want to predict market quantity, OLS does a fine job. On the other hand, if I want to recover the structural equations, supply and demand, I need 'IV'. That allows me to predict quantity given an exogenous shock to price, such as a tax. So, when we say a consistent estimator, it's a little ambiguous. We ought to be saying what we want to estimate consisistently.

Just to be clear, suppose we are trying to guess future earning based on IQ. However, IQ is measured with an additive error. Suppose Jim's measured IQ is 100 and I want to guess his future earnings. I'll do just fine plugging his measured IQ into an OLS estimate, so long as the measurement error for him follows the same DGP as the data for the regression. On the other hand, if I want to know the impact of giving Jim a pill that increases his true IQ by 10 points, then I need an IV estimator. That's because the slope coefficient is biased towards zero to adjust for the measurement error. But, I am adjusting the IQ by a known amount, and thus want to have the true slope coefficient for true (not measured) IQ.

This may be too fine a point for the article, but it's one I often see misunderstood in papers. The ambiguity being what we are trying to estimate: a structural model, or a good predictor under the current DGP. Derex @ 23:55, 8 October 2005 (UTC)


> The slope estimator thus obtained is unbiased.

I think the slope estimator is consistent, but biased. —Preceding unsigned comment added by 125.2.48.91 (talk • contribs)

I take it you're referring to the text about the simple IV? Both are right, depending on the experimental assumptions. If X is viewed as fixed in repeated samples, then it is unbiased. If X is viewed as random, then IV is consistent but likely biased. The latter is the realistic case, though the former is often presented in introductory texts. Probably should change it though, as "consistent" is always correct. Derex 02:54, 1 April 2006 (UTC)

[edit] 2 stage?

what are the corresponding regression equation for each stage? This is very unclear.

  • stage 1: Xi=Zb+residual we get Xihat=Zbhat=Z(Z'Z)^(-1)Z'Xi.
Xhat=(Xihat ...)
  • stage 2: y=Xhat*beta+residual.

Jackzhp (talk) 04:30, 25 March 2008 (UTC)

[edit] Hypothesis testing

The discussion under "hypothesis testing" is simply wrong. The first moment of the simple IV estimator doesn't exist, therefore, the estimator is neither biased nor unbiased. The normality result is asymptotic; in small samples the coefficients are not normally distributed, and the t-ratio does not follow a Student distribution. Since this section is superfluous it would be wise to delete it.

A minor problem is the second sentence is incoherent: ``endogeneity" means that the error term and regressors are correlated, so the piece should not claim that one reason that the error term and the covariates may be correlated is endogeneity!

68.146.25.175 (talk) 22:28, 20 April 2008 (UTC)