Freedman's paradox

In statistical analysis, Freedman's paradox,[1],[2] named after David Freedman, describes a problem in model selection whereby predictor variables with no explanatory power can appear artificially important. Freedman demonstrated (through simulation and asymptotic calculation) that this is a common occurrence when the number of variables is similar to the number of data points. Recently, new information-theoretic estimators have been developed in an attempt to reduce this problem,[3] in addition to the accompanying issue of model selection bias,[4] whereby estimators of predictor variables that have a weak relationship with the response variable are biased.

References

  1. Freedman, D. A. (1983) "A note on screening regression equations." The American Statistician, 37, 152155.
  2. Freedman, Laurence S.; Pee, David (November 1989). "Return to a Note on Screening Regression Equations". The American Statistician. 43 (4): 279–282. doi:10.2307/2685389.
  3. Lukacs, P. M., Burnham, K. P. & Anderson, D. R. (2010) "Model selection bias and Freedman's paradox." Annals of the Institute of Statistical Mathematics, 62(1), 117125 doi:10.1007/s10463-009-0234-4
  4. Burnham, K. P., & Anderson, D. R. (2002). Model Selection and Multimodel Inference: A Practical-Theoretic Approach, 2nd ed. Springer-Verlag.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.