Top-coded

In econometrics and statistics, a top-coded dataset is one for which the upper bound is not known. This is often done to preserve the anonymity of people participating in the survey (for example, if a survey included a person with wealth of $51 billion, it would not be anonymous because people would know it is Bill Gates).

Contents

Example: Top-coding of wealth

id age income
1 26 24778 exact value
2 32 26750 exact value
3 45 26780 exact value
4 32 30000+ top coded
5 45 30000+ top coded

Jacob S. Hacker and Paul Pierson argue that the practice of top-coding, or capping the reported maximum value on tax returns ostensibly to protect the earner's anonymity, complicates the analysis of the distribution of wealth in the United States.[1]

Implications for ordinary least squares

See also

References

  1. ^ Hacker, Jacob S. and Paul Pierson (2010). Winner-Take-All Politics: How Washington Made the Rich Richer--And Turned Its Back on the Middle Class. Simon & Schuster. pp. 13. ISBN 978-1-4165-8869-6.