Count data

From Wikipedia, the free encyclopedia

In statistics, count data is data in which the dependent variable can take only the non-negative integer values {0, 1, 2, 3, ...}, and where these integers arise from counting rather than ranking.

Statistical methods such as least squares and analysis of variance are designed to deal with continuous independent variables. These can be adapted to deal with count data by using data transformations such as the square root transformation, but such methods have several drawbacks; they are approximate at best and estimate parameters that are often hard to interpret.

The Poisson distribution now forms the basis for most analysis of count data. Poisson regression is often used, but techniques such as negative binomial regression may be needed when the assumptions of the Poisson model are violated, in particular when overdispersion is present.

Count data is distinct from binary data, in which the dependent variable can take only two values, usually represented by 0 and 1.

[edit] Further reading

  • Cameron, A.C. and P.K. Trivedi (1998). Regression analysis of count data, Cambridge University Press. ISBN 0-521-63201-3
  • Winkelmann, Rainer (2000), Econometric Analysis of Count Data, Springer, ISBN 354040404X