User:Marskell/Using statistics

From Wikipedia, the free encyclopedia

The main qualifier to the use of reliable sources in general applies to statistics: We may not use primary sources whose information has not been made available by a credible publication. This is especially true of statistics, which, when not subject to the scrutiny of independent publication, often betray confirmation bias.

Contents

[edit] Primacy of Government statistics

The main exception to avoiding primary source statistics is the use of government data, particularly demographic figures from a census bureau or similar organization. Wikipedians should actively seek out numbers from government sources, highlight them first in articles and provide caveats afterward where necessary. When placing information in infoboxes census bureau statistics should generally be the only source used; where figures are not available from a government source editors often rely on the C.I.A. World Factbook. Note that the primacy of government statistics is not based an assumption of their fundamental accuracy but rather that because they can be sourced to a degree that no other statistic can. Thus:

  • Wikipedia editors should document government stats even if they consider the figures to be wrong. A hypothetical web-site from the North Korean government, for instance, might state "the prison population of the North is 10 000 persons." A first thought upon encountering the number may well be "this is a lie and should not be included," yet it remains the official statistic provided by a national government and should not be dropped because it is believed to erroneous. A proper paragraph would read something like the following: "The North Korean government reports x...organizations such as Amnesty International are unconvinced and believe y...the United Nations suggests z."
  • In the infobox editors should continue to rely on census data even if they are sure it has become out-of-date. The population of the United Arab Emirates, for example, regularly increases by 10% or more per year and the infobox statistics are likely to be too low at any given moment. Noting this in the article is encouraged but the infobox itself should not contain an estimate or extrapolation and instead only be changed when a new census has been undertaken or an official update been promulgated.

[edit] Instant news statistics

Statistics differ from general statements or assertions of notability in that news coverage of them does not necessarily accord veracity. Links to news stories do confirm "X topic has received significant attention" but do not confirm "X statistic." Instant news (as opposed to feature coverage) mirrors numbers as they are presented rather than deconstructing them.

Wikipedians may place better trust in secondary reporting which has both professional and protracted editorial oversight. Statistics reported in the American Journal of Medicine or Foreign Affairs, for instance, should be considered more reliable than similar statistics on CNN or BBC as their compiliation and confirmation is often a process of months rather than hours.

[edit] Academic sources

[edit] Pan-Wiki consistency

Whenever a statistic is changed on wikipedia consideration should be given as to whether the page is rendered inconsistent with side or related articles. The Casualties of World War II page, for example, has numerous (often debatable) statistics anyone of which may be repeated in a dozen other articles. Rather than unilaterally changing, editors should seek consensus on Talk and make sure that the change is repeated in related articles.