User talk:Opabinia regalis/Article statistics

From Wikipedia, the free encyclopedia

[edit] Uncertainties

Your random article survey table is very interesting. I would just add an estimated uncertainty for the fraction for each type of article. As a rough estimate, especially for the less common topics, you can say that the standard deviation is the square root of the mean. That means that for an type with that "should" produce 4 hits, the standard deviation is about 2. That means that it is not too unlikely to actually observe zero hits for such an article type, as going two standard deviations away from the mean in a given direction is certainly not impossible (happens about 2.5% of the time). For example, chemistry certainly suffered this fate, as there are more than 20 000 chemistry articles by estimates based on categories, which is more than 1%. Cheers, Itub 11:53, 3 May 2007 (UTC).