Binomial test

From Wikipedia, the free encyclopedia

In statistics, the binomial test is an exact test of the statistical significance of deviations from a theoretically expected distribution of observations into two categories.

For example, suppose we have a board game that depends on the roll of a dice, and special importance attaches to rolling a 6. In a particular game, the die is rolled 235 times, and 6 comes up 51 times. If the die is fair, we would expect 6 to come up 235/6 = 39.17 times. Is the proportion of 6s significantly higher than would be expected by chance, on the null hypothesis of a fair die?

To find an answer to this question using the binomial test, we consult the binomial distribution B(235,1/6) to find out what the probability is of finding exactly 51 6s in a sample of 235 if the true probability of a 6 on each trial is 1/6. We then find the probability of finding exactly 52, exactly 53, and so on up to 235, and add all these probabilities together. In this way, we obtain the probability of obtaining the observed result (51 6s) or a more extreme result (>51 6s) and in this example, the result is 0.02654425, which means we cannot reject the null hypothesis (two-tailed test) if we are working at the 5% significance level. The result calculated, 0.02654425 is significant for a one-tailed test of the observed number of 6s.

For large samples such as this example, the binomial distribution is well approximated by convenient continuous distributions, and these are used as the basis for alternative tests that are much quicker to compute, Pearson's chi-square test and the G-test. However, for small samples these approximations break down, and there is no alternative to the binomial test.

The most common use of the binomial test is in the case where the null hypothesis is that two categories are equally likely to occur. Tables are widely available to give the significance observed numbers of observations in the categories for this case. However, as the example above shows, the binomial test is not restricted to this case.

Where there are more than two categories, and an exact test is required, the multinomial test, based on the multinomial distribution, must be used instead of the binomial test.

[edit] References