Wald-Wolfowitz runs test

From Wikipedia, the free encyclopedia

The runs test (also called Wald-Wolfowitz test) is a non-parametric test that checks a randomness hypothesis for a two-valued data sequence. More precisely, it can be used to test the hypothesis that the elements of the sequence are mutually independent.

A "run" of a sequence is a maximal non-empty segment of the sequence consisting of adjacent equal elements. For example, the sequence "++++−−−+++−−++++++−−−−" consists of six runs, three of which consist of +s and the others of −s. If +s and −s alternate randomly, the number of runs in a sequence of length N for which it is given that there are N+ occurrences of + and N occurrences of - (so N = N+ + N) is a random variable whose conditional distribution – given the observation of N+ and N – has:

These parameters do not depend on the "fairness" of the process generating the elements of the sequence in the sense that +s and -s must have equal probabilities, but only on the assumption that the elements are independent and identically distributed. If there are too many runs more or less than expected, the hypothesis of statistical independence of the elements may be rejected.

Runs tests can be used to test:

  1. the randomness of a distribution, by taking the data in the given order and marking with + the data greater than the median and with − the rest;
  2. whether a function fits well to a data set, by marking the data exceeding the function value with + and the other data with −. For this use, the runs test, which takes into account the signs but not the distances, is complementary to the chi square test, which takes into account the distances but not the signs.

The Kolmogorov-Smirnov test is more powerful, if it can be applied.

[edit] See also