Sign test

From Wikipedia, the free encyclopedia

In statistics, the sign test can be used to test the hypothesis that the difference median is zero between the continuous distributions of two random variables X and Y, in the situation when we can draw paired samples from X and Y. It is a non-parametric test which makes very few assumptions about the nature of the distributions under test - this means that it has very general applicability but may lack the statistical power of other tests such as the paired-samples t-test or the Wilcoxon signed-rank test.[citation needed]

Method

Let p = Pr(X > Y), and then test the null hypothesis H0: p = 0.50. In other words, the null hypothesis states that given a random pair of measurements (xi, yi), then xi and yi are equally likely to be larger than the other.

To test the null hypothesis, independent pairs of sample data are collected from the populations {(x1, y1), (x2, y2), . . ., (xn, yn)}. Pairs are omitted for which there is no difference so that there is a possibility of a reduced sample of m pairs.[1]

Then let W be the number of pairs for which yi  xi > 0. Assuming that H0 is true, then W follows a binomial distribution W ~ b(m, 0.5). The "W" is for Frank Wilcoxon who developed the test, then later, the more powerful Wilcoxon signed-rank test.[2]

Assumptions

Let Zi = Yi  Xi for i = 1, ... , n.

  1. The differences Zi are assumed to be independent.
  2. Each Zi comes from the same continuous population.
  3. The values Xi and Yi represent are ordered (at least the ordinal scale), so the comparisons "greater than", "less than", and "equal to" are meaningful.

Significance testing

Since the test statistic is expected to follow a binomial distribution, the standard binomial test is used to calculate significance. The normal approximation to the binomial distribution can be used for large sample sizes, m>25.[1]

The left-tail value is computed by Pr(W w), which is the p-value for the alternative H1: p < 0.50. This alternative means that the X measurements tend to be higher.

The right-tail value is computed by Pr(W w), which is the p-value for the alternative H1: p > 0.50. This alternative means that the Y measurements tend to be higher.

For a two-sided alternative H1 the p-value is twice the smaller tail-value.

See also

References

  1. 1.0 1.1 Mendenhall, W.; Wackerly, D. D. and Scheaffer, R. L. (1989), "15: Nonparametric statistics", Mathematical statistics with applications (Fourth ed.), PWS-Kent, pp. 674–679, ISBN 0-534-92026-8 
  2. Karas, J. & Savage, I.R. (1967) Publications of Frank Wilcoxon (18921965). Biometrics 23(1): 1–10
  • Gibbons, J.D. and Chakraborti, S. (1992). Nonparametric Statistical Inference. Marcel Dekker Inc., New York.
  • Kitchens, L.J.(2003). Basic Statistics and Data Analysis. Duxbury.
  • Conover, W. J. (1980). Practical Nonparametric Statistics, 2nd ed. Wiley, New York.
  • Lehmann, E. L. (1975). Nonparametrics: Statistical Methods Based on Ranks. Holden and Day, San Francisco.
This article is issued from Wikipedia. The text is available under the Creative Commons Attribution/Share Alike; additional terms may apply for the media files.