p-rep

From Wikipedia, the free encyclopedia

P-rep or $p r e p$ is a statistical alternative to the classic p-value. Whereas a p-value indicates the probability of obtaining a result by chance alone, p-rep estimates the probability of replicating an effect. The Association for Psychological Science now recommends that articles submitted to Psychological Science and their other journals report p-rep rather than the classic p-value. ^[1]

1 Calculation
2 Criticism
3 External links
4 References

[edit] Calculation

The value of the p-rep (p_rep) can be approximated based on the p-value (p) using the following equation:

$p_{rep} = \left[ 1 + \left( \frac{p}{1-p} \right)^{\frac{2}{3}} \right]^{-1}$

[edit] Criticism

The fact that the p-rep has a one-to-one correspondence with the p-value makes it clear that this new measure doesn't bring any additional information on the significance of the result of a given experiment. However, according to Killeen who acknowledges this latter point, the main advantage of p-rep lies in the fact that it better captures the way experimenters naively think and conceptualize p-values and Null hypothesis statistical testing. Since one can never accept either the null or the alternative, estimating the probability that one's results are replicable is more attractive to them.

Among the criticisms of p-rep is the fact that it does not take prior probabilities into account (Macdonald, R. R. Psychological Science, 2005, 16, 1006–1008).^[2] For example, if an experiment on some unlikely paranormal phenomenon produced a p-rep of .75, most right-thinking people would not believe the probability of a replication is .75. Instead they would conclude that it is much closer to .50. Extraordinary claims require extraordinary evidence, and p-rep ignores this. This consideration undermines the argument that p-rep is easier to understand than a classical p value. The fact that p-rep requires assumptions about prior probabilities for it to be valid makes its interpretation complex. The classical p merely states the probability of an outcome (or more extreme outcome) given a null hypothesis and therefore is valid without regard to prior probabilities. Killeen argues that new results should be evaluated in their own right, without the burden of history, with flat priors: that is what p-rep yields. A more pragmatic estimate of replicability would include prior knowledge, which the logic of p-rep permits, but which null testing does not.

Critics have also underscored mathematical errors in the original paper by Killeen. For example, the formula relating the effect sizes from two replications of a given experiment erroneously use one of these random variables as a parameter of the probability distribution of the other while he previously hypothesized these two variables to be independent.^[3] These criticisms were addressed in his rejoinder (Killeen, P. R., Psychological Science, 2005, 16, 1009-1012).^[4]