Berkson's paradox
From Wikipedia, the free encyclopedia
Berkson's paradox or Berkson's fallacy is a result in conditional probability and statistics which is counter-intuitive for some people, and so has been described as a paradox. It is a complicating factor arising in statistical tests of proportions. Specifically, it arises when there is an ascertainment bias inherent in a study design.
The result is that two independent events become conditionally dependent given that at least one of them occurs. Symbolically:
- if 0 < P(A) < 1 and 0 < P(B) < 1,
- and P(A|B) = P(A), i.e. they are independent,
- then P(A|B,C) < P(A|C) where C = A∪B (i.e. A or B).
It is often described in the fields of medical statistics or biostatistics, as in the original description of the problem by J Berkson.
A classic illustration involves a retrospective study examining a risk factor for a disease in a statistical sample from hospital in-patient . If a control group is also ascertained from the in-patient population, a difference in hospital admission rates for the case sample and control sample can result in a spurious association between the disease and the risk factor.
As another example, suppose I have 1000 postage stamps, of which 300 are pretty and 100 are rare, with 30 being both pretty and rare. 10% of all the stamps are rare and 10% of the pretty stamps are rare, so prettiness tells me nothing about rarity.
I put the 370 stamps which are pretty or rare on display. Just over 27% of the stamps on display are rare, but still only 10% of the pretty stamps on display are rare. If I only consider stamps on display, I will observe a spurious negative relationship between prettiness and rarity as a result of my selection bias.
[edit] References
- Berkson, J. (1946) "Limitations of the application of fourfold tables to hospital data". Biometrics Bulletin, 2(3), 47-53.
[edit] Note on References
The reference Berkson (1946) cited above is frequently cited incorrectly in the literature as Berkson, J. (1949) Biological Bulletin 2, 47-53.
Biological Bulletin, established in the 19th century, does not publish statistical papers. The correct reference is to the biostatistical journal Biometrics Bulletin, established in 1945 which became Biometrics in 1947.