False positive paradox
The false positive paradox is a statistical result where false positive tests are more probable than true positive tests, occurring when the overall population has a low incidence of a condition and the incidence rate is lower than the false positive rate. The probability of a positive test result is determined not only by the accuracy of the test but by the characteristics of the sampled population.[1] When the incidence, the proportion of those who have a given condition, is lower than the test's false positive rate, even tests that have a very low chance of giving a false positive in an individual case will give more false than true positives overall.[2] So, in a society with very few infected people—fewer proportionately than the test gives false positives—there will actually be more who test positive for a disease incorrectly and don't have it than those who test positive accurately and do. The paradox has surprised many.[3]
It is especially counter-intuitive when interpreting a positive result in a test on a low-incidence population after having dealt with positive results drawn from a high-incidence population.[2] If the false positive rate of the test is higher than the proportion of the new population with the condition, then a test administrator whose experience has been drawn from testing in a high-incidence population may conclude from experience that a positive test result usually indicates a positive subject, when in fact a false positive is far more likely to have occurred.
Not adjusting to the scarcity of the condition in the new population, and concluding that a positive test result probably indicates a positive subject, even though population incidence is below the false positive rate is a "base rate fallacy".
Example
High-incidence population
Imagine running an HIV test on population A,(of 1,000,000 persons) in which 200 out of every 10,000 people are infected. (2%) The test has a false positive rate of 0.0004 (0.04%) and no false negative rate. The expected outcome of a million tests on population A would be:
- Unhealthy and test indicates disease (true positive)
- 1,000,000 × (200/10000) = 20,000 people would receive a true positive
- Healthy and test indicates disease (false positive)
- 1,000,000 × (9800/10000) × 0.0004 = 392 people would receive a false positive
- The remaining 979,608 tests are correctly negative.
So, in population A, a person receiving a positive test could be over 98% confident (20,000/20,392) that it correctly indicates infection.
Low-incidence population
Now consider the same test applied to population B, in which only 1 person in 10,000 (0.01%) is infected . The expected outcome of a million tests on population B would be:
- Unhealthy and test indicates disease (true positive)
- 1,000,000 × (1/10,000) = 100 people would receive a true positive
- Healthy and test indicates disease (false positive)
- 1,000,000 × (9999/10,000) × 0.0004 ≈ 400 people would receive a false positive
- The remaining 999,500 tests are correctly negative.
In population B, only 100 of the 500 total people with a positive test result are actually infected. So, the probability of actually being infected after you are told you are infected is only 20% (100/500) for a test that otherwise appears to be "over 99.95% accurate".
A tester with experience of group A might find it a paradox that in group B, a result that had almost always indicated infection is now usually a false positive. The confusion of the posterior probability of infection with the prior probability of receiving a false positive is a natural error after receiving a life-threatening test result.
Discussion
Cory Doctorow discusses this paradox in his book Little Brother.
If you ever decide to do something as stupid as build an automatic terrorism detector, here's a math lesson you need to learn first. It's called "the paradox of the false positive," and it's a doozy.Say you have a new disease, called Super-AIDS. Only one in a million people gets Super-AIDS. You develop a test for Super-AIDS that's 99 percent accurate. I mean, 99 percent of the time, it gives the correct result -- true if the subject is infected, and false if the subject is healthy. You give the test to a million people. One in a million people have Super-AIDS. One in a hundred people that you test will generate a "false positive" -- the test will say he has Super-AIDS even though he doesn't. That's what "99 percent accurate" means: one percent wrong. What's one percent of one million? 1,000,000/100 = 10,000 One in a million people has Super-AIDS. If you test a million random people, you'll probably only find one case of real Super-AIDS. But your test won't identify one person as having Super-AIDS. It will identify 10,000 people as having it. Your 99 percent accurate test will perform with 99.99 percent inaccuracy. That's the paradox of the false positive. When you try to find something really rare, your test's accuracy has to match the rarity of the thing you're looking for. If you're trying to point at a single pixel on your screen, a sharp pencil is a good pointer: the pencil-tip is a lot smaller (more accurate) than the pixels. But a pencil-tip is no good at pointing at a single atom in your screen. For that, you need a pointer -- a test -- that's one atom wide or less at the tip. This is the paradox of the false positive, and here's how it applies to terrorism: Terrorists are really rare. In a city of twenty million like New York, there might be one or two terrorists. Maybe ten of them at the outside. 10/20,000,000 = 0.00005 percent. One twenty-thousandth of a percent. That's pretty rare all right. Now, say you've got some software that can sift through all the bank-records, or toll-pass records, or public transit records, or phone-call records in the city and catch terrorists 99 percent of the time. In a pool of twenty million people, a 99 percent accurate test will identify two hundred thousand people as being terrorists. But only ten of them are terrorists. To catch ten bad guys, you have to haul in and investigate two hundred thousand innocent people.
Guess what? Terrorism tests aren't anywhere close to 99 percent accurate. More like 60 percent accurate. Even 40 percent accurate, sometimes.
See also
- Bayes' theorem
- List of paradoxes
- Prosecutor's fallacy, a mistake in reasoning that involves ignoring a low prior probability
- Simpson's paradox, another error in statistical reasoning dealing with comparing groups
References
- ↑ Rheinfurth, M. H.; Howell, L. W. (March 1998). Probability and Statistics in Aerospace Engineering (pdf). NASA. p. 16. "MESSAGE: False positive tests are more probable than true positive tests when the overall population has a low incidence of the disease. This is called the false-positive paradox."
- ↑ 2.0 2.1 Vacher, H. L. (May 2003). "Quantitative literacy - drug testing, cancer screening, and the identification of igneous rocks". Journal of Geoscience Education: 2. "At first glance, this seems perverse: the less the students as a whole use steroids, the more likely a student identified as a user will be a non-user. This has been called the False Positive Paradox" - Citing: Smith, W. (1993). The cartoon guide to statistics. New York: Harper Collins. p. 49.
- ↑ Madison, B. L. (August 2007). "Mathematical Proficiency for Citizenship". In Schoenfeld, A. H. Assessing Mathematical Proficiency. Mathematical Sciences Research Institute Publications (New ed.). Cambridge University Press. p. 122. ISBN 978-0-521-69766-8. "The correct [probability estimate...] is surprising to many; hence, the term paradox."