False positive paradox

The false positive paradox is a statistical result where false positive tests are more probable than true positive tests, occurring when the overall population has a low incidence of a condition and the incidence rate is lower than the false positive rate. The probability of a positive test result is determined not only by the accuracy of the test but by the characteristics of the sampled population.^[1] When the incidence, the proportion of those who have a given condition, is lower than the test's false positive rate, even tests that have a very low chance of giving a false positive in an individual case will give more false than true positives overall.^[2] So, in a society with very few infected people—fewer proportionately than the test gives false positives—there will actually be more who test positive for a disease incorrectly and don't have it than those who test positive accurately and do. The paradox has surprised many.^[3]

It is especially counter-intuitive when interpreting a positive result in a test on a low-incidence population after having dealt with positive results drawn from a high-incidence population.^[2] If the false positive rate of the test is higher than the proportion of the new population with the condition, then a test administrator whose experience has been drawn from testing in a high-incidence population may conclude from experience that a positive test result usually indicates a positive subject, when in fact a false positive is far more likely to have occurred.

Not adjusting to the scarcity of the condition in the new population, and concluding that a positive test result probably indicates a positive subject, even though population incidence is below the false positive rate is a "base rate fallacy".

Example

High-incidence population

Number of people	Infected	Uninfected	Total
Test positive	400 (true positive)	30 (false positive)	430
Test negative	0 (false negative)	570 (true negative)	570
Total	400	600	1000

Imagine running an HIV test on population A of 1000 persons, in which 40% are infected. The test has a false positive rate of 5% (0.05) and no false negative rate. The expected outcome of the 1000 tests on population A would be:

Infected and test indicates disease (true positive)

1000 × 40/100 = 400 people would receive a true positive

Uninfected and test indicates disease (false positive)

1000 × 100 – 40/100 × 0.05 = 30 people would receive a false positive

The remaining 570 tests are correctly negative.

So, in population A, a person receiving a positive test could be over 93% confident (400/30 + 400) that it correctly indicates infection.

Low-incidence population

Number of people	Infected	Uninfected	Total
Test positive	20 (true positive)	49 (false positive)	69
Test negative	0 (false negative)	931 (true negative)	931
Total	20	980	1000

Now consider the same test applied to population B, in which only 2% is infected. The expected outcome of 1000 tests on population B would be:

Infected and test indicates disease (true positive)

1000 × 2/100 = 20 people would receive a true positive

Uninfected and test indicates disease (false positive)

1000 × 100 – 2/100 × 0.05 = 49 people would receive a false positive

The remaining 931 tests are correctly negative.

In population B, only 20 of the 69 total people with a positive test result are actually infected. So, the probability of actually being infected after one is told that one is infected is only 29% (20/20 + 49) for a test that otherwise appears to be "95% accurate".

A tester with experience of group A might find it a paradox that in group B, a result that had usually correctly indicated infection is now usually a false positive. The confusion of the posterior probability of infection with the prior probability of receiving a false positive is a natural error after receiving a life-threatening test result.

Discussion

Cory Doctorow discusses this paradox in his book Little Brother.

If you ever decide to do something as stupid as build an automatic terrorism detector, here's a math lesson you need to learn first. It's called "the paradox of the false positive," and it's a doozy.

Number (rounded)	Has Super-AIDS	Does not have Super-AIDS	Total
Test positive	1 (true positive)	10,000 (false positive)	10,001
Test negative	0 (false negative)	989,999 (true negative)	989,999
Total	1	999,999	1,000,000

Say you have a new disease, called Super-AIDS. Only one in a million people gets Super-AIDS. You develop a test for Super-AIDS that's 99 percent accurate. I mean, 99 percent of the time, it gives the correct result -- true if the subject is infected, and false if the subject is healthy. You give the test to a million people.
One in a million people have Super-AIDS. One in a hundred people that you test will generate a "false positive" -- the test will say he has Super-AIDS even though he doesn't. That's what "99 percent accurate" means: one percent wrong. What's one percent of one million? 1,000,000/100 = 10,000 One in a million people has Super-AIDS. If you test a million random people, you'll probably only find one case of real Super-AIDS. But your test won't identify one person as having Super-AIDS. It will identify 10,000 people as having it. Your 99 percent accurate test will perform with 99.99 percent inaccuracy.
That's the paradox of the false positive. When you try to find something really rare, your test's accuracy has to match the rarity of the thing you're looking for. If you're trying to point at a single pixel on your screen, a sharp pencil is a good pointer: the pencil-tip is a lot smaller (more accurate) than the pixels. But a pencil-tip is no good at pointing at a single atom in your screen. For that, you need a pointer -- a test -- that's one atom wide or less at the tip.

Number (rounded)	Is a terrorist	Is not a terrorist	Total
Test positive	10 (true positive)	200,000 (false positive)	200,010
Test negative	0 (false negative)	19,799,990 (true negative)	19,799,990
Total	10	19,999,990	20,000,000

Here is an application to terrorism:
Terrorists are really rare. In a city of twenty million like New York, there might be one or two terrorists, maybe up to ten. 10/20,000,000 = 0.00005 percent, one twenty-thousandth of a percent. That's pretty rare. Now, say you have software that can sift through all the bank-records, or toll-pass records, or public transit records, or phone-call records in the city and catch terrorists 99 percent of the time. In a pool of twenty million people, a 99 percent accurate test will identify two hundred thousand people as being terrorists. But only ten of them are terrorists. To catch ten bad guys, you have to investigate two hundred thousand innocent people.

References

↑ Rheinfurth, M. H.; Howell, L. W. (March 1998). Probability and Statistics in Aerospace Engineering (PDF). NASA. p. 16. MESSAGE: False positive tests are more probable than true positive tests when the overall population has a low incidence of the disease. This is called the false-positive paradox.
↑ 2.0 2.1 Vacher, H. L. (May 2003). "Quantitative literacy - drug testing, cancer screening, and the identification of igneous rocks". Journal of Geoscience Education: 2. At first glance, this seems perverse: the less the students as a whole use steroids, the more likely a student identified as a user will be a non-user. This has been called the False Positive Paradox - Citing: Smith, W. (1993). The cartoon guide to statistics. New York: Harper Collins. p. 49.
↑ Madison, B. L. (August 2007). "Mathematical Proficiency for Citizenship". In Schoenfeld, A. H. Assessing Mathematical Proficiency. Mathematical Sciences Research Institute Publications (New ed.). Cambridge University Press. p. 122. ISBN 978-0-521-69766-8. The correct [probability estimate...] is surprising to many; hence, the term paradox.

External links

The false positive paradox explained visually (video)