False positives and false negatives

In medical statistics, false positives and false negatives are concepts analogous to type I and type II errors in statistical hypothesis testing, where a positive result corresponds to rejecting the null hypothesis, and a negative result corresponds to not rejecting the null hypothesis. The terms are often used interchangeably, but there are differences in detail and interpretation.

False positive error

A false positive error, or in short false positive, commonly called a "false alarm", is a result that indicates a given condition has been fulfilled, when it has not. I.e. erroneously a positive effect has been assumed. In the case of "crying wolf" – the condition tested for was "is there a wolf near the herd?"; the result was that there had not been a wolf near the herd. The shepherd wrongly indicated there was one, by calling "Wolf, wolf!"

A false positive error is a type I error where the test is checking a single condition, and results in an affirmative or negative decision usually designated as "true or false".[1]

False negative error

A false negative error, or in short false negative, is where a test result indicates that a condition failed, while it was successful. I.e. erroneously no effect has been assumed. A common example is a guilty prisoner freed from jail. The condition: "Is the prisoner guilty?" is true (yes, the prisoner is guilty). But the test (a court of law) failed to realize this, and wrongly decided the prisoner was not guilty.

A false negative error is a type II error occurring in test steps where a single condition is checked for and the result can either be positive or negative.[2]

False positive and false negative rates

The false positive rate is the proportion of all negatives that still yield positive test outcomes, i.e., the conditional probability of a positive test result given an event that was not present.

The false positive rate is equal to the significance level. The specificity of the test is equal to 1 minus the false positive rate.

In statistical hypothesis testing, this fraction is given the Greek letter α, and 1−α is defined as the specificity of the test. Increasing the specificity of the test lowers the probability of type I errors, but raises the probability of type II errors (false negatives that reject the alternative hypothesis when it is true).[lower-alpha 1]

Complementarily, the false negative rate is the proportion of positives which yield negative test outcomes with the test, i.e., the conditional probability of a negative test result given that the condition being looked for is present.

In statistical hypothesis testing, this fraction is given the letter β. The "power" (or the "sensitivity") of the test is equal to 1−β.

Ambiguity in the definition of False Positive Rate

The term false discovery rate and false positive risk are used synonymously by Colquhoun (2014)[3] and Colquhoun (2017)[4]. They are both defined as the FDR at the start of this article. They are quite different from the Type 1 error rate. Confusion of these two ideas, the error of the transposed conditional, has caused much mischief[5]. Because of the ambiguity of notation in this field, it's essential to look at the definition in every paper. The hazards of reliance on p-values was emphasized in [4] by pointing out that even observation of p = 0.001 was not necessarily strong evidence against the null hypothesis. Despite the fact that the likelihood ratio in favour of the alternative hypothesis over the null is close to 100, if the hypothesis was implausible, with a prior probability of a real effect being 0.1, even the observation of p = 0.001 would have a false positive rate of 8 percent. It wouldn't even reach the 5 percent level. As a consequence, it has been recommended[4] the every p value should be accompanied by the prior probability of there being a real effect that it would be necessary to assume in order to achieve a false positive risk of 5%. For example, if we observe p= 0.05 in a single experiment, we would have to be 87% certain that there as a real effect before the experiment was done to achieve a false positive risk of 5%.

Receiver operating characteristic

The article "Receiver operating characteristic" discusses parameters in statistical signal processing based on ratios of errors of various types.

Consequences

In many legal traditions there is a presumption of innocence, as stated in Blackstone's formulation that:

"It is better that ten guilty persons escape than that one innocent suffer",

that is, that false negatives (a guilty person is acquitted and escapes) are far preferable to false positive (an innocent person is convicted and suffers). This is not universal, however, and some systems prefer to jail many innocent, rather than let a single guilty escape – the tradeoff varies between legal traditions.

Notes

  1. When developing detection algorithms or tests, a balance must be chosen between risks of false negatives and false positives. Usually there is a threshold of how close a match to a given sample must be achieved before the algorithm reports a match. The higher this threshold, the more false negatives and the fewer false positives.

References

  1. "False Positive". WhatIs.com. Retrieved 26 August 2016.
  2. Banerjee, A; Chitnis, UB; Jadhav, SL; Bhawalkar, JS; Chaudhury, S (2009). "Hypothesis testing, type I and type II errors". Ind Psychiatry J. 18: 127–31. PMC 2996198Freely accessible. PMID 21180491. doi:10.4103/0972-6748.62274.
  3. Colquhoun, David (2014). "An investigation of the false discovery rate and the misinterpretation of p-values". Royal Society Open Science. 1: 140216. doi:10.1098/rsos.140216.
  4. 1 2 3 Colquhoun, David. "The Reproducibility Of Research And The Misinterpretation Of P Values". bioRxiv. bioRxiv. Retrieved 5 June 2017.
  5. Colquhoun, David. "The problem with p-values". Aeon. Aeon Magazine. Retrieved 11 December 2016.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.