Accuracy paradox
From Wikipedia, the free encyclopedia
In Predictive Analytics, it is important to analyze the quality of a predictive Model and to compare the quality of two or more alternative models. The Accuracy Paradox for predictive analytics states that predictive models with a given level of Accuracy may have greater Predictive Power than models with higher accuracy. Therefore, in predictive analytics, it may be best to avoid the accuracy metric in favor of other metrics such as Precision and Recall. This perplexing fact is explained below and illustrated with a simple example.
Accuracy is often the starting point for analyzing the quality of a predictive model. Accuracy is also probably the first term that comes to mind when non-experts think about how to evaluate the quality of a prediction. As shown below, accuracy measures the ratio of correct predictions over the total number of cases evaluated.
What about the business relevance of accuracy? Surprisingly, this is a difficult question. It seems obvious that the ratio of correct predictions over all cases should be a key metric for determining the business impact of a predictive model. Yet, the value of the accuracy Metric is dubious. In fact, it is often trivially easy to create a predictive model with high accuracy, and such trivial models can be useless despite of high accuracy. Similarly, when comparing the business impact of two alternative predictive models, it may well be the less accurate model that is more beneficial to the user organization.
Let's review an example predictive model for an Insurance Fraud application. To prevent payment on fraudulent claims all cases that are predicted as high-risk by the model will be investigated by Fraud experts. The insurance has devised a predictive model that predicts fraud with some degree of accuracy. And in order to evaluate the performance of the model the insurance has created a sample data set of 10,000 claims. All 10,000 cases in the Validation sample have been carefully checked and it is known which cases are fraudulent. Now, to analyze the quality of the model, the insurance uses the table of confusion. The definition of accuracy, the table of confusion for model M1Fraud, and the calculation of accuracy for model M1Fraud is shown below.
A(M) = (TN + TP) / (TN + FP + FN + TP) where TN is the number of true negative cases FP is the number of false positive cases FN is the number of false negative cases TP is the number of true positive cases
Formula 1: Definition of Accuracy
Predicted Negative Predicted Positive Negative Cases 9,700 150 Positive Cases 50 100
Table 1: Table of Confusion for Fraud Model M1Fraud.
A(M) = (9,700 + 100) / (9,700 + 150 + 50 + 100) = 98.0%
Formula 2: Accuracy for model M1Fraud
With an accuracy of 98.0% model M1Fraudappears to perform fairly well. However, the Accuracy Paradox lies in the fact that accuracy can be easily improved to 98.5% by always predicting "no fraud". The table of confusion and the accuracy for this trivial “always predict negative” model M2Fraud and the accuracy of this model are shown below.
Predicted Negative Predicted Positive Negative Cases 9,850 0 Positive Cases 150 0
Table 2: Table of Confusion for Fraud Model M2Fraud.
A(M) = (9,850 + 0) / (9,850 + 0 + 150 + 0) = 98.5%
Formula 3: Accuracy for model M2Fraud
Model M2Fraudreduces the rate of inaccurate predictions from 2% to 1.5%. This is an apparent improvement of 25%. Although the new model M2Fraud shows fewer incorrect predictions and markedly improved accuracy, as compared to the original model M1Fraud, the new model is obviously useless. The alternative model M2Frauddoes not offer any value to the Insurance company for preventing fraud, and clearly, the less accurate model is more useful than the more accurate model. In general then, high accuracy does not necessarily lead to desirable business outcomes, and model improvements should not be measured in terms of accuracy gains. It may be going too far to say that accuracy is irrelevant for assessing business benefits but caution is advised when using accuracy in the evaluation of predictive models.
The inescapable conclusion is that high accuracy is not necessarily an indicator of high model quality, and therein lies the Accuracy Paradox of predictive analytics.