Youden's J statistic
Youden's J statistic (also called Youden's index) is a single statistic that captures the performance of a diagnostic test.
Definition
- J = Sensitivity + Specificity − 1
with the two right-hand quantities being sensitivity and specificity.
The index was suggested by W.J. Youden in 1950 [1] as a way of summarising the performance of a diagnostic test. Its value ranges from 0 to 1, and has a zero value when a diagnostic test gives the same proportion of positive results for groups with and without the disease, i.e the test is useless. A value of 1 indicates that there are no false positives or false negatives, i.e. the test is perfect. The index gives equal weight to false positive and false negative values, so all tests with the same value of the index give the same proportion of total misclassified results.
Youden's index is often used in conjunction with Receiver Operating Characteristic (ROC) analysis.[2] The index is defined for all points of an ROC curve, and the maximum value of the index may be used as a criterion for selecting the optimum cut-off point when a diagnostic test gives a numeric rather than a dichotomous result. The index is represented graphically as the height above the chance line, and it is also equivalent to the Area under the Curve subtended by a single operating point.[3]
It is also known as deltap' [4] and generalizes from the dichotomous to the multiclass case as Informedness.[3]
An unrelated but more commonly used combination of basic statistics is the F-score, being the harmonic mean of recall and precision where recall = sensitivity = true positive rate, but specificity and precision are separate terms. The use of a single index is "not generally to be recommended",[5] but Informedness or Youden's index is the probability of an informed decision (as opposed to a random guess), and unlike F-score takes into account all cells of the contingency table and is thus a better choice in general.
Matthews correlation coefficient is the geometric mean of the regression coefficient of the problem and its dual, where the component regression coefficients of the Matthews correlation coefficient are Markedness (deltap) and Informedness (deltap').
References
- ↑ Youden, W.J. (1950). "Index for rating diagnostic tests". Cancer 3: 32–35. doi:10.1002/1097-0142(1950)3:1<32::aid-cncr2820030106>3.0.co;2-3.
- ↑ Schisterman, E.F.; Perkins, N.J.; Liu, A.; Bondell, H. (2005). "Optimal cut-point and its corresponding Youden Index to discriminate individuals using pooled blood samples". Epidemiology 16: 73–81. doi:10.1097/01.ede.0000147512.81966.ba.
- ↑ 3.0 3.1 Powers, David M W (2011). "Evaluation: From Precision, Recall and F-Score to ROC, Informedness, Markedness & Correlation". Journal of Machine Learning Technologies 2 (1): 37–63.
- ↑ Perruchet, P.; Peereman, R. (2004). "The exploitation of distributional information in syllable processing". J. Neurolinguistics 17: 97–119. doi:10.1016/s0911-6044(03)00059-9.
- ↑ Everitt B.S. (2002) The Cambridge Dictionary of Statistics. CUP ISBN 0-521-81099-X