Cohen's kappa

From Wikipedia, the free encyclopedia

Cohen's kappa coefficient is a statistical measure of inter-rater agreement. It is generally thought to be a more robust measure than simple percent agreement calculation since κ takes into account the agreement occurring by chance. Cohen's kappa measures the agreement between two raters who each classify N items into C mutually exclusive categories.

The equation for κ is:

$\kappa = \frac{\Pr(a) - \Pr(e)}{1 - \Pr(e)}, \!$

where Pr(a) is the relative observed agreement among raters, and Pr(e) is the probability that agreement is due to chance. If the raters are in complete agreement then κ = 1. If there is no agreement among the raters (other than what would be expected by chance) then κ ≤ 0.

The seminal paper introducing kappa as a new technique was published by Jacob Cohen in the journal Educational and Psychological Measurement in 1960.

Note that Cohen's kappa measures agreement between two raters only. For a similar measure of agreement (Fleiss' kappa) used when there are more than two raters, see Fleiss (1981).

1 Significance
2 See also
3 Online calculators
4 References
5 Notes
6 References
7 External links

[edit] Significance

Landis and Koch^[1] gave the following table for interpreting $κ$ values. This table is however by no means universally accepted; Landis and Koch supplied no evidence to support it, basing it instead on personal opinion. It has been noted that these guidelines may be more harmful than helpful^[2], as the number of categories and subjects will affect the magnitude of the value. The kappa will be higher when there are fewer categories.^[3]

$κ$	Interpretation
< 0	No agreement
0.0 — 0.20	Slight agreement
0.21 — 0.40	Fair agreement
0.41 — 0.60	Moderate agreement
0.61 — 0.80	Substantial agreement
0.81 — 1.00	Almost perfect agreement

[edit] See also

Fleiss' kappa

[edit] Online calculators

[edit] References

Jacob Cohen (1960), A coefficient of agreement for nominal scales, Educational and Psychological Measurement Vol.20, No.1, pp.37-46.
Joseph L. Fleiss. Statistical methods for rates and proportions, 2ed. John Wiley & Sons, Inc. New York. 1981. pp 212-236 (chapter 13: The measurement of interrater agreement).

[edit] Notes

^ Landis, J. R. and Koch, G. G. (1977) pp. 159--174
^ Gwet, K. (2001)
^ Sim, J. and Wright, C. C. (2005) pp. 257--268

[edit] References

Fleiss, J. L. (1971) "Measuring nominal scale agreement among many raters." Psychological Bulletin, Vol. 76, No. 5 pp. 378--382
Gwet, K. (2001) Statistical Tables for Inter-Rater Agreement. (Gaithersburg : StatAxis Publishing)
Landis, J. R. and Koch, G. G. (1977) "The measurement of observer agreement for categorical data" in Biometrics. Vol. 33, pp. 159--174
Scott, W. (1955). "Reliability of content analysis: The case of nominal scale coding." Public Opinion Quarterly, 17, 321-325.
Sim, J. and Wright, C. C. (2005) "The Kappa Statistic in Reliability Studies: Use, Interpretation, and Sample Size Requirements" in Physical Therapy. Vol. 85, pp. 257--268
Fleiss, J. L. and Cohen, J. (1973) "The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability" in Educational and Psychological Measurement, Vol. 33 pp. 613--619
Fleiss, J. L. (1981) Statistical methods for rates and proportions. 2nd ed. (New York: John Wiley) pp. 38--46

[edit] External links

Cohen's kappa

From Wikipedia, the free encyclopedia

Contents

[edit] Significance

[edit] See also

[edit] Online calculators

[edit] References

[edit] Notes

[edit] References

[edit] External links

Views

Navigation

Interaction

Search

Languages