McNemar's test
From Wikipedia, the free encyclopedia
In statistics, McNemar's test is a non-parametric method used on nominal data to determine whether the row and column marginal frequencies are equal. It is named after Quinn McNemar, who introduced it in 1947. It is applied to 2 × 2 contingency tables with a dichotomous trait with matched pairs of subjects.
In the following example, a researcher attempts to determine if a drug has an effect on a particular disease. Counts of individuals are given in the table, with the diagnosis (disease: +/−) before treatment given in the columns (before), and the diagnosis after treatment in the rows (+/−) (after). The test requires the same subjects to be included in the before- and after measurements (matched pairs).
|
||||
+ | − | total | ||
after | + | 101 | 59 | 160 |
− | 121 | 33 | 154 | |
totals | 222 | 92 | 314 |
Cells represented in the following manner by the letters a, b, c and d, The totals across rows and columns marginal totals, and the grand total is represented by n:
|
||||
+ | − | total | ||
after | + | a | b | a + b |
− | c | d | c + d | |
totals | a+c | b+d | n |
Marginal homogeneity occurs when the row totals are equal to the column totals,
These equations are equivalent. They can be simplified leaving one constraint on the contingency table for the null hypothesis:
In this example, "marginal homogeneity" would mean there was no effect of the treatment.
The McNemar statistic is shown below:
χ2 is a chi-squared statistic with 1 degree of freedom. The formula may be re-written to correct for discontinuity:
The marginal frequencies are not homogeneous if the χ2 result is significant p < 0.05. If b and/or c are small (b + c < 20) then χ2 is not approximated by the chi-square distribution and a sign test should instead be used.
An interesting observation when interpreting McNemar's test is that the elements of the main diagonal contribute no information whatsoever to the decision if (in the above example) pre- or post-treatment condition is more favourable.
The statistical software SAS can easily be used to carry out this test. Here is an example: Oral contraceptives data: These data arise from a study reported by Sartwell et al. (1969). The study was conducted in a number of hospitals in several large American cities. In those hospitals, all those married women identified as suffering from idiopathic thromboembolism (blood clots) over a 3-year period were individually matched with a suitable control, those being female patients discharged alive from the same hospital in the same 6-month time interval as the case. In addition, they were individually matched to cases on age, marital status, race, etc. Patients and controls were then asked about their use of oral contraceptives. The following contingency table contains the data.
Control Used Control Not Used Case used 10 57 Case not used 13 95
The appropriate SAS code to analyze these data is as follows.
data the_pill; input caseuse $ contruse $ n @@; cards; Y Y 10 Y N 57 N Y 13 N N 95; proc freq data=the_pill order=data; tables caseuse*contruse/agree; weight n; run;
A celebrated application of the test in genetics is the Transmission disequilibrium test for detecting genetic linkage.
[edit] Related tests
- The Cochran test is a generalization that allows for more than two row and/or column categories.
- The Stuart-Maxwell test is different generalization of the McNemar test, used for testing marginal homogenity in a square table with more than two rows/columns.