Matching (statistics)

Matching is a statistical technique which is used to evaluate the effect of a treatment by comparing the treated and the non-treated units in an observational study or quasi-experiment (i.e. when the treatment is not randomly assigned). The goal of matching is, for every treated unit, to find one (or more) non-treated unit(s) with similar observable characteristics against whom the effect of the treatment can be assessed. By matching treated units to similar non-treated units, matching enables a comparison of outcomes among treated and non-treated units to estimate the effect of the treatment reducing bias due to confounding.^[1]^[2]^[3] Propensity score matching, an early matching technique, was developed as part of the Rubin causal model model.^[4]

Matching has been promoted by Donald Rubin.^[4] It was prominently criticized in economics by LaLonde (1986),^[5] who compared estimates of treatment effects from an experiment to comparable estimates produced with matching methods and showed that matching methods are biased. Dehejia and Wahba (1999) reevaluted LaLonde's critique and show that matching is a good solution.^[6] Similar critiques have been raised in political science^[7] and sociology^[8] journals.

Analysis

Matched samples of treated and non-treated units can often be analyzed with a paired difference test to estimate the average treatment effect. Matching can also be used to "pre-process" a sample before analysis via another technique, such as regression analysis.^[9]

Overmatching

Overmatching is matching for an apparent confounder that actually is a result of the exposure. True confounders are associated with both the exposure and the disease, but if the exposure itself leads to the confounder, or has equal status with it, then stratifying by that confounder will also partly stratify by the exposure, resulting in an obscured relation of the exposure to the disease.^[10] Overmatching thus causes statistical bias.^[10]

For example, matching the control group by gestation length and/or the number of multiple births when estimating perinatal mortality and birthweight after in vitro fertilization (IVF) is overmatching, since IVF itself increases the risk of premature birth and multiple birth.^[11]

It may be regarded as a sampling bias in decreasing the external validity of a study, because the controls become more similar to the cases in regard to exposure than the general population.

References

↑ Rubin, Donald B. (1973). "Matching to Remove Bias in Observational Studies". Biometrics 29 (1): 159–183. doi:10.2307/2529684. JSTOR 2529684.
↑ Anderson, Dallas W.; Kish, Leslie; Cornell, Richard G. (1980). "On Stratification, Grouping and Matching". Scandinavian Journal of Statistics 7 (2): 61–66. JSTOR 4615774.
↑ Kupper, Lawrence L.; Karon, John M.; Kleinbaum, David G.; Morgenstern, Hal; Lewis, Donald K. (1981). "Matching in Epidemiologic Studies: Validity and Efficiency Considerations". Biometrics 37 (2): 271–291. doi:10.2307/2530417. JSTOR 2530417. PMID 7272415.
↑ 4.0 4.1 Rosenbaum, Paul R.; Rubin, Donald B. (1983). "The Central Role of the Propensity Score in Observational Studies for Causal Effects". Biometrika 70 (1): 41–55. doi:10.1093/biomet/70.1.41.
↑ LaLonde, Robert J. (1986). "Evaluating the Econometric Evaluations of Training Programs with Experimental Data". American Economic Review 76 (4): 604–620. JSTOR 1806062.
↑ Dehejia, R. H.; Wahba, S. (1999). "Causal Effects in Nonexperimental Studies: Reevaluating the Evaluation of Training Programs". Journal of the American Statistical Association 94 (448): 1053–1062. doi:10.1080/01621459.1999.10473858.
↑ Arceneaux, Kevin; Gerber, Alan S.; Green, Donald P. (2006). "Comparing Experimental and Matching Methods Using a Large-Scale Field Experiment on Voter Mobilization". Political Analysis 14 (1): 37–62. doi:10.1093/pan/mpj001.
↑ Arceneaux, Kevin; Gerber, Alan S.; Green, Donald P. (2010). "A Cautionary Note on the Use of Matching to Estimate Causal Effects: An Empirical Example Comparing Matching Estimates to an Experimental Benchmark". Sociological Methods & Research 39 (2): 256–282. doi:10.1177/0049124110378098.
↑ Ho, Daniel E.; Imai, Kosuke; King, Gary; Stuart, Elizabeth A. (2007). "Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference". Political Analysis 15 (3): 199–236. doi:10.1093/pan/mpl013.
↑ 10.0 10.1 Marsh, J. L.; Hutton, J. L.; Binks, K. (2002). "Removal of radiation dose response effects: an example of over-matching". British Medical Journal 325 (7359): 327–330. doi:10.1136/bmj.325.7359.327. PMC 1123834. PMID 12169512.
↑ Gissler, M.; Hemminki, E. (1996). "The danger of overmatching in studies of the perinatal mortality and birthweight of infants born after assisted conception". Eur J Obstet Gynecol Reprod Biol 69 (2): 73–75. doi:10.1016/0301-2115(95)02517-0. PMID 8902436.

Study design	Effect size Standard error Statistical power Sample size determination

Survey methodology	Sampling stratified cluster Opinion poll Questionnaire

Controlled experiments	Design optimal Randomized Random assignment Replication Blocking Factorial experiment

Uncontrolled studies	Natural experiment Quasi-experiment Observational study

Statistical inference

Statistical theory

Frequentist inference

Confidence interval Testing hypotheses Power

Unbiased estimators	Mean unbiased minimum-variance Median unbiased

Biased estimators	Maximum likelihood Method of moments Minimum distance Density estimation

Parametric tests	Likelihood-ratio Wald Score

Specific tests

Z (normal) Student's t-test F Shapiro–Wilk Kolmogorov–Smirnov

Goodness of fit	Chi-squared G Sample source (Anderson–Darling) Sample normality (Shapiro–Wilk) Skewness / kurtosis normality (Jarque-Bera) Model comparison (Likelihood-ratio) Model quality (Akaike criterion)

Signed-rank	1-sample (Wilcoxon) 2-sample (Mann–Whitney U) 1-way anova (Kruskal–Wallis)

Bayesian inference

Correlation	Pearson product–moment Partial correlation Confounding variable Coefficient of determination

Regression analysis	Errors and residuals Regression model validation Mixed effects models Simultaneous equations models Multivariate adaptive regression splines (MARS)

Linear regression	Simple linear regression Ordinary least squares General linear model Bayesian regression

Non-standard predictors	Nonlinear regression Nonparametric Semiparametric Isotonic Robust Heteroscedasticity Homoscedasticity

Generalized linear model	Exponential families Logistic (Bernoulli) / Binomial / Poisson regressions

Partition of variance	Analysis of variance (ANOVA, anova) Analysis of covariance Multivariate ANOVA Degrees of freedom

Categorical / Multivariate / Time-series / Survival analysis

Categorical

Multivariate

Time-series

General	Decomposition Trend Stationarity Seasonal adjustment Exponential smoothing Cointegration Structural break Granger causality

Specific tests	Dickey–Fuller Johansen Q-statistic (Ljung–Box) Durbin–Watson Breusch–Godfrey

Time domain	Autocorrelation (ACF) partial (PACF) Cross-correlation (XCF) ARMA model ARIMA model (Box–Jenkins) Autoregressive conditional heteroskedasticity (ARCH) Vector autoregression (VAR)

Frequency domain	Spectral density estimation Fourier analysis Wavelet

Survival

Applications

Biostatistics	Bioinformatics Clinical trials / studies Epidemiology Medical statistics

Engineering statistics	Chemometrics Methods engineering Probabilistic design Process / quality control Reliability System identification

Social statistics	Actuarial science Census Crime statistics Demography Econometrics National accounts Official statistics Population Psychometrics

Spatial statistics	Cartography Environmental statistics Geographic information system Geostatistics Kriging

Category
Portal
Commons
WikiProject

Matching (statistics)

Analysis

Overmatching

See also

References

Further reading