Blocking (statistics)

From Wikipedia, the free encyclopedia

In the statistical theory of the design of experiments, blocking is the arranging of experimental units in groups (blocks) that are similar to one another. For example, an experiment is designed to test a new drug on patients. There are two levels of the treatment, drug, and placebo, administered to male and female patients in a double blind trial. The sex of the patient is a blocking factor accounting for treatment variability between males and females. This reduces sources of variability and thus leads to greater precision.

Reducing known variability is exactly what blocking does. Its principle lies in the fact that a variability that cannot be overcome (e.g. needing two batches of raw material to produce 1 container of a chemical) is confound or aliased with a(n) (higher/highest order) interaction to eliminate its influence on the end product. High order interactions are usually of the least importance (think of the fact that temperature of a reactor or the batch of raw materials is more important than the combination of the two - this is especially true when more (3, 4, ...) factors are present) thus it is preferable to confound this variability with the higher interaction.

Suppose a process is invented that intends to make the soles of shoes last longer, and a plan is formed to conduct a field trial. Given a group of n volunteers, one possible design would be to give n/2 of them shoes with the new soles and n/2 of them shoes with the ordinary soles, randomizing the assignment of the two kinds of soles. This type of experiment is a completely randomized design. Both groups are then asked to use their shoes for a period of time, and then measure the degree of wear of the soles. This is a workable experimental design, but purely from the point of view of statistical accuracy (ignoring any other factors), a better design would be to give each person one regular sole and one new sole, randomly assigning the two types to the left and right shoe of each volunteer. Such a design is called a randomized complete block design. This design will be more sensitive than the first, because each person is acting as their own control and thus the control group is more closely matched to the treatment group.

The theoretical basis of blocking is the following mathematical result. Given random variables, X and Y

$\operatorname {Var}(X-Y)=\operatorname {Var}(X)+\operatorname {Var}(Y)-2\operatorname {Cov}(X,Y).$

The difference between the treatment and the control can thus be given minimum variance (i.e. maximum precision) by maximising the covariance (or the correlation) between X and Y.

References

Addelman, Sidney (Oct 1969). "The Generalized Randomized Block Design". The American Statistician 23 (4): 35–36. doi:10.2307/2681737. JSTOR 2681737.

Addelman, Sidney (Sep 1970). "Variability of Treatments and Experimental Units in the Design and Analysis of Experiments". Journal of the American Statistical Association 65 (331): 1095–1108. doi:10.2307/2284277. JSTOR 2284277.

Bailey, R. A (2008). Design of Comparative Experiments. Cambridge University Press. ISBN 978-0-521-68357-9. Pre-publication chapters are available on-line.

Caliński, Tadeusz and Kageyama, Sanpei (2000). Block designs: A Randomization approach, Volume I: Analysis. Lecture Notes in Statistics 150. New York: Springer-Verlag. ISBN 0-387-98578-6.

Gates, Charles E. (Nov 1995). "What Really Is Experimental Error in Block Designs?". The American Statistician 49 (4): 362–363. doi:10.2307/2684574. JSTOR 2684574.

Kempthorne, Oscar (1979). The Design and Analysis of Experiments (Corrected reprint of (1952) Wiley ed.). Robert E. Krieger. ISBN 0-88275-105-0.
Hinkelmann, Klaus and Kempthorne, Oscar (2008). Design and Analysis of Experiments. I and II (Second ed.). Wiley. ISBN 978-0-470-38551-7.
- Hinkelmann, Klaus and Kempthorne, Oscar (2008). Design and Analysis of Experiments, Volume I: Introduction to Experimental Design (Second ed.). Wiley. ISBN 978-0-471-72756-9.
- Hinkelmann, Klaus and Kempthorne, Oscar (2005). Design and Analysis of Experiments, Volume 2: Advanced Experimental Design (First ed.). Wiley. ISBN 978-0-471-55177-5.

Lentner, Marvin; Thomas Bishop (1993). Experimental design and analysis (Second ed.). P.O. Box 884, Blacksburg, VA 24063: Valley Book Company. pp. 225–226. ISBN 0-9616255-2-X. Cite uses deprecated parameters (help)

Wilk, M. B. (June 1955). "The Randomization Analysis of a Generalized Randomized Block Design". Biometrika 42 (1–2): 70–79. JSTOR 2333423.

Zyskind, George (Dec 1963). "Some Consequences of randomization in a Generalization of the Balanced Incomplete Block Design". The Annals of Mathematical Statistics 34 (4): 1569–1581. doi:10.1214/aoms/1177703889. JSTOR 2238364.

Designing studies	Effect size Standard error Statistical power Sample size determination

Survey methodology	Sampling Stratified sampling Cluster sampling Opinion poll Questionnaire

Controlled experiment	Design of experiments Randomized experiment Random assignment Replication Blocking Factorial experiment Optimal design

Uncontrolled studies	Natural experiment Quasi-experiment Observational study

Statistical inference

Statistical theory	Sampling distribution Order statistic Scan statistic Record value Sufficiency Completeness Exponential family Permutation test (Randomization test) Empirical distribution Bootstrap U statistic Efficiency Asymptotics Robustness

Frequentist inference	Unbiased estimator (Mean unbiased minimum variance, Median unbiased) Biased estimators (Maximum likelihood, Method of moments, Minimum distance, Density estimation) Confidence interval Testing hypotheses Power Parametric tests (Likelihood-ratio, Wald, Score)

Specific tests	Z (normal) Student's t-test F Goodness of fit (Chi-squared, G, Sample source, sample normality, Skewness & kurtosis Normality, Model comparison, Model quality) Signed-rank (1-sample, 2-sample, 1-way anova) Shapiro–Wilk Kolmogorov–Smirnov

Bayesian inference	Bayesian probability Prior Posterior Credible interval Bayes factor Bayesian estimator Maximum posterior estimator

Correlation and regression analysis

Correlation	Pearson product–moment correlation Partial correlation Confounding variable Coefficient of determination

Regression analysis	Errors and residuals Regression model validation Mixed effects models Simultaneous equations models MARS

Linear regression	Simple linear regression Ordinary least squares General linear model Bayesian regression

Non-standard predictors	Nonlinear regression Nonparametric Semiparametric Isotonic Robust Heteroscedasticity Homoscedasticity

Generalized linear model	Exponential families Logistic (Bernoulli) Binomial Poisson

Partition of variance	Analysis of variance (ANOVA) Analysis of covariance Multivariate ANOVA Degrees of freedom

Categorical, multivariate, time-series, or survival analysis

Categorical data

Multivariate statistics

Time series analysis

General	Decomposition Trend Stationarity Seasonal adjustment Exponential smoothing Cointegration

Specific tests	Granger causality Q-Statistic Durbin–Watson

Time domain	ACF PACF XCF ARMA model ARIMA model ARCH Vector autoregression

Frequency domain	Spectral density estimation Fourier analysis

Survival analysis

Applications

Biostatistics	Bioinformatics Clinical trials & studies Epidemiology Medical statistics

Engineering statistics	Chemometrics Methods engineering Probabilistic design Process & Quality control Reliability System identification

Social statistics	Actuarial science Census Crime statistics Demography Econometrics National accounts Official statistics Population Psychometrics

Spatial statistics	Cartography Environmental statistics Geographic information system Geostatistics Kriging

Category
Portal
Outline
Index

v t e Design of experiments

Scientific Method	Scientific experiment Statistical design Control Internal & external validity Experimental unit Blinding Optimal design: Bayesian Random assignment Randomization Restricted randomization Replication versus subsampling Sample size

Treatment & Blocking	Treatment Effect size Contrast Interaction Confounding Orthogonality Blocking Covariate Nuisance variable

Models & Inference	Linear regression Ordinary least squares Bayesian Random effect Mixed model Hierarchical model: Bayesian Analysis of variance (Anova) Cochran's theorem Manova (multivariate) Ancova (covariance) Compare means Multiple comparison

Designs: Completely Randomized	Factorial Fractional factorial Plackett-Burman Taguchi Response surface methodology Polynomial & rational modeling Box-Behnken Central composite Block Generalized randomized block design (GRBD) Latin square Graeco-Latin square Orthogonal array Latin hypercube Repeated measures design Crossover study Randomized controlled trial Sequential analysis Sequential probability ratio test

Glossary Category Statistics portal Statistical outline Statistical topics

This article is issued from Wikipedia. The text is available under the Creative Commons Attribution/Share Alike; additional terms may apply for the media files.

Blocking (statistics)

References

See also