Sequential analysis

In statistics, sequential analysis or sequential hypothesis testing is statistical analysis where the sample size is not fixed in advance. Instead data is evaluated as it is collected, and further sampling is stopped in accordance with a pre-defined stopping rule as soon as significant results are observed. Thus a conclusion may sometimes be reached at a much earlier stage than would be possible with more classical hypothesis testing or estimation, at consequently lower financial and/or human cost.

History

Sequential analysis was first developed by Abraham Wald^[1] with Jacob Wolfowitz, W. Allen Wallis, and Milton Friedman^[2] while at Columbia University's Statistical Research Group as a tool for more efficient industrial quality control during World War II. Its value to the war effort was immediately recognised, and led to its receiving a "restricted" classification. Another early contribution to the method was made by K.J. Arrow with D. Blackwell and M.A. Girshick.^[3]

A similar approach was independently developed at the same time by Alan Turing, as part of the Banburismus technique used at Bletchley Park, to test hypotheses about whether different messages coded by German Enigma machines should be connected and analysed together. This work remained secret until the early 1980s.^[4]

Applications of sequential analysis

Clinical trials

In a randomized trial with two treatment groups, group sequential testing may for example be conducted in the following manner: After n subjects in each group, i.e., a total of 2n subjects, are available, an interim analysis is conducted. That means, a statistical test is performed to compare the two groups, if the null hypothesis is rejected, the trial is terminated. Otherwise, the trial continues. Another n subjects per group are recruited. The statistical test is performed again, now including all 4n subjects. If the null is rejected, the trial is terminated. Otherwise, it continues with periodic evaluations until a maximum number of interim analyses have been performed. At this point, the last statistical test is conducted, and the trial is discontinued.^[5]

Other applications

Sequential analysis also has a connection to the problem of gambler's ruin that has been studied by, among others, Huyghens in 1657.^[6]

Step detection is the process of finding abrupt changes in the mean level of a time series or signal. It is usually considered as a special kind of statistical method known as change point detection. Often, the step is small and the time series is corrupted by some kind of noise, and this makes the problem challenging because the step may be hidden by the noise. Therefore, statistical and/or signal processing algorithms are often required. When the algorithms are run online as the data is coming in, especially with the aim of producing an alert, this is an application of sequential analysis.

Bias

The statistics of a trial that is stopped early at only n samples are different than a similar trial that is run for a predetermined number of trials, even if they end up collecting the same number of samples. If this is not accounted for when interpreting the sequential trial, the results will be biased. Therefore it is important that proper methodology is followed in order to avoid false conclusions. See ^[7] for a discussion.

Notes

↑ Wald, Abraham (June 1945). "Sequential Tests of Statistical Hypotheses". The Annals of Mathematical Statistics 16 (2): 117–186. doi:10.1214/aoms/1177731118. JSTOR 2235829.
↑ Berger, James (2008). "Sequential Analysis". The New Palgrave Dictionary of Economics, 2nd Ed. doi:10.1057/9780230226203.1513.
↑ Kenneth J. Arrow, David Blackwell and M.A. Girshick (1949). "Bayes and minimax solutions of sequential decision problems". Econometrica 17 (3/4): 213–244. doi:10.2307/1905525. JSTOR 1905525.
↑ Randell, Brian (1980), "The Colossus", A History of Computing in the Twentieth Century (PDF), p. 30, retrieved 22 March 2011
↑ Korosteleva, Olga (2008). Clinical Statistics: Introducing Clinical Trials, Survival Analysis, and Longitudinal Data Analysis (First ed.). Jones and Bartlett Publishers. ISBN 0-7637-5850-7.
↑ Gosh, B. K.; Sen, P. K. (1991). Handbook of Sequential Analysis. New York: Marcel Dekker. ISBN 9780824784089.
↑

References

Wald, Abraham (1947). Sequential Analysis. New York: John Wiley and Sons.
Ghosh, Bhaskar Kumar (1970). Sequential Tests of Statistical Hypotheses. Reading: Addison-Wesley.
Chernoff, Herman (1972). Sequential Analysis and Optimal Design. SIAM.
Siegmund, David (1985). Sequential Analysis. Springer Series in Statistics. New York: Springer-Verlag. ISBN 0-387-96134-8.
Bakeman, R., Gottman, J.M., (1997) Observing Interaction: An Introduction to Sequential Analysis, Cambridge: Cambridge University Press

Jennison, C. and Turnbull, B.W (2000) Group Sequential Methods With Applications to Clinical Trials. Chapman & Hall/CRC.

Whitehead, J. (1997). The Design and Analysis of Sequential Clinical Trials, 2nd Edition. John Wiley & Sons.

External links

R Package: Wald's Sequential Probability Ratio Test by OnData.io
Sequential Analysis: Design Methods & Applications Journal
Course given by Rebecca Betensky at Harvard University, lecture note slides
Software for conducting sequential analysis and applications of sequential analysis in the study of group interaction in computer-mediated communication by Dr. Allan Jeong at Florida State University

Commercial

PASS Sample Size Software includes features for the setup of group sequential designs.

Design of experiments

Scientific method	Scientific experiment Statistical design Control Internal and external validity Experimental unit Blinding Optimal design: Bayesian Random assignment Randomization Restricted randomization Replication versus subsampling Sample size

Treatment and blocking	Treatment Effect size Contrast Interaction Confounding Orthogonality Blocking Covariate Nuisance variable

Models and inference	Linear regression Ordinary least squares Bayesian Random effect Mixed model Hierarchical model: Bayesian Analysis of variance (Anova) Cochran's theorem Manova (multivariate) Ancova (covariance) Compare means Multiple comparison

Designs Completely randomized	Factorial Fractional factorial Plackett-Burman Taguchi Response surface methodology Polynomial and rational modeling Box-Behnken Central composite Block Generalized randomized block design (GRBD) Latin square Graeco-Latin square Orthogonal array Latin hypercube Repeated measures design Crossover study Randomized controlled trial Sequential analysis Sequential probability ratio test

Glossary Category Statistics portal Statistical outline Statistical topics

Statistics

Descriptive statistics

Continuous data

Location	Mean arithmetic geometric harmonic Median Mode

Dispersion	Range Standard deviation Coefficient of variation Percentile Interquartile range

Shape	Variance Skewness Kurtosis Moments L-moments

Count data

Index of dispersion

Summary tables

Dependence

Statistical graphics

Data collection

Study design	Effect size Standard error Statistical power Sample size determination

Survey methodology	Sampling stratified cluster Opinion poll Questionnaire

Controlled experiments	Design optimal Randomized Random assignment Replication Blocking Factorial experiment

Uncontrolled studies	Natural experiment Quasi-experiment Observational study

Statistical inference

Statistical theory

Frequentist inference

Confidence interval Testing hypotheses Power

Unbiased estimators	Mean unbiased minimum-variance Median unbiased

Biased estimators	Maximum likelihood Method of moments Minimum distance Density estimation

Parametric tests	Likelihood-ratio Wald Score

Specific tests

Z (normal) Student's t-test F Shapiro–Wilk Kolmogorov–Smirnov

Goodness of fit	Chi-squared G Sample source (Anderson–Darling) Sample normality (Shapiro–Wilk) Skewness / kurtosis normality (Jarque-Bera) Model comparison (Likelihood-ratio) Model quality (Akaike criterion)

Signed-rank	1-sample (Wilcoxon) 2-sample (Mann–Whitney U) 1-way anova (Kruskal–Wallis)

Bayesian inference

Correlation	Pearson product–moment Partial correlation Confounding variable Coefficient of determination

Regression analysis	Errors and residuals Regression model validation Mixed effects models Simultaneous equations models Multivariate adaptive regression splines (MARS)

Linear regression	Simple linear regression Ordinary least squares General linear model Bayesian regression

Non-standard predictors	Nonlinear regression Nonparametric Semiparametric Isotonic Robust Heteroscedasticity Homoscedasticity

Generalized linear model	Exponential families Logistic (Bernoulli) / Binomial / Poisson regressions

Partition of variance	Analysis of variance (ANOVA, anova) Analysis of covariance Multivariate ANOVA Degrees of freedom

Categorical / Multivariate / Time-series / Survival analysis

Categorical

Multivariate

Time-series

General	Decomposition Trend Stationarity Seasonal adjustment Exponential smoothing Cointegration Structural break Granger causality

Specific tests	Dickey–Fuller Johansen Q-statistic (Ljung–Box) Durbin–Watson Breusch–Godfrey

Time domain	Autocorrelation (ACF) partial (PACF) Cross-correlation (XCF) ARMA model ARIMA model (Box–Jenkins) Autoregressive conditional heteroskedasticity (ARCH) Vector autoregression (VAR)

Frequency domain	Spectral density estimation Fourier analysis Wavelet

Survival

Applications

Biostatistics	Bioinformatics Clinical trials / studies Epidemiology Medical statistics

Engineering statistics	Chemometrics Methods engineering Probabilistic design Process / quality control Reliability System identification

Social statistics	Actuarial science Census Crime statistics Demography Econometrics National accounts Official statistics Population Psychometrics

Spatial statistics	Cartography Environmental statistics Geographic information system Geostatistics Kriging

Category
Portal
Commons
WikiProject