Sequential analysis

In statistics, sequential analysis or sequential hypothesis testing is statistical analysis where the sample size is not fixed in advance. Instead data are evaluated as they are collected, and further sampling is stopped in accordance with a pre-defined stopping rule as soon as significant results are observed. Thus a conclusion may sometimes be reached at a much earlier stage than would be possible with more classical hypothesis testing or estimation, at consequently lower financial and/or human cost.

History

The method of sequential analysis is first attributed to Abraham Wald^[1] with Jacob Wolfowitz, W. Allen Wallis, and Milton Friedman^[2] while at Columbia University's Statistical Research Group as a tool for more efficient industrial quality control during World War II. Its value to the war effort was immediately recognised, and led to its receiving a "restricted" classification.^[3] At the same time, George Barnard led a group working on optional stopping in Great Britain. Another early contribution to the method was made by K.J. Arrow with D. Blackwell and M.A. Girshick.^[4]

A similar approach was independently developed from first principles at about the same time by Alan Turing, as part of the Banburismus technique used at Bletchley Park, to test hypotheses about whether different messages coded by German Enigma machines should be connected and analysed together. This work remained secret until the early 1980s.^[5]

Peter Armitage introduced the use of sequential analysis in medical research, especially in the area of clinical trials. Sequential methods became increasingly popular in medicine following Stuart Pocock's work that provided clear recommendations on how to control Type 1 error rates in sequential designs.^[6]

Alpha spending functions

When researchers repeatedly analyze data as more observations are added, the probability of a Type 1 error increases. Therefore, it is important to adjust the alpha level at each interim analysis, such that the overall Type 1 error rate remains at the desired level. This is conceptually similar to using the Bonferroni correction, but because the repeated looks at the data are dependent, more efficient corrections for the alpha level can be used. Among the earliest proposals is the Pocock boundary. Alternative ways to control the Type 1 error rate exist, such as the Haybittle-Peto bounds, and additional work on determining the boundaries for interim analyses has been done by O’Brien & Fleming^[7] and Wang & Tsiatis.^[8]

A limitation of corrections such as the Pocock boundary is that the number of looks at the data must be determined before the data is collected, and that the looks at the data should be equally spaced (e.g., after 50, 100, 150, and 200 patients). The alpha spending function approach developed by Demets & Lan^[9] does not have these restrictions, and depending on the parameters chosen for the spending function, can be very similar to Pock boundaries or the corrections proposed by O'Brien and Fleming.

Applications of sequential analysis

Clinical trials

In a randomized trial with two treatment groups, group sequential testing may for example be conducted in the following manner: After n subjects in each group, are available, an interim analysis is conducted. That means, a statistical test is performed to compare the two groups, if the null hypothesis is rejected, the trial is terminated. Otherwise, the trial continues. Another n subjects per group are recruited. The statistical test is performed again, including all subjects. If the null is rejected, the trial is terminated. Otherwise, it continues with periodic evaluations until a maximum number of interim analyses have been performed. At this point, the last statistical test is conducted, and the trial is discontinued.^[10]

Other applications

Sequential analysis also has a connection to the problem of gambler's ruin that has been studied by, among others, Huygens in 1657.^[11]

Step detection is the process of finding abrupt changes in the mean level of a time series or signal. It is usually considered as a special kind of statistical method known as change point detection. Often, the step is small and the time series is corrupted by some kind of noise, and this makes the problem challenging because the step may be hidden by the noise. Therefore, statistical and/or signal processing algorithms are often required. When the algorithms are run online as the data is coming in, especially with the aim of producing an alert, this is an application of sequential analysis.

Bias

Trials that are terminated early because they reject the null hypothesis typically overestimate the true effect size.^[12] This is because in small samples, only large effect size estimates will lead to a significant effect, and the subsequent termination of a trial. Methods to correct effect size estimates in single trials have been proposed.^[13] Note that this bias is mainly problematic when interpreting single studies. In meta-analyses, overestimated effect sizes due to early stopping are balanced by underestimation in trials that stop late, leading Schou & Marschner to conclude that "early stopping of clinical trials is not a substantive source of bias in meta-analyses".^[14]

The meaning of p-values in sequential analyses also changes, because when using sequential analyses, more than one analysis is performed, and the typical definition of a p-value as the data “at least as extreme” as is observed needs to be redefined. One solution is to order the p-values of a series of sequential tests based on the time of stopping and how high the test statistic was at a given look, which is known as stagewise ordering,^[15] first proposed by Armitage.

Notes

↑ Wald, Abraham (June 1945). "Sequential Tests of Statistical Hypotheses". The Annals of Mathematical Statistics. 16 (2): 117–186. JSTOR 2235829. doi:10.1214/aoms/1177731118.
↑ Berger, James (2008). "Sequential Analysis". The New Palgrave Dictionary of Economics, 2nd Ed. doi:10.1057/9780230226203.1513.
↑
↑ Kenneth J. Arrow, David Blackwell and M.A. Girshick (1949). "Bayes and minimax solutions of sequential decision problems". Econometrica. 17 (3/4): 213–244. JSTOR 1905525. doi:10.2307/1905525.
↑ Randell, Brian (1980), "The Colossus", A History of Computing in the Twentieth Century, p. 30.
↑ W., Turnbull, Bruce (2000-01-01). Group sequential methods with applications to clinical trials. Chapman & Hall. ISBN 9780849303166. OCLC 900071609.
↑ O'Brien, Peter C.; Fleming, Thomas R. (1979-01-01). "A Multiple Testing Procedure for Clinical Trials". Biometrics. 35 (3): 549–556. doi:10.2307/2530245.
↑ Wang, Samuel K.; Tsiatis, Anastasios A. (1987-01-01). "Approximately Optimal One-Parameter Boundaries for Group Sequential Trials". Biometrics. 43 (1): 193–199. doi:10.2307/2531959.
↑ Demets, David L.; Lan, K. K. Gordon (1994-07-15). "Interim analysis: The alpha spending function approach". Statistics in Medicine. 13 (13-14): 1341–1352. ISSN 1097-0258. doi:10.1002/sim.4780131308.
↑ Korosteleva, Olga (2008). Clinical Statistics: Introducing Clinical Trials, Survival Analysis, and Longitudinal Data Analysis (First ed.). Jones and Bartlett Publishers. ISBN 0-7637-5850-7.
↑ Ghosh, B. K.; Sen, P. K. (1991). Handbook of Sequential Analysis. New York: Marcel Dekker. ISBN 9780824784089.
↑ Gordan., Lan, K. K.; Turk., Wittes, Janet (2007-01-01). Statistical monitoring of clinical trials : a unified approach. Springer. ISBN 9780387300597. OCLC 553888945.
↑ Liu, A.; Hall, W. J. (1999-03-01). "Unbiased estimation following a group sequential test". Biometrika. 86 (1): 71–78. ISSN 0006-3444. doi:10.1093/biomet/86.1.71.
↑ Schou, I. Manjula; Marschner, Ian C. (2013-12-10). "Meta-analysis of clinical trials with early stopping: an investigation of potential bias". Statistics in Medicine. 32 (28): 4859–4874. ISSN 1097-0258. doi:10.1002/sim.5893.
↑ Gordan., Lan, K. K.; Turk., Wittes, Janet (2007-01-01). Statistical monitoring of clinical trials : a unified approach. Springer. ISBN 9780387300597. OCLC 553888945.

References

Wald, Abraham (1947). Sequential Analysis. New York: John Wiley and Sons.
Bartroff, J., Lai T.L., and Shih, M.-C. (2013) Sequential Experimentation in Clinical Trials: Design and Analysis. Springer.
Ghosh, Bhaskar Kumar (1970). Sequential Tests of Statistical Hypotheses. Reading: Addison-Wesley.
Chernoff, Herman (1972). Sequential Analysis and Optimal Design. SIAM.
Siegmund, David (1985). Sequential Analysis. Springer Series in Statistics. New York: Springer-Verlag. ISBN 0-387-96134-8.
Bakeman, R., Gottman, J.M., (1997) Observing Interaction: An Introduction to Sequential Analysis, Cambridge: Cambridge University Press
Jennison, C. and Turnbull, B.W (2000) Group Sequential Methods With Applications to Clinical Trials. Chapman & Hall/CRC.
Whitehead, J. (1997). The Design and Analysis of Sequential Clinical Trials, 2nd Edition. John Wiley & Sons.

External links

R Package: Wald's Sequential Probability Ratio Test by OnlineMarketr.com
Sequential Analysis: Design Methods & Applications Journal
Course given by Rebecca Betensky at Harvard University, lecture note slides
Software for conducting sequential analysis and applications of sequential analysis in the study of group interaction in computer-mediated communication by Dr. Allan Jeong at Florida State University

Commercial

PASS Sample Size Software includes features for the setup of group sequential designs.

Design of experiments
Scientific method	Scientific experiment Statistical design Control Internal and external validity Experimental unit Blinding Optimal design: Bayesian Random assignment Randomization Restricted randomization Replication versus subsampling Sample size
Treatment and blocking	Treatment Effect size Contrast Interaction Confounding Orthogonality Blocking Covariate Nuisance variable
Models and inference	Linear regression Ordinary least squares Bayesian Random effect Mixed model Hierarchical model: Bayesian Analysis of variance (Anova) Cochran's theorem Manova (multivariate) Ancova (covariance) Compare means Multiple comparison
Designs Completely randomized	Factorial Fractional factorial Plackett-Burman Taguchi Response surface methodology Polynomial and rational modeling Box-Behnken Central composite Block Generalized randomized block design (GRBD) Latin square Graeco-Latin square Orthogonal array Latin hypercube Repeated measures design Crossover study Randomized controlled trial Sequential analysis Sequential probability ratio test
Glossary Category Statistics portal Statistical outline Statistical topics

Statistics

Descriptive statistics

Continuous data

Center	Mean arithmetic geometric harmonic Median Mode
Dispersion	Variance Standard deviation Coefficient of variation Percentile Range Interquartile range
Shape	Moments Skewness Kurtosis L-moments

Count data

Index of dispersion

Summary tables

Dependence

Graphics

Data collection

Study design	Population Statistic Effect size Statistical power Sample size determination Missing data
Survey methodology	Sampling stratified cluster Standard error Opinion poll Questionnaire
Controlled experiments	Design control optimal Controlled trial Randomized Random assignment Replication Blocking Interaction Factorial experiment
Uncontrolled studies	Observational study Natural experiment Quasi-experiment

Statistical inference

Statistical theory

Frequentist inference

Point estimation	Estimating equations Maximum likelihood Method of moments M-estimator Minimum distance Unbiased estimators Mean-unbiased minimum-variance Rao–Blackwellization Lehmann–Scheffé theorem Median unbiased Plug-in
Interval estimation	Confidence interval Pivot Likelihood interval Prediction interval Tolerance interval Resampling Bootstrap Jackknife
Testing hypotheses	1- & 2-tails Power Uniformly most powerful test Permutation test Randomization test Multiple comparisons
Parametric tests	Likelihood-ratio Wald Score

Specific tests

Z (normal) Student's t-test F
Goodness of fit	Chi-squared Kolmogorov–Smirnov Anderson–Darling Lilliefors Jarque–Bera Normality (Shapiro–Wilk) Likelihood-ratio test Model selection Cross validation AIC BIC
Rank statistics	Sign Sample median Signed rank (Wilcoxon) Hodges–Lehmann estimator Rank sum (Mann–Whitney) Nonparametric anova 1-way (Kruskal–Wallis) 2-way (Friedman) Ordered alternative (Jonckheere–Terpstra)

Bayesian inference

Correlation	Pearson product-moment Partial correlation Confounding variable Coefficient of determination
Regression analysis	Errors and residuals Regression model validation Mixed effects models Simultaneous equations models Multivariate adaptive regression splines (MARS)
Linear regression	Simple linear regression Ordinary least squares General linear model Bayesian regression
Non-standard predictors	Nonlinear regression Nonparametric Semiparametric Isotonic Robust Heteroscedasticity Homoscedasticity
Generalized linear model	Exponential families Logistic (Bernoulli) / Binomial / Poisson regressions
Partition of variance	Analysis of variance (ANOVA, anova) Analysis of covariance Multivariate ANOVA Degrees of freedom

Categorical / Multivariate / Time-series / Survival analysis

Categorical

Multivariate

Time-series

General	Decomposition Trend Stationarity Seasonal adjustment Exponential smoothing Cointegration Structural break Granger causality
Specific tests	Dickey–Fuller Johansen Q-statistic (Ljung–Box) Durbin–Watson Breusch–Godfrey
Time domain	Autocorrelation (ACF) partial (PACF) Cross-correlation (XCF) ARMA model ARIMA model (Box–Jenkins) Autoregressive conditional heteroskedasticity (ARCH) Vector autoregression (VAR)
Frequency domain	Spectral density estimation Fourier analysis Wavelet

Survival

Survival function	Kaplan–Meier estimator (product limit) Proportional hazards models Accelerated failure time (AFT) model First hitting time
Hazard function	Nelson–Aalen estimator
Test	Log-rank test

Applications

Biostatistics	Bioinformatics Clinical trials / studies Epidemiology Medical statistics
Engineering statistics	Chemometrics Methods engineering Probabilistic design Process / quality control Reliability System identification
Social statistics	Actuarial science Census Crime statistics Demography Econometrics National accounts Official statistics Population statistics Psychometrics
Spatial statistics	Cartography Environmental statistics Geographic information system Geostatistics Kriging

Category
Portal
Commons
WikiProject

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.