Basu's theorem

In statistics, Basu's theorem states that any boundedly complete sufficient statistic is independent of any ancillary statistic. This is a 1955 result of Debabrata Basu.^[1]

It is often used in statistics as a tool to prove independence of two statistics, by first demonstrating one is complete sufficient and the other is ancillary, then appealing to the theorem.^[2] An example of this is to show that the sample mean and sample variance of a normal distribution are independent statistics, which is done in the Examples section below. This property (independence of sample mean and sample variance) characterizes normal distributions.

Statement

Let P_θ be a family of distributions on a measurable space (X, Σ). Then if T is a boundedly complete sufficient statistic for θ, and A is ancillary to θ, then T is independent of A.

Proof

Let P_θ^T and P_θ^A be the marginal distributions of T and A respectively.

P_\theta^A(B) = P_\theta (A^{-1} B) = \int_{T(X)} P_\theta(A^{-1}B | T=t) \ P_\theta^T (dt) \,

The P_θ^A does not depend on θ because A is ancillary. Likewise, P_θ(·|T = t) does not depend on θ because T is sufficient. Therefore:

\int_{T(X)} \big[ P(A^{-1}B | T=t) - P^A(B) \big] \ P_\theta^T (dt) = 0 \,

Note the integrand (the function inside the integral) is a function of t and not θ. Therefore, since T is boundedly complete:

P(A^{-1}B | T=t) = P^A(B) \quad \text{for all }t\,

Therefore, A is independent of T.

Example

Independence of sample mean and sample variance of a normal distribution

Let X₁, X₂, ..., X_n be independent, identically distributed normal random variables with mean μ and variance σ².

Then with respect to the parameter μ, one can show that

\widehat{\mu}=\frac{\sum X_i}{n},\,

the sample mean, is a complete sufficient statistic – it is all the information one can derive to estimate μ, and no more – and

\widehat{\sigma}^2=\frac{\sum \left(X_i-\bar{X}\right)^2}{n-1},\,

the sample variance, is an ancillary statistic – its distribution does not depend on μ.

Therefore, from Basu's theorem it follows that these statistics are independent.

This independence result can also be proven by Cochran's theorem.

Further, this property (that the sample mean and sample variance of the normal distribution are independent) characterizes the normal distribution – no other distribution has this property.^[3]

Notes

↑ Basu (1955)
↑ Ghosh, Malay; Mukhopadhyay, Nitis; Sen, Pranab Kumar (2011), Sequential Estimation, Wiley Series in Probability and Statistics 904, John Wiley & Sons, p. 80, ISBN 9781118165911, The following theorem, due to Basu ... helps us in proving independence between certain types of statistics, without actually deriving the joint and marginal distributions of the statistics involved. This is a very powerful tool and it is often used ...
↑ Geary, R.C. (1936). "The Distribution of the "Student's" Ratio for the Non-Normal Samples". Supplement to the Journal of the Royal Statistical Society 3 (2): 178–184. doi:10.2307/2983669. JFM 63.1090.03. JSTOR 2983669.

References

Basu, D. (1955). "On Statistics Independent of a Complete Sufficient Statistic". Sankhyā 15 (4): 377–380. JSTOR 25048259. MR 74745. Zbl 0068.13401.
Mukhopadhyay, Nitis (2000). Probability and Statistical Inference. Statistics: A Series of Textbooks and Monographs. 162. Florida: CRC Press USA. ISBN 0-8247-0379-0.
Boos, Dennis D.; Oliver, Jacqueline M. Hughes (Aug 1998). "Applications of Basu's Theorem". The American Statistician (Boston: American Statistical Association) 52 (3): 218–221. doi:10.2307/2685927. JSTOR 2685927. MR 1650407.
Ghosh, Malay (October 2002). "Basu's Theorem with Applications: A Personalistic Review". Sankhyā: the Indian Journal of Statistics, Series A 64 (3): 509–531. JSTOR 25051412. MR 1985397.

Statistics

Descriptive statistics

Continuous data

Location	Mean arithmetic geometric harmonic Median Mode

Dispersion	Range Standard deviation Coefficient of variation Percentile Interquartile range

Shape	Variance Skewness Kurtosis Moments L-moments

Count data

Index of dispersion

Summary tables

Dependence

Statistical graphics

Data collection

Study design	Effect size Standard error Statistical power Sample size determination

Survey methodology	Sampling stratified cluster Opinion poll Questionnaire

Controlled experiments	Design optimal Randomized Random assignment Replication Blocking Factorial experiment

Uncontrolled studies	Natural experiment Quasi-experiment Observational study

Statistical inference

Statistical theory

Frequentist inference

Confidence interval Testing hypotheses Power

Unbiased estimators	Mean unbiased minimum-variance Median unbiased

Biased estimators	Maximum likelihood Method of moments Minimum distance Density estimation

Parametric tests	Likelihood-ratio Wald Score

Specific tests

Z (normal) Student's t-test F Shapiro–Wilk Kolmogorov–Smirnov

Goodness of fit	Chi-squared G Sample source (Anderson–Darling) Sample normality (Shapiro–Wilk) Skewness / kurtosis normality (Jarque-Bera) Model comparison (Likelihood-ratio) Model quality (Akaike criterion)

Signed-rank	1-sample (Wilcoxon) 2-sample (Mann–Whitney U) 1-way anova (Kruskal–Wallis)

Bayesian inference

Correlation	Pearson product–moment Partial correlation Confounding variable Coefficient of determination

Regression analysis	Errors and residuals Regression model validation Mixed effects models Simultaneous equations models Multivariate adaptive regression splines (MARS)

Linear regression	Simple linear regression Ordinary least squares General linear model Bayesian regression

Non-standard predictors	Nonlinear regression Nonparametric Semiparametric Isotonic Robust Heteroscedasticity Homoscedasticity

Generalized linear model	Exponential families Logistic (Bernoulli) / Binomial / Poisson regressions

Partition of variance	Analysis of variance (ANOVA, anova) Analysis of covariance Multivariate ANOVA Degrees of freedom

Categorical / Multivariate / Time-series / Survival analysis

Categorical

Multivariate

Time-series

General	Decomposition Trend Stationarity Seasonal adjustment Exponential smoothing Cointegration Structural break Granger causality

Specific tests	Dickey–Fuller Johansen Q-statistic (Ljung–Box) Durbin–Watson Breusch–Godfrey

Time domain	Autocorrelation (ACF) partial (PACF) Cross-correlation (XCF) ARMA model ARIMA model (Box–Jenkins) Autoregressive conditional heteroskedasticity (ARCH) Vector autoregression (VAR)

Frequency domain	Spectral density estimation Fourier analysis Wavelet

Survival

Applications

Biostatistics	Bioinformatics Clinical trials / studies Epidemiology Medical statistics

Engineering statistics	Chemometrics Methods engineering Probabilistic design Process / quality control Reliability System identification

Social statistics	Actuarial science Census Crime statistics Demography Econometrics National accounts Official statistics Population Psychometrics

Spatial statistics	Cartography Environmental statistics Geographic information system Geostatistics Kriging

Category
Portal
Commons
WikiProject