Frequentist probability

"Statistical probability" redirects here. For the episode of Star Trek: Deep Space Nine, see Statistical Probabilities.

John Venn

Frequentist probability or frequentism is a standard interpretation of probability; it defines an event's probability as the limit of its relative frequency in a large number of trials. This interpretation supports the statistical needs of experimental scientists and pollsters; probabilities can be found (in principle) by a repeatable objective process (and are thus ideally devoid of opinion). It does not support all needs; gamblers typically require estimates of the odds without experiments.

The development of the frequentist account was motivated by the problems and paradoxes of the previously dominant viewpoint, the classical interpretation. In the classical interpretation, probability was defined in terms of the principle of indifference, based on the natural symmetry of a problem, so, e.g. the probabilities of dice games arise from the natural symmetric 6-sidedness of the cube. This classical interpretation stumbled at any statistical problem that has no natural symmetry for reasoning.

Definition

In the frequentist interpretation, probabilities are discussed only when dealing with well-defined random experiments (or random samples).^[1] The set of all possible outcomes of a random experiment is called the sample space of the experiment. An event is defined as a particular subset of the sample space to be considered. For any given event, only one of two possibilities may hold: it occurs or it does not. The relative frequency of occurrence of an event, observed in a number of repetitions of the experiment, is a measure of the probability of that event. This is the core conception of probability in the frequentist interpretation.

Thus, if $n_t$ is the total number of trials and $n_x$ is the number of trials where the event $x$ occurred, the probability $P(x)$ of the event occurring will be approximated by the relative frequency as follows:

P(x) \approx \frac{n_x}{n_t}.

Clearly, as the number of trials is increased, one might expect the relative frequency to become a better approximation of a "true frequency".

A claim of the frequentist approach is that in the "long run," as the number of trials approaches infinity, the relative frequency will converge exactly to the true probability:^[2]

P(x) = \lim_{n_t\rightarrow \infty}\frac{n_x}{n_t}.

Scope

The frequentist interpretation is a philosophical approach to the definition and use of probabilities; it is one of several such approaches. It does not claim to capture all connotations of the concept 'probable' in colloquial speech of natural languages.

As an interpretation, it is not in conflict with the mathematical axiomatization of probability theory; rather, it provides guidance for how to apply mathematical probability theory to real-world situations. It offers distinct guidance in the construction and design of practical experiments, especially when contrasted with the Bayesian interpretation. As to whether this guidance is useful, or is apt to mis-interpretation, has been a source of controversy. Particularly when the frequency interpretation of probability is mistakenly assumed to be the only possible basis for frequentist inference. So, for example, a list of mis-interpretations of the meaning of p-values accompanies the article on p-values; controversies are detailed in the article on statistical hypothesis testing. The Jeffreys–Lindley paradox shows how different interpretations, applied to the same data set, can lead to different conclusions about the 'statistical significance' of a result.

As William Feller noted:^[3]

There is no place in our system for speculations concerning the probability that the sun will rise tomorrow. Before speaking of it we should have to agree on an (idealized) model which would presumably run along the lines "out of infinitely many worlds one is selected at random..." Little imagination is required to construct such a model, but it appears both uninteresting and meaningless.

Feller's comment was criticism of Laplace, who published a solution to the sunrise problem using an alternative probability interpretation. Despite Laplace's explicit and immediate disclaimer in the source, based on expertise in astronomy as well as probability, two centuries of criticism have followed.

History

Main article: History of probability

The frequentist view may have been foreshadowed by Aristotle, in Rhetoric,^[4] when he wrote:

the probable is that which for the most part happens^[5]

Poisson clearly distinguished between objective and subjective probabilities in 1837.^[6] Soon thereafter a flurry of nearly simultaneous publications by Mill, Ellis ("On the Foundations of the Theory of Probabilities"^[7] and "Remarks on the Fundamental Principles of the Theory of Probabilities"^[8]), Cournot (Exposition de la théorie des chances et des probabilités)^[9] and Fries introduced the frequentist view. Venn provided a thorough exposition (The Logic of Chance: An Essay on the Foundations and Province of the Theory of Probability (published editions in 1866, 1876, 1888))^[10] two decades later. These were further supported by the publications of Boole and Bertrand. By the end of the 19th century the frequentist interpretation was well established and perhaps dominant in the sciences.^[6] The following generation established the tools of classical inferential statistics (significance testing, hypothesis testing and confidence intervals) all based on frequentist probability.

Alternatively,^[11] Jacob Bernoulli (AKA James or Jacques) understood the concept of frequentist probability and published a critical proof (the weak law of large numbers) posthumously in 1713. He is also credited with some appreciation for subjective probability (prior to and without Bayes theorem).^[12]^[13] Gauss and Laplace used frequentist (and other) probability in derivations of the least squares method a century later, a generation before Poisson.^[14] Laplace considered the probabilities of testimonies, tables of mortality, judgments of tribunals, etc. which are unlikely candidates for classical probability. In this view, Poisson's contribution was his sharp criticism of the alternative "inverse" (subjective, Bayesian) probability interpretation. Any criticism by Gauss and Laplace was muted and implicit. (Their later derivations did not use inverse probability.)

Major contributors to "classical" statistics in the early 20th century included Fisher, Neyman and Pearson. Fisher contributed to most of statistics and made significance testing the core of experimental science; Neyman formulated confidence intervals and contributed heavily to sampling theory; Neyman and Pearson paired in the creation of hypothesis testing. All valued objectivity, so the best interpretation of probability available to them was frequentist. All were suspicious of "inverse probability" (the available alternative) with prior probabilities chosen by the using the principle of indifference. Fisher said, "...the theory of inverse probability is founded upon an error, [referring to Bayes theorem] and must be wholly rejected." (from his Statistical Methods for Research Workers). While Neyman was a pure frequentist,^[1] Fisher's views of probability were unique; Both had nuanced view of probability. von Mises offered a combination of mathematical and philosophical support for frequentism in the era.^[2]^[15]

Etymology

According to the Oxford English Dictionary, the term 'frequentist' was first used by M. G. Kendall in 1949, to contrast with Bayesians, whom he called "non-frequentists".^[16]^[17] He observed

3....we may broadly distinguish two main attitudes. One takes probability as 'a degree of rational belief', or some similar idea...the second defines probability in terms of frequencies of occurrence of events, or by relative proportions in 'populations' or 'collectives'; (p. 101)

...

12. It might be thought that the differences between the frequentists and the non-frequentists (if I may call them such) are largely due to the differences of the domains which they purport to cover. (p. 104)

...

I assert that this is not so ... The essential distinction between the frequentists and the non-frequentists is, I think, that the former, in an effort to avoid anything savouring of matters of opinion, seek to define probability in terms of the objective properties of a population, real or hypothetical, whereas the latter do not. [emphasis in original]

"The Frequency Theory of Probability" was used a generation earlier as a chapter title in Keynes (1921).^[4]

The historical sequence: probability concepts were introduced and much of probability mathematics derived (prior to the 20th century), classical statistical inference methods were developed, the mathematical foundations of probability were solidified and current terminology was introduced (all in the 20th century). The primary historical sources in probability and statistics did not use the current terminology of classical, subjective (Bayesian) and frequentist probability.

Alternative views

Main article: Probability interpretations

Probability theory is a branch of mathematics. While its roots reach centuries into the past, it reached maturity with the axioms of Andrey Kolmogorov in 1933. The theory focuses on the valid operations on probability values rather than on the initial assignment of values; the mathematics is largely independent of any interpretation of probability.

Applications and interpretations of probability are considered by philosophy, the sciences and statistics. All are interested in the extraction of knowledge from observations—inductive reasoning. There are a variety of competing interpretations;^[18] All have problems. Major interpretations include classical probability, subjective probability and frequency interpretations.

Classical probability assigns probabilities based on physical idealized symmetry (dice, coins, cards). The classical definition is at risk of circularity; Probabilities are defined by assuming equality of probabilities.^[19] In the absence of symmetry the utility of the definition is limited.
Subjective probability (a family of competing interpretations) considers degrees of belief. All practical "subjective" probability interpretations are so constrained to rationality as to avoid most subjectivity. Real subjectivity is repellent to the sciences which strive for results independent of the observer and analyst. The historical roots of this concept extended to such non-numeric applications as legal evidence.
Frequency interpretations are empirical—they are defined by a ratio from an infinite series of trials. This is a very natural interpretation for scientific experiments. Mathematicians are dubious of the convergence properties of the non-mathematical series.^[19]

The frequentist interpretation does resolve difficulties with the classical interpretation, such as any problem where the natural symmetry of outcomes is not known. It does not address other issues, such as the dutch book. Propensity probability is an alternative physicalist approach.^[18]

Notes

1 2 Neyman, Jerzy (30 August 1937). "Outline of a Theory of Statistical Estimation Based on the Classical Theory of Probability". Phil. Trans. R. Soc. Lond. A 236: 333–380. doi:10.1098/rsta.1937.0005. Neyman's derivation of confidence intervals embraced the measure theoretic axioms of probability published by Kolmogorov a few years previously and referenced the subjective (Bayesian) probability definitions of Jeffreys published earlier in the decade. Neyman defined frequentist probability (under the name classical) and stated the need for randomness in the repeated samples or trials. He accepted in principle the possibility of multiple competing theories of probability while expressing several specific reservations about the existing alternative probability interpretation.
1 2 von Mises, Richard (1939) Probability, Statistics, and Truth (in German) (English translation, 1981: Dover Publications; 2 Revised edition. ISBN 0486242145) (p.14)
↑ William Feller (1957), An Introduction to Probability Theory and Its Applications, Vol. 1, page 4
1 2 Keynes, John Maynard; A Treatise on Probability (1921), Chapter VIII “The Frequency Theory of Probability”.
↑ Rhetoric Bk 1 Ch 2; discussed in J. Franklin, The Science of Conjecture: Evidence and Probability Before Pascal (2001), The Johns Hopkins University Press. ISBN 0801865697 , p. 110.
1 2 Gigerenzer, Gerd; Swijtink, Porter, Daston, Beatty & Krüger (1989). The Empire of chance : how probability changed science and everyday life. Cambridge Cambridgeshire New York: Cambridge University Press. pp. 35–6, 45. ISBN 0-521-39838-X. Cite uses deprecated parameter |coauthors= (help)
↑ Ellis, Robert Leslie (1843) “On the Foundations of the Theory of Probabilities”, Transactions of the Cambridge Philosophical Society vol 8
↑ Ellis, Robert Leslie (1854) “Remarks on the Fundamental Principles of the Theory of Probabilitiess”, Transactions of the Cambridge Philosophical Society vol 9
↑ Cournot, Antoine Augustin (1843) Exposition de la théorie des chances et des probabilités. L. Hachette, Paris. archive.org
↑ Venn, John (1888) The Logic of Chance, 3rd Edition archive.org. Full title: The Logic of Chance: An essay on the foundations and province of the theory of probability, with especial reference to its logical bearings and its application to Moral and Social Science, and to Statistics, Macmillan & Co, London
↑ Hald, Anders (2004). A history of parametric statistical inference from Bernoulli to Fisher, 1713 to 1935. København: Anders Hald, Department of applied Mathematics and Statistics, University of Copenhagen. pp. 11–12. ISBN 87-7834-628-2.
↑ Fienberg, Stephen E. (1992). "A Brief History of Statistics in Three and One-half Chapters: A Review Essay". Statistical Science 7 (2): 208–225. doi:10.1214/ss/1177011360.
↑ David, F. N. (1962). Games, Gods & Gambling. New York: Hafner. pp. 137–138. Bernoulli provided a classical example of drawing a large number of black and white pebbles from an urn (with replacement). The sample ratio allowed Bernoulli to infer the ratio in the urn, with tighter bounds as the number of samples increased. Historians can interpret the example as classical, frequentist or subjective probability. David says, "James has definitely started here the controversy on inverse probability..." Bernoulli wrote generations before Bayes, LaPlace and Gauss. The controversy continues.
↑ Hald, Anders (2004). A history of parametric statistical inference from Bernoulli to Fisher, 1713 to 1935. København: Anders Hald, Department of applied Mathematics and Statistics, University of Copenhagen. pp. 1–5. ISBN 87-7834-628-2.
↑ The Frequency theory Chapter 5; discussed in Donald Gilles, Philosophical theories of probability (2000), Psychology Press. ISBN 9780415182751 , p. 88.
↑ Earliest Known Uses of Some of the Words of Probability & Statistics
↑ Kendall, Maurice George (1949). "On the Reconciliation of Theories of Probability". Biometrika (Biometrika Trust) 36 (1/2): 101–116. doi:10.1093/biomet/36.1-2.101. JSTOR 2332534.
1 2 Hájek, Alan, Zalta, Edward N., ed., Interpretations of Probability, The Stanford Encyclopedia of Philosophy
1 2 Ash, Robert B. (1970). Basic Probability Theory. New York: Wiley. pp. 1–2.

References

P W Bridgman, The Logic of Modern Physics, 1927
Alonzo Church, The Concept of a Random Sequence, 1940
Harald Cramér, Mathematical Methods of Statistics, 1946
William Feller, An introduction to Probability Theory and its Applications, 1957
P Martin-Löf, On the Concept of a Random Sequence, 1966
Richard von Mises, Probability, Statistics, and Truth, 1939 (German original 1928)
Jerzy Neyman, First Course in Probability and Statistics, 1950
Hans Reichenbach, The Theory of Probability, 1949 (German original 1935)
Bertrand Russell, Human Knowledge, 1948
Friedman, C. (1999). "The Frequency Interpretation in Probability". Advances in Applied Mathematics 23 (3): 234–174. doi:10.1006/aama.1999.0653. PS

Statistics

Descriptive statistics

Continuous data

Location	Mean arithmetic geometric harmonic Median Mode

Dispersion	Range Standard deviation Coefficient of variation Percentile Interquartile range

Shape	Variance Skewness Kurtosis Moments L-moments

Count data

Index of dispersion

Summary tables

Dependence

Statistical graphics

Data collection

Study design	Effect size Standard error Statistical power Sample size determination

Survey methodology	Sampling stratified cluster Opinion poll Questionnaire

Controlled experiments	Design control optimal Controlled trial Randomized Random assignment Replication Blocking Factorial experiment

Uncontrolled studies	Observational study Natural experiment Quasi-experiment

Statistical inference

Statistical theory

Frequentist inference

Confidence interval Testing hypotheses Power

Unbiased estimators	Mean unbiased minimum-variance Median unbiased

Biased estimators	Maximum likelihood Method of moments Minimum distance Density estimation

Parametric tests	Likelihood-ratio Wald Score

Specific tests

Z (normal) Student's t-test F Shapiro–Wilk Kolmogorov–Smirnov

Goodness of fit	Chi-squared G Sample source (Anderson–Darling) Sample normality (Shapiro–Wilk) Skewness / kurtosis normality (Jarque-Bera) Model comparison (Likelihood-ratio) Model quality (Akaike criterion)

Signed-rank	1-sample (Wilcoxon) 2-sample (Mann–Whitney U) 1-way anova (Kruskal–Wallis)

Bayesian inference

Correlation	Pearson product–moment Partial correlation Confounding variable Coefficient of determination

Regression analysis	Errors and residuals Regression model validation Mixed effects models Simultaneous equations models Multivariate adaptive regression splines (MARS)

Linear regression	Simple linear regression Ordinary least squares General linear model Bayesian regression

Non-standard predictors	Nonlinear regression Nonparametric Semiparametric Isotonic Robust Heteroscedasticity Homoscedasticity

Generalized linear model	Exponential families Logistic (Bernoulli) / Binomial / Poisson regressions

Partition of variance	Analysis of variance (ANOVA, anova) Analysis of covariance Multivariate ANOVA Degrees of freedom

Categorical / Multivariate / Time-series / Survival analysis

Categorical

Multivariate

Time-series

General	Decomposition Trend Stationarity Seasonal adjustment Exponential smoothing Cointegration Structural break Granger causality

Specific tests	Dickey–Fuller Johansen Q-statistic (Ljung–Box) Durbin–Watson Breusch–Godfrey

Time domain	Autocorrelation (ACF) partial (PACF) Cross-correlation (XCF) ARMA model ARIMA model (Box–Jenkins) Autoregressive conditional heteroskedasticity (ARCH) Vector autoregression (VAR)

Frequency domain	Spectral density estimation Fourier analysis Wavelet

Survival

Survival function	Kaplan–Meier estimator (product limit) Proportional hazards models Accelerated failure time (AFT) model First hitting time

Hazard function	Nelson–Aalen estimator

Test	Log-rank test

Applications

Biostatistics	Bioinformatics Clinical trials / studies Epidemiology Medical statistics

Engineering statistics	Chemometrics Methods engineering Probabilistic design Process / quality control Reliability System identification

Social statistics	Actuarial science Census Crime statistics Demography Econometrics National accounts Official statistics Population Psychometrics

Spatial statistics	Cartography Environmental statistics Geographic information system Geostatistics Kriging

Category
Portal
Commons
WikiProject

This article is issued from Wikipedia - version of the Wednesday, February 10, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.