Frequentist probability

From Wikipedia, the free encyclopedia

"Statistical probability" redirects here. For the episode of Star Trek: Deep Space Nine, see Statistical Probabilities.

John Venn

Frequentist probability or frequentism is the standard interpretation of probability; it defines an event's probability as the limit of its relative frequency in a large number of trials.

The development of the frequentist account was motivated by the problems and paradoxes of the previously dominant viewpoint, the classical interpretation. In the classical interpretation, probability was defined in terms of the principle of indifference, based on the natural symmetry of a problem, so, e.g. the probabilities of dice games arise from the natural symmetric 6-sidedness of the cube. This classical interpretation stumbled at any statistical problem that has no natural symmetry for reasoning.

The shift from the classical view to the frequentist view represents a paradigm shift in the progression of statistical thought.^{[citation needed]} This school is often associated with the names of Jerzy Neyman and Egon Pearson who described the logic of statistical hypothesis testing.^{[citation needed]} Other influential figures of the frequentist school include John Venn, R.A. Fisher, and Richard von Mises.^[1]

Definition

In the frequentist interpretation, probabilities are discussed only when dealing with well-defined random experiments.^{[citation needed]} The set of all possible outcomes of a random experiment is called the sample space of the experiment. An event is defined as a particular subset of the sample space to be considered. For any given event, only one of two possibilities may hold: it occurs or it does not. The relative frequency of occurrence of an event, observed in a number of repetitions of the experiment, is a measure of the probability of that event. This is the core conception of probability in the frequentist interpretation.

Thus, if $n_{t}$ is the total number of trials and $n_{x}$ is the number of trials where the event $x$ occurred, the probability $P(x)$ of the event occurring will be approximated by the relative frequency as follows:

$P(x)\approx {\frac {n_{x}}{n_{t}}}.$

Clearly, as the number of trials is increased, one might expect the relative frequency to become a better approximation of a "true frequency".

A controversial^{[citation needed]} claim of the frequentist approach is that in the "long run," as the number of trials approaches infinity, the relative frequency will converge exactly to the true probability:^[2]

$P(x)=\lim _{{n_{t}\rightarrow \infty }}{\frac {n_{x}}{n_{t}}}.$

Such a limit is possible only in theory (e.g. counting the relative fraction of even numbers less than n_t: one may easily compute the limit $n_{t}\to \infty$ .) This conflicts with the standard claim^{[citation needed]} that the frequency interpretation is somehow more "objective" than other theories of probability.

Scope

The frequentist interpretation is a philosophical approach to the definition and use of probabilities; it is one of several, and, historically, the earliest to challenge the classical interpretation.^{[citation needed]} It does not claim to capture all connotations of the concept 'probable' in colloquial speech of natural languages.

As an interpretation, it is not in conflict with the mathematical axiomatization of probability theory; rather, it provides guidance for how to apply mathematical probability theory to real-world situations. It offers distinct guidance in the construction and design of practical experiments, especially when contrasted with the Bayesian interpretation. As to whether this guidance is useful, or is apt to mis-interpretation, has been a source of controversy. Particularly when the frequency interpretation of probability is mistakenly assumed to be the only possible basis for frequentist inference. So, for example, a list of mis-interpretations of the meaning of p-values accompanies the article on p-values; controversies are detailed in the article on statistical hypothesis testing. The Jeffreys–Lindley paradox shows how different interpretations, applied to the same data set, can lead to different conclusions about the 'statistical significance' of a result.^{[citation needed]}

As William Feller noted:^[3]

There is no place in our system for speculations concerning the probability that the sun will rise tomorrow. Before speaking of it we should have to agree on an (idealized) model which would presumably run along the lines "out of infinitely many worlds one is selected at random..." Little imagination is required to construct such a model, but it appears both uninteresting and meaningless.

History

Main article: History of probability

The frequentist view was arguably^{[citation needed]} foreshadowed by Aristotle, in Rhetoric,^[4] when he wrote:

the probable is that which for the most part happens^[5]

It was given explicit statement by Robert Leslie Ellis in "On the Foundations of the Theory of Probabilities"^[6] read on 14 February 1842,^[4] (and much later again in "Remarks on the Fundamental Principles of the Theory of Probabilities"^[7]). Antoine Augustin Cournot presented the same conception in 1843, in Exposition de la théorie des chances et des probabilités.^[8]

Perhaps the first elaborate and systematic exposition^{[citation needed]} was by John Venn,^[9] in The Logic of Chance: An Essay on the Foundations and Province of the Theory of Probability (published editions in 1866, 1876, 1888).

Etymology

According to the Oxford English Dictionary, the term 'frequentist' was first used by M. G. Kendall in 1949, to contrast with Bayesians, whom he called "non-frequentists".^[10]^[11] He observed

3....we may broadly distinguish two main attitudes. One takes probability as 'a degree of rational belief', or some similar idea...the second defines probability in terms of frequencies of occurrence of events, or by relative proportions in 'populations' or 'collectives'; (p. 101)

...

12. It might be thought that the differences between the frequentists and the non-frequentists (if I may call them such) are largely due to the differences of the domains which they purport to cover. (p. 104)

...

I assert that this is not so ... The essential distinction between the frequentists and the non-frequentists is, I think, that the former, in an effort to avoid anything savouring of matters of opinion, seek to define probability in terms of the objective properties of a population, real or hypothetical, whereas the latter do not. [emphasis in original]

Alternative views

Main article: Probability interpretations

The frequentist interpretation does resolve difficulties with the classical interpretation, such as any problem where the natural symmetry of outcomes is not known. It does not address other issues, such as the dutch book. Propensity probability is an alternative physicalist approach.^{[citation needed]}

Notes

↑ The Frequency theory Chapter 5; discussed in Donald Gilles, Philosophical theories of probability (2000), Psychology Press. ISBN 9780415182751 , p. 88.
↑ von Mises, Richard (1939) Probability, Statistics, and Truth (in German) (English translation, 1981: Dover Publications; 2 Revised edition. ISBN 0486242145) (p.14)
↑ William Feller (1957), An Introduction to Probability Theory and Its Applications, Vol. 1, page 4
↑ 4.0 4.1 Keynes, John Maynard; A Treatise on Probability (1921), Chapter VIII “The Frequency Theory of Probability”.
↑ Rhetoric Bk 1 Ch 2; discussed in J. Franklin, The Science of Conjecture: Evidence and Probability Before Pascal (2001), The Johns Hopkins University Press. ISBN 0801865697 , p. 110.
↑ Ellis, Robert Leslie (1843) “On the Foundations of the Theory of Probabilities”, Transactions of the Cambridge Philosophical Society vol 8
↑ Ellis, Robert Leslie (1854) “Remarks on the Fundamental Principles of the Theory of Probabilitiess”, Transactions of the Cambridge Philosophical Society vol 9
↑ Cournot, Antoine Augustin (1843) Exposition de la théorie des chances et des probabilités. L. Hachette, Paris. archive.org
↑ Venn, John (1888) The Logic of Chance, 3rd Edition archive.org. Full title: The Logic of Chance: An essay on the foundations and province of the theory of probability, with especial reference to its logical bearings and its application to Moral and Social Science, and to Statistics, Macmillan & Co, London
↑ Earliest Known Uses of Some of the Words of Probability & Statistics
↑ Kendall, Maurice George (1949). "On the Reconciliation of Theories of Probability". Biometrika (Biometrika Trust) 36 (1/2): 101–116. doi:10.1093/biomet/36.1-2.101. JSTOR 2332534.

References

P W Bridgman, The Logic of Modern Physics, 1927
Alonzo Church, The Concept of a Random Sequence, 1940
Harald Cramér, Mathematical Methods of Statistics, 1946
William Feller, An introduction to Probability Theory and its Applications, 1957
P Martin-Löf, On the Concept of a Random Sequence, 1966
Richard von Mises, Probability, Statistics, and Truth, 1939 (German original 1928)
Jerzy Neyman, First Course in Probability and Statistics, 1950
Hans Reichenbach, The Theory of Probability, 1949 (German original 1935)
Bertrand Russell, Human Knowledge, 1948
Friedman, C. (1999). "The Frequency Interpretation in Probability". Advances in Applied Mathematics 23 (3): 234–174. doi:10.1006/aama.1999.0653. PS

Statistics

Descriptive statistics

Continuous data

Location	Mean (Arithmetic, Geometric, Harmonic) Median Mode

Dispersion	Range Standard deviation Coefficient of variation Percentile Interquartile range

Shape	Variance Skewness Kurtosis Moments L-moments

Count data

Index of dispersion

Summary tables

Dependence

Statistical graphics

Data collection

Designing studies	Effect size Standard error Statistical power Sample size determination

Survey methodology	Sampling Stratified sampling Cluster sampling Opinion poll Questionnaire

Controlled experiment	Design of experiments Randomized experiment Random assignment Replication Blocking Factorial experiment Optimal design

Uncontrolled studies	Natural experiment Quasi-experiment Observational study

Statistical inference

Statistical theory	Sampling distribution Order statistic Scan statistic Record value Sufficiency Completeness Exponential family Permutation test (Randomization test) Empirical distribution Bootstrap U statistic Efficiency Asymptotics Robustness

Frequentist inference	Unbiased estimator (Mean unbiased minimum variance, Median unbiased) Biased estimators (Maximum likelihood, Method of moments, Minimum distance, Density estimation) Confidence interval Testing hypotheses Power Parametric tests (Likelihood-ratio, Wald, Score)

Specific tests	Z (normal) Student's t-test F Goodness of fit (Chi-squared, G, Sample source, sample normality, Skewness & kurtosis Normality, Model comparison, Model quality) Signed-rank (1-sample, 2-sample, 1-way anova) Shapiro–Wilk Kolmogorov–Smirnov

Bayesian inference	Bayesian probability Prior Posterior Credible interval Bayes factor Bayesian estimator Maximum posterior estimator

Correlation and regression analysis

Correlation	Pearson product–moment correlation Partial correlation Confounding variable Coefficient of determination

Regression analysis	Errors and residuals Regression model validation Mixed effects models Simultaneous equations models MARS

Linear regression	Simple linear regression Ordinary least squares General linear model Bayesian regression

Non-standard predictors	Nonlinear regression Nonparametric Semiparametric Isotonic Robust Heteroscedasticity Homoscedasticity

Generalized linear model	Exponential families Logistic (Bernoulli) Binomial Poisson

Partition of variance	Analysis of variance (ANOVA) Analysis of covariance Multivariate ANOVA Degrees of freedom

Categorical, multivariate, time-series, or survival analysis

Categorical data

Multivariate statistics

Time series analysis

General	Decomposition Trend Stationarity Seasonal adjustment Exponential smoothing Cointegration

Specific tests	Granger causality Q-Statistic Durbin–Watson

Time domain	ACF PACF XCF ARMA model ARIMA model ARCH Vector autoregression

Frequency domain	Spectral density estimation Fourier analysis

Survival analysis

Applications

Biostatistics	Bioinformatics Clinical trials & studies Epidemiology Medical statistics

Engineering statistics	Chemometrics Methods engineering Probabilistic design Process & Quality control Reliability System identification

Social statistics	Actuarial science Census Crime statistics Demography Econometrics National accounts Official statistics Population Psychometrics

Spatial statistics	Cartography Environmental statistics Geographic information system Geostatistics Kriging

Category
Portal
Outline
Index

This article is issued from Wikipedia. The text is available under the Creative Commons Attribution/Share Alike; additional terms may apply for the media files.