Probability

For the Law & Order: Criminal Intent episode, see Probability (Law & Order: Criminal Intent).

Certainty series
Agnosticism Belief Certainty Doubt Determinism Epistemology Estimation Fallibilism Fatalism Justification Nihilism Probability Skepticism Solipsism Truth Uncertainty

Probability is ordinarily used to describe an attitude of mind towards some proposition of whose truth we are not certain.^[1] The proposition of interest is usually of the form "Will a specific event occur?" The attitude of mind is of the form "How certain are we that the event will occur?" The certainty we adopt can be described in terms of a numerical measure and this number, between 0 and 1, we call probability.^[2] The higher the probability of an event, the more certain we are that the event will occur. Thus, probability in an applied sense is a measure of the likeliness that a (random) event will occur.

The concept has been given an axiomatic mathematical derivation in probability theory, which is used widely in such areas of study as mathematics, statistics, finance, gambling, science, artificial intelligence/machine learning and philosophy to, for example, draw inferences about the likeliness of events. Probability is used to describe the underlying mechanics and regularities of complex systems.

1 Interpretations
2 Etymology
3 History
4 Theory
5 Applications
6 Mathematical treatment
7 Relation to randomness
8 See also
9 Notes
10 References
11 External links

Interpretations

Main article: Probability interpretations

The word probability does not have a singular direct definition for practical application. In fact, there are several broad categories of probability interpretations, whose adherents possess different (and sometimes conflicting) views about the fundamental nature of probability. For example:

Frequentists talk about probabilities only when dealing with experiments that are random and well-defined. The probability of a random event denotes the relative frequency of occurrence of an experiment's outcome, when repeating the experiment. Frequentists consider probability to be the relative frequency "in the long run" of outcomes.^[3]
Subjectivists assign numbers per subjective probability, i.e., as a degree of belief.^[4]
Bayesians include expert knowledge as well as experimental data to produce probabilities. The expert knowledge is represented by a prior probability distribution. The data is incorporated in a likelihood function. The product of the prior and the likelihood, normalized, results in a posterior probability distribution that incorporates all the information known to date.^[5]

Etymology

The word Probability derives from the Latin probabilitas, which can also mean probity, a measure of the authority of a witness in a legal case in Europe, and often correlated with the witness's nobility. In a sense, this differs much from the modern meaning of probability, which, in contrast, is a measure of the weight of empirical evidence, and is arrived at from inductive reasoning and statistical inference.^[6]^[7]

History

The scientific study of probability is a modern development. Gambling shows that there has been an interest in quantifying the ideas of probability for millennia, but exact mathematical descriptions arose much later. There are reasons of course, for the slow development of the mathematics of probability. Whereas games of chance provided the impetus for the mathematical study of probability, fundamental issues are still obscured by the superstitions of gamblers.^[8]

According to Richard Jeffrey, "Before the middle of the seventeenth century, the term 'probable' (Latin probabilis) meant approvable, and was applied in that sense, univocally, to opinion and to action. A probable action or opinion was one such as sensible people would undertake or hold, in the circumstances."^[9] However, in legal contexts especially, 'probable' could also apply to propositions for which there was good evidence.^[10]

Aside from elementary work by Girolamo Cardano in the 16th century, the doctrine of probabilities dates to the correspondence of Pierre de Fermat and Blaise Pascal (1654). Christiaan Huygens (1657) gave the earliest known scientific treatment of the subject. Jakob Bernoulli's Ars Conjectandi (posthumous, 1713) and Abraham de Moivre's Doctrine of Chances (1718) treated the subject as a branch of mathematics.^[11] See Ian Hacking's The Emergence of Probability and James Franklin's The Science of Conjecture for histories of the early development of the very concept of mathematical probability.

The theory of errors may be traced back to Roger Cotes's Opera Miscellanea (posthumous, 1722), but a memoir prepared by Thomas Simpson in 1755 (printed 1756) first applied the theory to the discussion of errors of observation. The reprint (1757) of this memoir lays down the axioms that positive and negative errors are equally probable, and that certain assignable limits define the range of all errors. Simpson also discusses continuous errors and describes a probability curve.

Pierre-Simon Laplace (1774) first tried to deduce a rule for combining observations from the principles of the theory of probabilities. He represented the law of probability of errors by a curve $y = \phi(x)$ , $x$ being any error and $y$ its probability, and laid down three properties of this curve:

It is symmetric as to the $y$ -axis;
The $x$ -axis is an asymptote, the probability of the error $\infty$ being 0;
The area enclosed is 1, it being certain that an error exists.

He also provided, in 1781, a formula for the law of facility of error (a term Lagrange used in 1774), but it led to unmanageable equations. Daniel Bernoulli (1778) introduced the principle of the maximum product of the probabilities of a system of concurrent errors.

Adrien-Marie Legendre (1805) developed the method of least squares, and introduced it in his Nouvelles méthodes pour la détermination des orbites des comètes (New Methods for Determining the Orbits of Comets). In ignorance of Legendre's contribution, an Irish-American writer, Robert Adrain, editor of "The Analyst" (1808), first deduced the law of facility of error,

$\phi(x) = ce^{-h^2 x^2},$

$h$ being a constant depending on precision of observation, and $c$ a scale factor ensuring that the area under the curve equals 1. He gave two proofs, the second being essentially the same as John Herschel's (1850). Gauss gave the first proof that seems to have been known in Europe (the third after Adrain's) in 1809. Further proofs were given by Laplace (1810, 1812), Gauss (1823), James Ivory (1825, 1826), Hagen (1837), Friedrich Bessel (1838), W. F. Donkin (1844, 1856), and Morgan Crofton (1870). Other contributors were Ellis (1844), De Morgan (1864), Glaisher (1872), and Giovanni Schiaparelli (1875). Peters's (1856) formula for $r$ , the probable error of a single observation, is well known.

In the nineteenth century authors on the general theory included Laplace, Sylvestre Lacroix (1816), Littrow (1833), Adolphe Quetelet (1853), Richard Dedekind (1860), Helmert (1872), Hermann Laurent (1873), Liagre, Didion, and Karl Pearson. Augustus De Morgan and George Boole improved the exposition of the theory.

Andrey Markov introduced the notion of Markov chains (1906), which played an important role in stochastic processes theory and its applications. The modern theory of probability based on the measure theory was developed by Andrey Kolmogorov (1931).

On the geometric side (see integral geometry) contributors to The Educational Times were influential (Miller, Crofton, McColl, Wolstenholme, Watson, and Artemas Martin).

Further information: History of probability

Further information: History of statistics

Theory

Main article: Probability theory

Like other theories, the theory of probability is a representation of probabilistic concepts in formal terms—that is, in terms that can be considered separately from their meaning. These formal terms are manipulated by the rules of mathematics and logic, and any results are interpreted or translated back into the problem domain.

There have been at least two successful attempts to formalize probability, namely the Kolmogorov formulation and the Cox formulation. In Kolmogorov's formulation (see probability space), sets are interpreted as events and probability itself as a measure on a class of sets. In Cox's theorem, probability is taken as a primitive (that is, not further analyzed) and the emphasis is on constructing a consistent assignment of probability values to propositions. In both cases, the laws of probability are the same, except for technical details.

There are other methods for quantifying uncertainty, such as the Dempster-Shafer theory or possibility theory, but those are essentially different and not compatible with the laws of probability as usually understood.

Applications

Probability theory is applied in everyday life in risk assessment and in trade on commodity markets. Governments typically apply probabilistic methods in environmental regulation, where it is called pathway analysis. A good example is the effect of the perceived probability of any widespread Middle East conflict on oil prices—which have ripple effects in the economy as a whole. An assessment by a commodity trader that a war is more likely vs. less likely sends prices up or down, and signals other traders of that opinion. Accordingly, the probabilities are neither assessed independently nor necessarily very rationally. The theory of behavioral finance emerged to describe the effect of such groupthink on pricing, on policy, and on peace and conflict.^[12]

It can reasonably be said that the discovery of rigorous methods to assess and combine probability assessments has profoundly affected modern society. Accordingly, it may be of some importance to most citizens to understand how odds and probability assessments are made, and how they contribute to reputations and to decisions, especially in a democracy.

Another significant application of probability theory in everyday life is reliability. Many consumer products, such as automobiles and consumer electronics, use reliability theory in product design to reduce the probability of failure. Failure probability may influence a manufacture's decisions on a product's warranty.^[13]

The cache language model and other statistical language models that are used in natural language processing are also examples of applications of probability theory.

Mathematical treatment

Independent probability

If two events, A and B are independent then the joint probability is

$P(A \mbox{ and }B) = P(A \cap B) = P(A) P(B),\,$

for example, if two coins are flipped the chance of both being heads is $\tfrac{1}{2}\times\tfrac{1}{2} = \tfrac{1}{4}.$ ^[17]

Mutually exclusive

If either event A or event B or both events occur on a single performance of an experiment this is called the union of the events A and B denoted as $P(A \cup B)$ . If two events are mutually exclusive then the probability of either occurring is

$P(A\mbox{ or }B) = P(A \cup B)= P(A) %2B P(B).$

For example, the chance of rolling a 1 or 2 on a six-sided die is $P(1\mbox{ or }2) = P(1) %2B P(2) = \tfrac{1}{6} %2B \tfrac{1}{6} = \tfrac{1}{3}.$

Not mutually exclusive

If the events are not mutually exclusive then

$\mathrm{P}\left(A \hbox{ or } B\right)=\mathrm{P}\left(A\right)%2B\mathrm{P}\left(B\right)-\mathrm{P}\left(A \mbox{ and } B\right).$

For example, when drawing a single card at random from a regular deck of cards, the chance of getting a heart or a face card (J,Q,K) (or one that is both) is $\tfrac{13}{52} %2B \tfrac{12}{52} - \tfrac{3}{52} = \tfrac{11}{26}$ , because of the 52 cards of a deck 13 are hearts, 12 are face cards, and 3 are both: here the possibilities included in the "3 that are both" are included in each of the "13 hearts" and the "12 face cards" but should only be counted once.

Conditional probability

Conditional probability is the probability of some event A, given the occurrence of some other event B. Conditional probability is written $\mathrm{P}(A \mid B)$ , and is read "the probability of A, given B". It is defined by

$\mathrm{P}(A \mid B) = \frac{\mathrm{P}(A \cap B)}{\mathrm{P}(B)}.\,$ ^[18]

If $\mathrm{P}(B)=0$ then $\mathrm{P}(A \mid B)$ is undefined. Note that in this case A and B are independent.

Summary of probabilities

Summary of probabilities
Event	Probability
A	$P(A)\in[0,1]\,$
not A	$P(A')=1-P(A)\,$
A or B	$\begin{align} P(A\cup B) & = P(A)%2BP(B)-P(A\cap B) \\ & = P(A)%2BP(B) \qquad\mbox{if A and B are mutually exclusive}\\ \end{align}$
A and B	$\begin{align} P(A\cap B) & = P(A\|B)P(B) = P(B\|A)P(A)\\ & = P(A)P(B) \qquad\mbox{if A and B are independent}\\ \end{align}$
A given B	$P(A \mid B) = \frac{P(A \cap B)}{P(B)}\,$

Relation to randomness

Main article: Randomness

In a deterministic universe, based on Newtonian concepts, there would be no probability if all conditions are known, (Laplace's demon). In the case of a roulette wheel, if the force of the hand and the period of that force are known, the number on which the ball will stop would be a certainty. Of course, this also assumes knowledge of inertia and friction of the wheel, weight, smoothness and roundness of the ball, variations in hand speed during the turning and so forth. A probabilistic description can thus be more useful than Newtonian mechanics for analyzing the pattern of outcomes of repeated rolls of roulette wheel. Physicists face the same situation in kinetic theory of gases, where the system, while deterministic in principle, is so complex (with the number of molecules typically the order of magnitude of Avogadro constant 6.02·10²³) that only statistical description of its properties is feasible.

Probability theory is required to describe nature.^[19] A revolutionary discovery of early 20th century physics was the random character of all physical processes that occur at sub-atomic scales and are governed by the laws of quantum mechanics. The objective wave function evolves deterministically but, according to the Copenhagen interpretation, randomness is explained by a wave function collapse when an observation is made. However, the loss of determinism for the sake of instrumentalism did not meet with universal approval. Albert Einstein famously remarked in a letter to Max Born: "I am convinced that God does not play dice".^[20] Like Einstein, Erwin Schrödinger, who discovered the wave function, believed quantum mechanics is a statistical approximation of an underlying deterministic reality.^[21] In modern interpretations, quantum decoherence accounts for subjectively probabilistic behavior.

Notes

^ Kendall's Advanced Theory of Statistics, Volume 1: Distribution Theory, Alan Stuart and Keith Ord, 6th Ed 2009
^ An Introduction to Probability Theory and Its Applications, William Feller. 3rd Ed 1968
^ Hacking, Ian (1965). The Logic of Statistical Inference.
^ Finetti, Bruno de (1970). "Logical foundations and measurement of subjective probability". Acta Psychologica 34: 129–145. doi:10.1016/0001-6918(70)90012-0.
^ Hogg, Robert V.; Craig, Allen; McKean, Joseph W. (2004). Introduction to Mathematical Statistics (6th ed.). Upper Saddle River: Pearson. ISBN 0130085073.
^ The Emergence of Probability: A Philosophical Study of Early Ideas about Probability, Induction and Statistical Inference, Ian Hacking, Cambridge University Press, 2006, ISBN 0521685575, 9780521685573
^ The Cambridge History of Seventeenth-century Philosophy, Daniel Garber, 2003
^ Freund, John. “Introduction to Probability”. 1973, p. 1.
^ Jeffrey, R.C., Probability and the Art of Judgment, Cambridge University Press. (1992). pp. 54-55 . ISBN 0-521-39459-7
^ Franklin, J., The Science of Conjecture: Evidence and Probability Before Pascal, Johns Hopkins University Press. (2001). pp. 22, 113, 127
^ Ivancevic, Vladimir; Tijana Ivancevic. "Quantum Leap". 2008. p 16
^ Singh, Laurie. "Whither Efficient Markets? Efficient Market Theory and Behavioral Finance". The Finance Professionals' Post, 2010.
^ Gorman, Michael. "Management Insights". Management Science, 2011.
^ Ross, Sheldon. A First course in Probability, 8th Edition. Page 26-27.
^ Olofsson, Peter. (2005) Page 8.
^ Olofsson, page 9
^ Olofsson, page 35.
^ Olofsson, page 29.
^ Burgi, Mark. ” Interpretations of Negative Probabilities”. 2009, p. 1.
^ Jedenfalls bin ich überzeugt, daß der Alte nicht würfelt.
^ Moore, W.J. (1992). Schrödinger: Life and Thought. Cambridge University Press. p. 479. ISBN 0-521-43767-9.

References

Kallenberg, O. (2005) Probabilistic Symmetries and Invariance Principles. Springer -Verlag, New York. 510 pp. ISBN 0-387-25115-4
Kallenberg, O. (2002) Foundations of Modern Probability, 2nd ed. Springer Series in Statistics. 650 pp. ISBN 0-387-95313-2
Olofsson, Peter (2005) Probability, Statistics, and Stochastic Processes, Wiley-Interscience. 504 pp ISBN 0-471-67969-0.

External links

Virtual Laboratories in Probability and Statistics (Univ. of Ala.-Huntsville)
Probability on In Our Time at the BBC. (listen now)
Probability and Statistics EBook
Edwin Thompson Jaynes. Probability Theory: The Logic of Science. Preprint: Washington University, (1996). — HTML index with links to PostScript files and PDF (first three chapters)
People from the History of Probability and Statistics (Univ. of Southampton)
Probability and Statistics on the Earliest Uses Pages (Univ. of Southampton)
Earliest Uses of Symbols in Probability and Statistics on Earliest Uses of Various Mathematical Symbols
Probability Homework Help, Definitions, Distribution Calculators and Study Guides
A tutorial on probability and Bayes’ theorem devised for first-year Oxford University students
pdf file of An Anthology of Chance Operations (1963) at UbuWeb
Probability Theory Guide for Non-Mathematicians
Understanding Risk and Probability with BBC raw
Introduction to Probability - eBook, by Charles Grinstead, Laurie Snell Source (GNU Free Documentation License)

Logic

Overview

Academic areas	Argumentation theory Axiology Critical thinking Computability theory Formal semantics History of logic Informal logic Logic in computer science Mathematical logic Mathematics Metalogic Metamathematics Model theory Philosophical logic Philosophy Philosophy of logic Philosophy of mathematics Proof theory Set theory

Foundational concepts	Abduction Analytic truth Antinomy A priori Deduction Definition Description Entailment Induction Inference Logical consequence Logical form Logical implication Logical truth Name Necessity Meaning Paradox Possible world Presupposition Probability Reason Reasoning Reference Semantics Statement Strict implication Substitution Syntax Truth Truth value Validity

Philosophical logic

Critical thinking and Informal logic	Analysis Ambiguity Argument Belief Bias Credibility Evidence Explanation Explanatory power Fact Fallacy Inquiry Opinion Parsimony Premise Propaganda Prudence Reasoning Relevance Rhetoric Rigor Vagueness

Theories of deduction	Constructivism Dialetheism Fictionalism Finitism Formalism Intuitionism Logical atomism Logicism Nominalism Platonic realism Pragmatism Realism

Metalogic and metamathematics

Cantor's theorem Church's theorem Church's thesis Consistency Effective method Foundations of mathematics Gödel's completeness theorem Gödel's incompleteness theorems Soundness Completeness Decidability Interpretation Löwenheim–Skolem theorem Metatheorem Satisfiability Independence Type–token distinction Use–mention distinction

Mathematical logic

General	Formal language Formation rule Formal system Deductive system Formal proof Formal semantics Well-formed formula Set Element Class Classical logic Axiom Natural deduction Rule of inference Relation Theorem Logical consequence Axiomatic system Type theory Symbol Syntax Theory

Traditional logic	Proposition Inference Argument Validity Cogency Syllogism Square of opposition Venn diagram

Propositional calculus and Boolean logic	Boolean functions Propositional calculus Propositional formula Logical connectives Truth tables

Predicate	First-order Quantifiers Predicate Second-order Monadic predicate calculus

Set theory	Set Empty set Enumeration Extensionality Finite set Function Subset Power set Countable set Recursive set Domain Range Ordered pair Uncountable set

Model theory	Model Interpretation Non-standard model Finite model theory Truth value Validity

Proof theory	Formal proof Deductive system Formal system Theorem Logical consequence Rule of inference Syntax

Computability theory	Recursion Recursive set Recursively enumerable set Decision problem Church–Turing thesis Computable function Primitive recursive function

Non-classical logic

Modal logic	Alethic Axiologic Deontic Doxastic Epistemic Temporal

Intuitionism	Intuitionistic logic Constructive analysis Heyting arithmetic Intuitionistic type theory Constructive set theory

Fuzzy logic	Degree of truth Fuzzy rule Fuzzy set Fuzzy finite element Fuzzy set operations

Substructural logic	Structural rule Relevance logic Linear logic

Paraconsistent logic	Dialetheism

Description logic	Ontology Ontology language

Logicians

Anderson Aristotle Averroes Avicenna Bain Barwise Bernays Boole Boolos Cantor Carnap Church Chrysippus Curry De Morgan Frege Geach Gentzen Gödel Hilbert Kleene Kripke Leibniz Löwenheim Peano Peirce Putnam Quine Russell Schröder Scotus Skolem Smullyan Tarski Turing Whitehead William of Ockham Wittgenstein Zermelo

Lists

Topics	Outline of logic Index of logic articles Mathematical logic Boolean algebra Set theory

Other	Logicians Rules of inference Paradoxes Fallacies Logic symbols

Portal
Category
Outline
WikiProject
Talk
changes

Areas of mathematics

Areas	Arithmetic · Algebra (elementary – linear – multilinear – abstract) · Geometry (Discrete geometry – Algebraic geometry – Differential geometry) · Calculus/Analysis · Set theory · Logic · Category theory · Number theory · Combinatorics · Graph theory · Topology · Lie theory · Differential equations/Dynamical systems · Mathematical physics · Numerical analysis · Computation · Information theory · Probability · Statistics · Optimization · Control theory · Game theory

Divisions	Pure mathematics · Applied mathematics · Discrete mathematics · Computational mathematics

Category · Mathematics portal · Outline · Lists

Statistics

Descriptive statistics

Continuous data

Location	Mean (Arithmetic, Geometric, Harmonic) Median Mode

Dispersion	Range Standard deviation Coefficient of variation Percentile Interquartile range

Shape	Variance Skewness Kurtosis Moments L-moments

Count data

Index of dispersion

Summary tables

Dependence

Statistical graphics

Data collection

Designing studies	Effect size Standard error Statistical power Sample size determination

Survey methodology	Sampling Stratified sampling Opinion poll Questionnaire

Controlled experiment	Design of experiments Randomized experiment Random assignment Replication Blocking Factorial experiment Optimal design

Uncontrolled studies	Natural experiment Quasi-experiment Observational study

Statistical inference

Statistical theory	Sampling distribution Sufficient statistic Meta-analysis

Bayesian inference	Bayesian probability Prior Posterior Credible interval Bayes factor Bayesian estimator Maximum posterior estimator

Frequentist inference	Confidence interval Hypothesis testing Likelihood-ratio

Specific tests	Z-test (normal) Student's t-test F-test Pearson's chi-squared test Wald test Mann–Whitney U Shapiro–Wilk Signed-rank Kolmogorov–Smirnov test

General estimation	Bias Robustness Efficiency Maximum likelihood Method of moments Minimum distance Density estimation

Correlation and regression analysis

Correlation	Pearson product-moment correlation Partial correlation Confounding variable Coefficient of determination

Regression analysis	Errors and residuals Regression model validation Mixed effects models Simultaneous equations models

Linear regression	Simple linear regression Ordinary least squares General linear model Bayesian regression

Non-standard predictors	Nonlinear regression Nonparametric Semiparametric Isotonic Robust

Generalized linear model	Exponential families Logistic (Bernoulli) Binomial Poisson

Partition of variance	Analysis of variance (ANOVA) Analysis of covariance Multivariate ANOVA Degrees of freedom

Categorical, multivariate, time-series, or survival analysis

Categorical data	Cohen's kappa Contingency table Graphical model Log-linear model McNemar's test

Multivariate statistics	Multivariate regression Principal components Factor analysis Cluster analysis Copulas

Time series analysis	Decomposition (Trend, Stationary process) ARMA model ARIMA model Vector autoregression Spectral density estimation

Survival analysis	Survival function Kaplan–Meier Logrank test Failure rate Proportional hazards models Accelerated failure time model

Applications

Biostatistics	Bioinformatics Biometrics Clinical trials & studies Epidemiology Medical statistics

Engineering statistics	Chemometrics Methods engineering Probabilistic design Process & Quality control Reliability System identification

Social statistics	Actuarial science Census Crime statistics Demography Econometrics National accounts Official statistics Population Psychometrics

Spatial statistics	Cartography Environmental statistics Geographic information system Geostatistics Kriging

Category
Portal
Outline
Index

Probability

Contents

Interpretations

Etymology

History

Theory

Applications

Mathematical treatment

Independent probability

Mutually exclusive

Not mutually exclusive

Conditional probability

Summary of probabilities

Relation to randomness

See also

Notes

References

External links