Notation in probability and statistics
From Wikipedia, the free encyclopedia
Probability theory and statistics has some commonly-used conventions of its own, in addition to standard mathematical notation and mathematical symbols.
Contents |
[edit] Probability theory
- Random variables (e.g. the height of students) are written in upper case.
- Singular values are written in lower case (e.g. P( X = x ) can be the probability that a student is of height x).
indicates the probability that events A and B both occur.
indicates the probability of either event A or event B occurring ("or" in this case means one or the other or both).
- σ-algebras are usually written with upper case calligraphic (e.g.
for the set of sets on which we define the probability P)
(N-choose-k) is defined as the number of ways in which one can select k objects from N objects, and is an alternative term for binomial coefficient. Also defined as combination without repetition in combinations and permutations.
- Probability density functions (pdfs) and probability mass functions are denoted by lower case letters, e.g. f(x).
- Cumulative distribution functions (cdfs) are denoted by upper case letters, e.g. F(x).
- In particular, the pdf of the standard normal distribution is denoted by φ(z), and its cdf by Φ(z).
- Some common operators:
-
- E(X) : expected value of X
- Var(X) : variance of X
- Cov(X,Y) : covariance of X and Y
[edit] Statistics
- Greek letters (e.g. θ, β) are commonly used to denote unknown parameters (population parameters).
- An estimate of a parameter is often denoted by placing a caret over the corresponding symbol, e.g.
, pronounced "theta hat".
- The arithmetic mean of a set of numbers x1, x2, ..., xn is denoted by
, pronounced "x bar".
[edit] Critical values
The α-level upper critical value of a probability distribution is the value exceeded with probability α, that is, the value xα such that F(xα) = 1 − α where F is the cumulative distribution function. There are standard notations for the upper critical values of some commonly used distributions in statistics:
- zα or z(α) for the Standard normal distribution
- tα,ν or t(α,ν) for the t-distribution with ν degrees of freedom
or χ2(α,ν) for the chi-square distribution with ν degrees of freedom
or F(α,ν1,ν2) for the F-distribution with ν1 and ν2 degrees of freedom
[edit] Linear algebra
- Matrices are usually denoted by boldface capital letters, e.g. A.
- Column vectors are usually denoted by boldface lower case letters, e.g. x.
- The transpose operator is denoted by either a superscript T (e.g. AT) or a prime symbol (e.g. A′).
- A row vector is written as the transpose of a column vector, e.g. xT or x′.
Note: Wikipedia articles usually use the superscript T to denote transpose. The prime symbol is more difficult to produce and is rather small in the default font.
[edit] Abbreviations
Common abbreviations include:
- a.e. almost everywhere
- cdf cumulative distribution function
- df degrees of freedom
- pdf probability density function
- pmf probability mass function
- r.v. random variable
[edit] See also
- Glossary of probability and statistics
- Combinations and permutations
- Typographical conventions in mathematical formulae
[edit] References
Halperin, Max; Hartley, H. O. & Hoel, P. G. (1965), “Recommended Standards for Statistical Symbols and Notation. COPSS Committee on Symbols and Notation”, The American Statistician 19 (3): 12-14, <http://links.jstor.org/sici?sici=0003-1305%28196506%2919%3A3%3C12%3ARSFSSA%3E2.0.CO%3B2-I>
[edit] External links
- Earliest Uses of Symbols in Probability and Statistics, maintained by Jeff Miller.