Multinomial distribution

From Wikipedia, the free encyclopedia

In probability theory, the multinomial distribution is a generalization of the binomial distribution. The binomial distribution is the probability distribution of the number of "successes" in n independent Bernoulli trials, with the same probability of "success" on each trial. Instead of each trial resulting in "success" or "failure", imagine that each trial results in one of some fixed finite number k of possible outcomes, with probabilities p1, ..., pk, and there are n independent trials. We can use a random variable Xi to indicate the number of times outcome number i was observed over the n trials. Then, the multinomial distribution can be defined as the distribution of the vector

(X_1,\dots,X_k) \,

The probabilities are given by

P(X_1=x_1,\dots,X_k=x_k)=\begin{cases}{n! \over x_1!\cdots x_k!}p_1^{x_1}\cdots p_k^{x_k} \quad & \mbox{when } \sum_{i=1}^k x_i=n \\ 0 & \mbox{otherwise.} \end{cases}

for non-negative integers x1, ..., xk.

Each of the k components separately has a binomial distribution with parameters n and pi, for the appropriate value of the subscript i, and, because of the constraint that the sum of the components is n, they are negatively correlated.

The expected value is

\operatorname{E}(X_i) = n p_i.

The covariance matrix is as follows. Each diagonal entry is the variance of a binomially distributed random variable, and is therefore

\operatorname{var}(X_i)=np_i(1-p_i).

The off-diagonal entries are the covariances. These are

\operatorname{cov}(X_i,X_j)=-np_i p_j

for i, j distinct. This is a k × k nonnegative-definite matrix of rank k − 1.

The off-diagonal entries of the corresponding correlation matrix are

\rho(X_i,X_j) = -\sqrt{\frac{p_i p_j}{ (1-p_i)(1-p_j)}}.

Note that the sample size drops out of this expression. All off-diagonal entries are negatively correlated because for fixed N, an increase in one component of a multinomial vector requires a decrease in another component.

The Dirichlet distribution is the conjugate prior of the multinomial in Bayesian statistics.

[edit] See also

Image:Bvn-small.png Probability distributionsview  talk  edit ]
Univariate Multivariate
Discrete: BernoullibinomialBoltzmanncompound PoissondegenerateGauss-Kuzmingeometrichypergeometriclogarithmicnegative binomialparabolic fractalPoissonRademacherSkellamuniformYule-SimonzetaZipfZipf-Mandelbrot Ewensmultinomial
Continuous: BetaBeta primeCauchychi-squareDirac delta functionErlangexponentialexponential powerFfadingFisher's zFisher-TippettGammageneralized extreme valuegeneralized hyperbolicgeneralized inverse GaussianHalf-LogisticHotelling's T-squarehyperbolic secanthyper-exponentialhypoexponentialinverse chi-squareinverse Gaussianinverse gammaKumaraswamyLandauLaplaceLévyLévy skew alpha-stablelogisticlog-normalMaxwell-BoltzmannMaxwell speednormal (Gaussian)ParetoPearsonpolarraised cosineRayleighrelativistic Breit-WignerRiceStudent's ttriangulartype-1 Gumbeltype-2 GumbeluniformVoigtvon MisesWeibullWigner semicircleWilks' lambda DirichletKentmatrix normalmultivariate normalvon Mises-FisherWigner quasiWishart
Miscellaneous: Cantorconditionalexponential familyinfinitely divisiblelocation-scale familymarginalmaximum entropyphase-typeposteriorpriorquasisamplingsingular
In other languages