Yule-Simon distribution
From Wikipedia, the free encyclopedia
Probability mass function Yule-Simon PMF on a log-log scale. (Note that the function is only defined at integer values of k. The connecting lines do not indicate continuity.) |
|
Cumulative distribution function Yule-Simon CMF. (Note that the function is only defined at integer values of k. The connecting lines do not indicate continuity.) |
|
Parameters | shape (real) |
---|---|
Support | |
Probability mass function (pmf) | |
Cumulative distribution function (cdf) | |
Mean | for |
Median | |
Mode | |
Variance | for |
Skewness | for |
Excess kurtosis | for |
Entropy | |
Moment-generating function (mgf) | |
Characteristic function |
In probability and statistics, the Yule-Simon distribution is a discrete probability distribution named after Udny Yule and Herbert Simon. Simon originally called it the Yule distribution.
The probability mass function of the Yule-Simon(ρ) distribution is
for integer and real ρ > 0, where B is the beta function. Equivalently the pmf can be written in terms of the falling factorial as
where Γ is the gamma function. Thus, if ρ is an integer,
The probability mass function f has the property that for sufficiently large k we have
This means that the tail of the Yule-Simon distribution is a realization of Zipf's law: f(k;ρ) can be used to model, for example, the relative frequency of the kth most frequent word in a large collection of text, which according to Zipf's law is inversely proportional to a (typically small) power of k.
[edit] Occurrence
The Yule-Simon distribution arises as a continuous mixture of geometric distributions. Specifically, assume that W follows an exponential distribution with scale 1 / ρ or rate ρ:
Then a Yule-Simon distributed variable K has the following geometric distribution:
The pmf of a geometric distribution is
for . The Yule-Simon pmf is then the following exponential-geometric mixture distribution:
[edit] Generalizations
The two-parameter generalization of the original Yule distribution replaces the beta function with an incomplete beta function. The probability mass function of the generalized Yule-Simon(ρ, α) distribution is defined as
with . For α = 0 the ordinary Yule-Simon(ρ) distribution is obtained as a special case. The use of the incomplete beta function has the effect of introducing an exponential cutoff in the upper tail.
[edit] References
- Herbert A. Simon, On a Class of Skew Distribution Functions, Biometrika 42(3/4): 425–440, December 1955.
- Colin Rose and Murray D. Smith, Mathematical Statistics with Mathematica. New York: Springer, 2002, ISBN 0-387-95234-9. (See page 107, where it is called the "Yule distribution".)