Heavy-tailed distribution

From Wikipedia, the free encyclopedia

In probability theory, heavy-tailed distributions are probability distributions whose tails are not exponentially bounded:[1] that is, they have heavier tails than the exponential distribution. In many applications it is the right tail of the distribution that is of interest, but a distribution may have a heavy left tail, or both tails may be heavy.

There are two important subclasses of heavy-tailed distributions, the long-tailed distributions and the subexponential distributions. In practice, all commonly used heavy-tailed distributions belong to the subexponential class.

There is still some discrepancy over the use of the term heavy-tailed. There are two other definitions in use. Some authors use the term to refer to those distributions which do not have all their power moments finite; and some others to those distributions that do not have a variance. The definition given in this article is the most general in use, and includes all distributions encompassed by the alternative definitions, as well as those distributions such as log-normal that possess all their power moments, yet which are generally acknowledged to be heavy-tailed.

Contents

[edit] Definition of heavy-tailed distribution

The distribution of a random variable X with distribution function F is said to have a heavy right tail if[2]


\lim_{x \to \infty} e^{\lambda x}\Pr[X>x] = \infty \quad \mbox{for all } \lambda>0.\,

This is also written in terms of the tail distribution function \overline{F}(x) \equiv \Pr(X>x) as


\lim_{x \to \infty} e^{\lambda x}\overline{F}(x) = \infty \quad \mbox{for all } \lambda>0.\,

This is equivalent to the statement that the moment generating function of F, MF(t), is infinite for all t > 0[3].

The definitions of heavy-tailed for left-tailed or two tailed distributions are similar.

[edit] Definition of long-tailed distribution

The distribution of a random variable X with distribution function F is said to have a long right tail[4] if for all  t \in \mathbb{R}


\lim_{x \to \infty} \Pr[X>x+t|X>x] =1,

or equivalently


\overline{F}(x+t) \sim \overline{F}(x) \quad \mbox{as } x \to \infty.

This has the intuitive interpretation for a right-tailed long-tailed distributed quantity that if the long-tailed quantity exceeds some high level, the probability approaches 1 that it will exceed any other higher level: if you know the situation is bad, it is probably worse than you think.

All long-tailed distributions are heavy-tailed, but the converse is false, and it is possible to construct heavy-tailed distributions that are not long-tailed.

[edit] Subexponential distributions

Subexponentiality is defined in terms of convolutions of probability distributions. For two independent, identically distributed random variables X1,X2 with common distribution function F the convolution of F with itself, F * 2 is defined, using Lebesgue-Stieltjes integration, by:


\Pr(X_1+X_2 \leq x) = F^{*2}(x) = \int_{- \infty}^{\infty} F(x-y)\,dF(y).

The n-fold convolution F * n is defined in the same way. The tail distribution function \overline{F} is defined as \overline{F}(x) = 1-F(x).

A distribution F on the positive half-line is subexponential[5] if


\overline{F^{*2}}(x) \sim 2\overline{F}(x) \quad \mbox{as } x \to \infty.

This implies[6] that, for any n \geq 1,


\overline{F^{*n}}(x) \sim n\overline{F}(x) \quad \mbox{as } x \to \infty.

The probabilistic interpretation[7] of this is that, for a sum of n independent random variables X_1,\ldots,X_n with common distribution F,


\Pr(X_1+ \cdots X_n>x) \sim \Pr(\max(X_1, \ldots,X_n)>x) \quad \mbox{as } x \to \infty.

This is often known as the principle of the single big jump[8].

A distribution F on the whole real line is subexponential if the distribution F I([0,\infty)) is[9]. Here I([0,\infty)) is the indicator function of the positive half-line. Alternatively, a random variable X supported on the real line is subexponential if and only if X + = max(0,X) is subexponential.

All subexponential distributions are long-tailed, but examples can be constructed of long-tailed distributions that are not subexponential.

[edit] Common heavy-tailed distributions

All commonly used heavy-tailed distributions are subexponential.[10]

Those that are one-tailed include:

Those that are two-tailed include:

[edit] References

  1. ^ Asmussen, Applied Probability and Queues, 2003
  2. ^ Asmussen, Applied Probability and Queues, 2003
  3. ^ Rolski, Schmidli, Scmidt, Teugels, Stochastic Processes for Insurance and Finance, 1999
  4. ^ Asmussen, Applied Probability and Queues, 2003
  5. ^ Asmussen, Applied Probability and Queues, 2003
  6. ^ Embrechts, Kluppelberg, Mikosch, Modelling Extremal Events, 1997
  7. ^ Embrechts, Kluppelberg, Mikosch, Modelling Extremal Events, 1997
  8. ^ Foss, Konstantopolous, Zachary, "Discrete and continuous time modulated random walks with heavy-tailed increments", Journal of Theoretical Probability, 20 (2007), No.3, 581—612
  9. ^ Willekens, E. Subexponentiality on the real line. Technical Report, K.U. Leuven(1986)
  10. ^ Embrechts, Kluppelberg, Mikosch, Modelling Extremal Events, 1997

[edit] See also

Languages