Phase-type distribution

From Wikipedia, the free encyclopedia

This article may require cleanup to meet Wikipedia's quality standards.
Please discuss this issue on the talk page or replace this tag with a more specific message. This article has been tagged since October 2006.

A phase--type distribution is a probability distribution that results from a system one or more inter-related poisson processes occurring in sequence, or phases. The sequence in which each of the phases occur may itself be a stochastic process. The distribution can be represented by a random variable describing the time until absorption of a Markov Chain with one absorbing state. Each of the states of the Markov Chain represent one of the phases.

The following probability distributions are all considered special cases of a Phase--type distribution:

Exponential distribution - 1 phase.
Erlang distribution - 2 or more identical phases in sequence.
Deterministic distribution (or constant) - The limiting case of an Erlang distribution, as the number of phases become infinite, while the time in each state becomes zero.
Coxian distribution - 2 or more (not necessarily identical) phases in sequence, with a probability of transitioning to the terminating/absorbing state after each phase.
Hyperexponential - 2 or more non-identical phases, that each have a probability of occurring in a mutually exclusive, or parallel, manner. (Note: The exponential distribution is the degenerate situation when all the parallel phases are identical.)

1 Discrete phase-type distribution
- 1.1 Definition and theorems
2 See also
3 References

[edit] Discrete phase-type distribution

[edit] Definition and theorems

Assume that for $m\geq 1$ the matrix $\mathbf{P}$ is the transition matrix of a $m + 1$ --state discrete Markov Chain with one absorbing state. Hence, arranging the matrix to have the $m + 1$ --th state as the absorbing one, we will have,

$\mathbf{P}=\left[\begin{matrix}\mathbf{T}&\vec\mathbf{T}^0\\\vec\mathbf{0}&1\end{matrix}\right].$

Here, $\mathbf{T}$ is a $m\times m$ matrix and $\vec\mathbf{T}^0$ is a $m\times 1$ vector. As $\mathbf{P}$ is a stochastic matrix we have,

$\vec\mathbf{T}^0+\mathbf{T}\vec\mathbf{1}=\vec\mathbf{1}.$

Here, $\vec\mathbf{1}$ is a $m\times 1$ matrix with all ones as its elements. Continuing with (\ref{eq1}) we have,

$\vec\mathbf{T}^0=\left(\mathbf{I}-\mathbf{T}\right)\vec\mathbf{1}.$

Hence, if $\left(\mathbf{I}-\mathbf{T}\right)$ is non--singular we will have,

$\left(\mathbf{I}-\mathbf{T}\right)^{-1}\vec\mathbf{T}^0=\vec\mathbf{1}.$

Here, we prove that $\mathbf{I}-\mathbf{T}$ actually is always non--singular. Assume the contrary case. Hence, there exists a non--zero vector $\vec\mathbf{x}$ for which,

$\left(\mathbf{I}-\mathbf{T}\right)\vec\mathbf{x}=\vec\mathbf{0}\Rightarrow\mathbf{T}\vec\mathbf{x}=\vec\mathbf{x}.$

Assuming that $\vec\mathbf{x}=[x_1,\cdots,x_m]$ we have,

$\forall i, x_i=\sum_{j=1}^n\mathbf{T}_{ij}x_j\Rightarrow \left|x_i\right|=\left|\sum_{j=1}^n\mathbf{T}_{ij}x_j\right|\leq\sum_{j=1}^n\left|\mathbf{T}_{ij}\right|\left|x_j\right|.$

Note that the equality happens when all $x i$ s have the same sign ( $\mathbf{T}_{ij}$ is positive). Assume that $x k$ is the largest element of $\vec\mathbf{X}$ . Hence,

$\left|x_k\right|\leq\sum_{j=1}^n\left|\mathbf{T}_{kj}\right|\left|x_j\right|\leq\sum_{j=1}^n\left|\mathbf{T}_{kj}\right|\left|x_k\right|=\left|x_k\right|\sum_{j=1}^n\left|\mathbf{T}_{kj}\right|\leq\left|x_k\right|.$

Hence, the equalities should happen. This means all $\left|x_i\right|$ have to be equal to $\left|x_k\right|$ . Given the previous condition that all $x i$ have the same sign, this means $\vec\mathbf{X}=\lambda\vec\mathbf{1}$ . Hence, according to (\ref{eq2}), we have,

$\lambda\mathbf{T}\vec\mathbf{1}=\lambda\vec\mathbf{1}\rightarrow\mathbf{T}\vec\mathbf{1}=\vec\mathbf{1}.$

As the elements of $\mathbf{T}\vec\mathbf{1}$ give the sum of different rows of $\mathbf{T}$ , so $\mathbf{T}$ is stochastic. Something resulting in $\vec\mathbf{Y}^0=\vec\mathbf{0}$ which makes the absorbing state unreachable from the other states, making the time till absorption infinite. From this argument we find a constraint for the matrix $\mathbf{T}$ . Namely, at least one row of $\mathbf{T}$ should sum to less than one. Here, we draw another, more important, conclusion that $\mathbf{T}$ can not have any right eigenvalue with absolute value of more than one. This basically means that the matrix $\mathbf{T}^k$ tends to zero as $k$ goes to infinity.

The time till absorption of the Markov Chain represented by the matrix $\mathbf{P}$ given in (\ref{eq4}) is denoted as $PH(\vec\mathbf{\alpha},\mathbf{T})$ . Here, $\vec\mathbf{\alpha}$ is the initialization probability and is a $1\times m$ vector of all non--negative entries summing up to one. The probability density function of this random variable equals,

$f\left(k\right)=\vec\mathbf{\alpha}\mathbf{T}^{k-1}\vec\mathbf{T}^0, k>0.$

Hence, we have,

$F\left(k\right)=\sum_{i=1}^k f\left(i\right)= \sum_{i=1}^k\vec\mathbf{\alpha}\mathbf{T}^{i-1}\vec\mathbf{T}^0= \vec\mathbf{\alpha}\sum_{i=1}^k\mathbf{T}^{i-1}\vec\mathbf{T}^0= \vec\mathbf{\alpha}\left(\mathbf{I}-\mathbf{T}^k\right)\left(\mathbf{I}-\mathbf{T}\right)^{-1}\vec\mathbf{T}^0.$

Using (\ref{eq5}) we have,

$F\left(k\right)= \vec\mathbf{\alpha}\left(\mathbf{I}-\mathbf{T}^k\right)\vec\mathbf{1}= \vec\mathbf{\alpha}\vec\mathbf{1}-\vec\mathbf{\alpha}\mathbf{T}^k\vec\mathbf{1}= 1-\vec\mathbf{\alpha}\mathbf{T}^k\vec\mathbf{1}.$

Also, trying to find the generating function, we have,

$f(z)=\sum_{k=1}^\infty z^kf(k)= \sum_{k=1}^\infty z^k\vec\mathbf{\alpha}\mathbf{T}^{k-1}\vec\mathbf{T}^0= z\vec\mathbf{\alpha}\sum_{k=1}^\infty z^{k-1}\mathbf{T}^{k-1}\vec\mathbf{T}^0= z\vec\mathbf{\alpha}\left(\mathbf{I}-z\mathbf{T}\right)^{-1}\vec\mathbf{T}^0.$

[edit] See also

[edit] References

Sheldon M Ross. Introduction to Probability Models, 8th edition. Chapter 6: Continuous Time Markov Chains, Academic Press; December, 2002

	Probability distributions [ view • talk • edit ]
	Univariate	Multivariate
Discrete:	Bernoulli • binomial • Boltzmann • compound Poisson • degenerate • Gauss-Kuzmin • geometric • hypergeometric • logarithmic • negative binomial • parabolic fractal • Poisson • Rademacher • Skellam • uniform • Yule-Simon • zeta • Zipf • Zipf-Mandelbrot	Ewens • multinomial
Continuous:	Beta • Beta prime • Cauchy • chi-square • Dirac delta function • Erlang • exponential • exponential power • F • fading • Fisher's z • Fisher-Tippett • Gamma • generalized extreme value • generalized hyperbolic • generalized inverse Gaussian • Half-Logistic • Hotelling's T-square • hyperbolic secant • hyper-exponential • hypoexponential • inverse chi-square • inverse gaussian • inverse gamma • Kumaraswamy • Landau • Laplace • Lévy • Lévy skew alpha-stable • logistic • log-normal • Maxwell-Boltzmann • Maxwell speed • normal (Gaussian) • Pareto • Pearson • polar • raised cosine • Rayleigh • relativistic Breit-Wigner • Rice • Student's t • triangular • type-1 Gumbel • type-2 Gumbel • uniform • Voigt • von Mises • Weibull • Wigner semicircle • Wilks' lambda	Dirichlet • Kent • matrix normal • multivariate normal • von Mises-Fisher • Wigner quasi • Wishart
Miscellaneous:	Cantor • conditional • exponential family • infinitely divisible • location-scale family • marginal • maximum entropy • phase-type • posterior • prior • quasi • sampling • singular