Generalized Dirichlet distribution

From Wikipedia, the free encyclopedia

In statistics, the generalized Dirichlet distribution (GD) is a generalization of the Dirichlet distribution with a more general covariance structure and twice the number of parameters. Random variables with a GD distribution are neutral[1].

The density function of p_1,\ldots,p_{k-1} is


\left[
\prod_{i=1}^{k-1}B(a_i,b_i)\right]^{-1}
p_k^{b_{k-1}-1}
\prod_{i=1}^{k-1}\left[
p_i^{a_i-1}\left(\sum_{j=i}^kp_j\right)^{b_{i-1}-(a_i+b_i)}\right]

where we define p_k= 1- \sum_{i=1}^{k-1}p_i. Here B(x,y) denotes the Beta function. This reduces to the standard Dirichlet distribution if bi − 1 = ai + bi for 2\leqslant i\leqslant k-1 (b0 is arbitrary).

Wong [2] gives the slightly more concise form for x_1+\cdots +x_k\leqslant 1


\prod_{i=1}^k\frac{x_i^{\alpha_i-1}\left(1-x_1-\ldots-x_i\right)^{\gamma_i}}{B(\alpha_i,\beta_i)}

where γi = βj − αj + 1 − βj + 1 for  1\leqslant j\leqslant k-1 and γk = βk − 1. Note that Wong defines a distribution over a k dimensional space (implicitly defining x_{k+1}=1-\sum_{i=1}^kx_i) while Connor and Mosiman use a k − 1 dimensional space with x_k=1-\sum_{i=1}^{k-1}x_i. The remainder of this article will use Wong's notation.

Contents

[edit] General moment function

If X=\left(X_1,\ldots,X_k\right)\sim GD_k\left(\alpha_1,\ldots,\alpha_k;\beta_1,\ldots,\beta_k\right), then


E\left[X_1^{r_1}X_2^{r_2}\cdots X_k^{r_k}\right]=
\prod_{j=1}^k
\frac{
   \Gamma\left(\alpha_j+\beta_j\right)
   \Gamma\left(\alpha_j+r_j\right)
   \Gamma\left(\beta_j+\delta_j\right)
}{
   \Gamma\left(\alpha_j\right)
   \Gamma\left(\beta_j\right)
   \Gamma\left(\alpha_j+\beta_j+r_j+\delta_j\right)
}

where \delta_j=r_{j+1}+r_{j+2}+\cdots +r_k. Thus


E\left(X_j\right)=\frac{\alpha_j}{\alpha_j+\beta_j}\prod_{m=1}^{j-1}\frac{\beta_m}{\alpha_m+\beta_m}

[edit] Reduction to standard Dirichlet distribution

As stated above, if bi − 1 = ai + bi for 2\leqslant i\leqslant k then the distribution reduces to a standard Dirichlet. This condition is different from the usual case in which the new parameters being equal to zero gives the original distribution. However, in the case of the GDD attempting to do this results in a very complicated density function.

[edit] Bayesian analysis

Suppose X=\left(X_1,\ldots,X_k\right)\sim GD_k\left(\alpha_1,\ldots,\alpha_k;\beta_1,\ldots,\beta_k\right) is generalized Dirichlet, and that Y | X is multinomial with n trials (here Y=\left(Y_1,\ldots,Y_k\right)). Writing Yj = yj for  1\leqslant 1\leqslant k and y_{k+1}=n-\sum_{i=1}^ky_i the joint posterior of X | Y is a generalized Dirichlet distribution with


X|Y\sim GD_k\left(
{\alpha'}_1,\ldots,{\alpha'}_k;
{\beta'}_1,\ldots,{\beta'}_k
\right)

where α'j = αj + yj and {\beta'}_j=\beta_j+\sum_{i=j+1}^{k+1}y_i for 1\leqslant k.

[edit] See also

[edit] References

  1. ^ R. J. Connor and J. E. Mosiman 1969. Concepts of independence for proportions with a generalization of the Dirichlet distibution. Journal of the American Statistical Association, volume 64, pp194--206
  2. ^ T.-T. Wong 1998. Generalized Dirichlet distribution in Bayesian analysis. Applied Mathematics and Computation, volume 97, pp165-181