Compositional data

In statistics, compositional data are quantitative descriptions of the parts of some whole, conveying exclusively relative information.

This definition, given by John Aitchison (1986) has several consequences:

\mathcal{C}[x_1,x_2,\dots,x_D]=\left[\frac{x_1}{\sum_{i=1}^D x_i},\frac{x_2}{\sum_{i=1}^D x_i}, \dots,\frac{x_D}{\sum_{i=1}^D x_i}\right],\

where D is the number of parts (components) and  [\cdot] denotes a row vector.

 \mathcal{S}^D=\left\{\mathbf{x}=[x_1,x_2,\dots,x_D]\in\mathbb{R}^D \left| x_i>0,i=1,2,\dots,D; \sum_{i=1}^D x_i=\kappa \right. \right\}. \

This is the reason why \scriptstyle\mathcal{S}^D is considered to be the sample space of compositional data. The positive constant \scriptstyle\kappa is arbitrary. Frequent values for \scriptstyle\kappa are 1 (per unit), 100 (percent, %), 1000, 106 (ppm), 109 (ppb), ...

Remarks on the definition of the simplex:

Examples

External links

References