CP decomposition
In multilinear algebra, the canonical polyadic decomposition (CPD), historically known as PARAFAC and later CANDECOMP, is a generalization of the matrix singular value decomposition (SVD) to tensors, with many applications in statistics, signal processing, psychometrics, linguistics and chemometrics. It originates from psychometrics[1][2] though going back to Hitchcock in 1927.[3]
Existence and uniqueness
Calculating the CPD
Alternating algorithms:
- alternating least squares (ALS)
- alternating slice-wise diagonalisation (ASD)
Algebraic algorithms:
- simultaneous diagonalization (SD)
- simultaneous generalized Schur decomposition (SGSD)
Optimization algorithms:
- Levenberg–Marquardt (LM)
- nonlinear conjugate gradient (NCG)
- limited memory BFGS (L-BFGS)
Direct methods:
- Direct multilinear decomposition (DMLD)
Applications of the CPD
Chemometrics
Multi-way data are characterized by several sets of categorical variables that are measured in a crossed fashion. Chemical examples could be fluorescence emission spectra measured at several excitation wavelengths for several samples, fluorescence lifetime measured at several excitation and emission wavelengths or any kind of spectrum measured chromatographically for several samples. Determining such variables will give rise to three-way data; i.e., the data can be arranged in a cube instead of a matrix as in standard multivariate data sets.
Other decompositions
PARAFAC is one of several decomposition methods for multi-way data. The two main competitors are the Tucker3 method, and simply unfolding of the multi-way array to a matrix and then performing standard two-way methods as principal component analysis (PCA). The Tucker3 method should rightfully be called three-mode principal component analysis (or N-mode principal component analysis), but here the term Tucker3 or just Tucker decomposition will be used instead. PARAFAC, Tucker and two-way PCA are all multi- or bi-linear decomposition methods, which decompose the array into sets of "scores" and "loadings", that hopefully describe the data in a more condensed form than the original data array. There are advantages and disadvantages with all the methods, and often several methods must be tried to find the most appropriate.
In the field of chemometrics, a number of diagnostic tools and techniques exist to help a PARAFAC user determine the best fitting model. These include the core consistency diagnostic (CORCONDIA),[4] split-half analyses,[5] examination of the loadings,[6] and residual analysis.[6]
See also
- Latent class analysis
- Multilinear subspace learning
- Singular value decomposition
- Tucker decomposition
References
- ↑ Carroll, J. D.; Chang, J. (1970). "Analysis of individual differences in multidimensional scaling via an n-way generalization of 'Eckart–Young' decomposition". Psychometrika 35: 283–319. doi:10.1007/BF02310791.
- ↑ Harshman, Richard A. (1970). "Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-modal factor analysis". UCLA Working Papers in Phonetics (Ann Arbor: University Microfilms) 16: 84. No. 10,085.
- ↑ F. L. Hitchcock (1927). "The expression of a tensor or a polyadic as a sum of products". Journal of Mathematics and Physics 6: 164–189.
- ↑ Bro, R.; Kiers, H. A. L. (2003). "A new efficient method for determining the number of components in PARAFAC models". Journal of Chemometrics 17 (5): 274–286.
- ↑ Bro, R. (1997). "PARAFAC. Tutorial and applications". Chemometrics and Intelligent Laboratory Systems 38 (2): 149–171.
- ↑ 6.0 6.1 Stedmon, C. A.; Bro, R. (2008). "Characterizing dissolved organic matter fluorescence with parallel factor analysis: a tutorial". Limnology and Oceanography-Methods 6: 572–579.
Further reading
- Kolda, Tamara G.; Bader, Brett W. (2009). "Tensor Decompositions and Applications". SIAM Rev. 51: 455–500. doi:10.1137/07070111X. CiteSeerX: 10.1.1.153.2059.
External links
- PARAFAC Tutorial
- Parallel Factor Analysis (PARAFAC)
- FactoMineR (free exploratory multivariate data analysis software linked to R)