Generalized canonical correlation

From Wikipedia, the free encyclopedia

In statistics, the generalized canonical correlation analysis (gCCA), is a way of making sense of cross-correlation matrices between the sets of random variables when there are more than two sets. It is a generalization of the Principal component analysis (PCA) to more than two sets of random variables like a conventional CCA also does the same thing for only two sets. The canonical variables represent those common factors that can be found by a large PCA of all of the transformed random variables after each set underwent its own PCA.

[edit] Applications

The Helmert-Wolf blocking (HWB) method of estimating linear regression parameters can find an optimal solution only if all cross-correlations between the data blocks are zero. They can always be made to vanish by introducing a new regression parameter for each common factor. The gCCA method can be used for finding those harmful common factors that create cross-correlation between the blocks. However, no optimal HWB solution exists if the random variables do not contain enough information on all of the new regression parameters.

[edit] External links

  • FactoMineR (free exploratory multivariate data analysis software linked to R)