Total sum of squares

In statistical data analysis the total sum of squares (TSS or SST) is a quantity that appears as part of a standard way of presenting results of such analyses. It is defined as being the sum, over all observations, of the squared differences of each observation from the overall mean.[1]

In statistical linear models, (particularly in standard regression models), the TSS is the sum of the squares of the difference of the dependent variable and its mean:

\mathrm{TSS}=\sum_{i=1}^{n}\left(y_{i}-\bar{y}\right)^2

where \bar{y} is the mean.

For wide classes of linear models, the total sum of squares equals the explained sum of squares plus the residual sum of squares. For a proof of this in the multivariate OLS case, see partitioning in the general OLS model.

In analysis of variance (ANOVA) the total sum of squares is the sum of the so-called "within-samples" sum of squares and "between-samples" sum of squares, i.e., partitioning of the sum of squares. In multivariate analysis of variance (MANOVA) the following equation applies[2]

\mathbf{T} = \mathbf{W} + \mathbf{B},

where T is the total sum of squares and products (SSP) matrix, W is the within-samples SSP matrix and B is the between-samples SSP matrix. Similar terminology may also be used in linear discriminant analysis, where W and B are respectively referred to as the within-groups and between-groups SSP matrics.[2]

See also

References

  1. Everitt, B.S. (2002) The Cambridge Dictionary of Statistics, CUP, ISBN 0-521-81099-X
  2. 2.0 2.1 K. V. Mardia, J. T. Kent and J. M. Bibby (1979). Multivariate Analysis. Academic Press. ISBN 0-12-471252-5. Especially chapters 11 and 12.