Contrast (statistics)

From Wikipedia, the free encyclopedia

In statistics, particularly analysis of variance and linear regression, an orthogonal contrast is a linear combination of two or more factor level means (averages) whose coefficients add up to zero.^[1]^[2] Non-orthogonal contrasts do not necessarily sum to 0. Contrasts should be constructed "to answer specific research questions", and do not necessarily have to be orthogonal.^[3]

A contrast is defined as the sum of each group mean multiplied by a coefficient for each group (i.e., a signed number, c_j).^[4] In equation form, $L=c_{1}{\bar X}_{1}+c_{2}{\bar X}_{2}+...+c_{k}{\bar X}_{k}=\sum c_{j}{\bar X}_{j}$ , where L is the weighted sum of group means, the c_j coefficients represent the assigned weights of the means (these must sum to 0 for orthogonal contrasts), and ${\bar X}$ _i represents the group means.^[5] Coefficients can be positive or negative, and fractions or whole numbers, depending on the comparison of interest. Linear contrasts are very useful and can be used to test complex hypotheses when used in conjunction with ANOVA or multiple regression. In essence, each contrast defines and tests for a particular pattern of differences among the means.^[4]

Background

A simple (non-orthogonal) contrast is the difference between two means. A more complex contrast can test the difference between several means (i.e., if you have four means, assign coefficients of -3, -1, +1, and +3), or test the difference between a single mean and the combined mean of several groups (i.e., if you have four means assign coefficients of -3, +1, +1, and +1) or test the difference between the combined mean of several groups and the combined mean of several other groups (i.e., if you have four means assign coefficients of -1, -1, +1, and +1).^[5] The coefficients for the means to be combined (or averaged) must be the same in magnitude and direction, in other words, they are weighted equally. When means are assigned different coefficients (either in magnitude or direction, or both), the contrast is testing for a difference between those means. A contrast may be any of: the set of coefficients used to specify a comparison; the specific value of the linear combination obtained for a given study or experiment; the random quantity defined by applying the linear combination to treatment effects when these are themselves considered as random variables. In the last context here, the term contrast variable is sometimes used.

Contrasts are sometimes used to compare mixed effects. A common example can be the difference between two test scores — one at the beginning of the semester and one at its end. Note that we are not interested in one of these scores by itself, but only in the contrast (in this case — the difference). Since this is a linear combination of independent variables, its variance will match accordingly, as the weighted sum of the variances; in this case both weights are one. This "blending" of two variables into one might be useful in many cases such as ANOVA, regression, or even as descriptive statistics in its own right.

An example of a complex contrast would be comparing 5 standard treatments to a new treatment, hence giving each old treatment mean a weight of 1/5, and the new sixth treatment mean a weight of −1 (using the equation above). If this new linear combination has a mean zero, this will mean that the old treatments are not different from the new treatment on average. If the sum of the new linear combination is positive, this will mean that the combined mean of the 5 standard treatments is higher than the new treatment mean. If the sum of the new linear combination is negative, this will mean the combined mean of the 5 standard treatments is lower than the new treatment mean.^[4] However, the sum of the linear combination is not a significance test, see testing significance (below) to learn how to determine if your contrast is significant.

The usual results for linear combinations of independent random variables mean that the variance of a contrast is equal to the weighted sum of the variances.^[1] If two contrasts are orthogonal, estimates created by using such contrasts will be uncorrelated. This helps to minimize the Type I Error Rate, the rate of falsely rejecting a true null hypothesis. Because orthogonal contrasts test different aspects of the data, they are independent, the results of one contrast has no effect on the results of the other contrasts. When contrasts are not orthogonal, they are not testing completing different aspects of the data, the results of one contrast can then influence the results of other contrasts. This can increase the chance of falsely rejecting a true null hypothesis.^[5]

If orthogonal contrasts are available, it is possible to summarize the results of a statistical analysis in the form of a simple analysis of variance table, in such a way that it contains the results for different test statistics relating to different contrasts, each of which are statistically independent. Linear contrasts can be easily converted into sums of squares. SS_contrast = ${\tfrac {n(\sum c_{j}{\bar X}_{j})^{2}}{\sum c_{j}^{2}}}$ , with 1 degree of freedom, where n represents the number of observations per group. If the contrasts are orthogonal, the sum of the SS_contrasts = SS_treatment. Testing the significance of a contrast requires the computation of SS_contrast.^[5] A recent development in statistical analysis is the standardized mean of a contrast variable. This makes a comparison between the size of the differences between groups, as measured by a contrast and the accuracy with which that contrast can be measured by a given study or experiment.^[6]

Types of contrast

Orthogonal contrasts are a set of contrasts in which, for any distinct pair, the sum of the cross-products of the coefficients is zero(Assume sample sizes are equal). ^[7] Although there are potentially infinite sets of orthogonal contrasts, within any given set there will always be a maximum of exactly k - 1 possible orthogonal contrasts (where k = the number of group means available).^[5]
Polynomial contrasts are a special set of orthogonal contrasts that test polynomial patterns in data with more than 2 means (e.g., linear, quadratic, cubic, quartic, etc).^[8]
Orthonormal contrasts are orthogonal contrasts which satisfy the additional condition that, for each contrast, the sum squares of the coefficients add up to one.^[7]

Testing Significance

SS_contrast also happens to be a mean square because all contrasts have 1 degree of freedom. Dividing MS_contrast by MS_error produces an F-statistic with one and df_error degrees of freedom, the statistical significance of F_contrast can be determined by comparing the obtained F statistic with a critical value of F with the same degrees of freedom.^[5]

References

↑ 1.0 1.1 NIST/SEMATECH e-Handbook of Statistical Methods
↑ Everitt, B.S. (2002) Cambridge Dictionary of Statistics, CUP, ISBN 0-521-81099-X (Entry for "Contrast"
↑ Kuehl, Robert O. (2000). Design of experiments : statistical principles of research design and analysis (2nd ed. ed.). Pacific Grove, CA: Duxbury/Thomson Learning. ISBN 0534368344.
↑ 4.0 4.1 4.2 Clark, James M. (2007). Intermediate Data Analysis: Multiple Regression and Analysis of Variance. University of Winnipeg.
↑ 5.0 5.1 5.2 5.3 5.4 5.5 Howell, David C. (2010). Statistical methods for psychology (7th ed. ed.). Belmont, CA: Thomson Wadsworth. ISBN 978-0-495-59784-1.
↑ Zhang XHD (2011). Optimal High-Throughput Screening: Practical Experimental Design and Data Analysis for Genome-scale RNAi Research. Cambridge University Press. ISBN 978-0-521-73444-8.
↑ 7.0 7.1 Everitt, B.S. (2002) The Cambridge Dictionary of Statistics, CUP. ISBN 0-521-81099-X (entry for "Orthogonal contrasts")
↑ Kim, Jong Sung. "Orthogonal Polynomial Contrasts". Retrieved 27 April 2012.

External links

This article is issued from Wikipedia. The text is available under the Creative Commons Attribution/Share Alike; additional terms may apply for the media files.