Standardized coefficient
From Wikipedia, the free encyclopedia
In statistics, standardized coefficient or beta coefficient is the estimate of an analysis performed on variables that have been standardized so that they have variances of 1. This is usually done to answer the question which of the independent variables have a greater effect on the dependent variable in multiple regression analysis, when the variables are measured in different units of measurement (for example, income measured in dollars and family size measured in number of individuals).
Before fitting the multiple regression equation, all variables (independent and dependent) can be standardized by subtracting the mean and dividing by the standard deviation. The standardized regression coefficients, then, represent the change in a dependent variable that result from a change of one standard deviation in an independent variable. Some statistical software packages like PSPP/SPSS report them automatically, labeling them "Beta" while the ordinary unstandarized coefficients are labeled "B". Others, like DAP/SAS, provide them as an option and label them "Standardized Coefficient". Sometimes the unstandarized are also labeled as "B" or "b".
A regression run on original, unstandarized variables produces unstandardized coefficients while a regression run on standardized variables produces standardized coefficients. In practice, both types of coefficients can be estimated from the original variables.
Advocates of standardized regression coefficients point out that the coefficients are the same regardless of an independent variable's underlying scale of units. They also suggest that this removes the problem of comparing, for example, years with kilograms since each regression coefficient represents the change in response per standard unit (one SD) change in a predictor. However, critics of standardized regression coefficients argue that this is illusory: there is no reason why a change of one SD in one predictor should be equivalent to a change of one SD in another predictor. Some variables are easy to change--the amount of time watching television, for example. Others are more difficult--weight or cholesterol level. Others are impossible--height or age.
[edit] Example
In a hypothetical example, the income of family ranges from $10,000 to $100,000, while the size of the family ranges from 1 to 9. It can be expected that the standard deviation of income will be several thousand $ (for example, 6,382$) while the standard deviation of family size will be 2. Thus using standard deviation as the unit of measure takes into account that a one-person change in family size is relatively more important than a one dollar change in income.
If the standard coefficients for this example were, for instance, .535 for income and .386 for family size, changing the income by one standard deviation (6,382$) while holding the family size constant would change our dependent variable (for example, food consumption) by .535 standard deviations. Changing family size by one standard deviation, holding income constant, would change food consumption by .386 standard deviations. Thus we can conclude that a change in income has a greater relative effect on food purchase than does a change in family size.
[edit] References
- Larry D. Schroeder, David L. Sqoquist, Paula E. Stephan. Understanding regression analysis, Sage Publications, 1986, ISBN 0-8039-2758-4, p.31-32
- Glossary of social science terms
- Which Predictors Are More Important? - why standardized coefficients are used