Sobel test

In statistics, the Sobel test is a method of testing the significance of a mediation effect. The test is based on the work of Michael E. Sobel, a statistics professor at Columbia University in New York, NY.[1][2] In mediation, the relationship between the independent variable and the dependent variable is hypothesized to be an indirect effect that exists due to the influence of a third variable (the mediator). As a result when the mediator is included in a regression analysis model with the independent variable, the effect of the independent variable is reduced and the effect of the mediator remains significant. The Sobel test is basically a specialized t test that provides a method to determine whether the reduction in the effect of the independent variable, after including the mediator in the model, is a significant reduction and therefore whether the mediation effect is statistically significant.

Theoretical basis

When evaluating a mediation effect three different regression models are examined:[3]


Model 1: YO = γ1 + τXI + ε1

Model 2: XM = γ2 + αXI + ε2

Model 3: YO = γ3 + τXI + βXM + ε3

In these models YO is the dependent variable, XI is the independent variable and XM is the mediator. γ1, γ2, and γ3 represent the intercepts for each model, while ε1, ε2, and ε3 represent the error term for each equation. τ denotes the relationship between the independent variable and the dependent variable in model 1, while τ’ denotes that same relationship in model 3 after controlling for the effect of the mediator. The terms αXI and βXM represent the relationship between the independent variable and the mediator, and the mediator and the dependent variable after controlling for the independent variable, respectively.

Product of coefficients

From these models, the mediation effect is calculated as (ττ’).[4] This represents the change in the magnitude of the effect that the independent variable has on the dependent variable after controlling for the mediator. From examination of these equations it can be determined that (αβ) = (ττ’). The α term represents the magnitude of the relationship between the independent variable and the mediatior. The β term represents the magnitude of the relationship between the mediator and dependent variable after controlling for the effect of the independent variable. Therefore (αβ) represents the product of these two terms. In essence this is the amount of variance in the dependent variable that is accounted for by the independent variable through the mechanism of the mediator. This is the indirect effect, and the (αβ) term has been termed the product of coefficients.[5]

Venn diagram approach

Another way of thinking about the product of coefficients is to examine the figure below. Each circle represents the variance of each of the variables. Where the circles overlap represents variance the circles have in common and thus the effect of one variable on the second variable. For example sections c + d represent the effect of the independent variable on the dependent variable, if we ignore the mediator, and corresponds to τ. This total amount of variance in the dependent variable that is accounted for by the independent variable can then be broken down into areas c and d. Area c is the variance that the independent variable and the dependent variable have in common with the mediator, and this is the indirect effect. Area c corresponds to the product of coefficients (αβ) and to (τ  τ’). The Sobel test is testing how large area c is. If area c is sufficiently large then Sobel’s test is significant and significant mediation is occurring.

Calculating the Sobel test

In order to determine the statistical significance of the indirect effect, a statistic based on the indirect effect must be compared to its null sampling distribution. The Sobel test uses the magnitude of the indirect effect compared to its estimated standard error of measurement to derive a t statistic[1]

t = (τ τ')SE   OR   t = (αβ)SE

Where SE is the pooled standard error term and SE = (α2 σ2β + β2σ2α) and σ2β is the variance of β and σ2α is the variance of α.[1]

This t statistic can then be compared to the normal distribution to determine its significance. Alternative methods of calculating the Sobel test have been proposed that use either the z or t distributions to determine significance, and each estimates the standard error differently.[6]

Problems with the Sobel test

Distribution of the product term

The distribution of the product term αβ is only normal at large sample sizes[5][6] which means that at smaller sample sizes the p-value that is derived from the formula will not be an accurate estimate of the true p-value. This occurs because both α and β are assumed to be normally distributed, and the distribution of the product of two normally distributed variables is skewed at smaller sample sizes.[5][7][8] If the sample is large enough this will not be a problem, however determining when a sample is sufficiently large is somewhat subjective.[1][2]

Problems with the product of coefficients

In some situations it is possible that (ττ’) ≠ (αβ).[9] This occurs when the sample size is different in the models used to estimate the mediated effects. Suppose that the independent variable and the mediator are available from 200 cases, while the dependent variable is only available from 150 cases. This means that the α parameter is based on a regression model with 200 cases and the β parameter is based on a regression model with only 150 cases. Both τ and τ’ are based on regression models with 150 cases. Different sample sizes and different participants means that (ττ’) ≠ (αβ). The only time (ττ’) = (αβ) is when exactly the same participants are used in each of the models testing the regression.

Alternatives to the Sobel test

Product of the coefficients distribution

One strategy to overcome the non-normality of the product of coefficients distribution is to compare the Sobel test statistic to the distribution of the product instead of to the normal distribution.[6][8] This approach bases the inference on a mathematical derivation of the product of two normally distributed variables which acknowledges the skew of the distribution instead of imposing normality.[5]

Bootstrapping

Another approach that is becoming more popular in the literature is bootstrapping.[5][8][10] Bootstrapping is a non-parametric resampling procedure that can build an empirical approximation of the sampling distribution of αβ by repeatedly sampling the dataset. Bootstrapping does not rely on the assumption of normality.

References

  1. 1.0 1.1 1.2 1.3 Sobel, Michael E. (1982). "Asymptotic Confidence Intervals for Indirect Effects in Structural Equation Models". Sociological Methodology 13: 290–312. doi:10.2307/270723.
  2. 2.0 2.1 Sobel, Michael E. (1986). "Some New Results on Indirect Effects and Their Standard Errors in Covariance Structure". Sociological Methodology 16: 159–186. doi:10.2307/270922.
  3. Baron, Reuben M.; Kenny, David A. (1986). "The Moderator-Mediator Variable Distinction in Social Psychological Research: Conceptual, Strategic, and Statistical Considerations". Journal of Personality and Social Psychology 51 (6): 1173–1182. doi:10.1037/0022-3514.51.6.1173. PMID 3806354.
  4. Judd, Charles M.; Kenny, David A. (1981). "Process Analysis: Estimating Mediation in Treatment Evaluations". Evaluation Review 5 (5): 602–619. doi:10.1177/0193841X8100500502.
  5. 5.0 5.1 5.2 5.3 5.4 Preacher, Kristopher J.; Hayes, Andrew F (2008). "Asymptotic and resampling strategies for assessing and comparing indirect effects in multiple mediator models". Behaviour Research Methods 40 (3): 879–891. doi:10.3758/BRM.40.3.879. PMID 18697684.
  6. 6.0 6.1 6.2 MacKinnon, David P.; Lockwood, Chondra M., Hoffman, Jeanne M., West Stephen G., Sheets Virgil (2002). "A Comparison of Methods to Test Mediation and Other Intervening Variable Effects". Psychological Methods 7 (1): 83–104. doi:10.1037/1082-989x.7.1.83.
  7. Aroian, Leo A. (1947). "The Probability Function of the Product of Two Normally Distributed Variables". Annals of Mathematical Statistics 18 (2): 265–271. doi:10.1214/aoms/1177730442.
  8. 8.0 8.1 8.2 MacKinnon, David P.; Lockwood, Chondra M.; Williams, Jason (2004). "Confidence Limits for the Indirect Effect: Distribution of the Product and Resampling Methods". Multivariate Behavioural Research 39 (1): 99–128. doi:10.1207/s15327906mbr3901_4.
  9. MacKinnon, David. "An Answer to Julie Maloy".
  10. Bollen, Kenneth A.; Stine, Robert (1990). "Direct and Indirect Effects: Classical and Bootstrap Estimates of Variability". Sociological Methodology 20: 115–140. doi:10.2307/271084.