Estimating causal direct and indirect effects in the presence of post-treatment confounders: a simulation study
In investigating mediating processes, researchers usually use randomized experiments and linear regression or structural equation modeling to determine if the treatment affects the hypothesized mediator and if the mediator affects the targeted outcome. However, randomizing the treatment will not yield accurate causal path estimates unless certain assumptions are satisfied. Since randomization of the mediator may not be plausible for most studies (i.e., the mediator status is not randomly assigned, but self-selected by participants), both the direct and indirect effects may be biased by confounding variables. The purpose of this dissertation is (1) to investigate the extent to which traditional mediation methods are affected by confounding variables and (2) to assess the statistical performance of several modern methods to address confounding variable effects in mediation analysis. This dissertation first reviewed the theoretical foundations of causal inference in statistical mediation analysis, modern statistical analysis for causal inference, and then described different methods to estimate causal direct and indirect effects in the presence of two post-treatment confounders. A large simulation study was designed to evaluate the extent to which ordinary regression and modern causal inference methods are able to obtain correct estimates of the direct and indirect effects when confounding variables that are present in the population are not included in the analysis. Five methods were compared in terms of bias, relative bias, mean square error, statistical power, Type I error rates, and confidence interval coverage to test how robust the methods are to the violation of the no unmeasured confounders assumption and confounder effect sizes. The methods explored were linear regression with adjustment, inverse propensity weighting, inverse propensity weighting with truncated weights, sequential g-estimation, and a doubly robust sequential g-estimation. Results showed that in estimating the direct and indirect effects, in general, sequential g-estimation performed the best in terms of bias, Type I error rates, power, and coverage across different confounder effect, direct effect, and sample sizes when all confounders were included in the estimation. When one of the two confounders were omitted from the estimation process, in general, none of the methods had acceptable relative bias in the simulation study. Omitting one of the confounders from estimation corresponded to the common case in mediation studies where no measure of a confounder is available but a confounder may affect the analysis. Failing to measure potential post-treatment confounder variables in a mediation model leads to biased estimates regardless of the analysis method used and emphasizes the importance of sensitivity analysis for causal mediation analysis.