Matching Items (3)

Filtering by

Clear all filters

151957-Thumbnail Image.png

Propensity score estimation with random forests

Description

Random Forests is a statistical learning method which has been proposed for propensity score estimation models that involve complex interactions, nonlinear relationships, or both of the covariates. In this dissertation I conducted a simulation study to examine the effects of

Random Forests is a statistical learning method which has been proposed for propensity score estimation models that involve complex interactions, nonlinear relationships, or both of the covariates. In this dissertation I conducted a simulation study to examine the effects of three Random Forests model specifications in propensity score analysis. The results suggested that, depending on the nature of data, optimal specification of (1) decision rules to select the covariate and its split value in a Classification Tree, (2) the number of covariates randomly sampled for selection, and (3) methods of estimating Random Forests propensity scores could potentially produce an unbiased average treatment effect estimate after propensity scores weighting by the odds adjustment. Compared to the logistic regression estimation model using the true propensity score model, Random Forests had an additional advantage in producing unbiased estimated standard error and correct statistical inference of the average treatment effect. The relationship between the balance on the covariates' means and the bias of average treatment effect estimate was examined both within and between conditions of the simulation. Within conditions, across repeated samples there was no noticeable correlation between the covariates' mean differences and the magnitude of bias of average treatment effect estimate for the covariates that were imbalanced before adjustment. Between conditions, small mean differences of covariates after propensity score adjustment were not sensitive enough to identify the optimal Random Forests model specification for propensity score analysis.

Contributors

Agent

Created

Date Created
2013

149409-Thumbnail Image.png

Multilevel mediation analysis: statistical assumptions and centering

Description

Mediation analysis is a statistical approach that examines the effect of a treatment (e.g., prevention program) on an outcome (e.g., substance use) achieved by targeting and changing one or more intervening variables (e.g., peer drug use norms). The increased use

Mediation analysis is a statistical approach that examines the effect of a treatment (e.g., prevention program) on an outcome (e.g., substance use) achieved by targeting and changing one or more intervening variables (e.g., peer drug use norms). The increased use of prevention intervention programs with outcomes measured at multiple time points following the intervention requires multilevel modeling techniques to account for clustering in the data. Estimating multilevel mediation models, in which all the variables are measured at individual level (Level 1), poses several challenges to researchers. The first challenge is to conceptualize a multilevel mediation model by clarifying the underlying statistical assumptions and implications of those assumptions on cluster-level (Level-2) covariance structure. A second challenge is that variables measured at Level 1 potentially contain both between- and within-cluster variation making interpretation of multilevel analysis difficult. As a result, multilevel mediation analyses may yield coefficient estimates that are composites of coefficient estimates at different levels if proper centering is not used. This dissertation addresses these two challenges. Study 1 discusses the concept of a correctly specified multilevel mediation model by examining the underlying statistical assumptions and implication of those assumptions on Level-2 covariance structure. Further, Study 1 presents analytical results showing algebraic relationships between the population parameters in a correctly specified multilevel mediation model. Study 2 extends previous work on centering in multilevel mediation analysis. First, different centering methods in multilevel analysis including centering within cluster with the cluster mean as a Level-2 predictor of intercept (CWC2) are discussed. Next, application of the CWC2 strategy to accommodate multilevel mediation models is explained. It is shown that the CWC2 centering strategy separates the between- and within-cluster mediated effects. Next, Study 2 discusses assumptions underlying a correctly specified CWC2 multilevel mediation model and defines between- and within-cluster mediated effects. In addition, analytical results for the algebraic relationships between the population parameters in a CWC2 multilevel mediation model are presented. Finally, Study 2 shows results of a simulation study conducted to verify derived algebraic relationships empirically.

Contributors

Agent

Created

Date Created
2010

154939-Thumbnail Image.png

Performance of contextual multilevel models for comparing between-person and within-person effects

Description

The comparison of between- versus within-person relations addresses a central issue in psychological research regarding whether group-level relations among variables generalize to individual group members. Between- and within-person effects may differ in magnitude as well as direction, and contextual multilevel

The comparison of between- versus within-person relations addresses a central issue in psychological research regarding whether group-level relations among variables generalize to individual group members. Between- and within-person effects may differ in magnitude as well as direction, and contextual multilevel models can accommodate this difference. Contextual multilevel models have been explicated mostly for cross-sectional data, but they can also be applied to longitudinal data where level-1 effects represent within-person relations and level-2 effects represent between-person relations. With longitudinal data, estimating the contextual effect allows direct evaluation of whether between-person and within-person effects differ. Furthermore, these models, unlike single-level models, permit individual differences by allowing within-person slopes to vary across individuals. This study examined the statistical performance of the contextual model with a random slope for longitudinal within-person fluctuation data.

A Monte Carlo simulation was used to generate data based on the contextual multilevel model, where sample size, effect size, and intraclass correlation (ICC) of the predictor variable were varied. The effects of simulation factors on parameter bias, parameter variability, and standard error accuracy were assessed. Parameter estimates were in general unbiased. Power to detect the slope variance and contextual effect was over 80% for most conditions, except some of the smaller sample size conditions. Type I error rates for the contextual effect were also high for some of the smaller sample size conditions. Conclusions and future directions are discussed.

Contributors

Agent

Created

Date Created
2016