Search Content

Propensity score estimation with random forests

Description

Random Forests is a statistical learning method which has been proposed for propensity score estimation models that involve complex interactions, nonlinear relationships, or both of the covariates. In this dissertation I conducted a simulation study to examine the effects of three Random Forests model specifications in propensity score analysis. The…

Random Forests is a statistical learning method which has been proposed for propensity score estimation models that involve complex interactions, nonlinear relationships, or both of the covariates. In this dissertation I conducted a simulation study to examine the effects of three Random Forests model specifications in propensity score analysis. The results suggested that, depending on the nature of data, optimal specification of (1) decision rules to select the covariate and its split value in a Classification Tree, (2) the number of covariates randomly sampled for selection, and (3) methods of estimating Random Forests propensity scores could potentially produce an unbiased average treatment effect estimate after propensity scores weighting by the odds adjustment. Compared to the logistic regression estimation model using the true propensity score model, Random Forests had an additional advantage in producing unbiased estimated standard error and correct statistical inference of the average treatment effect. The relationship between the balance on the covariates' means and the bias of average treatment effect estimate was examined both within and between conditions of the simulation. Within conditions, across repeated samples there was no noticeable correlation between the covariates' mean differences and the magnitude of bias of average treatment effect estimate for the covariates that were imbalanced before adjustment. Between conditions, small mean differences of covariates after propensity score adjustment were not sensitive enough to identify the optimal Random Forests model specification for propensity score analysis.

ContributorsCham, Hei Ning (Author) / Tein, Jenn-Yun (Thesis advisor) / Enders, Stephen G (Thesis advisor) / Enders, Craig K. (Committee member) / Mackinnon, David P (Committee member) / Arizona State University (Publisher)

Created2013

Performance of contextual multilevel models for comparing between-person and within-person effects

Description

The comparison of between- versus within-person relations addresses a central issue in psychological research regarding whether group-level relations among variables generalize to individual group members. Between- and within-person effects may differ in magnitude as well as direction, and contextual multilevel models can accommodate this difference. Contextual multilevel models have been…

The comparison of between- versus within-person relations addresses a central issue in psychological research regarding whether group-level relations among variables generalize to individual group members. Between- and within-person effects may differ in magnitude as well as direction, and contextual multilevel models can accommodate this difference. Contextual multilevel models have been explicated mostly for cross-sectional data, but they can also be applied to longitudinal data where level-1 effects represent within-person relations and level-2 effects represent between-person relations. With longitudinal data, estimating the contextual effect allows direct evaluation of whether between-person and within-person effects differ. Furthermore, these models, unlike single-level models, permit individual differences by allowing within-person slopes to vary across individuals. This study examined the statistical performance of the contextual model with a random slope for longitudinal within-person fluctuation data.

A Monte Carlo simulation was used to generate data based on the contextual multilevel model, where sample size, effect size, and intraclass correlation (ICC) of the predictor variable were varied. The effects of simulation factors on parameter bias, parameter variability, and standard error accuracy were assessed. Parameter estimates were in general unbiased. Power to detect the slope variance and contextual effect was over 80% for most conditions, except some of the smaller sample size conditions. Type I error rates for the contextual effect were also high for some of the smaller sample size conditions. Conclusions and future directions are discussed.

ContributorsWurpts, Ingrid Carlson (Author) / Mackinnon, David P (Thesis advisor) / West, Stephen G. (Committee member) / Grimm, Kevin J. (Committee member) / Suk, Hye Won (Committee member) / Arizona State University (Publisher)

Created2016

A Bayesian Synthesis approach to data fusion using augmented data-dependent priors

Description

The process of combining data is one in which information from disjoint datasets sharing at least a number of common variables is merged. This process is commonly referred to as data fusion, with the main objective of creating a new dataset permitting more flexible analyses than the separate analysis of…

The process of combining data is one in which information from disjoint datasets sharing at least a number of common variables is merged. This process is commonly referred to as data fusion, with the main objective of creating a new dataset permitting more flexible analyses than the separate analysis of each individual dataset. Many data fusion methods have been proposed in the literature, although most utilize the frequentist framework. This dissertation investigates a new approach called Bayesian Synthesis in which information obtained from one dataset acts as priors for the next analysis. This process continues sequentially until a single posterior distribution is created using all available data. These informative augmented data-dependent priors provide an extra source of information that may aid in the accuracy of estimation. To examine the performance of the proposed Bayesian Synthesis approach, first, results of simulated data with known population values under a variety of conditions were examined. Next, these results were compared to those from the traditional maximum likelihood approach to data fusion, as well as the data fusion approach analyzed via Bayes. The assessment of parameter recovery based on the proposed Bayesian Synthesis approach was evaluated using four criteria to reflect measures of raw bias, relative bias, accuracy, and efficiency. Subsequently, empirical analyses with real data were conducted. For this purpose, the fusion of real data from five longitudinal studies of mathematics ability varying in their assessment of ability and in the timing of measurement occasions was used. Results from the Bayesian Synthesis and data fusion approaches with combined data using Bayesian and maximum likelihood estimation methods were reported. The results illustrate that Bayesian Synthesis with data driven priors is a highly effective approach, provided that the sample sizes for the fused data are large enough to provide unbiased estimates. Bayesian Synthesis provides another beneficial approach to data fusion that can effectively be used to enhance the validity of conclusions obtained from the merging of data from different studies.

ContributorsMarcoulides, Katerina M (Author) / Grimm, Kevin (Thesis advisor) / Levy, Roy (Thesis advisor) / MacKinnon, David (Committee member) / Suk, Hye Won (Committee member) / Arizona State University (Publisher)

Created2017

Mechanisms linking daily pain and depressive symptoms: the application of diary assessment and bio-psycho-social profiling

Description

Despite the strong link between pain and depressive symptoms, the mechanisms by which they are connected in the everyday lives of individuals with chronic pain are not well understood. In addition, previous investigations have tended to ignore biopsychosocial individual difference factors, assuming that all individuals respond to pain-related experiences and…

Despite the strong link between pain and depressive symptoms, the mechanisms by which they are connected in the everyday lives of individuals with chronic pain are not well understood. In addition, previous investigations have tended to ignore biopsychosocial individual difference factors, assuming that all individuals respond to pain-related experiences and affect in the same manner. The present study tried to address these gaps in the existing literature. Two hundred twenty individuals with Fibromyalgia completed daily diaries during the morning, afternoon, and evening for 21 days. Findings were generally consistent with the hypotheses. Multilevel structural equation modeling revealed that morning pain and positive and negative affect are uniquely associated with morning negative pain appraisal, which in turn, is positively related to pain’s activity interference in the afternoon. Pain’s activity interference was the strongest predictor of evening depressive symptoms. Latent profile analysis using biopsychosocial measures identified three theoretically and clinically important subgroups (i.e., Low Functioning, Normative, and High Functioning groups). Although the daily pain-depressive symptoms link was not significantly moderated by these subgroups, individuals in the High Functioning group reported the lowest levels of average morning pain, negative affect, negative pain appraisal, afternoon pain’s activity interference, and evening depressive symptoms, and the highest levels of average morning positive affect across 21 days relative to the other two groups. The Normative group fared better on all measures than did the Low Functioning group. The findings of the present study suggest the importance of promoting morning positive affect and decreasing negative affect in disconnecting the within-day pain-depressive symptoms link, as well as the potential value of tailoring chronic pain interventions to those individuals who are in the greatest need.

ContributorsMun, Chung Jung (Author) / Karoly, Paul (Thesis advisor) / Davis, Mary C. (Thesis advisor) / Suk, Hye Won (Committee member) / Dishion, Thomas J (Committee member) / Arizona State University (Publisher)

Created2017

Filtering by

Propensity score estimation with random forests

Performance of contextual multilevel models for comparing between-person and within-person effects

A Bayesian Synthesis approach to data fusion using augmented data-dependent priors

Mechanisms linking daily pain and depressive symptoms: the application of diary assessment and bio-psycho-social profiling