Search Content

Propensity score estimation with random forests

Description

Random Forests is a statistical learning method which has been proposed for propensity score estimation models that involve complex interactions, nonlinear relationships, or both of the covariates. In this dissertation I conducted a simulation study to examine the effects of three Random Forests model specifications in propensity score analysis. The…

Random Forests is a statistical learning method which has been proposed for propensity score estimation models that involve complex interactions, nonlinear relationships, or both of the covariates. In this dissertation I conducted a simulation study to examine the effects of three Random Forests model specifications in propensity score analysis. The results suggested that, depending on the nature of data, optimal specification of (1) decision rules to select the covariate and its split value in a Classification Tree, (2) the number of covariates randomly sampled for selection, and (3) methods of estimating Random Forests propensity scores could potentially produce an unbiased average treatment effect estimate after propensity scores weighting by the odds adjustment. Compared to the logistic regression estimation model using the true propensity score model, Random Forests had an additional advantage in producing unbiased estimated standard error and correct statistical inference of the average treatment effect. The relationship between the balance on the covariates' means and the bias of average treatment effect estimate was examined both within and between conditions of the simulation. Within conditions, across repeated samples there was no noticeable correlation between the covariates' mean differences and the magnitude of bias of average treatment effect estimate for the covariates that were imbalanced before adjustment. Between conditions, small mean differences of covariates after propensity score adjustment were not sensitive enough to identify the optimal Random Forests model specification for propensity score analysis.

ContributorsCham, Hei Ning (Author) / Tein, Jenn-Yun (Thesis advisor) / Enders, Stephen G (Thesis advisor) / Enders, Craig K. (Committee member) / Mackinnon, David P (Committee member) / Arizona State University (Publisher)

Created2013

Multilevel multiple imputation: an examination of competing methods

Description

Missing data are common in psychology research and can lead to bias and reduced power if not properly handled. Multiple imputation is a state-of-the-art missing data method recommended by methodologists. Multiple imputation methods can generally be divided into two broad categories: joint model (JM) imputation and fully conditional specification (FCS)…

Missing data are common in psychology research and can lead to bias and reduced power if not properly handled. Multiple imputation is a state-of-the-art missing data method recommended by methodologists. Multiple imputation methods can generally be divided into two broad categories: joint model (JM) imputation and fully conditional specification (FCS) imputation. JM draws missing values simultaneously for all incomplete variables using a multivariate distribution (e.g., multivariate normal). FCS, on the other hand, imputes variables one at a time, drawing missing values from a series of univariate distributions. In the single-level context, these two approaches have been shown to be equivalent with multivariate normal data. However, less is known about the similarities and differences of these two approaches with multilevel data, and the methodological literature provides no insight into the situations under which the approaches would produce identical results. This document examined five multilevel multiple imputation approaches (three JM methods and two FCS methods) that have been proposed in the literature. An analytic section shows that only two of the methods (one JM method and one FCS method) used imputation models equivalent to a two-level joint population model that contained random intercepts and different associations across levels. The other three methods employed imputation models that differed from the population model primarily in their ability to preserve distinct level-1 and level-2 covariances. I verified the analytic work with computer simulations, and the simulation results also showed that imputation models that failed to preserve level-specific covariances produced biased estimates. The studies also highlighted conditions that exacerbated the amount of bias produced (e.g., bias was greater for conditions with small cluster sizes). The analytic work and simulations lead to a number of practical recommendations for researchers.

ContributorsMistler, Stephen (Author) / Enders, Craig K. (Thesis advisor) / Aiken, Leona (Committee member) / Levy, Roy (Committee member) / West, Stephen G. (Committee member) / Arizona State University (Publisher)

Created2015

Socioeconomic and racial/ethnic disparities in cognitive trajectories among the oldest old: the role of vascular and functional health

Description

Identifying modifiable causes of chronic disease is essential to prepare for the needs of an aging population. Cognitive decline is a precursor to the development of Alzheimer's and other dementing diseases, representing some of the most prevalent and least understood sources of morbidity and mortality associated with aging. To contribute…

Identifying modifiable causes of chronic disease is essential to prepare for the needs of an aging population. Cognitive decline is a precursor to the development of Alzheimer's and other dementing diseases, representing some of the most prevalent and least understood sources of morbidity and mortality associated with aging. To contribute to the literature on cognitive aging, this work focuses on the role of vascular and physical health in the development of cognitive trajectories while accounting for the socioeconomic context where health disparities are developed. The Assets and Health Dynamics among the Oldest-Old study provided a nationally-representative sample of non-institutionalized adults age 65 and over in 1998, with biennial follow-up continuing until 2008. Latent growth models with adjustment for non-random missing data were used to assess vascular, physical, and social predictors of cognitive change. A core aim of this project was examining socioeconomic and racial/ethnic variation in vascular predictors of cognitive trajectories. Results indicated that diabetes and heart problems were directly related to an increased rate of memory decline in whites, where these risk factors were only associated with baseline word-recall for blacks when conditioned on gender and household assets. These results support the vascular hypotheses of cognitive aging and attest to the significance of socioeconomic and racial/ethnic variation in vascular influences on cognitive health. The second substantive portion of this dissertation used parallel process latent growth models to examine the co-development of cognitive and functional health. Initial word-recall scores were consistently associated with later functional limitations, but baseline functional limitations were not consistently associated with later word-recall scores. Gender and household income moderated this relationship, and indicators of lifecourse SES were better equipped to explain variation in initial cognitive and functional status than change in these measures over time. Overall, this work suggests that research examining associations between cognitive decline, chronic disease, and disability must account for the social context where individuals and their health develop. Also, these findings advocate that reducing socioeconomic and racial/ethnic disparities in cognitive health among the aging requires interventions early in the lifecourse, as disparities in cognitive trajectories were solidified prior to late old age.

ContributorsBishop, Nicholas Joseph (Author) / Kronenfeld, Jennie J. (Thesis advisor) / Haas, Steven A. (Committee member) / Eggum, Natalie D. (Committee member) / Arizona State University (Publisher)

Created2011

Three-level multiple imputation: a fully conditional specification approach

Description

Currently, there is a clear gap in the missing data literature for three-level models.

To date, the literature has only focused on the theoretical and algorithmic work

required to implement three-level imputation using the joint model (JM) method of

imputation, leaving relatively no work done on fully conditional specication (FCS)

method. Moreover, the literature…

Currently, there is a clear gap in the missing data literature for three-level models.

To date, the literature has only focused on the theoretical and algorithmic work

required to implement three-level imputation using the joint model (JM) method of

imputation, leaving relatively no work done on fully conditional specication (FCS)

method. Moreover, the literature lacks any methodological evaluation of three-level

imputation. Thus, this thesis serves two purposes: (1) to develop an algorithm in

order to implement FCS in the context of a three-level model and (2) to evaluate

both imputation methods. The simulation investigated a random intercept model

under both 20% and 40% missing data rates. The ndings of this thesis suggest

that the estimates for both JM and FCS were largely unbiased, gave good coverage,

and produced similar results. The sole exception for both methods was the slope for

the level-3 variable, which was modestly biased. The bias exhibited by the methods

could be due to the small number of clusters used. This nding suggests that future

research ought to investigate and establish clear recommendations for the number of

clusters required by these imputation methods. To conclude, this thesis serves as a

preliminary start in tackling a much larger issue and gap in the current missing data

literature.

ContributorsKeller, Brian Tinnell (Author) / Enders, Craig K. (Thesis advisor) / Grimm, Kevin J. (Committee member) / Levy, Roy (Committee member) / Arizona State University (Publisher)

Created2015

Filtering by

Propensity score estimation with random forests

Multilevel multiple imputation: an examination of competing methods

Socioeconomic and racial/ethnic disparities in cognitive trajectories among the oldest old: the role of vascular and functional health

Three-level multiple imputation: a fully conditional specification approach