Search Content

Propensity score estimation with random forests

Description

Random Forests is a statistical learning method which has been proposed for propensity score estimation models that involve complex interactions, nonlinear relationships, or both of the covariates. In this dissertation I conducted a simulation study to examine the effects of three Random Forests model specifications in propensity score analysis. The…

Random Forests is a statistical learning method which has been proposed for propensity score estimation models that involve complex interactions, nonlinear relationships, or both of the covariates. In this dissertation I conducted a simulation study to examine the effects of three Random Forests model specifications in propensity score analysis. The results suggested that, depending on the nature of data, optimal specification of (1) decision rules to select the covariate and its split value in a Classification Tree, (2) the number of covariates randomly sampled for selection, and (3) methods of estimating Random Forests propensity scores could potentially produce an unbiased average treatment effect estimate after propensity scores weighting by the odds adjustment. Compared to the logistic regression estimation model using the true propensity score model, Random Forests had an additional advantage in producing unbiased estimated standard error and correct statistical inference of the average treatment effect. The relationship between the balance on the covariates' means and the bias of average treatment effect estimate was examined both within and between conditions of the simulation. Within conditions, across repeated samples there was no noticeable correlation between the covariates' mean differences and the magnitude of bias of average treatment effect estimate for the covariates that were imbalanced before adjustment. Between conditions, small mean differences of covariates after propensity score adjustment were not sensitive enough to identify the optimal Random Forests model specification for propensity score analysis.

ContributorsCham, Hei Ning (Author) / Tein, Jenn-Yun (Thesis advisor) / Enders, Stephen G (Thesis advisor) / Enders, Craig K. (Committee member) / Mackinnon, David P (Committee member) / Arizona State University (Publisher)

Created2013

Psychometric and Machine Learning Approaches to Diagnostic Classification

Description

The goal of diagnostic assessment is to discriminate between groups. In many cases, a binary decision is made conditional on a cut score from a continuous scale. Psychometric methods can improve assessment by modeling a latent variable using item response theory (IRT), and IRT scores can subsequently be used to…

The goal of diagnostic assessment is to discriminate between groups. In many cases, a binary decision is made conditional on a cut score from a continuous scale. Psychometric methods can improve assessment by modeling a latent variable using item response theory (IRT), and IRT scores can subsequently be used to determine a cut score using receiver operating characteristic (ROC) curves. Psychometric methods provide reliable and interpretable scores, but the prediction of the diagnosis is not the primary product of the measurement process. In contrast, machine learning methods, such as regularization or binary recursive partitioning, can build a model from the assessment items to predict the probability of diagnosis. Machine learning predicts the diagnosis directly, but does not provide an inferential framework to explain why item responses are related to the diagnosis. It remains unclear whether psychometric and machine learning methods have comparable accuracy or if one method is preferable in some situations. In this study, Monte Carlo simulation methods were used to compare psychometric and machine learning methods on diagnostic classification accuracy. Results suggest that classification accuracy of psychometric models depends on the diagnostic-test correlation and prevalence of diagnosis. Also, machine learning methods that reduce prediction error have inflated specificity and very low sensitivity compared to the data-generating model, especially when prevalence is low. Finally, machine learning methods that use ROC curves to determine probability thresholds have comparable classification accuracy to the psychometric models as sample size, number of items, and number of item categories increase. Therefore, results suggest that machine learning models could provide a viable alternative for classification in diagnostic assessments. Strengths and limitations for each of the methods are discussed, and future directions are considered.

ContributorsGonzález, Oscar (Author) / Mackinnon, David P (Thesis advisor) / Edwards, Michael C (Thesis advisor) / Grimm, Kevin J. (Committee member) / Zheng, Yi (Committee member) / Arizona State University (Publisher)

Created2018

Evaluation of five effect size measures of measurement non-invariance for continuous outcomes

Description

To make meaningful comparisons on a construct of interest across groups or over time, measurement invariance needs to exist for at least a subset of the observed variables that define the construct. Often, chi-square difference tests are used to test for measurement invariance. However, these statistics are affected by sample…

To make meaningful comparisons on a construct of interest across groups or over time, measurement invariance needs to exist for at least a subset of the observed variables that define the construct. Often, chi-square difference tests are used to test for measurement invariance. However, these statistics are affected by sample size such that larger sample sizes are associated with a greater prevalence of significant tests. Thus, using other measures of non-invariance to aid in the decision process would be beneficial. For this dissertation project, I proposed four new effect size measures of measurement non-invariance and analyzed a Monte Carlo simulation study to evaluate their properties and behavior in addition to the properties and behavior of an already existing effect size measure of non-invariance. The effect size measures were evaluated based on bias, variability, and consistency. Additionally, the factors that affected the value of the effect size measures were analyzed. All studied effect sizes were consistent, but three were biased under certain conditions. Further work is needed to establish benchmarks for the unbiased effect sizes.

ContributorsGunn, Heather J (Author) / Grimm, Kevin J. (Thesis advisor) / Edwards, Michael C (Thesis advisor) / Tein, Jenn-Yun (Committee member) / Anderson, Samantha F. (Committee member) / Arizona State University (Publisher)

Created2019

Sensitivity analysis of longitudinal measurement non-invariance: a second-order latent growth model approach with ordered-categorical indicators

Description

Researchers who conduct longitudinal studies are inherently interested in studying individual and population changes over time (e.g., mathematics achievement, subjective well-being). To answer such research questions, models of change (e.g., growth models) make the assumption of longitudinal measurement invariance. In many applied situations, key constructs are measured by a collection…

Researchers who conduct longitudinal studies are inherently interested in studying individual and population changes over time (e.g., mathematics achievement, subjective well-being). To answer such research questions, models of change (e.g., growth models) make the assumption of longitudinal measurement invariance. In many applied situations, key constructs are measured by a collection of ordered-categorical indicators (e.g., Likert scale items). To evaluate longitudinal measurement invariance with ordered-categorical indicators, a set of hierarchical models can be sequentially tested and compared. If the statistical tests of measurement invariance fail to be supported for one of the models, it is useful to have a method with which to gauge the practical significance of the differences in measurement model parameters over time. Drawing on studies of latent growth models and second-order latent growth models with continuous indicators (e.g., Kim & Willson, 2014a; 2014b; Leite, 2007; Wirth, 2008), this study examined the performance of a potential sensitivity analysis to gauge the practical significance of violations of longitudinal measurement invariance for ordered-categorical indicators using second-order latent growth models. The change in the estimate of the second-order growth parameters following the addition of an incorrect level of measurement invariance constraints at the first-order level was used as an effect size for measurement non-invariance. This study investigated how sensitive the proposed sensitivity analysis was to different locations of non-invariance (i.e., non-invariance in the factor loadings, the thresholds, and the unique factor variances) given a sufficient sample size. This study also examined whether the sensitivity of the proposed sensitivity analysis depended on a number of other factors including the magnitude of non-invariance, the number of non-invariant indicators, the number of non-invariant occasions, and the number of response categories in the indicators.

ContributorsLiu, Yu, Ph.D (Author) / West, Stephen G. (Thesis advisor) / Tein, Jenn-Yun (Thesis advisor) / Green, Samuel (Committee member) / Grimm, Kevin J. (Committee member) / Arizona State University (Publisher)

Created2016

Time metric in latent difference score models

Description

Time metric is an important consideration for all longitudinal models because it can influence the interpretation of estimates, parameter estimate accuracy, and model convergence in longitudinal models with latent variables. Currently, the literature on latent difference score (LDS) models does not discuss the importance of time metric. Furthermore, there is…

Time metric is an important consideration for all longitudinal models because it can influence the interpretation of estimates, parameter estimate accuracy, and model convergence in longitudinal models with latent variables. Currently, the literature on latent difference score (LDS) models does not discuss the importance of time metric. Furthermore, there is little research using simulations to investigate LDS models. This study examined the influence of time metric on model estimation, interpretation, parameter estimate accuracy, and convergence in LDS models using empirical simulations. Results indicated that for a time structure with a true time metric where participants had different starting points and unequally spaced intervals, LDS models fit with a restructured and less informative time metric resulted in biased parameter estimates. However, models examined using the true time metric were less likely to converge than models using the restructured time metric, likely due to missing data. Where participants had different starting points but equally spaced intervals, LDS models fit with a restructured time metric resulted in biased estimates of intercept means, but all other parameter estimates were unbiased, and models examined using the true time metric had less convergence than the restructured time metric as well due to missing data. The findings of this study support prior research on time metric in longitudinal models, and further research should examine these findings under alternative conditions. The importance of these findings for substantive researchers is discussed.

ContributorsO'Rourke, Holly P (Author) / Grimm, Kevin J. (Thesis advisor) / Mackinnon, David P (Thesis advisor) / Chassin, Laurie (Committee member) / Aiken, Leona S. (Committee member) / Arizona State University (Publisher)

Created2016

Performance of contextual multilevel models for comparing between-person and within-person effects

Description

The comparison of between- versus within-person relations addresses a central issue in psychological research regarding whether group-level relations among variables generalize to individual group members. Between- and within-person effects may differ in magnitude as well as direction, and contextual multilevel models can accommodate this difference. Contextual multilevel models have been…

The comparison of between- versus within-person relations addresses a central issue in psychological research regarding whether group-level relations among variables generalize to individual group members. Between- and within-person effects may differ in magnitude as well as direction, and contextual multilevel models can accommodate this difference. Contextual multilevel models have been explicated mostly for cross-sectional data, but they can also be applied to longitudinal data where level-1 effects represent within-person relations and level-2 effects represent between-person relations. With longitudinal data, estimating the contextual effect allows direct evaluation of whether between-person and within-person effects differ. Furthermore, these models, unlike single-level models, permit individual differences by allowing within-person slopes to vary across individuals. This study examined the statistical performance of the contextual model with a random slope for longitudinal within-person fluctuation data.

A Monte Carlo simulation was used to generate data based on the contextual multilevel model, where sample size, effect size, and intraclass correlation (ICC) of the predictor variable were varied. The effects of simulation factors on parameter bias, parameter variability, and standard error accuracy were assessed. Parameter estimates were in general unbiased. Power to detect the slope variance and contextual effect was over 80% for most conditions, except some of the smaller sample size conditions. Type I error rates for the contextual effect were also high for some of the smaller sample size conditions. Conclusions and future directions are discussed.

ContributorsWurpts, Ingrid Carlson (Author) / Mackinnon, David P (Thesis advisor) / West, Stephen G. (Committee member) / Grimm, Kevin J. (Committee member) / Suk, Hye Won (Committee member) / Arizona State University (Publisher)

Created2016

Filtering by

Propensity score estimation with random forests

Psychometric and Machine Learning Approaches to Diagnostic Classification

Evaluation of five effect size measures of measurement non-invariance for continuous outcomes

Sensitivity analysis of longitudinal measurement non-invariance: a second-order latent growth model approach with ordered-categorical indicators

Time metric in latent difference score models

Performance of contextual multilevel models for comparing between-person and within-person effects