Search Content

Matching Items (2)

Filtering by

All Subjects: Machine Learning
Resource Type: Text

Missing Data in Conditional Inference Trees

Description

Decision trees is a machine learning technique that searches the predictor space for the variable and observed value that leads to the best prediction when the data are split into two nodes based on the variable and splitting value. Conditional Inference Trees (CTREEs) is a non-parametric class of decision trees that uses statistical theory in order to select variables for splitting. Missing data can be problematic in decision trees because of an inability to place an observation with a missing value into a node based on the chosen splitting variable. Moreover, missing data can alter the selection process because of its inability to place observations with missing values. Simple missing data approaches (e.g., deletion, majority rule, and surrogate split) have been implemented in decision tree algorithms; however, more sophisticated missing data techniques have not been thoroughly examined. In addition to these approaches, this dissertation proposed a modified multiple imputation approach to handling missing data in CTREEs. A simulation was conducted to compare this approach with simple missing data approaches as well as single imputation and a multiple imputation with prediction averaging. Results revealed that simple approaches (i.e., majority rule, treat missing as its own category, and listwise deletion) were effective in handling missing data in CTREEs. The modified multiple imputation approach did not perform very well against simple approaches in most conditions, but this approach did seem best suited for small sample sizes and extreme missingness situations.

ContributorsManapat, Danielle Marie (Author) / Grimm, Kevin J (Thesis advisor) / Edwards, Michael C (Thesis advisor) / McNeish, Daniel (Committee member) / Anderson, Samantha F (Committee member) / Arizona State University (Publisher)

Created2023

Addressing the Variable Selection Bias and Local Optimum Limitations of Longitudinal Recursive Partitioning with Time-Efficient Approximations

Description

Longitudinal recursive partitioning (LRP) is a tree-based method for longitudinal data. It takes a sample of individuals that were each measured repeatedly across time, and it splits them based on a set of covariates such that individuals with similar trajectories become grouped together into nodes. LRP does this by fitting a mixed-effects model to each node every time that it becomes partitioned and extracting the deviance, which is the measure of node purity. LRP is implemented using the classification and regression tree algorithm, which suffers from a variable selection bias and does not guarantee reaching a global optimum. Additionally, fitting mixed-effects models to each potential split only to extract the deviance and discard the rest of the information is a computationally intensive procedure. Therefore, in this dissertation, I address the high computational demand, variable selection bias, and local optimum solution. I propose three approximation methods that reduce the computational demand of LRP, and at the same time, allow for a straightforward extension to recursive partitioning algorithms that do not have a variable selection bias and can reach the global optimum solution. In the three proposed approximations, a mixed-effects model is fit to the full data, and the growth curve coefficients for each individual are extracted. Then, (1) a principal component analysis is fit to the set of coefficients and the principal component score is extracted for each individual, (2) a one-factor model is fit to the coefficients and the factor score is extracted, or (3) the coefficients are summed. The three methods result in each individual having a single score that represents the growth curve trajectory. Therefore, now that the outcome is a single score for each individual, any tree-based method may be used for partitioning the data and group the individuals together. Once the individuals are assigned to their final nodes, a mixed-effects model is fit to each terminal node with the individuals belonging to it.

I conduct a simulation study, where I show that the approximation methods achieve the goals proposed while maintaining a similar level of out-of-sample prediction accuracy as LRP. I then illustrate and compare the methods using an applied data.

ContributorsStegmann, Gabriela (Author) / Grimm, Kevin (Thesis advisor) / Edwards, Michael (Committee member) / MacKinnon, David (Committee member) / McNeish, Daniel (Committee member) / Arizona State University (Publisher)

Created2019