Matching Items (2)

Filtering by

Clear all filters

157544-Thumbnail Image.png

Addressing the Variable Selection Bias and Local Optimum Limitations of Longitudinal Recursive Partitioning with Time-Efficient Approximations

Description

Longitudinal recursive partitioning (LRP) is a tree-based method for longitudinal data. It takes a sample of individuals that were each measured repeatedly across time, and it splits them based on

Longitudinal recursive partitioning (LRP) is a tree-based method for longitudinal data. It takes a sample of individuals that were each measured repeatedly across time, and it splits them based on a set of covariates such that individuals with similar trajectories become grouped together into nodes. LRP does this by fitting a mixed-effects model to each node every time that it becomes partitioned and extracting the deviance, which is the measure of node purity. LRP is implemented using the classification and regression tree algorithm, which suffers from a variable selection bias and does not guarantee reaching a global optimum. Additionally, fitting mixed-effects models to each potential split only to extract the deviance and discard the rest of the information is a computationally intensive procedure. Therefore, in this dissertation, I address the high computational demand, variable selection bias, and local optimum solution. I propose three approximation methods that reduce the computational demand of LRP, and at the same time, allow for a straightforward extension to recursive partitioning algorithms that do not have a variable selection bias and can reach the global optimum solution. In the three proposed approximations, a mixed-effects model is fit to the full data, and the growth curve coefficients for each individual are extracted. Then, (1) a principal component analysis is fit to the set of coefficients and the principal component score is extracted for each individual, (2) a one-factor model is fit to the coefficients and the factor score is extracted, or (3) the coefficients are summed. The three methods result in each individual having a single score that represents the growth curve trajectory. Therefore, now that the outcome is a single score for each individual, any tree-based method may be used for partitioning the data and group the individuals together. Once the individuals are assigned to their final nodes, a mixed-effects model is fit to each terminal node with the individuals belonging to it.

I conduct a simulation study, where I show that the approximation methods achieve the goals proposed while maintaining a similar level of out-of-sample prediction accuracy as LRP. I then illustrate and compare the methods using an applied data.

Contributors

Agent

Created

Date Created
  • 2019

155069-Thumbnail Image.png

Handling sparse and missing data in functional data analysis: a functional mixed-effects model approach

Description

This paper investigates a relatively new analysis method for longitudinal data in the framework of functional data analysis. This approach treats longitudinal data as so-called sparse functional data. The first

This paper investigates a relatively new analysis method for longitudinal data in the framework of functional data analysis. This approach treats longitudinal data as so-called sparse functional data. The first section of the paper introduces functional data and the general ideas of functional data analysis. The second section discusses the analysis of longitudinal data in the context of functional data analysis, while considering the unique characteristics of longitudinal data such, in particular sparseness and missing data. The third section introduces functional mixed-effects models that can handle these unique characteristics of sparseness and missingness. The next section discusses a preliminary simulation study conducted to examine the performance of a functional mixed-effects model under various conditions. An extended simulation study was carried out to evaluate the estimation accuracy of a functional mixed-effects model. Specifically, the accuracy of the estimated trajectories was examined under various conditions including different types of missing data and varying levels of sparseness.

Contributors

Agent

Created

Date Created
  • 2016