Matching Items (2)
Filtering by

Clear all filters

150618-Thumbnail Image.png
Description
Coarsely grouped counts or frequencies are commonly used in the behavioral sciences. Grouped count and grouped frequency (GCGF) that are used as outcome variables often violate the assumptions of linear regression as well as models designed for categorical outcomes; there is no analytic model that is designed specifically to accommodate

Coarsely grouped counts or frequencies are commonly used in the behavioral sciences. Grouped count and grouped frequency (GCGF) that are used as outcome variables often violate the assumptions of linear regression as well as models designed for categorical outcomes; there is no analytic model that is designed specifically to accommodate GCGF outcomes. The purpose of this dissertation was to compare the statistical performance of four regression models (linear regression, Poisson regression, ordinal logistic regression, and beta regression) that can be used when the outcome is a GCGF variable. A simulation study was used to determine the power, type I error, and confidence interval (CI) coverage rates for these models under different conditions. Mean structure, variance structure, effect size, continuous or binary predictor, and sample size were included in the factorial design. Mean structures reflected either a linear relationship or an exponential relationship between the predictor and the outcome. Variance structures reflected homoscedastic (as in linear regression), heteroscedastic (monotonically increasing) or heteroscedastic (increasing then decreasing) variance. Small to medium, large, and very large effect sizes were examined. Sample sizes were 100, 200, 500, and 1000. Results of the simulation study showed that ordinal logistic regression produced type I error, statistical power, and CI coverage rates that were consistently within acceptable limits. Linear regression produced type I error and statistical power that were within acceptable limits, but CI coverage was too low for several conditions important to the analysis of counts and frequencies. Poisson regression and beta regression displayed inflated type I error, low statistical power, and low CI coverage rates for nearly all conditions. All models produced unbiased estimates of the regression coefficient. Based on the statistical performance of the four models, ordinal logistic regression seems to be the preferred method for analyzing GCGF outcomes. Linear regression also performed well, but CI coverage was too low for conditions with an exponential mean structure and/or heteroscedastic variance. Some aspects of model prediction, such as model fit, were not assessed here; more research is necessary to determine which statistical model best captures the unique properties of GCGF outcomes.
ContributorsCoxe, Stefany (Author) / Aiken, Leona S. (Thesis advisor) / West, Stephen G. (Thesis advisor) / Mackinnon, David P (Committee member) / Reiser, Mark R. (Committee member) / Arizona State University (Publisher)
Created2012
155855-Thumbnail Image.png
Description
Time-to-event analysis or equivalently, survival analysis deals with two variables simultaneously: when (time information) an event occurs and whether an event occurrence is observed or not during the observation period (censoring information). In behavioral and social sciences, the event of interest usually does not lead to a terminal state

Time-to-event analysis or equivalently, survival analysis deals with two variables simultaneously: when (time information) an event occurs and whether an event occurrence is observed or not during the observation period (censoring information). In behavioral and social sciences, the event of interest usually does not lead to a terminal state such as death. Other outcomes after the event can be collected and thus, the survival variable can be considered as a predictor as well as an outcome in a study. One example of a case where the survival variable serves as a predictor as well as an outcome is a survival-mediator model. In a single survival-mediator model an independent variable, X predicts a survival variable, M which in turn, predicts a continuous outcome, Y. The survival-mediator model consists of two regression equations: X predicting M (M-regression), and M and X simultaneously predicting Y (Y-regression). To estimate the regression coefficients of the survival-mediator model, Cox regression is used for the M-regression. Ordinary least squares regression is used for the Y-regression using complete case analysis assuming censored data in M are missing completely at random so that the Y-regression is unbiased. In this dissertation research, different measures for the indirect effect were proposed and a simulation study was conducted to compare performance of different indirect effect test methods. Bias-corrected bootstrapping produced high Type I error rates as well as low parameter coverage rates in some conditions. In contrast, the Sobel test produced low Type I error rates as well as high parameter coverage rates in some conditions. The bootstrap of the natural indirect effect produced low Type I error and low statistical power when the censoring proportion was non-zero. Percentile bootstrapping, distribution of the product and the joint-significance test showed best performance. Statistical analysis of the survival-mediator model is discussed. Two indirect effect measures, the ab-product and the natural indirect effect are compared and discussed. Limitations and future directions of the simulation study are discussed. Last, interpretation of the survival-mediator model for a made-up empirical data set is provided to clarify the meaning of the quantities in the survival-mediator model.
ContributorsKim, Han Joe (Author) / Mackinnon, David P. (Thesis advisor) / Tein, Jenn-Yun (Thesis advisor) / West, Stephen G. (Committee member) / Grimm, Kevin J. (Committee member) / Arizona State University (Publisher)
Created2017