Search Content

An investigation of power analysis approaches for latent growth modeling

Description

Designing studies that use latent growth modeling to investigate change over time calls for optimal approaches for conducting power analysis for a priori determination of required sample size. This investigation (1) studied the impacts of variations in specified parameters, design features, and model misspecification in simulation-based power analyses and…

Designing studies that use latent growth modeling to investigate change over time calls for optimal approaches for conducting power analysis for a priori determination of required sample size. This investigation (1) studied the impacts of variations in specified parameters, design features, and model misspecification in simulation-based power analyses and (2) compared power estimates across three common power analysis techniques: the Monte Carlo method; the Satorra-Saris method; and the method developed by MacCallum, Browne, and Cai (MBC). Choice of sample size, effect size, and slope variance parameters markedly influenced power estimates; however, level-1 error variance and number of repeated measures (3 vs. 6) when study length was held constant had little impact on resulting power. Under some conditions, having a moderate versus small effect size or using a sample size of 800 versus 200 increased power by approximately .40, and a slope variance of 10 versus 20 increased power by up to .24. Decreasing error variance from 100 to 50, however, increased power by no more than .09 and increasing measurement occasions from 3 to 6 increased power by no more than .04. Misspecification in level-1 error structure had little influence on power, whereas misspecifying the form of the growth model as linear rather than quadratic dramatically reduced power for detecting differences in slopes. Additionally, power estimates based on the Monte Carlo and Satorra-Saris techniques never differed by more than .03, even with small sample sizes, whereas power estimates for the MBC technique appeared quite discrepant from the other two techniques. Results suggest the choice between using the Satorra-Saris or Monte Carlo technique in a priori power analyses for slope differences in latent growth models is a matter of preference, although features such as missing data can only be considered within the Monte Carlo approach. Further, researchers conducting power analyses for slope differences in latent growth models should pay greatest attention to estimating slope difference, slope variance, and sample size. Arguments are also made for examining model-implied covariance matrices based on estimated parameters and graphic depictions of slope variance to help ensure parameter estimates are reasonable in a priori power analysis.

ContributorsVan Vleet, Bethany Lucía (Author) / Thompson, Marilyn S. (Thesis advisor) / Green, Samuel B. (Committee member) / Enders, Craig K. (Committee member) / Arizona State University (Publisher)

Created2011

Modern psychometric theory in clinical assessment

Description

Item response theory (IRT) and related latent variable models represent modern psychometric theory, the successor to classical test theory in psychological assessment. While IRT has become prevalent in the assessment of ability and achievement, it has not been widely embraced by clinical psychologists. This appears due, in part, to psychometrists'…

Item response theory (IRT) and related latent variable models represent modern psychometric theory, the successor to classical test theory in psychological assessment. While IRT has become prevalent in the assessment of ability and achievement, it has not been widely embraced by clinical psychologists. This appears due, in part, to psychometrists' use of unidimensional models despite evidence that psychiatric disorders are inherently multidimensional. The construct validity of unidimensional and multidimensional latent variable models was compared to evaluate the utility of modern psychometric theory in clinical assessment. Archival data consisting of 688 outpatients' presenting concerns, psychiatric diagnoses, and item level responses to the Brief Symptom Inventory (BSI) were extracted from files at a university mental health clinic. Confirmatory factor analyses revealed that models with oblique factors and/or item cross-loadings better represented the internal structure of the BSI in comparison to a strictly unidimensional model. The models were generally equivalent in their ability to account for variance in criterion-related validity variables; however, bifactor models demonstrated superior validity in differentiating between mood and anxiety disorder diagnoses. Multidimensional IRT analyses showed that the orthogonal bifactor model partitioned distinct, clinically relevant sources of item variance. Similar results were also achieved through multivariate prediction with an oblique simple structure model. Receiver operating characteristic curves confirmed improved sensitivity and specificity through multidimensional models of psychopathology. Clinical researchers are encouraged to consider these and other comprehensive models of psychological distress.

ContributorsThomas, Michael Lee (Author) / Lanyon, Richard (Thesis advisor) / Barrera, Manuel (Committee member) / Levy, Roy (Committee member) / Millsap, Roger (Committee member) / Arizona State University (Publisher)

Created2011

Nonword item generation: predicting item difficulty in nonword repetition

Description

The current study employs item difficulty modeling procedures to evaluate the feasibility of potential generative item features for nonword repetition. Specifically, the extent to which the manipulated item features affect the theoretical mechanisms that underlie nonword repetition accuracy was estimated. Generative item features were based on the phonological loop component…

The current study employs item difficulty modeling procedures to evaluate the feasibility of potential generative item features for nonword repetition. Specifically, the extent to which the manipulated item features affect the theoretical mechanisms that underlie nonword repetition accuracy was estimated. Generative item features were based on the phonological loop component of Baddelely's model of working memory which addresses phonological short-term memory (Baddeley, 2000, 2003; Baddeley & Hitch, 1974). Using researcher developed software, nonwords were generated to adhere to the phonological constraints of Spanish. Thirty-six nonwords were chosen based on the set item features identified by the proposed cognitive processing model. Using a planned missing data design, two-hundred fifteen Spanish-English bilingual children were administered 24 of the 36 generated nonwords. Multiple regression and explanatory item response modeling techniques (e.g., linear logistic test model, LLTM; Fischer, 1973) were used to estimate the impact of item features on item difficulty. The final LLTM included three item radicals and two item incidentals. Results indicated that the LLTM predicted item difficulties were highly correlated with the Rasch item difficulties (r = .89) and accounted for a substantial amount of the variance in item difficulty (R2 = .79). The findings are discussed in terms of validity evidence in support of using the phonological loop component of Baddeley's model (2000) as a cognitive processing model for nonword repetition items and the feasibility of using the proposed radical structure as an item blueprint for the future generation of nonword repetition items.

ContributorsMorgan, Gareth Philip (Author) / Gorin, Joanna (Thesis advisor) / Levy, Roy (Committee member) / Gray, Shelley (Committee member) / Arizona State University (Publisher)

Created2011

Assessing dimensionality in complex data structures: a performance comparison of DETECT and NOHARM procedures

Description

The purpose of this study was to investigate the effect of complex structure on dimensionality assessment in compensatory and noncompensatory multidimensional item response models (MIRT) of assessment data using dimensionality assessment procedures based on conditional covariances (i.e., DETECT) and a factor analytical approach (i.e., NOHARM). The DETECT-based methods typically outperformed…

The purpose of this study was to investigate the effect of complex structure on dimensionality assessment in compensatory and noncompensatory multidimensional item response models (MIRT) of assessment data using dimensionality assessment procedures based on conditional covariances (i.e., DETECT) and a factor analytical approach (i.e., NOHARM). The DETECT-based methods typically outperformed the NOHARM-based methods in both two- (2D) and three-dimensional (3D) compensatory MIRT conditions. The DETECT-based methods yielded high proportion correct, especially when correlations were .60 or smaller, data exhibited 30% or less complexity, and larger sample size. As the complexity increased and the sample size decreased, the performance typically diminished. As the complexity increased, it also became more difficult to label the resulting sets of items from DETECT in terms of the dimensions. DETECT was consistent in classification of simple items, but less consistent in classification of complex items. Out of the three NOHARM-based methods, χ2G/D and ALR generally outperformed RMSR. χ2G/D was more accurate when N = 500 and complexity levels were 30% or lower. As the number of items increased, ALR performance improved at correlation of .60 and 30% or less complexity. When the data followed a noncompensatory MIRT model, the NOHARM-based methods, specifically χ2G/D and ALR, were the most accurate of all five methods. The marginal proportions for labeling sets of items as dimension-like were typically low, suggesting that the methods generally failed to label two (three) sets of items as dimension-like in 2D (3D) noncompensatory situations. The DETECT-based methods were more consistent in classifying simple items across complexity levels, sample sizes, and correlations. However, as complexity and correlation levels increased the classification rates for all methods decreased. In most conditions, the DETECT-based methods classified complex items equally or more consistent than the NOHARM-based methods. In particular, as complexity, the number of items, and the true dimensionality increased, the DETECT-based methods were notably more consistent than any NOHARM-based method. Despite DETECT's consistency, when data follow a noncompensatory MIRT model, the NOHARM-based method should be preferred over the DETECT-based methods to assess dimensionality due to poor performance of DETECT in identifying the true dimensionality.

ContributorsSvetina, Dubravka (Author) / Levy, Roy (Thesis advisor) / Gorin, Joanna S. (Committee member) / Millsap, Roger (Committee member) / Arizona State University (Publisher)

Created2011

Stability of grammaticality judgments in German-English code-switching

Description

Code-switching, a bilingual language phenomenon, which may be defined as the concurrent use of two or more languages by fluent speakers is frequently misunderstood and stigmatized. Given that the majority of the world's population is bilingual rather than monolingual, the study of code-switching provides a fundamental window into human cognition…

Code-switching, a bilingual language phenomenon, which may be defined as the concurrent use of two or more languages by fluent speakers is frequently misunderstood and stigmatized. Given that the majority of the world's population is bilingual rather than monolingual, the study of code-switching provides a fundamental window into human cognition and the systematic structural outcomes of language contact. Intra-sentential code-switching is said to systematically occur, constrained by the lexicons of each respective language. In order to access information about the acceptability of certain switches, linguists often elicit grammaticality judgments from bilingual informants. In current linguistic research, grammaticality judgment tasks are often scrutinized on account of the lack of stability of responses to individual sentences. Although this claim is largely motivated by research on monolingual strings under a variety of variable conditions, the stability of code-switched grammaticality judgment data given by bilingual informants has yet to be systematically investigated. By comparing grammaticality judgment data from 3 groups of German-English bilinguals, Group A (N=50), Group B (N=34), and Group C (N=40), this thesis investigates the stability of grammaticality judgments in code-switching over time, as well as a potential difference in judgments between judgment data for spoken and written code-switching stimuli. Using a web-based survey, informants were asked to give ratings of each code-switched token. The results were computed and findings from a correlated groups t test attest to the stability of code-switched judgment data over time with a p value of .271 and to the validity of the methodologies currently in place. Furthermore, results from the study also indicated that no statistically significant difference was found between spoken and written judgment data as computed with an independent groups t test resulting in a p value of .186, contributing a valuable fact to the body of data collection practices in research in bilingualism. Results from this study indicate that there are significant differences attributable to language dominance for specific token types, which were calculated using an ANOVA test. However, when using group composite scores of all tokens, the ANOVA measure returned a non-significant score of .234, suggesting that bilinguals with differing language dominances rank in a similar manner. The findings from this study hope to help clarify current practices in code-switching research.

ContributorsGrabowski, Jane (Author) / Gilfillan, Daniel (Thesis advisor) / Macswan, Jeff (Thesis advisor) / Ghanem, Carla (Committee member) / Arizona State University (Publisher)

Created2011

Estimating causal direct and indirect effects in the presence of post-treatment confounders: a simulation study

Description

In investigating mediating processes, researchers usually use randomized experiments and linear regression or structural equation modeling to determine if the treatment affects the hypothesized mediator and if the mediator affects the targeted outcome. However, randomizing the treatment will not yield accurate causal path estimates unless certain assumptions are satisfied. Since…

In investigating mediating processes, researchers usually use randomized experiments and linear regression or structural equation modeling to determine if the treatment affects the hypothesized mediator and if the mediator affects the targeted outcome. However, randomizing the treatment will not yield accurate causal path estimates unless certain assumptions are satisfied. Since randomization of the mediator may not be plausible for most studies (i.e., the mediator status is not randomly assigned, but self-selected by participants), both the direct and indirect effects may be biased by confounding variables. The purpose of this dissertation is (1) to investigate the extent to which traditional mediation methods are affected by confounding variables and (2) to assess the statistical performance of several modern methods to address confounding variable effects in mediation analysis. This dissertation first reviewed the theoretical foundations of causal inference in statistical mediation analysis, modern statistical analysis for causal inference, and then described different methods to estimate causal direct and indirect effects in the presence of two post-treatment confounders. A large simulation study was designed to evaluate the extent to which ordinary regression and modern causal inference methods are able to obtain correct estimates of the direct and indirect effects when confounding variables that are present in the population are not included in the analysis. Five methods were compared in terms of bias, relative bias, mean square error, statistical power, Type I error rates, and confidence interval coverage to test how robust the methods are to the violation of the no unmeasured confounders assumption and confounder effect sizes. The methods explored were linear regression with adjustment, inverse propensity weighting, inverse propensity weighting with truncated weights, sequential g-estimation, and a doubly robust sequential g-estimation. Results showed that in estimating the direct and indirect effects, in general, sequential g-estimation performed the best in terms of bias, Type I error rates, power, and coverage across different confounder effect, direct effect, and sample sizes when all confounders were included in the estimation. When one of the two confounders were omitted from the estimation process, in general, none of the methods had acceptable relative bias in the simulation study. Omitting one of the confounders from estimation corresponded to the common case in mediation studies where no measure of a confounder is available but a confounder may affect the analysis. Failing to measure potential post-treatment confounder variables in a mediation model leads to biased estimates regardless of the analysis method used and emphasizes the importance of sensitivity analysis for causal mediation analysis.

ContributorsKisbu Sakarya, Yasemin (Author) / Mackinnon, David Peter (Thesis advisor) / Aiken, Leona (Committee member) / West, Stephen (Committee member) / Millsap, Roger (Committee member) / Arizona State University (Publisher)

Created2013

The accuracy of accuracy estimates for single form dichotomous classification exams

Description

The use of exams for classification purposes has become prevalent across many fields including professional assessment for employment screening and standards based testing in educational settings. Classification exams assign individuals to performance groups based on the comparison of their observed test scores to a pre-selected criterion (e.g. masters vs. nonmasters…

The use of exams for classification purposes has become prevalent across many fields including professional assessment for employment screening and standards based testing in educational settings. Classification exams assign individuals to performance groups based on the comparison of their observed test scores to a pre-selected criterion (e.g. masters vs. nonmasters in dichotomous classification scenarios). The successful use of exams for classification purposes assumes at least minimal levels of accuracy of these classifications. Classification accuracy is an index that reflects the rate of correct classification of individuals into the same category which contains their true ability score. Traditional methods estimate classification accuracy via methods which assume that true scores follow a four-parameter beta-binomial distribution. Recent research suggests that Item Response Theory may be a preferable alternative framework for estimating examinees' true scores and may return more accurate classifications based on these scores. Researchers hypothesized that test length, the location of the cut score, the distribution of items, and the distribution of examinee ability would impact the recovery of accurate estimates of classification accuracy. The current simulation study manipulated these factors to assess their potential influence on classification accuracy. Observed classification as masters vs. nonmasters, true classification accuracy, estimated classification accuracy, BIAS, and RMSE were analyzed. In addition, Analysis of Variance tests were conducted to determine whether an interrelationship existed between levels of the four manipulated factors. Results showed small values of estimated classification accuracy and increased BIAS in accuracy estimates with few items, mismatched distributions of item difficulty and examinee ability, and extreme cut scores. A significant four-way interaction between manipulated variables was observed. In additional to interpretations of these findings and explanation of potential causes for the recovered values, recommendations that inform practice and avenues of future research are provided.

ContributorsKunze, Katie (Author) / Gorin, Joanna (Thesis advisor) / Levy, Roy (Thesis advisor) / Green, Samuel (Committee member) / Arizona State University (Publisher)

Created2013

Mediation as a novel method for increasing statistical power

Description

Including a covariate can increase power to detect an effect between two variables. Although previous research has studied power in mediation models, the extent to which the inclusion of a mediator will increase the power to detect a relation between two variables has not been investigated. The first study identified…

Including a covariate can increase power to detect an effect between two variables. Although previous research has studied power in mediation models, the extent to which the inclusion of a mediator will increase the power to detect a relation between two variables has not been investigated. The first study identified situations where empirical and analytical power of two tests of significance for a single mediator model was greater than power of a bivariate significance test. Results from the first study indicated that including a mediator increased statistical power in small samples with large effects and in large samples with small effects. Next, a study was conducted to assess when power was greater for a significance test for a two mediator model as compared with power of a bivariate significance test. Results indicated that including two mediators increased power in small samples when both specific mediated effects were large and in large samples when both specific mediated effects were small. Implications of the results and directions for future research are then discussed.

ContributorsO'Rourke, Holly Patricia (Author) / Mackinnon, David P (Thesis advisor) / Enders, Craig K. (Committee member) / Millsap, Roger (Committee member) / Arizona State University (Publisher)

Created2013

The structure of cyber and traditional aggression: an integrated conceptualization

Description

ABSTRACT The phenomenon of cyberbullying has captured the attention of educators and researchers alike as it has been associated with multiple aversive outcomes including suicide. Young people today have easy access to computer mediated communication (CMC) and frequently use it to harass one another -- a practice that many researchers…

ABSTRACT The phenomenon of cyberbullying has captured the attention of educators and researchers alike as it has been associated with multiple aversive outcomes including suicide. Young people today have easy access to computer mediated communication (CMC) and frequently use it to harass one another -- a practice that many researchers have equated to cyberbullying. However, there is great disagreement among researchers whether intentional harmful actions carried out by way of CMC constitute cyberbullying, and some authors have argued that "cyber-aggression" is a more accurate term to describe this phenomenon. Disagreement in terms of cyberbullying's definition and methodological inconsistencies including choice of questionnaire items has resulted in highly variable results across cyberbullying studies. Researchers are in agreement however, that cyber and traditional forms of aggression are closely related phenomena, and have suggested that they may be extensions of one another. This research developed a comprehensive set of items to span cyber-aggression's content domain in order to 1) fully address all types of cyber-aggression, and 2) assess the interrelated nature of cyber and traditional aggression. These items were administered to 553 middle school students located in a central Illinois school district. Results from confirmatory factor analyses suggested that cyber-aggression is best conceptualized as integrated with traditional aggression, and that cyber and traditional aggression share two dimensions: direct-verbal and relational aggression. Additionally, results indicated that all forms of aggression are a function of general aggressive tendencies. This research identified two synthesized models combining cyber and traditional aggression into a shared framework that demonstrated excellent fit to the item data.

ContributorsLerner, David (Author) / Green, Samuel B (Thesis advisor) / Caterino, Linda (Committee member) / Atkinson, Robert (Committee member) / Nakagawa, Kathryn (Committee member) / Arizona State University (Publisher)

Created2013

Propensity score estimation with random forests

Description

Random Forests is a statistical learning method which has been proposed for propensity score estimation models that involve complex interactions, nonlinear relationships, or both of the covariates. In this dissertation I conducted a simulation study to examine the effects of three Random Forests model specifications in propensity score analysis. The…

Random Forests is a statistical learning method which has been proposed for propensity score estimation models that involve complex interactions, nonlinear relationships, or both of the covariates. In this dissertation I conducted a simulation study to examine the effects of three Random Forests model specifications in propensity score analysis. The results suggested that, depending on the nature of data, optimal specification of (1) decision rules to select the covariate and its split value in a Classification Tree, (2) the number of covariates randomly sampled for selection, and (3) methods of estimating Random Forests propensity scores could potentially produce an unbiased average treatment effect estimate after propensity scores weighting by the odds adjustment. Compared to the logistic regression estimation model using the true propensity score model, Random Forests had an additional advantage in producing unbiased estimated standard error and correct statistical inference of the average treatment effect. The relationship between the balance on the covariates' means and the bias of average treatment effect estimate was examined both within and between conditions of the simulation. Within conditions, across repeated samples there was no noticeable correlation between the covariates' mean differences and the magnitude of bias of average treatment effect estimate for the covariates that were imbalanced before adjustment. Between conditions, small mean differences of covariates after propensity score adjustment were not sensitive enough to identify the optimal Random Forests model specification for propensity score analysis.

ContributorsCham, Hei Ning (Author) / Tein, Jenn-Yun (Thesis advisor) / Enders, Stephen G (Thesis advisor) / Enders, Craig K. (Committee member) / Mackinnon, David P (Committee member) / Arizona State University (Publisher)

Created2013