Matching Items (13)
Filtering by

Clear all filters

149678-Thumbnail Image.png
Description
In the current context of fiscal austerity as well as neo-colonial criticisms, the discipline of religious studies has been challenged to critically assess its teaching methods as well as articulate its relevance in the modern university setting. Responding to these needs, this dissertation explores the educational outcomes on undergraduate students

In the current context of fiscal austerity as well as neo-colonial criticisms, the discipline of religious studies has been challenged to critically assess its teaching methods as well as articulate its relevance in the modern university setting. Responding to these needs, this dissertation explores the educational outcomes on undergraduate students as a result of religious studies curriculum. This research employs a robust quantitative methodology designed to assess the impact of the courses while controlling for a number of covariates. Based on data collected from pre- and post-course surveys of a combined 1,116 students enrolled at Arizona State University (ASU) and two area community colleges, the research examines student change across five outcomes: attributional complexity, multi-religious awareness, commitment to social justice, individual religiosity, and the first to be developed, neo-colonial measures. The sample was taken in the Fall of 2009 from courses including Religions of the World, introductory Islamic studies courses, and a control group consisting of engineering and political science students. The findings were mixed. From the "virtues of the humanities" standpoint, select within group changes showed a statistically significant positive shift, but when compared across groups and the control group, there were no statistically significant findings after controlling for key variables. The students' pre-course survey score was the best predictor of their post-course survey score. In response to the neo-colonial critiques, the non-findings suggest the critiques have been overstated in terms of their impact pedagogically or in the classroom.
ContributorsLewis, Bret (Author) / Gereboff, Joel (Thesis advisor) / Foard, James (Committee member) / Levy, Roy (Committee member) / Woodward, Mark (Committee member) / Arizona State University (Publisher)
Created2011
151021-Thumbnail Image.png
Description
The Culture-Language Interpretive Matrix (C-LIM) is a new tool hypothesized to help practitioners accurately determine whether students who are administered an IQ test are culturally and linguistically different from the normative comparison group (i.e., different) or culturally and linguistically similar to the normative comparison group and possibly have Specific Learning

The Culture-Language Interpretive Matrix (C-LIM) is a new tool hypothesized to help practitioners accurately determine whether students who are administered an IQ test are culturally and linguistically different from the normative comparison group (i.e., different) or culturally and linguistically similar to the normative comparison group and possibly have Specific Learning Disabilities (SLD) or other neurocognitive disabilities (i.e., disordered). Diagnostic utility statistics were used to test the ability of the Wechsler Intelligence Scales for Children-Fourth Edition (WISC-IV) C-LIM to accurately identify students from a referred sample of English language learners (Ells) (n = 86) for whom Spanish was the primary language spoken at home and a sample of students from the WISC-IV normative sample (n = 2,033) as either culturally and linguistically different from the WISC-IV normative sample or culturally and linguistically similar to the WISC-IV normative sample. WISC-IV scores from three paired comparison groups were analyzed using the Receiver Operating Characteristic (ROC) curve: (a) Ells with SLD and the WISC-IV normative sample, (b) Ells without SLD and the WISC-IV normative sample, and (c) Ells with SLD and Ells without SLD. Results of the ROC yielded Area Under the Curve (AUC) values that ranged between 0.51 and 0.53 for the comparison between Ells with SLD and the WISC-IV normative sample, AUC values that ranged between 0.48 and 0.53 for the comparison between Ells without SLD and the WISC-IV normative sample, and AUC values that ranged between 0.49 and 0.55 for the comparison between Ells with SLD and Ells without SLD. These values indicate that the C-LIM has low diagnostic accuracy in terms of differentiating between a sample of Ells and the WISC-IV normative sample. Current available evidence does not support use of the C-LIM in applied practice at this time.
ContributorsStyck, Kara M (Author) / Watkins, Marley W. (Thesis advisor) / Levy, Roy (Thesis advisor) / Balles, John (Committee member) / Arizona State University (Publisher)
Created2012
134539-Thumbnail Image.png
Description
Post-traumatic stress disorder is prevalent in refugees. The population of refugees in the United States is continuing to increase, of which the majority of the incoming refugees are children. A more comprehensive approach is needed to assess children for PTSD. This creative project involved reviewing existing literature on refugees in

Post-traumatic stress disorder is prevalent in refugees. The population of refugees in the United States is continuing to increase, of which the majority of the incoming refugees are children. A more comprehensive approach is needed to assess children for PTSD. This creative project involved reviewing existing literature on refugees in the United States, child refugees, Erik Erikson's stages of psychosocial development, and available and applicable PTSD assessment tools. I developed a reference chart that compared the available assessment tools. I recognized that a PTSD assessment tool for refugee children does not exist. In response, I created an approach to assessing PTSD in refugee children ages 5-12. In creating this toolkit, I determined who is appropriate for administering the assessment, discovered how to create trust between the clinician and the child, created the assessment tool, including implementation instructions, and then provided directions on scoring and referrals. The tool itself is called the Child Refugee PTSD Assessment Tool (CRPAT-12). The creation of the CRPAT-12 will hopefully be disseminated and will encourage refugee resettlement organizations to assess children for PTSD upon intake. Early identification of symptoms of distress will help the child receive the appropriate treatment and will help prevent more extreme mental health complications.
ContributorsBuizer, Danyela Sutthida (Author) / Walker, Beth (Thesis director) / Stevens, Carol (Committee member) / Arizona State University. College of Nursing & Healthcare Innovation (Contributor) / Barrett, The Honors College (Contributor)
Created2017-05
171917-Thumbnail Image.png
Description
The last two decades have seen growing awareness of and emphasis on the replication of empirical findings. While this is a large literature, very little of it has focused on or considered the interaction of replication and psychometrics. This is unfortunate given that sound measurement is crucial when considering the

The last two decades have seen growing awareness of and emphasis on the replication of empirical findings. While this is a large literature, very little of it has focused on or considered the interaction of replication and psychometrics. This is unfortunate given that sound measurement is crucial when considering the complex constructs studied in psychological research. If the psychometric properties of a scale fail to replicate, then inferences made using scores from that scale are questionable at best. In this dissertation, I begin to address replication issues in factor analysis – a widely used psychometric method in psychology. After noticing inconsistencies across results for studies that factor analyzed the same scale, I sought to gain a better understanding of what replication means in factor analysis as well as address issues that affect the replicability of factor analytic models. With this work, I take steps toward integrating factor analysis into the broader replication discussion. Ultimately, the goal of this dissertation was to highlight the importance of psychometric replication and bring attention to its role in fostering a more replicable scientific literature.
ContributorsManapat, Patrick D. (Author) / Edwards, Michael C. (Thesis advisor) / Anderson, Samantha F. (Thesis advisor) / Grimm, Kevin J. (Committee member) / Levy, Roy (Committee member) / Arizona State University (Publisher)
Created2022
189395-Thumbnail Image.png
Description
The proliferation of intensive longitudinal datasets has necessitated the development of analytical techniques that are flexible and accessible to researchers collecting dyadic or individual data. Dynamic structural equation models (DSEMs), as implemented in Mplus, provides the flexibility researchers require by combining components from multilevel modeling, structural equation modeling, and time

The proliferation of intensive longitudinal datasets has necessitated the development of analytical techniques that are flexible and accessible to researchers collecting dyadic or individual data. Dynamic structural equation models (DSEMs), as implemented in Mplus, provides the flexibility researchers require by combining components from multilevel modeling, structural equation modeling, and time series analyses. This dissertation project presents a simulation study that evaluates the performance of categorical DSEM using a probit link function across different numbers of clusters (N = 50 or 200), timepoints (T = 14, 28, or 56), categories on the outcome (2, 3, or 5), and distribution of responses on the outcome (symmetric/approximate normal, skewed, or uniform) for both univariate and multivariate models (representing individual data and dyadic longitudinal Actor-Partner Interdependence Model data, respectively). The 3- and 5-category model conditions were also evaluated as continuous DSEMs across the same cluster, timepoint, and distribution conditions to evaluate to what extent ignoring the categorical nature of the outcome impacted model performance. Results indicated that previously-suggested minimums for number of clusters and timepoints from studies evaluating continuous DSEM performance with continuous outcomes are not large enough to produce unbiased and adequately powered models in categorical DSEM. The distribution of responses on the outcome did not have a noticeable impact in model performance for categorical DSEM, but did affect model performance when fitting a continuous DSEM to the same datasets. Ignoring the categorical nature of the outcome lead to underestimated effects across parameters and conditions, and showed large Type-I error rates in the N = 200 cluster conditions.
ContributorsSavord, Andrea (Author) / McNeish, Daniel (Thesis advisor) / Grimm, Kevin J (Committee member) / Iida, Masumi (Committee member) / Levy, Roy (Committee member) / Arizona State University (Publisher)
Created2023
168527-Thumbnail Image.png
Description
Scale scores play a significant role in research and practice in a wide range of areas such as education, psychology, and health sciences. Although the methods of scale scoring have advanced considerably over the last 100 years, researchers and practitioners have generally been slow to implement these advances. There are

Scale scores play a significant role in research and practice in a wide range of areas such as education, psychology, and health sciences. Although the methods of scale scoring have advanced considerably over the last 100 years, researchers and practitioners have generally been slow to implement these advances. There are many topics that fall under this umbrella but the current study focuses on two. The first topic is that of subscores and total scores. Many of the scales in psychological and health research are designed to yield subscores, yet it is common to see total scores reported instead. Simplifying scores in this way, however, may have important implications for researchers and scale users in terms of interpretation and use. The second topic is subscore augmentation. That is, if there are subscores, how much value is there in using a subscore augmentation method? Most people using psychological assessments are unfamiliar with score augmentation techniques and the potential benefits they may have over the traditional sum score approach. The current study borrows methods from education to explore the magnitude of improvement of using augmented scores over observed scores. Data was simulated using the Graded Response Model. Factors controlled in the simulation were number of subscales, number of items per subscale, level of correlation between subscales, and sample size. Four estimates of the true subscore were considered (raw, subscore-adjusted, total score-adjusted, joint score-adjusted). Results from the simulation suggest that the score adjusted with total score information may perform poorly when the level of inter-subscore correlation is 0.3. Joint scores perform well most of the time, and the subscore-adjusted scores and joint-adjusted scores were always better performers than raw scores. Finally, general advice to applied users is provided.
ContributorsGardner, Molly (Author) / Edwards, Michael C (Thesis advisor) / McNeish, Daniel (Committee member) / Levy, Roy (Committee member) / Arizona State University (Publisher)
Created2022
156690-Thumbnail Image.png
Description
Dynamic Bayesian networks (DBNs; Reye, 2004) are a promising tool for modeling student proficiency under rich measurement scenarios (Reichenberg, in press). These scenarios often present assessment conditions far more complex than what is seen with more traditional assessments and require assessment arguments and psychometric models capable of integrating those complexities.

Dynamic Bayesian networks (DBNs; Reye, 2004) are a promising tool for modeling student proficiency under rich measurement scenarios (Reichenberg, in press). These scenarios often present assessment conditions far more complex than what is seen with more traditional assessments and require assessment arguments and psychometric models capable of integrating those complexities. Unfortunately, DBNs remain understudied and their psychometric properties relatively unknown. If the apparent strengths of DBNs are to be leveraged, then the body of literature surrounding their properties and use needs to be expanded upon. To this end, the current work aimed at exploring the properties of DBNs under a variety of realistic psychometric conditions. A two-phase Monte Carlo simulation study was conducted in order to evaluate parameter recovery for DBNs using maximum likelihood estimation with the Netica software package. Phase 1 included a limited number of conditions and was exploratory in nature while Phase 2 included a larger and more targeted complement of conditions. Manipulated factors included sample size, measurement quality, test length, the number of measurement occasions. Results suggested that measurement quality has the most prominent impact on estimation quality with more distinct performance categories yielding better estimation. While increasing sample size tended to improve estimation, there were a limited number of conditions under which greater samples size led to more estimation bias. An exploration of this phenomenon is included. From a practical perspective, parameter recovery appeared to be sufficient with samples as low as N = 400 as long as measurement quality was not poor and at least three items were present at each measurement occasion. Tests consisting of only a single item required exceptional measurement quality in order to adequately recover model parameters. The study was somewhat limited due to potentially software-specific issues as well as a non-comprehensive collection of experimental conditions. Further research should replicate and, potentially expand the current work using other software packages including exploring alternate estimation methods (e.g., Markov chain Monte Carlo).
ContributorsReichenberg, Raymond E (Author) / Levy, Roy (Thesis advisor) / Eggum-Wilkens, Natalie (Thesis advisor) / Iida, Masumi (Committee member) / DeLay, Dawn (Committee member) / Arizona State University (Publisher)
Created2018
156621-Thumbnail Image.png
Description
Investigation of measurement invariance (MI) commonly assumes correct specification of dimensionality across multiple groups. Although research shows that violation of the dimensionality assumption can cause bias in model parameter estimation for single-group analyses, little research on this issue has been conducted for multiple-group analyses. This study explored the effects of

Investigation of measurement invariance (MI) commonly assumes correct specification of dimensionality across multiple groups. Although research shows that violation of the dimensionality assumption can cause bias in model parameter estimation for single-group analyses, little research on this issue has been conducted for multiple-group analyses. This study explored the effects of mismatch in dimensionality between data and analysis models with multiple-group analyses at the population and sample levels. Datasets were generated using a bifactor model with different factor structures and were analyzed with bifactor and single-factor models to assess misspecification effects on assessments of MI and latent mean differences. As baseline models, the bifactor models fit data well and had minimal bias in latent mean estimation. However, the low convergence rates of fitting bifactor models to data with complex structures and small sample sizes caused concern. On the other hand, effects of fitting the misspecified single-factor models on the assessments of MI and latent means differed by the bifactor structures underlying data. For data following one general factor and one group factor affecting a small set of indicators, the effects of ignoring the group factor in analysis models on the tests of MI and latent mean differences were mild. In contrast, for data following one general factor and several group factors, oversimplifications of analysis models can lead to inaccurate conclusions regarding MI assessment and latent mean estimation.
ContributorsXu, Yuning (Author) / Green, Samuel (Thesis advisor) / Levy, Roy (Committee member) / Thompson, Marilyn (Committee member) / Arizona State University (Publisher)
Created2018
154905-Thumbnail Image.png
Description
Through a two study simulation design with different design conditions (sample size at level 1 (L1) was set to 3, level 2 (L2) sample size ranged from 10 to 75, level 3 (L3) sample size ranged from 30 to 150, intraclass correlation (ICC) ranging from 0.10 to 0.50, model

Through a two study simulation design with different design conditions (sample size at level 1 (L1) was set to 3, level 2 (L2) sample size ranged from 10 to 75, level 3 (L3) sample size ranged from 30 to 150, intraclass correlation (ICC) ranging from 0.10 to 0.50, model complexity ranging from one predictor to three predictors), this study intends to provide general guidelines about adequate sample sizes at three levels under varying ICC conditions for a viable three level HLM analysis (e.g., reasonably unbiased and accurate parameter estimates). In this study, the data generating parameters for the were obtained using a large-scale longitudinal data set from North Carolina, provided by the National Center on Assessment and Accountability for Special Education (NCAASE). I discuss ranges of sample sizes that are inadequate or adequate for convergence, absolute bias, relative bias, root mean squared error (RMSE), and coverage of individual parameter estimates. The current study, with the help of a detailed two-part simulation design for various sample sizes, model complexity and ICCs, provides various options of adequate sample sizes under different conditions. This study emphasizes that adequate sample sizes at either L1, L2, and L3 can be adjusted according to different interests in parameter estimates, different ranges of acceptable absolute bias, relative bias, root mean squared error, and coverage. Under different model complexity and varying ICC conditions, this study aims to help researchers identify L1, L2, and L3 sample size or both as the source of variation in absolute bias, relative bias, RMSE, or coverage proportions for a certain parameter estimate. This assists researchers in making better decisions for selecting adequate sample sizes in a three-level HLM analysis. A limitation of the study was the use of only a single distribution for the dependent and explanatory variables, different types of distributions and their effects might result in different sample size recommendations.
ContributorsYel, Nedim (Author) / Levy, Roy (Thesis advisor) / Elliott, Stephen N. (Thesis advisor) / Schulte, Ann C (Committee member) / Iida, Masumi (Committee member) / Arizona State University (Publisher)
Created2016
154063-Thumbnail Image.png
Description
Although models for describing longitudinal data have become increasingly sophisticated, the criticism of even foundational growth curve models remains challenging. The challenge arises from the need to disentangle data-model misfit at multiple and interrelated levels of analysis. Using posterior predictive model checking (PPMC)—a popular Bayesian framework for model criticism—the performance

Although models for describing longitudinal data have become increasingly sophisticated, the criticism of even foundational growth curve models remains challenging. The challenge arises from the need to disentangle data-model misfit at multiple and interrelated levels of analysis. Using posterior predictive model checking (PPMC)—a popular Bayesian framework for model criticism—the performance of several discrepancy functions was investigated in a Monte Carlo simulation study. The discrepancy functions of interest included two types of conditional concordance correlation (CCC) functions, two types of R2 functions, two types of standardized generalized dimensionality discrepancy (SGDDM) functions, the likelihood ratio (LR), and the likelihood ratio difference test (LRT). Key outcomes included effect sizes of the design factors on the realized values of discrepancy functions, distributions of posterior predictive p-values (PPP-values), and the proportion of extreme PPP-values.

In terms of the realized values, the behavior of the CCC and R2 functions were generally consistent with prior research. However, as diagnostics, these functions were extremely conservative even when some aspect of the data was unaccounted for. In contrast, the conditional SGDDM (SGDDMC), LR, and LRT were generally sensitive to the underspecifications investigated in this work on all outcomes considered. Although the proportions of extreme PPP-values for these functions tended to increase in null situations for non-normal data, this behavior may have reflected the true misfit that resulted from the specification of normal prior distributions. Importantly, the LR and the SGDDMC to a greater extent exhibited some potential for untangling the sources of data-model misfit. Owing to connections of growth curve models to the more fundamental frameworks of multilevel modeling, structural equation models with a mean structure, and Bayesian hierarchical models, the results of the current work may have broader implications that warrant further research.
ContributorsFay, Derek (Author) / Levy, Roy (Thesis advisor) / Thompson, Marilyn (Committee member) / Enders, Craig (Committee member) / Arizona State University (Publisher)
Created2015