Matching Items (15)
Filtering by

Clear all filters

149678-Thumbnail Image.png
Description
In the current context of fiscal austerity as well as neo-colonial criticisms, the discipline of religious studies has been challenged to critically assess its teaching methods as well as articulate its relevance in the modern university setting. Responding to these needs, this dissertation explores the educational outcomes on undergraduate students

In the current context of fiscal austerity as well as neo-colonial criticisms, the discipline of religious studies has been challenged to critically assess its teaching methods as well as articulate its relevance in the modern university setting. Responding to these needs, this dissertation explores the educational outcomes on undergraduate students as a result of religious studies curriculum. This research employs a robust quantitative methodology designed to assess the impact of the courses while controlling for a number of covariates. Based on data collected from pre- and post-course surveys of a combined 1,116 students enrolled at Arizona State University (ASU) and two area community colleges, the research examines student change across five outcomes: attributional complexity, multi-religious awareness, commitment to social justice, individual religiosity, and the first to be developed, neo-colonial measures. The sample was taken in the Fall of 2009 from courses including Religions of the World, introductory Islamic studies courses, and a control group consisting of engineering and political science students. The findings were mixed. From the "virtues of the humanities" standpoint, select within group changes showed a statistically significant positive shift, but when compared across groups and the control group, there were no statistically significant findings after controlling for key variables. The students' pre-course survey score was the best predictor of their post-course survey score. In response to the neo-colonial critiques, the non-findings suggest the critiques have been overstated in terms of their impact pedagogically or in the classroom.
ContributorsLewis, Bret (Author) / Gereboff, Joel (Thesis advisor) / Foard, James (Committee member) / Levy, Roy (Committee member) / Woodward, Mark (Committee member) / Arizona State University (Publisher)
Created2011
149935-Thumbnail Image.png
Description
The purpose of this study was to investigate the effect of complex structure on dimensionality assessment in compensatory and noncompensatory multidimensional item response models (MIRT) of assessment data using dimensionality assessment procedures based on conditional covariances (i.e., DETECT) and a factor analytical approach (i.e., NOHARM). The DETECT-based methods typically outperformed

The purpose of this study was to investigate the effect of complex structure on dimensionality assessment in compensatory and noncompensatory multidimensional item response models (MIRT) of assessment data using dimensionality assessment procedures based on conditional covariances (i.e., DETECT) and a factor analytical approach (i.e., NOHARM). The DETECT-based methods typically outperformed the NOHARM-based methods in both two- (2D) and three-dimensional (3D) compensatory MIRT conditions. The DETECT-based methods yielded high proportion correct, especially when correlations were .60 or smaller, data exhibited 30% or less complexity, and larger sample size. As the complexity increased and the sample size decreased, the performance typically diminished. As the complexity increased, it also became more difficult to label the resulting sets of items from DETECT in terms of the dimensions. DETECT was consistent in classification of simple items, but less consistent in classification of complex items. Out of the three NOHARM-based methods, χ2G/D and ALR generally outperformed RMSR. χ2G/D was more accurate when N = 500 and complexity levels were 30% or lower. As the number of items increased, ALR performance improved at correlation of .60 and 30% or less complexity. When the data followed a noncompensatory MIRT model, the NOHARM-based methods, specifically χ2G/D and ALR, were the most accurate of all five methods. The marginal proportions for labeling sets of items as dimension-like were typically low, suggesting that the methods generally failed to label two (three) sets of items as dimension-like in 2D (3D) noncompensatory situations. The DETECT-based methods were more consistent in classifying simple items across complexity levels, sample sizes, and correlations. However, as complexity and correlation levels increased the classification rates for all methods decreased. In most conditions, the DETECT-based methods classified complex items equally or more consistent than the NOHARM-based methods. In particular, as complexity, the number of items, and the true dimensionality increased, the DETECT-based methods were notably more consistent than any NOHARM-based method. Despite DETECT's consistency, when data follow a noncompensatory MIRT model, the NOHARM-based method should be preferred over the DETECT-based methods to assess dimensionality due to poor performance of DETECT in identifying the true dimensionality.
ContributorsSvetina, Dubravka (Author) / Levy, Roy (Thesis advisor) / Gorin, Joanna S. (Committee member) / Millsap, Roger (Committee member) / Arizona State University (Publisher)
Created2011
151476-Thumbnail Image.png
Description
The health benefits of physical activity are widely accepted. Emerging research also indicates that sedentary behaviors can carry negative health consequences regardless of physical activity level. This dissertation explored four projects that examined measurement properties of physical activity and sedentary behavior monitors. Project one identified the oxygen costs of four

The health benefits of physical activity are widely accepted. Emerging research also indicates that sedentary behaviors can carry negative health consequences regardless of physical activity level. This dissertation explored four projects that examined measurement properties of physical activity and sedentary behavior monitors. Project one identified the oxygen costs of four other care activities in seventeen adults. Pushing a wheelchair and pushing a stroller were identified as moderate-intensity activities. Minutes spent engaged in these activities contribute towards meeting the 2008 Physical Activity Guidelines. Project two identified the oxygen costs of common cleaning activities in sixteen adults. Mopping a floor was identified as moderate-intensity physical activity, while cleaning a kitchen and cleaning a bathtub were identified as light-intensity physical activity. Minutes spent engaged in mopping a floor contributes towards meeting the 2008 Physical Activity Guidelines. Project three evaluated the differences in number of minutes spent in activity levels when utilizing different epoch lengths in accelerometry. A shorter epoch length (1-second, 5-seconds) accumulated significantly more minutes of sedentary behaviors than a longer epoch length (60-seconds). The longer epoch length also identified significantly more time engaged in light-intensity activities than the shorter epoch lengths. Future research needs to account for epoch length selection when conducting physical activity and sedentary behavior assessment. Project four investigated the accuracy of four activity monitors in assessing activities that were either sedentary behaviors or light-intensity physical activities. The ActiGraph GT3X+ assessed the activities least accurately, while the SenseWear Armband and ActivPAL assessed activities equally accurately. The monitor used to assess physical activity and sedentary behaviors may influence the accuracy of the measurement of a construct.
ContributorsMeckes, Nathanael (Author) / Ainsworth, Barbara E (Thesis advisor) / Belyea, Michael (Committee member) / Buman, Matthew (Committee member) / Gaesser, Glenn (Committee member) / Wharton, Christopher (Christopher Mack), 1977- (Committee member) / Arizona State University (Publisher)
Created2012
151439-Thumbnail Image.png
Description
The Rapid Eating and Activity Assessment for Participants Short Version (REAP-S), represents a method for rapid diet quality assessment, however, few studies have tested its validity. The Healthy Eating Index-2005 (HEI-2005) and the Diet Quality Index Revised (DQI-R) are tools that effectively assess diet quality, however, both are complex and

The Rapid Eating and Activity Assessment for Participants Short Version (REAP-S), represents a method for rapid diet quality assessment, however, few studies have tested its validity. The Healthy Eating Index-2005 (HEI-2005) and the Diet Quality Index Revised (DQI-R) are tools that effectively assess diet quality, however, both are complex and time consuming. The objective of this study was to evaluate the validity of the REAP-S against the HEI-2005 and the DQI-R. Fifty males, 18 to 33 years of age, completed the REAP-S as well as a 24-hour diet recall. HEI-2005 and DQI-R scores were determined for each 24-hour recall. Scores from the REAP-S were evaluated against the HEI-2005 and DQI-R scores using Spearman rank order correlations and chi square. Modifications were also made to the original method of scoring the REAP-S to evaluate how the correlations transformed when certain questions were removed. The correlation coefficient for REAP-S and the HEI-2005 was 0.367 (P=0.009), and the correlation coefficient for REAP-S and the DQI-R was 0.323 (P=0.022). Chi square determined precision of the REAP-S to the HEI-2005 for overall diet quality at 64% and 62% for the DQI-R and REAP-S. Scores that were considered extreme (n=21) by the HEI-2005 (scores <40 and >60) had 76% precision with REAP-S. The correlation for the modified version of scoring REAP-S with the overall HEI-2005 and DQI-R were 0.395 (P=0.005) and 0.417 (P=0.003) respectively. Chi square statistics revealed the REAP-S accurately captured the diets of high quality versus low quality with 64% precision to the HEI-2005 and 62% of the DQI-R. When evaluating the modified REAP-S scores against the extreme HEI-2005 scores, precision increased to 81%. It appears the REAP-S is an acceptable tool to rapidly assess diet quality. It has a significant, moderate correlation to both the HEI-2005 and the DQI-R, with strong precision as well. Both correlation and precision is strengthened when values are compared to only the extreme scores of the HEI-2005; however, more research studies are needed to evaluate the validity of REAP-S in a more diverse population and to evaluate if changes to select questions can improve its accuracy in assessing diet quality.
ContributorsFawcett, Rachael (Author) / Johnston, Carol (Thesis advisor) / Mayol-Kreiser, Sandra (Committee member) / Wharton, Christopher (Christopher Mack), 1977- (Committee member) / Arizona State University (Publisher)
Created2012
151021-Thumbnail Image.png
Description
The Culture-Language Interpretive Matrix (C-LIM) is a new tool hypothesized to help practitioners accurately determine whether students who are administered an IQ test are culturally and linguistically different from the normative comparison group (i.e., different) or culturally and linguistically similar to the normative comparison group and possibly have Specific Learning

The Culture-Language Interpretive Matrix (C-LIM) is a new tool hypothesized to help practitioners accurately determine whether students who are administered an IQ test are culturally and linguistically different from the normative comparison group (i.e., different) or culturally and linguistically similar to the normative comparison group and possibly have Specific Learning Disabilities (SLD) or other neurocognitive disabilities (i.e., disordered). Diagnostic utility statistics were used to test the ability of the Wechsler Intelligence Scales for Children-Fourth Edition (WISC-IV) C-LIM to accurately identify students from a referred sample of English language learners (Ells) (n = 86) for whom Spanish was the primary language spoken at home and a sample of students from the WISC-IV normative sample (n = 2,033) as either culturally and linguistically different from the WISC-IV normative sample or culturally and linguistically similar to the WISC-IV normative sample. WISC-IV scores from three paired comparison groups were analyzed using the Receiver Operating Characteristic (ROC) curve: (a) Ells with SLD and the WISC-IV normative sample, (b) Ells without SLD and the WISC-IV normative sample, and (c) Ells with SLD and Ells without SLD. Results of the ROC yielded Area Under the Curve (AUC) values that ranged between 0.51 and 0.53 for the comparison between Ells with SLD and the WISC-IV normative sample, AUC values that ranged between 0.48 and 0.53 for the comparison between Ells without SLD and the WISC-IV normative sample, and AUC values that ranged between 0.49 and 0.55 for the comparison between Ells with SLD and Ells without SLD. These values indicate that the C-LIM has low diagnostic accuracy in terms of differentiating between a sample of Ells and the WISC-IV normative sample. Current available evidence does not support use of the C-LIM in applied practice at this time.
ContributorsStyck, Kara M (Author) / Watkins, Marley W. (Thesis advisor) / Levy, Roy (Thesis advisor) / Balles, John (Committee member) / Arizona State University (Publisher)
Created2012
156690-Thumbnail Image.png
Description
Dynamic Bayesian networks (DBNs; Reye, 2004) are a promising tool for modeling student proficiency under rich measurement scenarios (Reichenberg, in press). These scenarios often present assessment conditions far more complex than what is seen with more traditional assessments and require assessment arguments and psychometric models capable of integrating those complexities.

Dynamic Bayesian networks (DBNs; Reye, 2004) are a promising tool for modeling student proficiency under rich measurement scenarios (Reichenberg, in press). These scenarios often present assessment conditions far more complex than what is seen with more traditional assessments and require assessment arguments and psychometric models capable of integrating those complexities. Unfortunately, DBNs remain understudied and their psychometric properties relatively unknown. If the apparent strengths of DBNs are to be leveraged, then the body of literature surrounding their properties and use needs to be expanded upon. To this end, the current work aimed at exploring the properties of DBNs under a variety of realistic psychometric conditions. A two-phase Monte Carlo simulation study was conducted in order to evaluate parameter recovery for DBNs using maximum likelihood estimation with the Netica software package. Phase 1 included a limited number of conditions and was exploratory in nature while Phase 2 included a larger and more targeted complement of conditions. Manipulated factors included sample size, measurement quality, test length, the number of measurement occasions. Results suggested that measurement quality has the most prominent impact on estimation quality with more distinct performance categories yielding better estimation. While increasing sample size tended to improve estimation, there were a limited number of conditions under which greater samples size led to more estimation bias. An exploration of this phenomenon is included. From a practical perspective, parameter recovery appeared to be sufficient with samples as low as N = 400 as long as measurement quality was not poor and at least three items were present at each measurement occasion. Tests consisting of only a single item required exceptional measurement quality in order to adequately recover model parameters. The study was somewhat limited due to potentially software-specific issues as well as a non-comprehensive collection of experimental conditions. Further research should replicate and, potentially expand the current work using other software packages including exploring alternate estimation methods (e.g., Markov chain Monte Carlo).
ContributorsReichenberg, Raymond E (Author) / Levy, Roy (Thesis advisor) / Eggum-Wilkens, Natalie (Thesis advisor) / Iida, Masumi (Committee member) / DeLay, Dawn (Committee member) / Arizona State University (Publisher)
Created2018
156621-Thumbnail Image.png
Description
Investigation of measurement invariance (MI) commonly assumes correct specification of dimensionality across multiple groups. Although research shows that violation of the dimensionality assumption can cause bias in model parameter estimation for single-group analyses, little research on this issue has been conducted for multiple-group analyses. This study explored the effects of

Investigation of measurement invariance (MI) commonly assumes correct specification of dimensionality across multiple groups. Although research shows that violation of the dimensionality assumption can cause bias in model parameter estimation for single-group analyses, little research on this issue has been conducted for multiple-group analyses. This study explored the effects of mismatch in dimensionality between data and analysis models with multiple-group analyses at the population and sample levels. Datasets were generated using a bifactor model with different factor structures and were analyzed with bifactor and single-factor models to assess misspecification effects on assessments of MI and latent mean differences. As baseline models, the bifactor models fit data well and had minimal bias in latent mean estimation. However, the low convergence rates of fitting bifactor models to data with complex structures and small sample sizes caused concern. On the other hand, effects of fitting the misspecified single-factor models on the assessments of MI and latent means differed by the bifactor structures underlying data. For data following one general factor and one group factor affecting a small set of indicators, the effects of ignoring the group factor in analysis models on the tests of MI and latent mean differences were mild. In contrast, for data following one general factor and several group factors, oversimplifications of analysis models can lead to inaccurate conclusions regarding MI assessment and latent mean estimation.
ContributorsXu, Yuning (Author) / Green, Samuel (Thesis advisor) / Levy, Roy (Committee member) / Thompson, Marilyn (Committee member) / Arizona State University (Publisher)
Created2018
157145-Thumbnail Image.png
Description
A simulation study was conducted to explore the robustness of general factor mean difference estimation in bifactor ordered-categorical data. In the No Differential Item Functioning (DIF) conditions, the data generation conditions varied were sample size, the number of categories per item, effect size of the general factor mean difference, and

A simulation study was conducted to explore the robustness of general factor mean difference estimation in bifactor ordered-categorical data. In the No Differential Item Functioning (DIF) conditions, the data generation conditions varied were sample size, the number of categories per item, effect size of the general factor mean difference, and the size of specific factor loadings; in data analysis, misspecification conditions were introduced in which the generated bifactor data were fit using a unidimensional model, and/or ordered-categorical data were treated as continuous data. In the DIF conditions, the data generation conditions varied were sample size, the number of categories per item, effect size of latent mean difference for the general factor, the type of item parameters that had DIF, and the magnitude of DIF; the data analysis conditions varied in whether or not setting equality constraints on the noninvariant item parameters.

Results showed that falsely fitting bifactor data using unidimensional models or failing to account for DIF in item parameters resulted in estimation bias in the general factor mean difference, while treating ordinal data as continuous had little influence on the estimation bias as long as there was no severe model misspecification. The extent of estimation bias produced by misspecification of bifactor datasets with unidimensional models was mainly determined by the degree of unidimensionality (i.e., size of specific factor loadings) and the general factor mean difference size. When the DIF was present, the estimation accuracy of the general factor mean difference was completely robust to ignoring noninvariance in specific factor loadings while it was very sensitive to failing to account for DIF in threshold parameters. With respect to ignoring the DIF in general factor loadings, the estimation bias of the general factor mean difference was substantial when the DIF was -0.15, and it can be negligible for smaller sizes of DIF. Despite the impact of model misspecification on estimation accuracy, the power to detect the general factor mean difference was mainly influenced by the sample size and effect size. Serious Type I error rate inflation only occurred when the DIF was present in threshold parameters.
ContributorsLiu, Yixing (Author) / Thompson, Marilyn (Thesis advisor) / Levy, Roy (Committee member) / O’Rourke, Holly (Committee member) / Arizona State University (Publisher)
Created2019
152477-Thumbnail Image.png
Description
This simulation study compared the utility of various discrepancy measures within a posterior predictive model checking (PPMC) framework for detecting different types of data-model misfit in multidimensional Bayesian network (BN) models. The investigated conditions were motivated by an applied research program utilizing an operational complex performance assessment within a digital-simulation

This simulation study compared the utility of various discrepancy measures within a posterior predictive model checking (PPMC) framework for detecting different types of data-model misfit in multidimensional Bayesian network (BN) models. The investigated conditions were motivated by an applied research program utilizing an operational complex performance assessment within a digital-simulation educational context grounded in theories of cognition and learning. BN models were manipulated along two factors: latent variable dependency structure and number of latent classes. Distributions of posterior predicted p-values (PPP-values) served as the primary outcome measure and were summarized in graphical presentations, by median values across replications, and by proportions of replications in which the PPP-values were extreme. An effect size measure for PPMC was introduced as a supplemental numerical summary to the PPP-value. Consistent with previous PPMC research, all investigated fit functions tended to perform conservatively, but Standardized Generalized Dimensionality Discrepancy Measure (SGDDM), Yen's Q3, and Hierarchy Consistency Index (HCI) only mildly so. Adequate power to detect at least some types of misfit was demonstrated by SGDDM, Q3, HCI, Item Consistency Index (ICI), and to a lesser extent Deviance, while proportion correct (PC), a chi-square-type item-fit measure, Ranked Probability Score (RPS), and Good's Logarithmic Scale (GLS) were powerless across all investigated factors. Bivariate SGDDM and Q3 were found to provide powerful and detailed feedback for all investigated types of misfit.
ContributorsCrawford, Aaron (Author) / Levy, Roy (Thesis advisor) / Green, Samuel (Committee member) / Thompson, Marilyn (Committee member) / Arizona State University (Publisher)
Created2014
153357-Thumbnail Image.png
Description
Many methodological approaches have been utilized to predict student retention and persistence over the years, yet few have utilized a Bayesian framework. It is believed this is due in part to the absence of an established process for guiding educational researchers reared in a frequentist perspective into the realms of

Many methodological approaches have been utilized to predict student retention and persistence over the years, yet few have utilized a Bayesian framework. It is believed this is due in part to the absence of an established process for guiding educational researchers reared in a frequentist perspective into the realms of Bayesian analysis and educational data mining. The current study aimed to address this by providing a model-building process for developing a Bayesian network (BN) that leveraged educational data mining, Bayesian analysis, and traditional iterative model-building techniques in order to predict whether community college students will stop out at the completion of each of their first six terms. The study utilized exploratory and confirmatory techniques to reduce an initial pool of more than 50 potential predictor variables to a parsimonious final BN with only four predictor variables. The average in-sample classification accuracy rate for the model was 80% (Cohen's κ = 53%). The model was shown to be generalizable across samples with an average out-of-sample classification accuracy rate of 78% (Cohen's κ = 49%). The classification rates for the BN were also found to be superior to the classification rates produced by an analog frequentist discrete-time survival analysis model.
ContributorsArcuria, Philip (Author) / Levy, Roy (Thesis advisor) / Green, Samuel B (Committee member) / Thompson, Marilyn S (Committee member) / Arizona State University (Publisher)
Created2015