Search Content

Impact of violations of longitudinal measurement invariance in latent growth models and autoregressive quasi-simplex models

Description

In order to analyze data from an instrument administered at multiple time points it is a common practice to form composites of the items at each wave and to fit a longitudinal model to the composites. The advantage of using composites of items is that smaller sample sizes are required…

In order to analyze data from an instrument administered at multiple time points it is a common practice to form composites of the items at each wave and to fit a longitudinal model to the composites. The advantage of using composites of items is that smaller sample sizes are required in contrast to second order models that include the measurement and the structural relationships among the variables. However, the use of composites assumes that longitudinal measurement invariance holds; that is, it is assumed that that the relationships among the items and the latent variables remain constant over time. Previous studies conducted on latent growth models (LGM) have shown that when longitudinal metric invariance is violated, the parameter estimates are biased and that mistaken conclusions about growth can be made. The purpose of the current study was to examine the impact of non-invariant loadings and non-invariant intercepts on two longitudinal models: the LGM and the autoregressive quasi-simplex model (AR quasi-simplex). A second purpose was to determine if there are conditions in which researchers can reach adequate conclusions about stability and growth even in the presence of violations of invariance. A Monte Carlo simulation study was conducted to achieve the purposes. The method consisted of generating items under a linear curve of factors model (COFM) or under the AR quasi-simplex. Composites of the items were formed at each time point and analyzed with a linear LGM or an AR quasi-simplex model. The results showed that AR quasi-simplex model yielded biased path coefficients only in the conditions with large violations of invariance. The fit of the AR quasi-simplex was not affected by violations of invariance. In general, the growth parameter estimates of the LGM were biased under violations of invariance. Further, in the presence of non-invariant loadings the rejection rates of the hypothesis of linear growth increased as the proportion of non-invariant items and as the magnitude of violations of invariance increased. A discussion of the results and limitations of the study are provided as well as general recommendations.

ContributorsOlivera-Aguilar, Margarita (Author) / Millsap, Roger E. (Thesis advisor) / Levy, Roy (Committee member) / MacKinnon, David (Committee member) / West, Stephen G. (Committee member) / Arizona State University (Publisher)

Created2013

Maximizing the benefits of collaborative learning in the college classroom

Description

This study tested the effects of two kinds of cognitive, domain-based preparation tasks on learning outcomes after engaging in a collaborative activity with a partner. The collaborative learning method of interest was termed "preparing-to-interact," and is supported in theory by the Preparation for Future Learning (PFL) paradigm and the Interactive-Constructive-Active-Passive…

This study tested the effects of two kinds of cognitive, domain-based preparation tasks on learning outcomes after engaging in a collaborative activity with a partner. The collaborative learning method of interest was termed "preparing-to-interact," and is supported in theory by the Preparation for Future Learning (PFL) paradigm and the Interactive-Constructive-Active-Passive (ICAP) framework. The current work combined these two cognitive-based approaches to design collaborative learning activities that can serve as alternatives to existing methods, which carry limitations and challenges. The "preparing-to-interact" method avoids the need for training students in specific collaboration skills or guiding/scripting their dialogic behaviors, while providing the opportunity for students to acquire the necessary prior knowledge for maximizing their discussions towards learning. The study used a 2x2 experimental design, investigating the factors of Preparation (No Prep and Prep) and Type of Activity (Active and Constructive) on deep and shallow learning. The sample was community college students in introductory psychology classes; the domain tested was "memory," in particular, concepts related to the process of remembering/forgetting information. Results showed that Preparation was a significant factor affecting deep learning, while shallow learning was not affected differently by the interventions. Essentially, equalizing time-on-task and content across all conditions, time spent individually preparing by working on the task alone and then discussing the content with a partner produced deeper learning than engaging in the task jointly for the duration of the learning period. Type of Task was not a significant factor in learning outcomes, however, exploratory analyses showed evidence of Constructive-type behaviors leading to deeper learning of the content. Additionally, a novel method of multilevel analysis (MLA) was used to examine the data to account for the dependency between partners within dyads. This work showed that "preparing-to-interact" is a way to maximize the benefits of collaborative learning. When students are first cognitively prepared, they seem to make the most efficient use of discussion towards learning, engage more deeply in the content during learning, leading to deeper knowledge of the content. Additionally, in using MLA to account for subject nonindependency, this work introduces new questions about the validity of statistical analyses for dyadic data.

ContributorsLam, Rachel Jane (Author) / Nakagawa, Kathryn (Thesis advisor) / Green, Samuel (Committee member) / Stamm, Jill (Committee member) / Arizona State University (Publisher)

Created2013

Structure of perfectionism and relation to career Indecision

Description

ABSTRACT Perfectionism has been conceptualized as a relatively stable, independent, multidimensional personality construct in research during the last two decades. Despite general agreement that perfectionism is dimensional in nature, analyses using these instruments vacillate between a dimensional approach and a categorical approach (Broman-Fulks, Hill, & Green, 2008; Stoeber & Otto,…

ABSTRACT Perfectionism has been conceptualized as a relatively stable, independent, multidimensional personality construct in research during the last two decades. Despite general agreement that perfectionism is dimensional in nature, analyses using these instruments vacillate between a dimensional approach and a categorical approach (Broman-Fulks, Hill, & Green, 2008; Stoeber & Otto, 2006). The goal of the current study was two-fold. One aim was to examine the structural nature of two commonly used measures of perfectionism, the APS-R and the HFMPS. Latent class and factor analyses were conducted to determine the dimensions and categories that underlie the items of these two instruments. A second aim was to determine whether perfectionism classes or perfectionism factors better predicted 4 criterion variables of career indecision. Results lent evidence to the claim that both the APS-R and HFMPS are best used as dimensional, rather than categorical instruments. From a substantive perspective, results indicated that both positive and negative aspects of perfectionism successfully predicted career indecision factors. The study concludes with a discussion of limitations, and implications for future research and counseling individuals with career indecision concerns.

ContributorsRohlfing, Jessica Elizabeth (Author) / Tracey, Terence J. G. (Thesis advisor) / Green, Samuel (Committee member) / Kinnier, Richard T. (Committee member) / Arizona State University (Publisher)

Created2013

Looking out the window: toward a visual understanding of school grounds as place

Description

This study looked at ways of understanding how schoolyards might act as meaningful places in children's developing sense of identity and possibility. Photographs and other images such as historical photographs and maps were used to look at how built environments outside of school reflect demographic and social differences within one…

This study looked at ways of understanding how schoolyards might act as meaningful places in children's developing sense of identity and possibility. Photographs and other images such as historical photographs and maps were used to look at how built environments outside of school reflect demographic and social differences within one southwest city. Intersections of children's worlds with various socio-political communities, woven into and through schooling, were examined for evidence of ways that schools act as the embodiment of a community's values: they are the material and observable effects of resource-allocation decisions. And scholarly materials were consulted to examine relationships in the images to existing theories of place, and its effect on children, as well as to consider theories of the hidden curriculum and its relationship to social reproduction, and the nature of visual representation as a form of data rather than strictly in the service of illustrating other forms of data. The focus of the study was on identifying appropriate research methods for investigating ways to understand the importance of the material worlds of school and childhood. Using a combination of visual and narrative approaches to contribute to our understanding of those material worlds, I sought to expose areas of inequity and class differences in ways that children experience schooling, as evidenced by differences in the material environment. Using a mixed-methods approach, created and found images were coded for categories of material culture, such as the existence of fences, trees, views from the playground or walking in the neighborhood at four Tempe schools. Findings were connected to a rich body of knowledge in areas such as theories of space and place, the nature of the hidden curriculum, visual culture, visual research methods including mapping. Familiar aspects of schooling were exposed in different ways, linking past decisions made by adults to their continuing effects on children today. In this way I arrived at an expanded and enriched understanding of the present worlds of children communicated as through the material environment. Visually examining children's worlds, by looking at the material artifacts of everyday worlds that children experience at school and including the child's-eye view in decision processes, has promise in moving decision makers away from strictly analytical and impersonal approaches to decision making about schooling children of the future. I proposed that by weighting of data points, as used in decision-making processes regarding schooling, differently than is currently done, and by paying closer attention to possible longer-term effects of place for all children, not just a few, there is the potential to improve the quality of life for today's children, and tomorrow's adults.

ContributorsWalsum, Joyce Van (Author) / Margolis, Eric M. (Thesis advisor) / Green, Samuel (Thesis advisor) / Collins, Daniel (Committee member) / Arizona State University (Publisher)

Created2013

The validation study of the Persistent Academic Possible Selves Scale for Adolescents

Description

Possible selves researchers have uncovered many issues associated with the current possible selves measures. For instance, one of the most famous possible selves measures, Oyserman (2004)'s open-ended possible selves, has proven to be difficult to score reliably and also involves laborious scoring procedures. Therefore, this study was initiated to develo…

Possible selves researchers have uncovered many issues associated with the current possible selves measures. For instance, one of the most famous possible selves measures, Oyserman (2004)'s open-ended possible selves, has proven to be difficult to score reliably and also involves laborious scoring procedures. Therefore, this study was initiated to develop a close-ended measure, called the Persistent Academic Possible Selves Scale for Adolescents (PAPSS), that meets these challenges. The PAPSS integrates possible selves theories (personal and social identities) and educational psychology (self-regulation in social cognitive theory). Four hundred and ninety five junior high and high school students participated in the validation study of the PAPSS. I conducted confirmatory factor analyses (CFA) to compare fit for a baseline model to the hypothesized models using Mplus version 7 (Muthén & Muthén, 2012). A weighted least square means and a variance adjusted (WLSMV) estimation method was used for handling multivariate nonnormality of ordered categorical data. The final PAPSS has validity evidence based on the internal structure. The factor structure is composed of three goal-driven factors, one self-regulated factor that focuses on peers, and four self-regulated factors that emphasize the self. Oyserman (2004)'s open-ended questionnaire was used for exploring the evidence of convergent validity. Many issues regarding Oyserman (2003)'s instructions were found during the coding process of academic plausibility. It was complicated to detect hidden academic possible selves and strategies from non-academic possible selves and strategies. Also, interpersonal related strategies were over weighted in the scoring process compared to interpersonal related academic possible selves. The study results uncovered that all of the academic goal-related factors in the PAPSS are significantly related to academic plausibility in a positive direction. However, self-regulated factors in the PAPSS are not. The correlation results between the self-regulated factors and academic plausibility do not provide the evidence of convergent validity. Theoretical and methodological explanations for the test results are discussed.

ContributorsLee, Ji Eun (Author) / Husman, Jenefer (Thesis advisor) / Green, Samuel (Committee member) / Millsap, Roger (Committee member) / Brem, Sarah (Committee member) / Arizona State University (Publisher)

Created2013

A comparison of DIMTEST and generalized dimensionality discrepancy approaches to assessing dimensionality in item response theory

Description

Dimensionality assessment is an important component of evaluating item response data. Existing approaches to evaluating common assumptions of unidimensionality, such as DIMTEST (Nandakumar & Stout, 1993; Stout, 1987; Stout, Froelich, & Gao, 2001), have been shown to work well under large-scale assessment conditions (e.g., large sample sizes and item pools;…

Dimensionality assessment is an important component of evaluating item response data. Existing approaches to evaluating common assumptions of unidimensionality, such as DIMTEST (Nandakumar & Stout, 1993; Stout, 1987; Stout, Froelich, & Gao, 2001), have been shown to work well under large-scale assessment conditions (e.g., large sample sizes and item pools; see e.g., Froelich & Habing, 2007). It remains to be seen how such procedures perform in the context of small-scale assessments characterized by relatively small sample sizes and/or short tests. The fact that some procedures come with minimum allowable values for characteristics of the data, such as the number of items, may even render them unusable for some small-scale assessments. Other measures designed to assess dimensionality do not come with such limitations and, as such, may perform better under conditions that do not lend themselves to evaluation via statistics that rely on asymptotic theory. The current work aimed to evaluate the performance of one such metric, the standardized generalized dimensionality discrepancy measure (SGDDM; Levy & Svetina, 2011; Levy, Xu, Yel, & Svetina, 2012), under both large- and small-scale testing conditions. A Monte Carlo study was conducted to compare the performance of DIMTEST and the SGDDM statistic in terms of evaluating assumptions of unidimensionality in item response data under a variety of conditions, with an emphasis on the examination of these procedures in small-scale assessments. Similar to previous research, increases in either test length or sample size resulted in increased power. The DIMTEST procedure appeared to be a conservative test of the null hypothesis of unidimensionality. The SGDDM statistic exhibited rejection rates near the nominal rate of .05 under unidimensional conditions, though the reliability of these results may have been less than optimal due to high sampling variability resulting from a relatively limited number of replications. Power values were at or near 1.0 for many of the multidimensional conditions. It was only when the sample size was reduced to N = 100 that the two approaches diverged in performance. Results suggested that both procedures may be appropriate for sample sizes as low as N = 250 and tests as short as J = 12 (SGDDM) or J = 19 (DIMTEST). When used as a diagnostic tool, SGDDM may be appropriate with as few as N = 100 cases combined with J = 12 items. The study was somewhat limited in that it did not include any complex factorial designs, nor were the strength of item discrimination parameters or correlation between factors manipulated. It is recommended that further research be conducted with the inclusion of these factors, as well as an increase in the number of replications when using the SGDDM procedure.

ContributorsReichenberg, Ray E (Author) / Levy, Roy (Thesis advisor) / Thompson, Marilyn S. (Thesis advisor) / Green, Samuel B. (Committee member) / Arizona State University (Publisher)

Created2013

Do more comprehensive psychoeducational evaluations promote TBI educational diagnosis?

Description

Students with traumatic brain injury (TBI) sometimes experience impairments that can adversely affect educational performance. Consequently, school psychologists may be needed to help determine if a TBI diagnosis is warranted (i.e., in compliance with the Individuals with Disabilities Education Improvement Act, IDEIA) and to suggest accommodations to assist those students.…

Students with traumatic brain injury (TBI) sometimes experience impairments that can adversely affect educational performance. Consequently, school psychologists may be needed to help determine if a TBI diagnosis is warranted (i.e., in compliance with the Individuals with Disabilities Education Improvement Act, IDEIA) and to suggest accommodations to assist those students. This analogue study investigated whether school psychologists provided with more comprehensive psychoeducational evaluations of a student with TBI succeeded in detecting TBI, in making TBI-related accommodations, and were more confident in their decisions. To test these hypotheses, 76 school psychologists were randomly assigned to one of three groups that received increasingly comprehensive levels of psychoeducational evaluation embedded in a cumulative folder of a hypothetical student whose history included a recent head injury and TBI-compatible school problems. As expected, school psychologists who received a more comprehensive psychoeducational evaluation were more likely to make a TBI educational diagnosis, but the effect size was not strong, and the predictive value came from the variance between the first and third groups. Likewise, school psychologists receiving more comprehensive evaluation data produced more accommodations related to student needs and felt more confidence in those accommodations, but significant differences were not found at all levels of evaluation. Contrary to expectations, however, providing more comprehensive information failed to engender more confidence in decisions about TBI educational diagnoses. Concluding that a TBI is present may itself facilitate accommodations; school psychologists who judged that the student warranted a TBI educational diagnosis produce more TBI-related accommodations. Impact of findings suggest the importance of training school psychologists in the interpretation of neuropsychology test results to aid in educational diagnosis and to increase confidence in their use.

ContributorsHildreth, Lisa Jane (Author) / Hildreth, Lisa J (Thesis advisor) / Wodrich, David (Committee member) / Levy, Roy (Committee member) / Lavoie, Michael (Committee member) / Arizona State University (Publisher)

Created2012

Posterior predictive model checking in Bayesian networks

Description

This simulation study compared the utility of various discrepancy measures within a posterior predictive model checking (PPMC) framework for detecting different types of data-model misfit in multidimensional Bayesian network (BN) models. The investigated conditions were motivated by an applied research program utilizing an operational complex performance assessment within a digital-simulation…

This simulation study compared the utility of various discrepancy measures within a posterior predictive model checking (PPMC) framework for detecting different types of data-model misfit in multidimensional Bayesian network (BN) models. The investigated conditions were motivated by an applied research program utilizing an operational complex performance assessment within a digital-simulation educational context grounded in theories of cognition and learning. BN models were manipulated along two factors: latent variable dependency structure and number of latent classes. Distributions of posterior predicted p-values (PPP-values) served as the primary outcome measure and were summarized in graphical presentations, by median values across replications, and by proportions of replications in which the PPP-values were extreme. An effect size measure for PPMC was introduced as a supplemental numerical summary to the PPP-value. Consistent with previous PPMC research, all investigated fit functions tended to perform conservatively, but Standardized Generalized Dimensionality Discrepancy Measure (SGDDM), Yen's Q3, and Hierarchy Consistency Index (HCI) only mildly so. Adequate power to detect at least some types of misfit was demonstrated by SGDDM, Q3, HCI, Item Consistency Index (ICI), and to a lesser extent Deviance, while proportion correct (PC), a chi-square-type item-fit measure, Ranked Probability Score (RPS), and Good's Logarithmic Scale (GLS) were powerless across all investigated factors. Bivariate SGDDM and Q3 were found to provide powerful and detailed feedback for all investigated types of misfit.

ContributorsCrawford, Aaron (Author) / Levy, Roy (Thesis advisor) / Green, Samuel (Committee member) / Thompson, Marilyn (Committee member) / Arizona State University (Publisher)

Created2014

Competency Assessment in Nursing Using Simulation: A Generalizability Study and Scenario Validation Process

Description

The measurement of competency in nursing is critical to ensure safe and effective care of patients. This study had two purposes. First, the psychometric characteristics of the Nursing Performance Profile (NPP), an instrument used to measure nursing competency, were evaluated using generalizability theory and a sample of 18 nurses in…

The measurement of competency in nursing is critical to ensure safe and effective care of patients. This study had two purposes. First, the psychometric characteristics of the Nursing Performance Profile (NPP), an instrument used to measure nursing competency, were evaluated using generalizability theory and a sample of 18 nurses in the Measuring Competency with Simulation (MCWS) Phase I dataset. The relative magnitudes of various error sources and their interactions were estimated in a generalizability study involving a fully crossed, three-facet random design with nurse participants as the object of measurement and scenarios, raters, and items as the three facets. A design corresponding to that of the MCWS Phase I data--involving three scenarios, three raters, and 41 items--showed nurse participants contributed the greatest proportion to total variance (50.00%), followed, in decreasing magnitude, by: rater (19.40%), the two-way participant x scenario interaction (12.93%), and the two-way participant x rater interaction (8.62%). The generalizability (G) coefficient was .65 and the dependability coefficient was .50. In decision study designs minimizing number of scenarios, the desired generalizability coefficients of .70 and .80 were reached at three scenarios with five raters, and five scenarios with nine raters, respectively. In designs minimizing number of raters, G coefficients of .72 and .80 were reached at three raters and five scenarios and four raters and nine scenarios, respectively. A dependability coefficient of .71 was attained with six scenarios and nine raters or seven raters and nine scenarios. Achieving high reliability with designs involving fewer raters may be possible with enhanced rater training to decrease variance components for rater main and interaction effects. The second part of this study involved the design and implementation of a validation process for evidence-based human patient simulation scenarios in assessment of nursing competency. A team of experts validated the new scenario using a modified Delphi technique, involving three rounds of iterative feedback and revisions. In tandem, the psychometric study of the NPP and the development of a validation process for human patient simulation scenarios both advance and encourage best practices for studying the validity of simulation-based assessments.

ContributorsO'Brien, Janet Elaine (Author) / Thompson, Marilyn (Thesis advisor) / Hagler, Debra (Thesis advisor) / Green, Samuel (Committee member) / Arizona State University (Publisher)

Created2014

Analytic Selection of a Valid Subtest for DIF Analysis when DIF has Multiple Potential Causes among Multiple Groups

Description

The study examined how ATFIND, Mantel-Haenszel, SIBTEST, and Crossing SIBTEST function when items in the dataset are modelled to differentially advantage a lower ability focal group over a higher ability reference group. The primary purpose of the study was to examine ATFIND's usefulness as a valid subtest selection tool, but…

The study examined how ATFIND, Mantel-Haenszel, SIBTEST, and Crossing SIBTEST function when items in the dataset are modelled to differentially advantage a lower ability focal group over a higher ability reference group. The primary purpose of the study was to examine ATFIND's usefulness as a valid subtest selection tool, but it also explored the influence of DIF items, item difficulty, and presence of multiple examinee populations with different ability distributions on both its selection of the assessment test (AT) and partitioning test (PT) lists and on all three differential item functioning (DIF) analysis procedures. The results of SIBTEST were also combined with those of Crossing SIBTEST, as might be done in practice.

ATFIND was found to be a less-than-effective matching subtest selection tool with DIF items that are modelled unidimensionally. If an item was modelled with uniform DIF or if it had a referent difficulty parameter in the Medium range, it was found to be selected slightly more often for the AT List than the PT List. These trends were seen to increase as sample size increased. All three DIF analyses, and the combined SIBTEST and Crossing SIBTEST, generally were found to perform less well as DIF contaminated the matching subtest, as well as when DIF was modelled less severely or when the focal group ability was skewed. While the combined SIBTEST and Crossing SIBTEST was found to have the highest power among the DIF analyses, it also was found to have Type I error rates that were sometimes extremely high.

ContributorsScott, Lietta Marie (Author) / Levy, Roy (Thesis advisor) / Green, Samuel B (Thesis advisor) / Gorin, Joanna S (Committee member) / Williams, Leila E (Committee member) / Arizona State University (Publisher)

Created2014