Search Content

Maximizing the benefits of collaborative learning in the college classroom

Description

This study tested the effects of two kinds of cognitive, domain-based preparation tasks on learning outcomes after engaging in a collaborative activity with a partner. The collaborative learning method of interest was termed "preparing-to-interact," and is supported in theory by the Preparation for Future Learning (PFL) paradigm and the Interactive-Constructive-Active-Passive…

This study tested the effects of two kinds of cognitive, domain-based preparation tasks on learning outcomes after engaging in a collaborative activity with a partner. The collaborative learning method of interest was termed "preparing-to-interact," and is supported in theory by the Preparation for Future Learning (PFL) paradigm and the Interactive-Constructive-Active-Passive (ICAP) framework. The current work combined these two cognitive-based approaches to design collaborative learning activities that can serve as alternatives to existing methods, which carry limitations and challenges. The "preparing-to-interact" method avoids the need for training students in specific collaboration skills or guiding/scripting their dialogic behaviors, while providing the opportunity for students to acquire the necessary prior knowledge for maximizing their discussions towards learning. The study used a 2x2 experimental design, investigating the factors of Preparation (No Prep and Prep) and Type of Activity (Active and Constructive) on deep and shallow learning. The sample was community college students in introductory psychology classes; the domain tested was "memory," in particular, concepts related to the process of remembering/forgetting information. Results showed that Preparation was a significant factor affecting deep learning, while shallow learning was not affected differently by the interventions. Essentially, equalizing time-on-task and content across all conditions, time spent individually preparing by working on the task alone and then discussing the content with a partner produced deeper learning than engaging in the task jointly for the duration of the learning period. Type of Task was not a significant factor in learning outcomes, however, exploratory analyses showed evidence of Constructive-type behaviors leading to deeper learning of the content. Additionally, a novel method of multilevel analysis (MLA) was used to examine the data to account for the dependency between partners within dyads. This work showed that "preparing-to-interact" is a way to maximize the benefits of collaborative learning. When students are first cognitively prepared, they seem to make the most efficient use of discussion towards learning, engage more deeply in the content during learning, leading to deeper knowledge of the content. Additionally, in using MLA to account for subject nonindependency, this work introduces new questions about the validity of statistical analyses for dyadic data.

ContributorsLam, Rachel Jane (Author) / Nakagawa, Kathryn (Thesis advisor) / Green, Samuel (Committee member) / Stamm, Jill (Committee member) / Arizona State University (Publisher)

Created2013

A comparison of DIMTEST and generalized dimensionality discrepancy approaches to assessing dimensionality in item response theory

Description

Dimensionality assessment is an important component of evaluating item response data. Existing approaches to evaluating common assumptions of unidimensionality, such as DIMTEST (Nandakumar & Stout, 1993; Stout, 1987; Stout, Froelich, & Gao, 2001), have been shown to work well under large-scale assessment conditions (e.g., large sample sizes and item pools;…

Dimensionality assessment is an important component of evaluating item response data. Existing approaches to evaluating common assumptions of unidimensionality, such as DIMTEST (Nandakumar & Stout, 1993; Stout, 1987; Stout, Froelich, & Gao, 2001), have been shown to work well under large-scale assessment conditions (e.g., large sample sizes and item pools; see e.g., Froelich & Habing, 2007). It remains to be seen how such procedures perform in the context of small-scale assessments characterized by relatively small sample sizes and/or short tests. The fact that some procedures come with minimum allowable values for characteristics of the data, such as the number of items, may even render them unusable for some small-scale assessments. Other measures designed to assess dimensionality do not come with such limitations and, as such, may perform better under conditions that do not lend themselves to evaluation via statistics that rely on asymptotic theory. The current work aimed to evaluate the performance of one such metric, the standardized generalized dimensionality discrepancy measure (SGDDM; Levy & Svetina, 2011; Levy, Xu, Yel, & Svetina, 2012), under both large- and small-scale testing conditions. A Monte Carlo study was conducted to compare the performance of DIMTEST and the SGDDM statistic in terms of evaluating assumptions of unidimensionality in item response data under a variety of conditions, with an emphasis on the examination of these procedures in small-scale assessments. Similar to previous research, increases in either test length or sample size resulted in increased power. The DIMTEST procedure appeared to be a conservative test of the null hypothesis of unidimensionality. The SGDDM statistic exhibited rejection rates near the nominal rate of .05 under unidimensional conditions, though the reliability of these results may have been less than optimal due to high sampling variability resulting from a relatively limited number of replications. Power values were at or near 1.0 for many of the multidimensional conditions. It was only when the sample size was reduced to N = 100 that the two approaches diverged in performance. Results suggested that both procedures may be appropriate for sample sizes as low as N = 250 and tests as short as J = 12 (SGDDM) or J = 19 (DIMTEST). When used as a diagnostic tool, SGDDM may be appropriate with as few as N = 100 cases combined with J = 12 items. The study was somewhat limited in that it did not include any complex factorial designs, nor were the strength of item discrimination parameters or correlation between factors manipulated. It is recommended that further research be conducted with the inclusion of these factors, as well as an increase in the number of replications when using the SGDDM procedure.

ContributorsReichenberg, Ray E (Author) / Levy, Roy (Thesis advisor) / Thompson, Marilyn S. (Thesis advisor) / Green, Samuel B. (Committee member) / Arizona State University (Publisher)

Created2013

Assessing dimensionality in complex data structures: a performance comparison of DETECT and NOHARM procedures

Description

The purpose of this study was to investigate the effect of complex structure on dimensionality assessment in compensatory and noncompensatory multidimensional item response models (MIRT) of assessment data using dimensionality assessment procedures based on conditional covariances (i.e., DETECT) and a factor analytical approach (i.e., NOHARM). The DETECT-based methods typically outperformed…

The purpose of this study was to investigate the effect of complex structure on dimensionality assessment in compensatory and noncompensatory multidimensional item response models (MIRT) of assessment data using dimensionality assessment procedures based on conditional covariances (i.e., DETECT) and a factor analytical approach (i.e., NOHARM). The DETECT-based methods typically outperformed the NOHARM-based methods in both two- (2D) and three-dimensional (3D) compensatory MIRT conditions. The DETECT-based methods yielded high proportion correct, especially when correlations were .60 or smaller, data exhibited 30% or less complexity, and larger sample size. As the complexity increased and the sample size decreased, the performance typically diminished. As the complexity increased, it also became more difficult to label the resulting sets of items from DETECT in terms of the dimensions. DETECT was consistent in classification of simple items, but less consistent in classification of complex items. Out of the three NOHARM-based methods, χ2G/D and ALR generally outperformed RMSR. χ2G/D was more accurate when N = 500 and complexity levels were 30% or lower. As the number of items increased, ALR performance improved at correlation of .60 and 30% or less complexity. When the data followed a noncompensatory MIRT model, the NOHARM-based methods, specifically χ2G/D and ALR, were the most accurate of all five methods. The marginal proportions for labeling sets of items as dimension-like were typically low, suggesting that the methods generally failed to label two (three) sets of items as dimension-like in 2D (3D) noncompensatory situations. The DETECT-based methods were more consistent in classifying simple items across complexity levels, sample sizes, and correlations. However, as complexity and correlation levels increased the classification rates for all methods decreased. In most conditions, the DETECT-based methods classified complex items equally or more consistent than the NOHARM-based methods. In particular, as complexity, the number of items, and the true dimensionality increased, the DETECT-based methods were notably more consistent than any NOHARM-based method. Despite DETECT's consistency, when data follow a noncompensatory MIRT model, the NOHARM-based method should be preferred over the DETECT-based methods to assess dimensionality due to poor performance of DETECT in identifying the true dimensionality.

ContributorsSvetina, Dubravka (Author) / Levy, Roy (Thesis advisor) / Gorin, Joanna S. (Committee member) / Millsap, Roger (Committee member) / Arizona State University (Publisher)

Created2011

Assessment of item parameter drift of known items in a university placement exam

Description

ABSTRACT This study investigated the possibility of item parameter drift (IPD) in a calculus placement examination administered to approximately 3,000 students at a large university in the United States. A single form of the exam was administered continuously for a period of two years, possibly allowing later examinees to have…

ABSTRACT This study investigated the possibility of item parameter drift (IPD) in a calculus placement examination administered to approximately 3,000 students at a large university in the United States. A single form of the exam was administered continuously for a period of two years, possibly allowing later examinees to have prior knowledge of specific items on the exam. An analysis of IPD was conducted to explore evidence of possible item exposure. Two assumptions concerning items exposure were made: 1) item recall and item exposure are positively correlated, and 2) item exposure results in the items becoming easier over time. Special consideration was given to two contextual item characteristics: 1) item location within the test, specifically items at the beginning and end of the exam, and 2) the use of an associated diagram. The hypotheses stated that these item characteristics would make the items easier to recall and, therefore, more likely to be exposed, resulting in item drift. BILOG-MG 3 was used to calibrate the items and assess for IPD. No evidence was found to support the hypotheses that the items located at the beginning of the test or with an associated diagram drifted as a result of item exposure. Three items among the last ten on the exam drifted significantly and became easier, consistent with item exposure. However, in this study, the possible effects of item exposure could not be separated from the effects of other potential factors such as speededness, curriculum changes, better test preparation on the part of subsequent examinees, or guessing.

ContributorsKrause, Janet (Author) / Levy, Roy (Thesis advisor) / Thompson, Marilyn (Thesis advisor) / Gorin, Joanna (Committee member) / Arizona State University (Publisher)

Created2012

Sample size and test length minima for DIMTEST with conditional covariance-based subtest selection

Description

The existing minima for sample size and test length recommendations for DIMTEST (750 examinees and 25 items) are tied to features of the procedure that are no longer in use. The current version of DIMTEST uses a bootstrapping procedure to remove bias from the test statistic and is packaged with…

The existing minima for sample size and test length recommendations for DIMTEST (750 examinees and 25 items) are tied to features of the procedure that are no longer in use. The current version of DIMTEST uses a bootstrapping procedure to remove bias from the test statistic and is packaged with a conditional covariance-based procedure called ATFIND for partitioning test items. Key factors such as sample size, test length, test structure, the correlation between dimensions, and strength of dependence were manipulated in a Monte Carlo study to assess the effectiveness of the current version of DIMTEST with fewer examinees and items. In addition, the DETECT program was also used to partition test items; a second feature of this study also compared the structure of test partitions obtained with ATFIND and DETECT in a number of ways. With some exceptions, the performance of DIMTEST was quite conservative in unidimensional conditions. The performance of DIMTEST in multidimensional conditions depended on each of the manipulated factors, and did suggest that the minima of sample size and test length can be made lower for some conditions. In terms of partitioning test items in unidimensional conditions, DETECT tended to produce longer assessment subtests than ATFIND in turn yielding different test partitions. In multidimensional conditions, test partitions became more similar and were more accurate with increased sample size, for factorially simple data, greater strength of dependence, and a decreased correlation between dimensions. Recommendations for sample size and test length minima are provided along with suggestions for future research.

ContributorsFay, Derek (Author) / Levy, Roy (Thesis advisor) / Green, Samuel (Committee member) / Gorin, Joanna (Committee member) / Arizona State University (Publisher)

Created2012

Modern psychometric theory in clinical assessment

Description

Item response theory (IRT) and related latent variable models represent modern psychometric theory, the successor to classical test theory in psychological assessment. While IRT has become prevalent in the assessment of ability and achievement, it has not been widely embraced by clinical psychologists. This appears due, in part, to psychometrists'…

Item response theory (IRT) and related latent variable models represent modern psychometric theory, the successor to classical test theory in psychological assessment. While IRT has become prevalent in the assessment of ability and achievement, it has not been widely embraced by clinical psychologists. This appears due, in part, to psychometrists' use of unidimensional models despite evidence that psychiatric disorders are inherently multidimensional. The construct validity of unidimensional and multidimensional latent variable models was compared to evaluate the utility of modern psychometric theory in clinical assessment. Archival data consisting of 688 outpatients' presenting concerns, psychiatric diagnoses, and item level responses to the Brief Symptom Inventory (BSI) were extracted from files at a university mental health clinic. Confirmatory factor analyses revealed that models with oblique factors and/or item cross-loadings better represented the internal structure of the BSI in comparison to a strictly unidimensional model. The models were generally equivalent in their ability to account for variance in criterion-related validity variables; however, bifactor models demonstrated superior validity in differentiating between mood and anxiety disorder diagnoses. Multidimensional IRT analyses showed that the orthogonal bifactor model partitioned distinct, clinically relevant sources of item variance. Similar results were also achieved through multivariate prediction with an oblique simple structure model. Receiver operating characteristic curves confirmed improved sensitivity and specificity through multidimensional models of psychopathology. Clinical researchers are encouraged to consider these and other comprehensive models of psychological distress.

ContributorsThomas, Michael Lee (Author) / Lanyon, Richard (Thesis advisor) / Barrera, Manuel (Committee member) / Levy, Roy (Committee member) / Millsap, Roger (Committee member) / Arizona State University (Publisher)

Created2011

Association between Student Engagement and Resilience in the Context of COVID-19

Description

During the global COVID-19 pandemic in 2020, many universities shifted their focus to hosting classes and events online for their student population in order to keep them engaged. The present study investigated whether an association exists between student engagement (an individual’s engagement with class and campus) and resilience. A single-shot…

During the global COVID-19 pandemic in 2020, many universities shifted their focus to hosting classes and events online for their student population in order to keep them engaged. The present study investigated whether an association exists between student engagement (an individual’s engagement with class and campus) and resilience. A single-shot survey was administered to 200 participants currently enrolled as undergraduate students at Arizona State University. A multiple regression analysis and Pearson correlations were calculated. A moderate, significant correlation was found between student engagement (total score) and resilience. A significant correlation was found between cognitive engagement (student’s approach and understanding of his learning) and resilience and between valuing and resilience. Contrary to expectations, participation was not associated with resilience. Potential explanations for these results were explored and practical applications for the university were discussed.

ContributorsEmmanuelli, Michelle (Author) / Jimenez Arista, Laura (Thesis director) / Sever, Amy (Committee member) / College of Integrative Sciences and Arts (Contributor) / Barrett, The Honors College (Contributor)

Created2021-05

Using Logistic Regression to Predict Stock Trends Based on Bag-of-Words Representations of News Article Headlines

Description

We attempted to apply a novel approach to stock market predictions. The Logistic Regression machine learning algorithm (Joseph Berkson) was applied to analyze news article headlines as represented by a bag-of-words (tri-gram and single-gram) representation in an attempt to predict the trends of stock prices based on the Dow Jones…

We attempted to apply a novel approach to stock market predictions. The Logistic Regression machine learning algorithm (Joseph Berkson) was applied to analyze news article headlines as represented by a bag-of-words (tri-gram and single-gram) representation in an attempt to predict the trends of stock prices based on the Dow Jones Industrial Average. The results showed that a tri-gram bag led to a 49% trend accuracy, a 1% increase when compared to the single-gram representation’s accuracy of 48%.

ContributorsBarolli, Adeiron (Author) / Jimenez Arista, Laura (Thesis director) / Wilson, Jeffrey (Committee member) / School of Life Sciences (Contributor) / Barrett, The Honors College (Contributor)

Created2021-05

The accuracy of accuracy estimates for single form dichotomous classification exams

Description

The use of exams for classification purposes has become prevalent across many fields including professional assessment for employment screening and standards based testing in educational settings. Classification exams assign individuals to performance groups based on the comparison of their observed test scores to a pre-selected criterion (e.g. masters vs. nonmasters…

The use of exams for classification purposes has become prevalent across many fields including professional assessment for employment screening and standards based testing in educational settings. Classification exams assign individuals to performance groups based on the comparison of their observed test scores to a pre-selected criterion (e.g. masters vs. nonmasters in dichotomous classification scenarios). The successful use of exams for classification purposes assumes at least minimal levels of accuracy of these classifications. Classification accuracy is an index that reflects the rate of correct classification of individuals into the same category which contains their true ability score. Traditional methods estimate classification accuracy via methods which assume that true scores follow a four-parameter beta-binomial distribution. Recent research suggests that Item Response Theory may be a preferable alternative framework for estimating examinees' true scores and may return more accurate classifications based on these scores. Researchers hypothesized that test length, the location of the cut score, the distribution of items, and the distribution of examinee ability would impact the recovery of accurate estimates of classification accuracy. The current simulation study manipulated these factors to assess their potential influence on classification accuracy. Observed classification as masters vs. nonmasters, true classification accuracy, estimated classification accuracy, BIAS, and RMSE were analyzed. In addition, Analysis of Variance tests were conducted to determine whether an interrelationship existed between levels of the four manipulated factors. Results showed small values of estimated classification accuracy and increased BIAS in accuracy estimates with few items, mismatched distributions of item difficulty and examinee ability, and extreme cut scores. A significant four-way interaction between manipulated variables was observed. In additional to interpretations of these findings and explanation of potential causes for the recovered values, recommendations that inform practice and avenues of future research are provided.

ContributorsKunze, Katie (Author) / Gorin, Joanna (Thesis advisor) / Levy, Roy (Thesis advisor) / Green, Samuel (Committee member) / Arizona State University (Publisher)

Created2013

Filtering by