Matching Items (28)

153282-Thumbnail Image.png

Improving Arizona English language learners' mathematics achievement using curriculum-based measures

Description

ABSTRACT

This study was an investigation of the effectiveness of curriculum-based measures (CBMs) on the math achievement of first and second grade English Language Learners (ELL). The No Child Left Behind

ABSTRACT

This study was an investigation of the effectiveness of curriculum-based measures (CBMs) on the math achievement of first and second grade English Language Learners (ELL). The No Child Left Behind Act (NCLB) of 2001 led to a new educational reform, which identifies and provides services to students in need of academic support based on English language proficiency. Students are from certain demographics: minorities, low-income families, students with disabilities, and students with limited English proficiency. NCLB intended to lead as to improvement in the quality of the United States educational system.

Four classes from the community of Kayenta, Arizona in the Navajo Nation were randomly assigned to control and experimental groups, one each per grade. All four classes used the state-approved, core math curriculum, but one class in each grade was provided with weekly CBMs for an entire school year that included sample questions developed from the Arizona Department of Education performance standards. The CBMs contained at least one question from each of the five math strands: number and operations, algebra, geometry, measurement, and data and probability.

The NorthWest Evaluation Assessment (NWEA) served as the pretest and posttest for all four groups. The SAT 10 (RIT scores) math test, administered near the time of the pretest, served as the covariate in the analysis. Two analysis of covariance tests revealed no statistically significant treatment effects, subject gender effects, or interactions for either Grade 1 or Grade 2. Achievement levels were relatively constant across both genders and the two grade levels.

Despite increasing emphasis on assessment and accountability, the achievement gaps between these subpopulations and the general population of students continues to widen. It appears that other variables are responsible for the different achievement levels found among students. Researchers have found that teachers with math certification, degrees related to math, and advanced course work in math leads to improved math performance over students of teachers who lack those qualifications. The design of the current study did not permit analyses of teacher or school effects.

Contributors

Agent

Created

Date Created
  • 2014

156658-Thumbnail Image.png

Implications of Learning Outcomes of In-Person and Virtual Field-Based Geoscience Instruction at Grand Canyon National Park

Description

Education through field exploration is fundamental in geoscience. But not all students enjoy equal access to field-based learning because of time, cost, distance, ability, and safety constraints. At the same

Education through field exploration is fundamental in geoscience. But not all students enjoy equal access to field-based learning because of time, cost, distance, ability, and safety constraints. At the same time, technological advances afford ever more immersive, rich, and student-centered virtual field experiences. Virtual field trips may be the only practical options for most students to explore pedagogically rich but inaccessible places. A mixed-methods research project was conducted on an introductory and an advanced geology class to explore the implications of learning outcomes of in-person and virtual field-based instruction at Grand Canyon National Park. The study incorporated the Great Unconformity in the Grand Canyon, a 1.2 billion year break in the rock record; the Trail of Time, an interpretive walking timeline; and two immersive, interactive virtual field trips (iVFTs). The in-person field trip (ipFT) groups collectively explored the canyon and took an instructor-guided inquiry hike along the interpretive Trail of Time from rim level, while iVFT students individually explored the canyon and took a guided-inquiry virtual tour of Grand Canyon geology from river level. High-resolution 360° spherical images anchor the iVFTs and serve as a framework for programmed overlays that enable interactivity and allow the iVFT to provide feedback in response to student actions. Students in both modalities received pre- and post-trip Positive and Negative Affect Schedules (PANAS). The iVFT students recorded pre- to post-trip increases in positive affect (PA) scores and decreases in negative (NA) affect scores, representing an affective state conducive to learning. Pre- to post-trip mean scores on concept sketches used to assess visualization and geological knowledge increased for both classes and modalities. However, the iVFT pre- to post-trip increases were three times greater (statistically significant) than the ipFT gains. Both iVFT and ipFT students scored 92-98% on guided-inquiry worksheets completed during the trips, signifying both met learning outcomes. Virtual field trips do not trump traditional in-person field work, but they can meet and/or exceed similar learning objectives and may replace an inaccessible or impractical in-person field trip.

Contributors

Agent

Created

Date Created
  • 2018

158108-Thumbnail Image.png

The Land of Disenchantment: Bias in New Mexico Teacher Evaluation Measures

Description

Over the past 20 years in the United States (U.S.), teachers have seen a marked

shift in how teacher evaluation policies govern the evaluation of their performance.

Spurred by federal

Over the past 20 years in the United States (U.S.), teachers have seen a marked

shift in how teacher evaluation policies govern the evaluation of their performance.

Spurred by federal mandates, teachers have been increasingly held accountable for their

students’ academic achievement, most notably through the use of value-added models

(VAMs)—a statistically complex tool that aims to isolate and then quantify the effect of

teachers on their students’ achievement. This increased focus on accountability ultimately

resulted in numerous lawsuits across the U.S. where teachers protested what they felt

were unfair evaluations informed by invalid, unreliable, and biased measures—most

notably VAMs.

While New Mexico’s teacher evaluation system was labeled as a “gold standard”

due to its purported ability to objectively and accurately differentiate between effective

and ineffective teachers, in 2015, teachers filed suit contesting the fairness and accuracy

of their evaluations. Amrein-Beardsley and Geiger’s (revise and resubmit) initial analyses

of the state’s teacher evaluation data revealed that the four individual measures

comprising teachers’ overall evaluation scores showed evidence of bias, and specifically,

teachers who taught in schools with different student body compositions (e.g., special

education students, poorer students, gifted students) had significantly different scores

than their peers. The purpose of this study was to expand upon these prior analyses by

investigating whether those conclusions still held true when controlling for a variety of

confounding factors at the school, class, and teacher levels, as such covariates were not

included in prior analyses.

Results from multiple linear regression analyses indicated that, overall, the

measures used to inform New Mexico teachers’ overall evaluation scores still showed

evidence of bias by school-level student demographic factors, with VAMs potentially

being the most susceptible and classroom observations being the least. This study is

especially unique given the juxtaposition of such a highly touted evaluation system also

being one where teachers contested its constitutionality. Study findings are important for

all education stakeholders to consider, especially as teacher evaluation systems and

related policies continue to be transformed.

Contributors

Agent

Created

Date Created
  • 2020

151036-Thumbnail Image.png

The factor structure of curriculum-based writing indices at Grades 3, 7, and 10

Description

National assessment data indicate that the large majority of students in America perform below expected proficiency levels in the area of writing. Given the importance of writing skills, this is

National assessment data indicate that the large majority of students in America perform below expected proficiency levels in the area of writing. Given the importance of writing skills, this is a significant problem. Curriculum-based measurement, when used for progress monitoring and intervention planning, has been shown to lead to improved academic achievement. However, researchers have not yet been able to establish the validity of curriculum-based measures of writing (CBM-W). This study examined the structural validity of CBM-W using exploratory factor analysis. The participants for this study were 253 third, 154 seventh, and 154 tenth grade students. Each participant completed a 3-minute writing sample in response to a narrative prompt. The writing samples were scored for fifteen different CBM-W indices. Separate analyses were conducted for each grade level to examine differences in the CBM-W construct across grade levels. Due to extreme multicollinearity, principal components analysis rather than common factor analysis was used to examine the structure of writing as measured by CBM-W indices. The overall structure of CBM-W indices was found to remain stable across grade levels. In all cases a three-component solution was supported, with the components being labeled production, accuracy, and sentence complexity. Limitations of the study and implications for progress monitoring with CBM-W are discussed, including the recommendation for a combination of variables that may provide more reliable and valid measurement of the writing construct.

Contributors

Agent

Created

Date Created
  • 2012

150759-Thumbnail Image.png

Screening in school-wide positive behavior supports: methodogical comparisons

Description

Many schools have adopted programming designed to promote students' behavioral aptitude. A specific type of programming with this focus is School Wide Positive Behavior Supports (SWPBS), which combines positive behavior

Many schools have adopted programming designed to promote students' behavioral aptitude. A specific type of programming with this focus is School Wide Positive Behavior Supports (SWPBS), which combines positive behavior techniques with a system wide problem solving model. Aspects of this model are still being developed in the research community, including assessment techniques which aid the decision making process. Tools for screening entire student populations are examples of such assessment interests. Although screening tools which have been described as "empirically validated" and "cost effective" have been around since at least 1991, they have yet to become standard practice (Lane, Gresham, & O'Shaughnessy 2002). The lack of widespread implementation to date raises questions regarding their ecological validity and actual cost-effectiveness, leaving the development of useful tools for screening an ongoing project for many researchers. It may be beneficial for educators to expand the range of measurement to include tools which measure the symptoms at the root of the problematic behaviors. Lane, Grasham, and O'Shaughnessy (2002) note the possibility that factors from within a student, including those that are cognitive in nature, may influence not only his or her academic performance, but also aspects of behavior. A line of logic follows wherein measurement of those factors may aid the early identification of students at risk for developing disorders with related symptoms. The validity and practicality of various tools available for screening in SWPBS were investigated, including brief behavior rating scales completed by parents and teachers, as well as performance tasks borrowed from the field of neuropsychology. All instruments showed an ability to predict children's behavior, although not to equal extents. A discussion of practicality and predictive utility of each instrument follows.

Contributors

Agent

Created

Date Created
  • 2012

150522-Thumbnail Image.png

Native American students' perceptions of high-stakes testing in New Mexico

Description

Given the political and public demands for accountability, using the voices of students from the frontlines, this study investigated student perceptions of New Mexico's high-stakes testing program taking public schools

Given the political and public demands for accountability, using the voices of students from the frontlines, this study investigated student perceptions of New Mexico's high-stakes testing program taking public schools in the right direction. Did the students perceive the program having an impact on retention, drop outs, or graduation requirements? What were the perceptions of Navajo students in Navajo reservation schools as to the impact of high-stakes testing on their emotional, physical, social, and academic well-being? The specific tests examined were the New Mexico High School Competency Exam (NMHSCE) and the New Mexico Standard Based Assessment (SBA/ High School Graduation Assessment) on Native American students. Based on interviews published by the Daily Times of Farmington, New Mexico, our local newspaper, some of the students reported that the testing program was not taking schools in the right direction, that the test was used improperly, and that the one-time test scores were not an accurate assessment of students learning. In addition, they were cited on negative and positive effects on the curriculum, teaching and learning, and student and teacher motivation. Based on the survey results, the students' positive and negative concerns and praises of high-stakes testing were categorized into themes. The positive effects cited included the fact that the testing held students, educators, and parents accountable for their actions. The students were not opposed to accountability, but rather, opposed to the manner in which it was currently implemented. Several implications of these findings were examined: (a) requirements to pass the New Mexico High School Competency Exam; (b) what high stakes testing meant for the emotional well-being of the students; (c) the impact of sanctions under New Mexico's high-stakes testing proficiency; and (d) the effects of high-stakes tests on students' perceptions, experiences and attitudes. Student voices are not commonly heard in meetings and discussions about K-12 education policy. Yet, the adults who control policy could learn much from listening to what students have to say about their experiences.

Contributors

Agent

Created

Date Created
  • 2012

150518-Thumbnail Image.png

Assessment of item parameter drift of known items in a university placement exam

Description

ABSTRACT This study investigated the possibility of item parameter drift (IPD) in a calculus placement examination administered to approximately 3,000 students at a large university in the United States. A

ABSTRACT This study investigated the possibility of item parameter drift (IPD) in a calculus placement examination administered to approximately 3,000 students at a large university in the United States. A single form of the exam was administered continuously for a period of two years, possibly allowing later examinees to have prior knowledge of specific items on the exam. An analysis of IPD was conducted to explore evidence of possible item exposure. Two assumptions concerning items exposure were made: 1) item recall and item exposure are positively correlated, and 2) item exposure results in the items becoming easier over time. Special consideration was given to two contextual item characteristics: 1) item location within the test, specifically items at the beginning and end of the exam, and 2) the use of an associated diagram. The hypotheses stated that these item characteristics would make the items easier to recall and, therefore, more likely to be exposed, resulting in item drift. BILOG-MG 3 was used to calibrate the items and assess for IPD. No evidence was found to support the hypotheses that the items located at the beginning of the test or with an associated diagram drifted as a result of item exposure. Three items among the last ten on the exam drifted significantly and became easier, consistent with item exposure. However, in this study, the possible effects of item exposure could not be separated from the effects of other potential factors such as speededness, curriculum changes, better test preparation on the part of subsequent examinees, or guessing.

Contributors

Agent

Created

Date Created
  • 2012

156311-Thumbnail Image.png

Norming at scale: faculty perceptions of assessment culture and student learning outcomes assessment

Description

To foster both external and internal accountability, universities seek more effective models for student learning outcomes assessment (SLOA). Meaningful and authentic measurement of program-level student learning outcomes requires engagement with

To foster both external and internal accountability, universities seek more effective models for student learning outcomes assessment (SLOA). Meaningful and authentic measurement of program-level student learning outcomes requires engagement with an institution’s faculty members, especially to gather student performance assessment data using common scoring instruments, or rubrics, across a university’s many colleges and programs. Too often, however, institutions rely on faculty engagement for SLOA initiatives like this without providing necessary support, communication, and training. The resulting data may lack sufficient reliability and reflect deficiencies in an institution’s culture of assessment.

This mixed methods action research study gauged how well one form of SLOA training – a rubric-norming workshop – could affect both inter-rater reliability for faculty scorers and faculty perceptions of SLOA while exploring the nature of faculty collaboration toward a shared understanding of student learning outcomes. The study participants, ten part-time faculty members at the institution, each held primary careers in the health care industry, apart from their secondary role teaching university courses. Accordingly, each contributed expertise and experience to the rubric-norming discussions, surveys of assessment-related perceptions, and individual scoring of student performance with a common rubric. Drawing on sociocultural learning principles and the specific lens of activity theory, influences on faculty SLOA were arranged and analyzed within the heuristic framework of an activity system to discern effects of collaboration and perceptions toward SLOA on consistent rubric-scoring by faculty participants.

Findings suggest participation in the study did not correlate to increased inter-rater reliability for faculty scorers when using the common rubric. Constraints found within assessment tools and unclear institutional leadership prevented more reliable use of common rubrics. Instead, faculty participants resorted to individual assessment approaches to meaningfully guide students to classroom achievement and preparation for careers in the health care field. Despite this, faculty participants valued SLOA, collaborated readily with colleagues for shared assessment goals, and worked hard to teach and assess students meaningfully.

Contributors

Agent

Created

Date Created
  • 2018

155025-Thumbnail Image.png

Multiple imputation for two-level hierarchical models with categorical variables and missing at random data

Description

Accurate data analysis and interpretation of results may be influenced by many potential factors. The factors of interest in the current work are the chosen analysis model(s), the presence of

Accurate data analysis and interpretation of results may be influenced by many potential factors. The factors of interest in the current work are the chosen analysis model(s), the presence of missing data, and the type(s) of data collected. If analysis models are used which a) do not accurately capture the structure of relationships in the data such as clustered/hierarchical data, b) do not allow or control for missing values present in the data, or c) do not accurately compensate for different data types such as categorical data, then the assumptions associated with the model have not been met and the results of the analysis may be inaccurate. In the presence of clustered
ested data, hierarchical linear modeling or multilevel modeling (MLM; Raudenbush & Bryk, 2002) has the ability to predict outcomes for each level of analysis and across multiple levels (accounting for relationships between levels) providing a significant advantage over single-level analyses. When multilevel data contain missingness, multilevel multiple imputation (MLMI) techniques may be used to model both the missingness and the clustered nature of the data. With categorical multilevel data with missingness, categorical MLMI must be used. Two such routines for MLMI with continuous and categorical data were explored with missing at random (MAR) data: a formal Bayesian imputation and analysis routine in JAGS (R/JAGS) and a common MLM procedure of imputation via Bayesian estimation in BLImP with frequentist analysis of the multilevel model in Mplus (BLImP/Mplus). Manipulated variables included interclass correlations, number of clusters, and the rate of missingness. Results showed that with continuous data, R/JAGS returned more accurate parameter estimates than BLImP/Mplus for almost all parameters of interest across levels of the manipulated variables. Both R/JAGS and BLImP/Mplus encountered convergence issues and returned inaccurate parameter estimates when imputing and analyzing dichotomous data. Follow-up studies showed that JAGS and BLImP returned similar imputed datasets but the choice of analysis software for MLM impacted the recovery of accurate parameter estimates. Implications of these findings and recommendations for further research will be discussed.

Contributors

Agent

Created

Date Created
  • 2016

153060-Thumbnail Image.png

Integration of traditional assessment and response to intervention in psychoeducational evaluations of culturally and linguistically diverse students

Description

The popularity of response-to-intervention (RTI) frameworks of service delivery has increased in recent years. Scholars have speculated that RTI may be particularly relevant to the special education assessment process for

The popularity of response-to-intervention (RTI) frameworks of service delivery has increased in recent years. Scholars have speculated that RTI may be particularly relevant to the special education assessment process for culturally and linguistically diverse (CLD) students, due to its suspected utility in ruling out linguistic proficiency as the primary factor in learning difficulties. The present study explored how RTI and traditional assessment methods were integrated into the psychoeducational evaluation process for students suspected of having specific learning disabilities (SLD). The content of psychoeducational evaluation reports completed on students who were found eligible for special education services under the SLD category from 2009-2013 was analyzed. Two main research questions were addressed: how RTI influenced the psychoeducational evaluation process, and how this process differed for CLD and non-CLD students. Findings indicated variability in the incorporation of RTI in evaluation reports, with an increase across time in the tendency to reference the prereferral intervention process. However, actual RTI data was present in a minority of reports, with the inclusion of such data more common for reading than other academic areas, as well as more likely for elementary students than secondary students. Contrary to expectations, RTI did not play a larger role in evaluation reports for CLD students than reports for non-CLD students. Evaluations of CLD students also did not demonstrate greater variability in the use of traditional assessments, and were more likely to rely on nonverbal cognitive measures than evaluations of non-CLD students. Methods by which practitioners addressed linguistic proficiency were variable, with parent input, educational history, and individually-administered proficiency test data commonly used. Assessment practices identified in this study are interpreted in the context of best practice recommendations.

Contributors

Agent

Created

Date Created
  • 2014