Matching Items (6)
Filtering by

Clear all filters

Description
ABSTRACT

This study examines validity evidence of a state policy-directed teacher evaluation system implemented in Arizona during school year 2012-2013. The purpose was to evaluate the warrant for making high stakes, consequential judgments of teacher competence based on value-added (VAM) estimates of instructional impact and observations of professional practice (PP).

ABSTRACT

This study examines validity evidence of a state policy-directed teacher evaluation system implemented in Arizona during school year 2012-2013. The purpose was to evaluate the warrant for making high stakes, consequential judgments of teacher competence based on value-added (VAM) estimates of instructional impact and observations of professional practice (PP). The research also explores educator influence (voice) in evaluation design and the role information brokers have in local decision making. Findings are situated in an evidentiary and policy context at both the LEA and state policy levels.

The study employs a single-phase, concurrent, mixed-methods research design triangulating multiple sources of qualitative and quantitative evidence onto a single (unified) validation construct: Teacher Instructional Quality. It focuses on assessing the characteristics of metrics used to construct quantitative ratings of instructional competence and the alignment of stakeholder perspectives to facets implicit in the evaluation framework. Validity examinations include assembly of criterion, content, reliability, consequential and construct articulation evidences. Perceptual perspectives were obtained from teachers, principals, district leadership, and state policy decision makers. Data for this study came from a large suburban public school district in metropolitan Phoenix, Arizona.

Study findings suggest that the evaluation framework is insufficient for supporting high stakes, consequential inferences of teacher instructional quality. This is based, in part on the following: (1) Weak associations between VAM and PP metrics; (2) Unstable VAM measures across time and between tested content areas; (3) Less than adequate scale reliabilities; (4) Lack of coherence between theorized and empirical PP factor structures; (5) Omission/underrepresentation of important instructional attributes/effects; (6) Stakeholder concerns over rater consistency, bias, and the inability of test scores to adequately represent instructional competence; (7) Negative sentiments regarding the system's ability to improve instructional competence and/or student learning; (8) Concerns regarding unintended consequences including increased stress, lower morale, harm to professional identity, and restricted learning opportunities; and (9) The general lack of empowerment and educator exclusion from the decision making process. Study findings also highlight the value of information brokers in policy decision making and the importance of having access to unbiased empirical information during the design and implementation phases of important change initiatives.
ContributorsSloat, Edward F. (Author) / Wetzel, Keith (Thesis advisor) / Amrein-Beardsley, Audrey (Thesis advisor) / Ewbank, Ann (Committee member) / Shough, Lori (Committee member) / Arizona State University (Publisher)
Created2015
136787-Thumbnail Image.png
Description
There is a serious need for early childhood intervention practices for children who are living at or below the poverty line. Since 1965 Head Start has provided a federally funded, free preschool program for children in this population. The City of Phoenix Head Start program consists of nine delegate agencies,

There is a serious need for early childhood intervention practices for children who are living at or below the poverty line. Since 1965 Head Start has provided a federally funded, free preschool program for children in this population. The City of Phoenix Head Start program consists of nine delegate agencies, seven of which reside in school districts. These agencies are currently not conducting local longitudinal evaluations of their preschool graduates. The purpose of this study was to recommend initial steps the City of Phoenix grantee and the delegate agencies can take to begin a longitudinal evaluation process of their Head Start programs. Seven City of Phoenix Head Start agency directors were interviewed. These interviews provided information about the attitudes of the directors when considering longitudinal evaluations and how Head Start already evaluates their programs through internal assessments. The researcher also took notes on the Third Grade Follow-Up to the Head Start Executive Summary in order to make recommendations to the City of Phoenix Head Start programs about the best practices for longitudinal student evaluations.
Created2014-05
141447-Thumbnail Image.png
Description

Preventing heat-associated morbidity and mortality is a public health priority in Maricopa County, Arizona (United States). The objective of this project was to evaluate Maricopa County cooling centers and gain insight into their capacity to provide relief for the public during extreme heat events. During the summer of 2014, 53

Preventing heat-associated morbidity and mortality is a public health priority in Maricopa County, Arizona (United States). The objective of this project was to evaluate Maricopa County cooling centers and gain insight into their capacity to provide relief for the public during extreme heat events. During the summer of 2014, 53 cooling centers were evaluated to assess facility and visitor characteristics. Maricopa County staff collected data by directly observing daily operations and by surveying managers and visitors. The cooling centers in Maricopa County were often housed within community, senior, or religious centers, which offered various services for at least 1500 individuals daily. Many visitors were unemployed and/or homeless. Many learned about a cooling center by word of mouth or by having seen the cooling center’s location. The cooling centers provide a valuable service and reach some of the region’s most vulnerable populations. This project is among the first to systematically evaluate cooling centers from a public health perspective and provides helpful insight to community leaders who are implementing or improving their own network of cooling centers.

ContributorsBerisha, Vjollca (Author) / Hondula, David M. (Author) / Roach, Matthew (Author) / White, Jessica R. (Author) / McKinney, Benita (Author) / Bentz, Darcie (Author) / Mohamed, Ahmed (Author) / Uebelherr, Joshua (Author) / Goodin, Kate (Author)
Created2016-09-23
168438-Thumbnail Image.png
Description
In this mixed-methods study, I sought to design and develop a test delivery method to reduce linguistic bias in English-based mathematics tests. Guided by translanguaging, a recent linguistic theory recognizing the complexity of multilingualism, I designed a computer-based test delivery method allowing test-takers to toggle between English and their self-identified

In this mixed-methods study, I sought to design and develop a test delivery method to reduce linguistic bias in English-based mathematics tests. Guided by translanguaging, a recent linguistic theory recognizing the complexity of multilingualism, I designed a computer-based test delivery method allowing test-takers to toggle between English and their self-identified dominant language. This three-part study asks and answers research questions from all phases of the novel test delivery design. In the first phase, I conducted cognitive interviews with 11 Mandarin Chinese dominant speakers and 11 Spanish speaking dominant undergraduate students while taking a well-regarded calculus conceptual exam, the Precalculus Concept Assessment (PCA). In the second phase, I designed and developed the linguistically adaptive test (LAT) version of the PCA using the Concerto test delivery platform. In the third phase, I conducted a within-subjects random-assignment study of the efficacy the LAT. I also conducted in-depth interviews with a subset of the test-takers. Nine items on the PCA revealed linguistic issues during the cognitive interviews demonstrating the need to improve the linguistic bias on the test items. Additionally, the newly developed LAT demonstrated evidence of reliability and validity. However, the large-scale efficacy study showed that the LAT did not appear to make a significant difference in scores for dominant speakers of Spanish or dominant speakers of Mandarin Chinese. This finding held true for overall test scores as well as at the item level indicating that the LAT test delivery system does not appear to reduce linguistic bias in testing. Additionally, in-depth interviews revealed that many students felt that the linguistically adaptive test was either the same or essentially the same as the non-LAT version of the test. Some participants felt that the toggle button was not necessary if they could understand the mathematics item well enough. As one participant noted, “It's math, It's math. It doesn't matter if it's in English or in Spanish.” This dissertation concludes with a discussion about the implications for test developers and suggestions for future direction of study.
ContributorsClose, Kevin (Author) / Zheng, Yi (Thesis advisor) / Amrein-Beardsley, Audrey (Thesis advisor) / Anderson, Kate (Committee member) / Arizona State University (Publisher)
Created2021
190787-Thumbnail Image.png
Description
This study investigated the impact of learning about cultural intelligence (CQ) from senior U.S. Army Special Forces leaders (Group Commanders and Group Command Sergeants Major) on aspiring Special Forces Captains (students) at the Captains Career Course. Three research questions addressed the influence of senior leader interventions on students’ CQ scores,

This study investigated the impact of learning about cultural intelligence (CQ) from senior U.S. Army Special Forces leaders (Group Commanders and Group Command Sergeants Major) on aspiring Special Forces Captains (students) at the Captains Career Course. Three research questions addressed the influence of senior leader interventions on students’ CQ scores, motivation to work with partner forces, and intentions to improve CQ. The study involved quantitative and qualitative data for each of the three comparison groups: control, face-to-face (in-person interaction with senior leaders), and podcast (audio-only recordings). The quantitative data measured CQ capabilities of motivation, cognition, metacognition, and behavior. Descriptive statistics revealed that from the pre-test to the post-test, the control and podcast groups experienced increased self-assessment scores on all four constructs but decreased observer assessment scores. By contrast, the face-to-face group experienced both a decrease in observer assessment scores as well as a marginal decrease in self-assessment scores (on motivation and metacognition). Exploring motivation to work with partner forces, analysis of the group interview transcripts revealed that the control group attributed their motivation primarily to their prior experiences, while participants in the face-to-face group reported mixed feelings regarding prior experiences but highlighted the impact of senior Special Forces leaders' stories on their motivation. The podcast group credited their course experience and the senior leaders' narratives for their increased motivation. Examining the influence of senior leader stories on intent to improve CQ, the control group provided generic responses focused on improving cognition. The face-to-face group offered more specific, action-oriented answers emphasizing business systems, sociolinguistics, and cultural values. The podcast group produced varying responses, with some sharing basic intent and others detailing specific strategies such as language fluency and cultural immersion. Participants across all three groups expressed a strong intention to seek out mentorship and stories from experienced individuals. In conclusion, this study highlights the myriad influences on aspiring Special Forces Captains' CQ and the multifaceted impact of senior Special Forces leaders' stories. The narratives contributed to increased motivation, deeper understanding of the Special Forces mission, and specific strategies for improving CQ, providing valuable insights for military education and training programs.
ContributorsKohistany, Mahboba Lyla (Author) / Dorn, Sherman (Thesis advisor) / Livermore, David (Committee member) / Amrein-Beardsley, Audrey (Committee member) / Arizona State University (Publisher)
Created2023
158108-Thumbnail Image.png
Description
Over the past 20 years in the United States (U.S.), teachers have seen a marked

shift in how teacher evaluation policies govern the evaluation of their performance.

Spurred by federal mandates, teachers have been increasingly held accountable for their

students’ academic achievement, most notably through the use of value-added models

Over the past 20 years in the United States (U.S.), teachers have seen a marked

shift in how teacher evaluation policies govern the evaluation of their performance.

Spurred by federal mandates, teachers have been increasingly held accountable for their

students’ academic achievement, most notably through the use of value-added models

(VAMs)—a statistically complex tool that aims to isolate and then quantify the effect of

teachers on their students’ achievement. This increased focus on accountability ultimately

resulted in numerous lawsuits across the U.S. where teachers protested what they felt

were unfair evaluations informed by invalid, unreliable, and biased measures—most

notably VAMs.

While New Mexico’s teacher evaluation system was labeled as a “gold standard”

due to its purported ability to objectively and accurately differentiate between effective

and ineffective teachers, in 2015, teachers filed suit contesting the fairness and accuracy

of their evaluations. Amrein-Beardsley and Geiger’s (revise and resubmit) initial analyses

of the state’s teacher evaluation data revealed that the four individual measures

comprising teachers’ overall evaluation scores showed evidence of bias, and specifically,

teachers who taught in schools with different student body compositions (e.g., special

education students, poorer students, gifted students) had significantly different scores

than their peers. The purpose of this study was to expand upon these prior analyses by

investigating whether those conclusions still held true when controlling for a variety of

confounding factors at the school, class, and teacher levels, as such covariates were not

included in prior analyses.



Results from multiple linear regression analyses indicated that, overall, the

measures used to inform New Mexico teachers’ overall evaluation scores still showed

evidence of bias by school-level student demographic factors, with VAMs potentially

being the most susceptible and classroom observations being the least. This study is

especially unique given the juxtaposition of such a highly touted evaluation system also

being one where teachers contested its constitutionality. Study findings are important for

all education stakeholders to consider, especially as teacher evaluation systems and

related policies continue to be transformed.
ContributorsGeiger, Tray (Author) / Amrein-Beardsley, Audrey (Thesis advisor) / Anderson, Kate (Committee member) / McGuire, Keon (Committee member) / Holloway, Jessica (Committee member) / Arizona State University (Publisher)
Created2020