ASU Electronic Theses and Dissertations
This collection includes most of the ASU Theses and Dissertations from 2011 to present. ASU Theses and Dissertations are available in downloadable PDF format; however, a small percentage of items are under embargo. Information about the dissertations/theses includes degree information, committee members, an abstract, supporting data or media.
In addition to the electronic theses found in the ASU Digital Repository, ASU Theses and Dissertations can be found in the ASU Library Catalog.
Dissertations and Theses granted by Arizona State University are archived and made available through a joint effort of the ASU Graduate College and the ASU Libraries. For more information or questions about this collection contact or visit the Digital Repository ETD Library Guide or contact the ASU Graduate College at gradformat@asu.edu.
This thesis develops a unique type of textual features that generalize triplets extracted from text, by clustering them into high-level concepts. These concepts are utilized as features to detect frames in text. Compared to uni-gram and bi-gram based models, classification and clustering using generalized concepts yield better discriminating features and a higher classification accuracy with a 12% boost (i.e. from 74% to 83% F-measure) and 0.91 clustering purity for Frame/Non-Frame detection.
The automatic discovery of complex causal chains among interlinked events and their participating actors has not yet been thoroughly studied. Previous studies related to extracting causal relationships from text were based on laborious and incomplete hand-developed lists of explicit causal verbs, such as “causes" and “results in." Such approaches result in limited recall because standard causal verbs may not generalize well to accommodate surface variations in texts when different keywords and phrases are used to express similar causal effects. Therefore, I present a system that utilizes generalized concepts to extract causal relationships. The proposed algorithms overcome surface variations in written expressions of causal relationships and discover the domino effects between climate events and human security. This semi-supervised approach alleviates the need for labor intensive keyword list development and annotated datasets. Experimental evaluations by domain experts achieve an average precision of 82%. Qualitative assessments of causal chains show that results are consistent with the 2014 IPCC report illuminating causal mechanisms underlying the linkages between climatic stresses and social instability.
In this dissertation, I seek to advance both the knowledge of limitations in current technologies used in practice as well as the mechanisms that can be used for large-scale support. The overall research question I explore is: “How can we support large-scale creative collaboration in distributed online communities?” I first advance existing support techniques by evaluating the impact of active support in brainstorming performance. Furthermore, I leverage existing theoretical models of individual idea generation as well as recommender system techniques to design CrowdMuse, a novel adaptive large-scale idea generation system. CrowdMuse models users in order to adapt itself to each individual. I evaluate the system’s efficacy through two large-scale studies. I also advance knowledge of current large-scale practices by examining common communication channels under the lens of Creativity Support Tools, yielding a list of creativity bottlenecks brought about by the affordances of these channels. Finally, I connect both ends of this dissertation by deploying CrowdMuse in an Open Source online community for two weeks. I evaluate their usage of the system as well as its perceived benefits and issues compared to traditional communication tools.
This dissertation makes the following contributions to the field of large-scale creativity: 1) the design and evaluation of a first-of-its-kind adaptive brainstorming system; 2) the evaluation of the effects of active inspirations compared to simple idea exposure; 3) the development and application of a set of creativity support design heuristics to uncover creativity bottlenecks; and 4) an exploration of large-scale brainstorming systems’ usefulness to online communities.
This dissertation has laid the groundwork for studying these relationships and applying them to three real-world problems. In criminal modeling, inductive and deductive reasonings are applied to early prediction of violent criminal gang members. To address this problem the features derived from the co-arrestee social network as well as geographical and temporal features are leveraged. Then, a data-driven variant of geospatial abductive inference is studied in missing person problem to locate the missing person. Finally, induction and abduction reasonings are studied for identifying pathogenic accounts of a cascade in social networks.
In this dissertation, three methods that facilitate the testing and verification process for CPS are presented:
1. A graphical formalism and tool which enables the elicitation of formal requirements. To evaluate the performance of the tool, a usability study is conducted.
2. A parameter mining method to infer, analyze, and visually represent falsifying ranges for parametrized system specifications.
3. A notion of conformance between a CPS model and implementation along with a testing framework.
The methods are evaluated over high-fidelity case studies from the industry.
The new spatially explicit counterfactual framework considers how spatial effects impact treatment choice, treatment variation, and treatment effects. To illustrate this new methodological framework, I first replicate a classic quasi-experimental study that evaluates the effect of drinking age policy on mortality in the United States from 1970 to 1984, and further extend it with a spatial perspective. In another example, I evaluate food access dynamics in Chicago from 2007 to 2014 by implementing advanced spatial analytics that better account for the complex patterns of food access, and quasi-experimental research design to distill the impact of the Great Recession on the foodscape. Inference interpretation is sensitive to both research design framing and underlying processes that drive geographically distributed relationships. Finally, I advance a new Spatial Data Science Infrastructure to integrate and manage data in dynamic, open environments for public health systems research and decision- making. I demonstrate an infrastructure prototype in a final case study, developed in collaboration with health department officials and community organizations.
Researchers and practitioners use social media to extract actionable patterns such as where aid should be distributed in a crisis. However, the validity of these patterns relies on having a representative dataset. As this dissertation shows, the data collected from social media is seldom representative of the activity of the site itself, and less so of human activity. This means that the results of many studies are limited by the quality of data they collect.
The finding that social media data is biased inspires the main challenge addressed by this thesis. I introduce three sets of methodologies to correct for bias. First, I design methods to deal with data collection bias. I offer a methodology which can find bias within a social media dataset. This methodology works by comparing the collected data with other sources to find bias in a stream. The dissertation also outlines a data collection strategy which minimizes the amount of bias that will appear in a given dataset. It introduces a crawling strategy which mitigates the amount of bias in the resulting dataset. Second, I introduce a methodology to identify bots and shills within a social media dataset. This directly addresses the concern that the users of a social media site are not representative. Applying these methodologies allows the population under study on a social media site to better match that of the real world. Finally, the dissertation discusses perceptual biases, explains how they affect analysis, and introduces computational approaches to mitigate them.
The results of the dissertation allow for the discovery and removal of different levels of bias within a social media dataset. This has important implications for social media mining, namely that the behavioral patterns and insights extracted from social media will be more representative of the populations under study.