This collection includes both ASU Theses and Dissertations, submitted by graduate students, and the Barrett, Honors College theses submitted by undergraduate students. 

Displaying 1 - 10 of 105
151718-Thumbnail Image.png
Description
The increasing popularity of Twitter renders improved trustworthiness and relevance assessment of tweets much more important for search. However, given the limitations on the size of tweets, it is hard to extract measures for ranking from the tweet's content alone. I propose a method of ranking tweets by generating a

The increasing popularity of Twitter renders improved trustworthiness and relevance assessment of tweets much more important for search. However, given the limitations on the size of tweets, it is hard to extract measures for ranking from the tweet's content alone. I propose a method of ranking tweets by generating a reputation score for each tweet that is based not just on content, but also additional information from the Twitter ecosystem that consists of users, tweets, and the web pages that tweets link to. This information is obtained by modeling the Twitter ecosystem as a three-layer graph. The reputation score is used to power two novel methods of ranking tweets by propagating the reputation over an agreement graph based on tweets' content similarity. Additionally, I show how the agreement graph helps counter tweet spam. An evaluation of my method on 16~million tweets from the TREC 2011 Microblog Dataset shows that it doubles the precision over baseline Twitter Search and achieves higher precision than current state of the art method. I present a detailed internal empirical evaluation of RAProp in comparison to several alternative approaches proposed by me, as well as external evaluation in comparison to the current state of the art method.
ContributorsRavikumar, Srijith (Author) / Kambhampati, Subbarao (Thesis advisor) / Davulcu, Hasan (Committee member) / Liu, Huan (Committee member) / Arizona State University (Publisher)
Created2013
152236-Thumbnail Image.png
Description
Continuous Delivery, as one of the youngest and most popular member of agile model family, has become a popular concept and method in software development industry recently. Instead of the traditional software development method, which requirements and solutions must be fixed before starting software developing, it promotes adaptive planning, evolutionary

Continuous Delivery, as one of the youngest and most popular member of agile model family, has become a popular concept and method in software development industry recently. Instead of the traditional software development method, which requirements and solutions must be fixed before starting software developing, it promotes adaptive planning, evolutionary development and delivery, and encourages rapid and flexible response to change. However, several problems prevent Continuous Delivery to be introduced into education world. Taking into the consideration of the barriers, we propose a new Cloud based Continuous Delivery Software Developing System. This system is designed to fully utilize the whole life circle of software developing according to Continuous Delivery concepts in a virtualized environment in Vlab platform.
ContributorsDeng, Yuli (Author) / Huang, Dijiang (Thesis advisor) / Davulcu, Hasan (Committee member) / Chen, Yinong (Committee member) / Arizona State University (Publisher)
Created2013
151371-Thumbnail Image.png
Description
This dissertation presents the Temporal Event Query Language (TEQL), a new language for querying event streams. Event Stream Processing enables online querying of streams of events to extract relevant data in a timely manner. TEQL enables querying of interval-based event streams using temporal database operators. Temporal databases and temporal query

This dissertation presents the Temporal Event Query Language (TEQL), a new language for querying event streams. Event Stream Processing enables online querying of streams of events to extract relevant data in a timely manner. TEQL enables querying of interval-based event streams using temporal database operators. Temporal databases and temporal query languages have been a subject of research for more than 30 years and are a natural fit for expressing queries that involve a temporal dimension. However, operators developed in this context cannot be directly applied to event streams. The research extends a preexisting relational framework for event stream processing to support temporal queries. The language features and formal semantic extensions to extend the relational framework are identified. The extended framework supports continuous, step-wise evaluation of temporal queries. The incremental evaluation of TEQL operators is formalized to avoid re-computation of previous results. The research includes the development of a prototype that supports the integrated event and temporal query processing framework, with support for incremental evaluation and materialization of intermediate results. TEQL enables reporting temporal data in the output, direct specification of conditions over timestamps, and specification of temporal relational operators. Through the integration of temporal database operators with event languages, a new class of temporal queries is made possible for querying event streams. New features include semantic aggregation, extraction of temporal patterns using set operators, and a more accurate specification of event co-occurrence.
ContributorsShiva, Foruhar Ali (Author) / Urban, Susan D (Thesis advisor) / Chen, Yi (Thesis advisor) / Davulcu, Hasan (Committee member) / Sarjoughian, Hessam S. (Committee member) / Arizona State University (Publisher)
Created2012
151275-Thumbnail Image.png
Description
The pay-as-you-go economic model of cloud computing increases the visibility, traceability, and verifiability of software costs. Application developers must understand how their software uses resources when running in the cloud in order to stay within budgeted costs and/or produce expected profits. Cloud computing's unique economic model also leads naturally to

The pay-as-you-go economic model of cloud computing increases the visibility, traceability, and verifiability of software costs. Application developers must understand how their software uses resources when running in the cloud in order to stay within budgeted costs and/or produce expected profits. Cloud computing's unique economic model also leads naturally to an earn-as-you-go profit model for many cloud based applications. These applications can benefit from low level analyses for cost optimization and verification. Testing cloud applications to ensure they meet monetary cost objectives has not been well explored in the current literature. When considering revenues and costs for cloud applications, the resource economic model can be scaled down to the transaction level in order to associate source code with costs incurred while running in the cloud. Both static and dynamic analysis techniques can be developed and applied to understand how and where cloud applications incur costs. Such analyses can help optimize (i.e. minimize) costs and verify that they stay within expected tolerances. An adaptation of Worst Case Execution Time (WCET) analysis is presented here to statically determine worst case monetary costs of cloud applications. This analysis is used to produce an algorithm for determining control flow paths within an application that can exceed a given cost threshold. The corresponding results are used to identify path sections that contribute most to cost excess. A hybrid approach for determining cost excesses is also presented that is comprised mostly of dynamic measurements but that also incorporates calculations that are based on the static analysis approach. This approach uses operational profiles to increase the precision and usefulness of the calculations.
ContributorsBuell, Kevin, Ph.D (Author) / Collofello, James (Thesis advisor) / Davulcu, Hasan (Committee member) / Lindquist, Timothy (Committee member) / Sen, Arunabha (Committee member) / Arizona State University (Publisher)
Created2012
151467-Thumbnail Image.png
Description
A semiconductor supply chain modeling and simulation platform using Linear Program (LP) optimization and parallel Discrete Event System Specification (DEVS) process models has been developed in a joint effort by ASU and Intel Corporation. A Knowledge Interchange Broker (KIBDEVS/LP) was developed to broker information synchronously between the DEVS and LP

A semiconductor supply chain modeling and simulation platform using Linear Program (LP) optimization and parallel Discrete Event System Specification (DEVS) process models has been developed in a joint effort by ASU and Intel Corporation. A Knowledge Interchange Broker (KIBDEVS/LP) was developed to broker information synchronously between the DEVS and LP models. Recently a single-echelon heuristic Inventory Strategy Module (ISM) was added to correct for forecast bias in customer demand data using different smoothing techniques. The optimization model could then use information provided by the forecast model to make better decisions for the process model. The composition of ISM with LP and DEVS models resulted in the first realization of what is now called the Optimization Simulation Forecast (OSF) platform. It could handle a single echelon supply chain system consisting of single hubs and single products In this thesis, this single-echelon simulation platform is extended to handle multiple echelons with multiple inventory elements handling multiple products. The main aspect for the multi-echelon OSF platform was to extend the KIBDEVS/LP such that ISM interactions with the LP and DEVS models could also be supported. To achieve this, a new, scalable XML schema for the KIB has been developed. The XML schema has also resulted in strengthening the KIB execution engine design. A sequential scheme controls the executions of the DEVS-Suite simulator, CPLEX optimizer, and ISM engine. To use the ISM for multiple echelons, it is extended to compute forecast customer demands and safety stocks over multiple hubs and products. Basic examples for semiconductor manufacturing spanning single and two echelon supply chain systems have been developed and analyzed. Experiments using perfect data were conducted to show the correctness of the OSF platform design and implementation. Simple, but realistic experiments have also been conducted. They highlight the kinds of supply chain dynamics that can be evaluated using discrete event process simulation, linear programming optimization, and heuristics forecasting models.
ContributorsSmith, James Melkon (Author) / Sarjoughian, Hessam S. (Thesis advisor) / Davulcu, Hasan (Committee member) / Fainekos, Georgios (Committee member) / Arizona State University (Publisher)
Created2012
151524-Thumbnail Image.png
Description
Process migration is a heavily studied research area and has a number of applications in distributed systems. Process migration means transferring a process running on one machine to another such that it resumes execution from the point at which it was suspended. The conventional approach to implement process migration is

Process migration is a heavily studied research area and has a number of applications in distributed systems. Process migration means transferring a process running on one machine to another such that it resumes execution from the point at which it was suspended. The conventional approach to implement process migration is to move the entire state information of the process (including hardware context, virtual memory, files etc.) from one machine to another. Copying all the state information is costly. This thesis proposes and demonstrates a new approach of migrating a process between two cores of Intel Single Chip Cloud (SCC), an experimental 48-core processor by Intel, with each core running a separate instance of the operating system. In this method the amount of process state to be transferred from one core's memory to another is reduced by making use of special registers called Lookup tables (LUTs) present on each core of SCC. Thus this new approach is faster than the conventional method.
ContributorsJain, Vaibhav (Author) / Dasgupta, Partha (Thesis advisor) / Shriavstava, Aviral (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)
Created2013
152310-Thumbnail Image.png
Description
The wide adoption and continued advancement of information and communications technologies (ICT) have made it easier than ever for individuals and groups to stay connected over long distances. These advances have greatly contributed in dramatically changing the dynamics of the modern day workplace to the point where it is now

The wide adoption and continued advancement of information and communications technologies (ICT) have made it easier than ever for individuals and groups to stay connected over long distances. These advances have greatly contributed in dramatically changing the dynamics of the modern day workplace to the point where it is now commonplace to see large, distributed multidisciplinary teams working together on a daily basis. However, in this environment, motivating, understanding, and valuing the diverse contributions of individual workers in collaborative enterprises becomes challenging. To address these issues, this thesis presents the goals, design, and implementation of Taskville, a distributed workplace game played by teams on large, public displays. Taskville uses a city building metaphor to represent the completion of individual and group tasks within an organization. Promising results from two usability studies and two longitudinal studies at a multidisciplinary school demonstrate that Taskville supports personal reflection and improves team awareness through an engaging workplace activity.
ContributorsNikkila, Shawn (Author) / Sundaram, Hari (Thesis advisor) / Byrne, Daragh (Committee member) / Davulcu, Hasan (Committee member) / Olson, Loren (Committee member) / Arizona State University (Publisher)
Created2013
152158-Thumbnail Image.png
Description
Most data cleaning systems aim to go from a given deterministic dirty database to another deterministic but clean database. Such an enterprise pre–supposes that it is in fact possible for the cleaning process to uniquely recover the clean versions of each dirty data tuple. This is not possible in many

Most data cleaning systems aim to go from a given deterministic dirty database to another deterministic but clean database. Such an enterprise pre–supposes that it is in fact possible for the cleaning process to uniquely recover the clean versions of each dirty data tuple. This is not possible in many cases, where the most a cleaning system can do is to generate a (hopefully small) set of clean candidates for each dirty tuple. When the cleaning system is required to output a deterministic database, it is forced to pick one clean candidate (say the "most likely" candidate) per tuple. Such an approach can lead to loss of information. For example, consider a situation where there are three equally likely clean candidates of a dirty tuple. An appealing alternative that avoids such an information loss is to abandon the requirement that the output database be deterministic. In other words, even though the input (dirty) database is deterministic, I allow the reconstructed database to be probabilistic. Although such an approach does avoid the information loss, it also brings forth several challenges. For example, how many alternatives should be kept per tuple in the reconstructed database? Maintaining too many alternatives increases the size of the reconstructed database, and hence the query processing time. Second, while processing queries on the probabilistic database may well increase recall, how would they affect the precision of the query processing? In this thesis, I investigate these questions. My investigation is done in the context of a data cleaning system called BayesWipe that has the capability of producing multiple clean candidates per each dirty tuple, along with the probability that they are the correct cleaned version. I represent these alternatives as tuples in a tuple disjoint probabilistic database, and use the Mystiq system to process queries on it. This probabilistic reconstruction (called BayesWipe–PDB) is compared to a deterministic reconstruction (called BayesWipe–DET)—where the most likely clean candidate for each tuple is chosen, and the rest of the alternatives discarded.
ContributorsRihan, Preet Inder Singh (Author) / Kambhampati, Subbarao (Thesis advisor) / Liu, Huan (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)
Created2013
152168-Thumbnail Image.png
Description
There has been a lot of research in the field of artificial intelligence about thinking machines. Alan Turing proposed a test to observe a machine's intelligent behaviour with respect to natural language conversation. The Winograd schema challenge is suggested as an alternative, to the Turing test. It needs inferencing capabilities,

There has been a lot of research in the field of artificial intelligence about thinking machines. Alan Turing proposed a test to observe a machine's intelligent behaviour with respect to natural language conversation. The Winograd schema challenge is suggested as an alternative, to the Turing test. It needs inferencing capabilities, reasoning abilities and background knowledge to get the answer right. It involves a coreference resolution task in which a machine is given a sentence containing a situation which involves two entities, one pronoun and some more information about the situation and the machine has to come up with the right resolution of a pronoun to one of the entities. The complexity of the task is increased with the fact that the Winograd sentences are not constrained by one domain or specific sentence structure and it also contains a lot of human proper names. This modification makes the task of association of entities, to one particular word in the sentence, to derive the answer, difficult. I have developed a pronoun resolver system for the confined domain Winograd sentences. I have developed a classifier or filter which takes input sentences and decides to accept or reject them based on a particular criteria. Once the sentence is accepted. I run parsers on it to obtain the detailed analysis. Furthermore I have developed four answering modules which use world knowledge and inferencing mechanisms to try and resolve the pronoun. The four techniques I use are : ConceptNet knowledgebase, Search engine pattern counts,Narrative event chains and sentiment analysis. I have developed a particular aggregation mechanism for the answers from these modules to arrive at a final answer. I have used caching technique for the association relations that I obtain for different modules, so as to boost the performance. I run my system on the standard ‘nyu dataset’ of Winograd sentences and questions. This dataset is then restricted, by my classifier, to 90 sentences. I evaluate my system on this 90 sentence dataset. When I compare my results against the state of the art system on the same dataset, I get nearly 4.5 % improvement in the restricted domain.
ContributorsBudukh, Tejas Ulhas (Author) / Baral, Chitta (Thesis advisor) / VanLehn, Kurt (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)
Created2013
152428-Thumbnail Image.png
Description
Biological organisms are made up of cells containing numerous interconnected biochemical processes. Diseases occur when normal functionality of these processes is disrupted, manifesting as disease symptoms. Thus, understanding these biochemical processes and their interrelationships is a primary task in biomedical research and a prerequisite for activities including diagnosing diseases and

Biological organisms are made up of cells containing numerous interconnected biochemical processes. Diseases occur when normal functionality of these processes is disrupted, manifesting as disease symptoms. Thus, understanding these biochemical processes and their interrelationships is a primary task in biomedical research and a prerequisite for activities including diagnosing diseases and drug development. Scientists studying these interconnected processes have identified various pathways involved in drug metabolism, diseases, and signal transduction, etc. High-throughput technologies, new algorithms and speed improvements over the last decade have resulted in deeper knowledge about biological systems, leading to more refined pathways. Such pathways tend to be large and complex, making it difficult for an individual to remember all aspects. Thus, computer models are needed to represent and analyze them. The refinement activity itself requires reasoning with a pathway model by posing queries against it and comparing the results against the real biological system. Many existing models focus on structural and/or factoid questions, relying on surface-level information. These are generally not the kind of questions that a biologist may ask someone to test their understanding of biological processes. Examples of questions requiring understanding of biological processes are available in introductory college level biology text books. Such questions serve as a model for the question answering system developed in this thesis. Thus, the main goal of this thesis is to develop a system that allows the encoding of knowledge about biological pathways to answer questions demonstrating understanding of the pathways. To that end, a language is developed to specify a pathway and pose questions against it. Some existing tools are modified and used to accomplish this goal. The utility of the framework developed in this thesis is illustrated with applications in the biological domain. Finally, the question answering system is used in real world applications by extracting pathway knowledge from text and answering questions related to drug development.
ContributorsAnwar, Saadat (Author) / Baral, Chitta (Thesis advisor) / Inoue, Katsumi (Committee member) / Chen, Yi (Committee member) / Davulcu, Hasan (Committee member) / Lee, Joohyung (Committee member) / Arizona State University (Publisher)
Created2014