Matching Items (298)
Filtering by

Clear all filters

151716-Thumbnail Image.png
Description
The rapid escalation of technology and the widespread emergence of modern technological equipments have resulted in the generation of humongous amounts of digital data (in the form of images, videos and text). This has expanded the possibility of solving real world problems using computational learning frameworks. However, while gathering a

The rapid escalation of technology and the widespread emergence of modern technological equipments have resulted in the generation of humongous amounts of digital data (in the form of images, videos and text). This has expanded the possibility of solving real world problems using computational learning frameworks. However, while gathering a large amount of data is cheap and easy, annotating them with class labels is an expensive process in terms of time, labor and human expertise. This has paved the way for research in the field of active learning. Such algorithms automatically select the salient and exemplar instances from large quantities of unlabeled data and are effective in reducing human labeling effort in inducing classification models. To utilize the possible presence of multiple labeling agents, there have been attempts towards a batch mode form of active learning, where a batch of data instances is selected simultaneously for manual annotation. This dissertation is aimed at the development of novel batch mode active learning algorithms to reduce manual effort in training classification models in real world multimedia pattern recognition applications. Four major contributions are proposed in this work: $(i)$ a framework for dynamic batch mode active learning, where the batch size and the specific data instances to be queried are selected adaptively through a single formulation, based on the complexity of the data stream in question, $(ii)$ a batch mode active learning strategy for fuzzy label classification problems, where there is an inherent imprecision and vagueness in the class label definitions, $(iii)$ batch mode active learning algorithms based on convex relaxations of an NP-hard integer quadratic programming (IQP) problem, with guaranteed bounds on the solution quality and $(iv)$ an active matrix completion algorithm and its application to solve several variants of the active learning problem (transductive active learning, multi-label active learning, active feature acquisition and active learning for regression). These contributions are validated on the face recognition and facial expression recognition problems (which are commonly encountered in real world applications like robotics, security and assistive technology for the blind and the visually impaired) and also on collaborative filtering applications like movie recommendation.
ContributorsChakraborty, Shayok (Author) / Panchanathan, Sethuraman (Thesis advisor) / Balasubramanian, Vineeth N. (Committee member) / Li, Baoxin (Committee member) / Mittelmann, Hans (Committee member) / Ye, Jieping (Committee member) / Arizona State University (Publisher)
Created2013
151718-Thumbnail Image.png
Description
The increasing popularity of Twitter renders improved trustworthiness and relevance assessment of tweets much more important for search. However, given the limitations on the size of tweets, it is hard to extract measures for ranking from the tweet's content alone. I propose a method of ranking tweets by generating a

The increasing popularity of Twitter renders improved trustworthiness and relevance assessment of tweets much more important for search. However, given the limitations on the size of tweets, it is hard to extract measures for ranking from the tweet's content alone. I propose a method of ranking tweets by generating a reputation score for each tweet that is based not just on content, but also additional information from the Twitter ecosystem that consists of users, tweets, and the web pages that tweets link to. This information is obtained by modeling the Twitter ecosystem as a three-layer graph. The reputation score is used to power two novel methods of ranking tweets by propagating the reputation over an agreement graph based on tweets' content similarity. Additionally, I show how the agreement graph helps counter tweet spam. An evaluation of my method on 16~million tweets from the TREC 2011 Microblog Dataset shows that it doubles the precision over baseline Twitter Search and achieves higher precision than current state of the art method. I present a detailed internal empirical evaluation of RAProp in comparison to several alternative approaches proposed by me, as well as external evaluation in comparison to the current state of the art method.
ContributorsRavikumar, Srijith (Author) / Kambhampati, Subbarao (Thesis advisor) / Davulcu, Hasan (Committee member) / Liu, Huan (Committee member) / Arizona State University (Publisher)
Created2013
151780-Thumbnail Image.png
Description
Objective of this thesis project is to build a prototype using Linear Temporal Logic specifications for generating a 2D motion plan commanding an iRobot to fulfill the specifications. This thesis project was created for Cyber Physical Systems Lab in Arizona State University. The end product of this thesis is creation

Objective of this thesis project is to build a prototype using Linear Temporal Logic specifications for generating a 2D motion plan commanding an iRobot to fulfill the specifications. This thesis project was created for Cyber Physical Systems Lab in Arizona State University. The end product of this thesis is creation of a software solution which can be used in the academia and industry for research in cyber physical systems related applications. The major features of the project are: creating a modular system for motion planning, use of Robot Operating System (ROS), use of triangulation for environment decomposition and using stargazer sensor for localization. The project is built on an open source software called ROS which provides an environment where it is very easy to integrate different modules be it software or hardware on a Linux based platform. Use of ROS implies the project or its modules can be adapted quickly for different applications as the need arises. The final software package created and tested takes a data file as its input which contains the LTL specifications, a symbols list used in the LTL and finally the environment polygon data containing real world coordinates for all polygons and also information on neighbors and parents of each polygon. The software package successfully ran the experiment of coverage, reachability with avoidance and sequencing.
ContributorsPandya, Parth (Author) / Fainekos, Georgios (Thesis advisor) / Dasgupta, Partha (Committee member) / Lee, Yann-Hang (Committee member) / Arizona State University (Publisher)
Created2013
151793-Thumbnail Image.png
Description
Linear Temporal Logic is gaining increasing popularity as a high level specification language for robot motion planning due to its expressive power and scalability of LTL control synthesis algorithms. This formalism, however, requires expert knowledge and makes it inaccessible to non-expert users. This thesis introduces a graphical specification environment to

Linear Temporal Logic is gaining increasing popularity as a high level specification language for robot motion planning due to its expressive power and scalability of LTL control synthesis algorithms. This formalism, however, requires expert knowledge and makes it inaccessible to non-expert users. This thesis introduces a graphical specification environment to create high level motion plans to control robots in the field by converting a visual representation of the motion/task plan into a Linear Temporal Logic (LTL) specification. The visual interface is built on the Android tablet platform and provides functionality to create task plans through a set of well defined gestures and on screen controls. It uses the notion of waypoints to quickly and efficiently describe the motion plan and enables a variety of complex Linear Temporal Logic specifications to be described succinctly and intuitively by the user without the need for the knowledge and understanding of LTL specification. Thus, it opens avenues for its use by personnel in military, warehouse management, and search and rescue missions. This thesis describes the construction of LTL for various scenarios used for robot navigation using the visual interface developed and leverages the use of existing LTL based motion planners to carry out the task plan by a robot.
ContributorsSrinivas, Shashank (Author) / Fainekos, Georgios (Thesis advisor) / Baral, Chitta (Committee member) / Burleson, Winslow (Committee member) / Arizona State University (Publisher)
Created2013
151653-Thumbnail Image.png
Description
Answer Set Programming (ASP) is one of the most prominent and successful knowledge representation paradigms. The success of ASP is due to its expressive non-monotonic modeling language and its efficient computational methods originating from building propositional satisfiability solvers. The wide adoption of ASP has motivated several extensions to its modeling

Answer Set Programming (ASP) is one of the most prominent and successful knowledge representation paradigms. The success of ASP is due to its expressive non-monotonic modeling language and its efficient computational methods originating from building propositional satisfiability solvers. The wide adoption of ASP has motivated several extensions to its modeling language in order to enhance expressivity, such as incorporating aggregates and interfaces with ontologies. Also, in order to overcome the grounding bottleneck of computation in ASP, there are increasing interests in integrating ASP with other computing paradigms, such as Constraint Programming (CP) and Satisfiability Modulo Theories (SMT). Due to the non-monotonic nature of the ASP semantics, such enhancements turned out to be non-trivial and the existing extensions are not fully satisfactory. We observe that one main reason for the difficulties rooted in the propositional semantics of ASP, which is limited in handling first-order constructs (such as aggregates and ontologies) and functions (such as constraint variables in CP and SMT) in natural ways. This dissertation presents a unifying view on these extensions by viewing them as instances of formulas with generalized quantifiers and intensional functions. We extend the first-order stable model semantics by by Ferraris, Lee, and Lifschitz to allow generalized quantifiers, which cover aggregate, DL-atoms, constraints and SMT theory atoms as special cases. Using this unifying framework, we study and relate different extensions of ASP. We also present a tight integration of ASP with SMT, based on which we enhance action language C+ to handle reasoning about continuous changes. Our framework yields a systematic approach to study and extend non-monotonic languages.
ContributorsMeng, Yunsong (Author) / Lee, Joohyung (Thesis advisor) / Ahn, Gail-Joon (Committee member) / Baral, Chitta (Committee member) / Fainekos, Georgios (Committee member) / Lifschitz, Vladimir (Committee member) / Arizona State University (Publisher)
Created2013
152168-Thumbnail Image.png
Description
There has been a lot of research in the field of artificial intelligence about thinking machines. Alan Turing proposed a test to observe a machine's intelligent behaviour with respect to natural language conversation. The Winograd schema challenge is suggested as an alternative, to the Turing test. It needs inferencing capabilities,

There has been a lot of research in the field of artificial intelligence about thinking machines. Alan Turing proposed a test to observe a machine's intelligent behaviour with respect to natural language conversation. The Winograd schema challenge is suggested as an alternative, to the Turing test. It needs inferencing capabilities, reasoning abilities and background knowledge to get the answer right. It involves a coreference resolution task in which a machine is given a sentence containing a situation which involves two entities, one pronoun and some more information about the situation and the machine has to come up with the right resolution of a pronoun to one of the entities. The complexity of the task is increased with the fact that the Winograd sentences are not constrained by one domain or specific sentence structure and it also contains a lot of human proper names. This modification makes the task of association of entities, to one particular word in the sentence, to derive the answer, difficult. I have developed a pronoun resolver system for the confined domain Winograd sentences. I have developed a classifier or filter which takes input sentences and decides to accept or reject them based on a particular criteria. Once the sentence is accepted. I run parsers on it to obtain the detailed analysis. Furthermore I have developed four answering modules which use world knowledge and inferencing mechanisms to try and resolve the pronoun. The four techniques I use are : ConceptNet knowledgebase, Search engine pattern counts,Narrative event chains and sentiment analysis. I have developed a particular aggregation mechanism for the answers from these modules to arrive at a final answer. I have used caching technique for the association relations that I obtain for different modules, so as to boost the performance. I run my system on the standard ‘nyu dataset’ of Winograd sentences and questions. This dataset is then restricted, by my classifier, to 90 sentences. I evaluate my system on this 90 sentence dataset. When I compare my results against the state of the art system on the same dataset, I get nearly 4.5 % improvement in the restricted domain.
ContributorsBudukh, Tejas Ulhas (Author) / Baral, Chitta (Thesis advisor) / VanLehn, Kurt (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)
Created2013
152428-Thumbnail Image.png
Description
Biological organisms are made up of cells containing numerous interconnected biochemical processes. Diseases occur when normal functionality of these processes is disrupted, manifesting as disease symptoms. Thus, understanding these biochemical processes and their interrelationships is a primary task in biomedical research and a prerequisite for activities including diagnosing diseases and

Biological organisms are made up of cells containing numerous interconnected biochemical processes. Diseases occur when normal functionality of these processes is disrupted, manifesting as disease symptoms. Thus, understanding these biochemical processes and their interrelationships is a primary task in biomedical research and a prerequisite for activities including diagnosing diseases and drug development. Scientists studying these interconnected processes have identified various pathways involved in drug metabolism, diseases, and signal transduction, etc. High-throughput technologies, new algorithms and speed improvements over the last decade have resulted in deeper knowledge about biological systems, leading to more refined pathways. Such pathways tend to be large and complex, making it difficult for an individual to remember all aspects. Thus, computer models are needed to represent and analyze them. The refinement activity itself requires reasoning with a pathway model by posing queries against it and comparing the results against the real biological system. Many existing models focus on structural and/or factoid questions, relying on surface-level information. These are generally not the kind of questions that a biologist may ask someone to test their understanding of biological processes. Examples of questions requiring understanding of biological processes are available in introductory college level biology text books. Such questions serve as a model for the question answering system developed in this thesis. Thus, the main goal of this thesis is to develop a system that allows the encoding of knowledge about biological pathways to answer questions demonstrating understanding of the pathways. To that end, a language is developed to specify a pathway and pose questions against it. Some existing tools are modified and used to accomplish this goal. The utility of the framework developed in this thesis is illustrated with applications in the biological domain. Finally, the question answering system is used in real world applications by extracting pathway knowledge from text and answering questions related to drug development.
ContributorsAnwar, Saadat (Author) / Baral, Chitta (Thesis advisor) / Inoue, Katsumi (Committee member) / Chen, Yi (Committee member) / Davulcu, Hasan (Committee member) / Lee, Joohyung (Committee member) / Arizona State University (Publisher)
Created2014
152514-Thumbnail Image.png
Description
As the size and scope of valuable datasets has exploded across many industries and fields of research in recent years, an increasingly diverse audience has sought out effective tools for their large-scale data analytics needs. Over this period, machine learning researchers have also been very prolific in designing improved algorithms

As the size and scope of valuable datasets has exploded across many industries and fields of research in recent years, an increasingly diverse audience has sought out effective tools for their large-scale data analytics needs. Over this period, machine learning researchers have also been very prolific in designing improved algorithms which are capable of finding the hidden structure within these datasets. As consumers of popular Big Data frameworks have sought to apply and benefit from these improved learning algorithms, the problems encountered with the frameworks have motivated a new generation of Big Data tools to address the shortcomings of the previous generation. One important example of this is the improved performance in the newer tools with the large class of machine learning algorithms which are highly iterative in nature. In this thesis project, I set about to implement a low-rank matrix completion algorithm (as an example of a highly iterative algorithm) within a popular Big Data framework, and to evaluate its performance processing the Netflix Prize dataset. I begin by describing several approaches which I attempted, but which did not perform adequately. These include an implementation of the Singular Value Thresholding (SVT) algorithm within the Apache Mahout framework, which runs on top of the Apache Hadoop MapReduce engine. I then describe an approach which uses the Divide-Factor-Combine (DFC) algorithmic framework to parallelize the state-of-the-art low-rank completion algorithm Orthogoal Rank-One Matrix Pursuit (OR1MP) within the Apache Spark engine. I describe the results of a series of tests running this implementation with the Netflix dataset on clusters of various sizes, with various degrees of parallelism. For these experiments, I utilized the Amazon Elastic Compute Cloud (EC2) web service. In the final analysis, I conclude that the Spark DFC + OR1MP implementation does indeed produce competitive results, in both accuracy and performance. In particular, the Spark implementation performs nearly as well as the MATLAB implementation of OR1MP without any parallelism, and improves performance to a significant degree as the parallelism increases. In addition, the experience demonstrates how Spark's flexible programming model makes it straightforward to implement this parallel and iterative machine learning algorithm.
ContributorsKrouse, Brian (Author) / Ye, Jieping (Thesis advisor) / Liu, Huan (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)
Created2014
152834-Thumbnail Image.png
Description
Current work in planning assumes that user preferences and/or domain dynamics are completely specified in advance, and aims to search for a single solution plan to satisfy these. In many real world scenarios, however, providing a complete specification of user preferences and domain dynamics becomes a time-consuming and error-prone task.

Current work in planning assumes that user preferences and/or domain dynamics are completely specified in advance, and aims to search for a single solution plan to satisfy these. In many real world scenarios, however, providing a complete specification of user preferences and domain dynamics becomes a time-consuming and error-prone task. More often than not, a user may provide no knowledge or at best partial knowledge of her preferences with respect to a desired plan. Similarly, a domain writer may only be able to determine certain parts, not all, of the model of some actions in a domain. Such modeling issues requires new concepts on what a solution should be, and novel techniques in solving the problem. When user preferences are incomplete, rather than presenting a single plan, the planner must instead provide a set of plans containing one or more plans that are similar to the one that the user prefers. This research first proposes the usage of different measures to capture the quality of such plan sets. These are domain-independent distance measures based on plan elements if no knowledge of the user preferences is given, or the Integrated Preference Function measure in case incomplete knowledge of such preferences is provided. It then investigates various heuristic approaches to generate plan sets in accordance with these measures, and presents empirical results demonstrating the promise of the methods. The second part of this research addresses planning problems with incomplete domain models, specifically those annotated with possible preconditions and effects of actions. It formalizes the notion of plan robustness capturing the probability of success for plans during execution. A method of assessing plan robustness based on the weighted model counting approach is proposed. Two approaches for synthesizing robust plans are introduced. The first one compiles the robust plan synthesis problems to the conformant probabilistic planning problems. The second approximates the robustness measure with lower and upper bounds, incorporating them into a stochastic local search for estimating distance heuristic to a goal state. The resulting planner outperforms a state-of-the-art planner that can handle incomplete domain models in both plan quality and planning time.
ContributorsNguyễn, Tuấn Anh (Author) / Kambhampati, Subbarao (Thesis advisor) / Baral, Chitta (Committee member) / Do, Minh (Committee member) / Lee, Joohyung (Committee member) / Smith, David E. (Committee member) / Arizona State University (Publisher)
Created2014
153492-Thumbnail Image.png
Description
Although current urban search and rescue (USAR) robots are little more than remotely controlled cameras, the end goal is for them to work alongside humans as trusted teammates. Natural language communications and performance data are collected as a team of humans works to carry out a simulated search and rescue

Although current urban search and rescue (USAR) robots are little more than remotely controlled cameras, the end goal is for them to work alongside humans as trusted teammates. Natural language communications and performance data are collected as a team of humans works to carry out a simulated search and rescue task in an uncertain virtual environment. Conditions are tested emulating a remotely controlled robot versus an intelligent one. Differences in performance, situation awareness, trust, workload, and communications are measured. The Intelligent robot condition resulted in higher levels of performance and operator situation awareness (SA).
ContributorsBartlett, Cade Earl (Author) / Cooke, Nancy J. (Thesis advisor) / Kambhampati, Subbarao (Committee member) / Wu, Bing (Committee member) / Arizona State University (Publisher)
Created2015