Search Content

Batch mode active learning for multimedia pattern recognition

Description

The rapid escalation of technology and the widespread emergence of modern technological equipments have resulted in the generation of humongous amounts of digital data (in the form of images, videos and text). This has expanded the possibility of solving real world problems using computational learning frameworks. However, while gathering a…

The rapid escalation of technology and the widespread emergence of modern technological equipments have resulted in the generation of humongous amounts of digital data (in the form of images, videos and text). This has expanded the possibility of solving real world problems using computational learning frameworks. However, while gathering a large amount of data is cheap and easy, annotating them with class labels is an expensive process in terms of time, labor and human expertise. This has paved the way for research in the field of active learning. Such algorithms automatically select the salient and exemplar instances from large quantities of unlabeled data and are effective in reducing human labeling effort in inducing classification models. To utilize the possible presence of multiple labeling agents, there have been attempts towards a batch mode form of active learning, where a batch of data instances is selected simultaneously for manual annotation. This dissertation is aimed at the development of novel batch mode active learning algorithms to reduce manual effort in training classification models in real world multimedia pattern recognition applications. Four major contributions are proposed in this work: $(i)$ a framework for dynamic batch mode active learning, where the batch size and the specific data instances to be queried are selected adaptively through a single formulation, based on the complexity of the data stream in question, $(ii)$ a batch mode active learning strategy for fuzzy label classification problems, where there is an inherent imprecision and vagueness in the class label definitions, $(iii)$ batch mode active learning algorithms based on convex relaxations of an NP-hard integer quadratic programming (IQP) problem, with guaranteed bounds on the solution quality and $(iv)$ an active matrix completion algorithm and its application to solve several variants of the active learning problem (transductive active learning, multi-label active learning, active feature acquisition and active learning for regression). These contributions are validated on the face recognition and facial expression recognition problems (which are commonly encountered in real world applications like robotics, security and assistive technology for the blind and the visually impaired) and also on collaborative filtering applications like movie recommendation.

ContributorsChakraborty, Shayok (Author) / Panchanathan, Sethuraman (Thesis advisor) / Balasubramanian, Vineeth N. (Committee member) / Li, Baoxin (Committee member) / Mittelmann, Hans (Committee member) / Ye, Jieping (Committee member) / Arizona State University (Publisher)

Created2013

Answer set programming and other computing paradigms

Description

Answer Set Programming (ASP) is one of the most prominent and successful knowledge representation paradigms. The success of ASP is due to its expressive non-monotonic modeling language and its efficient computational methods originating from building propositional satisfiability solvers. The wide adoption of ASP has motivated several extensions to its modeling…

Answer Set Programming (ASP) is one of the most prominent and successful knowledge representation paradigms. The success of ASP is due to its expressive non-monotonic modeling language and its efficient computational methods originating from building propositional satisfiability solvers. The wide adoption of ASP has motivated several extensions to its modeling language in order to enhance expressivity, such as incorporating aggregates and interfaces with ontologies. Also, in order to overcome the grounding bottleneck of computation in ASP, there are increasing interests in integrating ASP with other computing paradigms, such as Constraint Programming (CP) and Satisfiability Modulo Theories (SMT). Due to the non-monotonic nature of the ASP semantics, such enhancements turned out to be non-trivial and the existing extensions are not fully satisfactory. We observe that one main reason for the difficulties rooted in the propositional semantics of ASP, which is limited in handling first-order constructs (such as aggregates and ontologies) and functions (such as constraint variables in CP and SMT) in natural ways. This dissertation presents a unifying view on these extensions by viewing them as instances of formulas with generalized quantifiers and intensional functions. We extend the first-order stable model semantics by by Ferraris, Lee, and Lifschitz to allow generalized quantifiers, which cover aggregate, DL-atoms, constraints and SMT theory atoms as special cases. Using this unifying framework, we study and relate different extensions of ASP. We also present a tight integration of ASP with SMT, based on which we enhance action language C+ to handle reasoning about continuous changes. Our framework yields a systematic approach to study and extend non-monotonic languages.

ContributorsMeng, Yunsong (Author) / Lee, Joohyung (Thesis advisor) / Ahn, Gail-Joon (Committee member) / Baral, Chitta (Committee member) / Fainekos, Georgios (Committee member) / Lifschitz, Vladimir (Committee member) / Arizona State University (Publisher)

Created2013

Representing, reasoning and answering questions about biological pathways various applications

Description

Biological organisms are made up of cells containing numerous interconnected biochemical processes. Diseases occur when normal functionality of these processes is disrupted, manifesting as disease symptoms. Thus, understanding these biochemical processes and their interrelationships is a primary task in biomedical research and a prerequisite for activities including diagnosing diseases and…

Biological organisms are made up of cells containing numerous interconnected biochemical processes. Diseases occur when normal functionality of these processes is disrupted, manifesting as disease symptoms. Thus, understanding these biochemical processes and their interrelationships is a primary task in biomedical research and a prerequisite for activities including diagnosing diseases and drug development. Scientists studying these interconnected processes have identified various pathways involved in drug metabolism, diseases, and signal transduction, etc. High-throughput technologies, new algorithms and speed improvements over the last decade have resulted in deeper knowledge about biological systems, leading to more refined pathways. Such pathways tend to be large and complex, making it difficult for an individual to remember all aspects. Thus, computer models are needed to represent and analyze them. The refinement activity itself requires reasoning with a pathway model by posing queries against it and comparing the results against the real biological system. Many existing models focus on structural and/or factoid questions, relying on surface-level information. These are generally not the kind of questions that a biologist may ask someone to test their understanding of biological processes. Examples of questions requiring understanding of biological processes are available in introductory college level biology text books. Such questions serve as a model for the question answering system developed in this thesis. Thus, the main goal of this thesis is to develop a system that allows the encoding of knowledge about biological pathways to answer questions demonstrating understanding of the pathways. To that end, a language is developed to specify a pathway and pose questions against it. Some existing tools are modified and used to accomplish this goal. The utility of the framework developed in this thesis is illustrated with applications in the biological domain. Finally, the question answering system is used in real world applications by extracting pathway knowledge from text and answering questions related to drug development.

ContributorsAnwar, Saadat (Author) / Baral, Chitta (Thesis advisor) / Inoue, Katsumi (Committee member) / Chen, Yi (Committee member) / Davulcu, Hasan (Committee member) / Lee, Joohyung (Committee member) / Arizona State University (Publisher)

Created2014

Planning with incomplete user preferences and domain models

Description

Current work in planning assumes that user preferences and/or domain dynamics are completely specified in advance, and aims to search for a single solution plan to satisfy these. In many real world scenarios, however, providing a complete specification of user preferences and domain dynamics becomes a time-consuming and error-prone task.…

Current work in planning assumes that user preferences and/or domain dynamics are completely specified in advance, and aims to search for a single solution plan to satisfy these. In many real world scenarios, however, providing a complete specification of user preferences and domain dynamics becomes a time-consuming and error-prone task. More often than not, a user may provide no knowledge or at best partial knowledge of her preferences with respect to a desired plan. Similarly, a domain writer may only be able to determine certain parts, not all, of the model of some actions in a domain. Such modeling issues requires new concepts on what a solution should be, and novel techniques in solving the problem. When user preferences are incomplete, rather than presenting a single plan, the planner must instead provide a set of plans containing one or more plans that are similar to the one that the user prefers. This research first proposes the usage of different measures to capture the quality of such plan sets. These are domain-independent distance measures based on plan elements if no knowledge of the user preferences is given, or the Integrated Preference Function measure in case incomplete knowledge of such preferences is provided. It then investigates various heuristic approaches to generate plan sets in accordance with these measures, and presents empirical results demonstrating the promise of the methods. The second part of this research addresses planning problems with incomplete domain models, specifically those annotated with possible preconditions and effects of actions. It formalizes the notion of plan robustness capturing the probability of success for plans during execution. A method of assessing plan robustness based on the weighted model counting approach is proposed. Two approaches for synthesizing robust plans are introduced. The first one compiles the robust plan synthesis problems to the conformant probabilistic planning problems. The second approximates the robustness measure with lower and upper bounds, incorporating them into a stochastic local search for estimating distance heuristic to a goal state. The resulting planner outperforms a state-of-the-art planner that can handle incomplete domain models in both plan quality and planning time.

ContributorsNguyễn, Tuấn Anh (Author) / Kambhampati, Subbarao (Thesis advisor) / Baral, Chitta (Committee member) / Do, Minh (Committee member) / Lee, Joohyung (Committee member) / Smith, David E. (Committee member) / Arizona State University (Publisher)

Created2014

Solving Winograd Schema Challenge: using semantic parsing, automatic knowledge acquisition and logical reasoning

Description

Turing test has been a benchmark scale for measuring the human level intelligence in computers since it was proposed by Alan Turing in 1950. However, for last 60 years, the applications such as ELIZA, PARRY, Cleverbot and Eugene Goostman, that claimed to pass the test. These applications are either based…

Turing test has been a benchmark scale for measuring the human level intelligence in computers since it was proposed by Alan Turing in 1950. However, for last 60 years, the applications such as ELIZA, PARRY, Cleverbot and Eugene Goostman, that claimed to pass the test. These applications are either based on tricks to fool humans on a textual chat based test or there has been a disagreement between AI communities on them passing the test. This has led to the school of thought that it might not be the ideal test for predicting the human level intelligence in machines.

Consequently, the Winograd Schema Challenge has been suggested as an alternative to the Turing test. As opposed to deciding the intelligent behavior with the help of chat servers, like it was done in the Turing test, the Winograd Schema Challenge is a question answering test. It consists of sentence and question pairs such that the answer to the question depends on the resolution of a definite pronoun or adjective in the sentence. The answers are fairly intuitive for humans but they are difficult for machines because it requires some sort of background or commonsense knowledge about the sentence.

In this thesis, I propose a novel technique to solve the Winograd Schema Challenge. The technique has three basic modules at its disposal, namely, a Semantic Parser that parses the English text (both sentences and questions) into a formal representation, an Automatic Background Knowledge Extractor that extracts the Background Knowledge pertaining to the given Winograd sentence, and an Answer Set Programming Reasoning Engine that reasons on the given Winograd sentence and the corresponding Background Knowledge. The applicability of the technique is illustrated by solving a subset of Winograd Schema Challenge pertaining to a certain type of Background Knowledge. The technique is evaluated on the subset and a notable accuracy is achieved.

ContributorsSharma, Arpita (Author) / Baral, Chita (Thesis advisor) / Lee, Joohyung (Committee member) / Pon-Barry, Heather (Committee member) / Arizona State University (Publisher)

Created2014

Representing the language of the causal calculator in answer set programming

Description

Action language C+ is a formalism for describing properties of actions, which is based on nonmonotonic causal logic. The definite fragment of C+ is implemented in the Causal Calculator (CCalc), which is based on the reduction of nonmonotonic causal logic to propositional logic. This thesis describes the language…

Action language C+ is a formalism for describing properties of actions, which is based on nonmonotonic causal logic. The definite fragment of C+ is implemented in the Causal Calculator (CCalc), which is based on the reduction of nonmonotonic causal logic to propositional logic. This thesis describes the language of CCalc in terms of answer set programming (ASP), based on the translation of nonmonotonic causal logic to formulas under the stable model semantics. I designed a standard library which describes the constructs of the input language of CCalc in terms of ASP, allowing a simple modular method to represent CCalc input programs in the language of ASP. Using the combination of system F2LP and answer set solvers, this method achieves functionality close to that of CCalc while taking advantage of answer set solvers to yield efficient computation that is orders of magnitude faster than CCalc for many benchmark examples. In support of this, I created an automated translation system Cplus2ASP that implements the translation and encoding method and automatically invokes the necessary software to solve the translated input programs.

ContributorsCasolary, Michael (Author) / Lee, Joohyung (Thesis advisor) / Ahn, Gail-Joon (Committee member) / Baral, Chitta (Committee member) / Arizona State University (Publisher)

Created2011

Bridging the gap between classical logic based formalisms and logic programs

Description

Different logic-based knowledge representation formalisms have different limitations either with respect to expressivity or with respect to computational efficiency. First-order logic, which is the basis of Description Logics (DLs), is not suitable for defeasible reasoning due to its monotonic nature. The nonmonotonic formalisms that extend first-order logic, such as circumscription…

Different logic-based knowledge representation formalisms have different limitations either with respect to expressivity or with respect to computational efficiency. First-order logic, which is the basis of Description Logics (DLs), is not suitable for defeasible reasoning due to its monotonic nature. The nonmonotonic formalisms that extend first-order logic, such as circumscription and default logic, are expressive but lack efficient implementations. The nonmonotonic formalisms that are based on the declarative logic programming approach, such as Answer Set Programming (ASP), have efficient implementations but are not expressive enough for representing and reasoning with open domains. This dissertation uses the first-order stable model semantics, which extends both first-order logic and ASP, to relate circumscription to ASP, and to integrate DLs and ASP, thereby partially overcoming the limitations of the formalisms. By exploiting the relationship between circumscription and ASP, well-known action formalisms, such as the situation calculus, the event calculus, and Temporal Action Logics, are reformulated in ASP. The advantages of these reformulations are shown with respect to the generality of the reasoning tasks that can be handled and with respect to the computational efficiency. The integration of DLs and ASP presented in this dissertation provides a framework for integrating rules and ontologies for the semantic web. This framework enables us to perform nonmonotonic reasoning with DL knowledge bases. Observing the need to integrate action theories and ontologies, the above results are used to reformulate the problem of integrating action theories and ontologies as a problem of integrating rules and ontologies, thus enabling us to use the computational tools developed in the context of the latter for the former.

ContributorsPalla, Ravi (Author) / Lee, Joohyung (Thesis advisor) / Baral, Chitta (Committee member) / Kambhampati, Subbarao (Committee member) / Lifschitz, Vladimir (Committee member) / Arizona State University (Publisher)

Created2012

A study of boosting based transfer learning for activity and gesture recognition

Description

Real-world environments are characterized by non-stationary and continuously evolving data. Learning a classification model on this data would require a framework that is able to adapt itself to newer circumstances. Under such circumstances, transfer learning has come to be a dependable methodology for improving classification performance with reduced training costs…

Real-world environments are characterized by non-stationary and continuously evolving data. Learning a classification model on this data would require a framework that is able to adapt itself to newer circumstances. Under such circumstances, transfer learning has come to be a dependable methodology for improving classification performance with reduced training costs and without the need for explicit relearning from scratch. In this thesis, a novel instance transfer technique that adapts a "Cost-sensitive" variation of AdaBoost is presented. The method capitalizes on the theoretical and functional properties of AdaBoost to selectively reuse outdated training instances obtained from a "source" domain to effectively classify unseen instances occurring in a different, but related "target" domain. The algorithm is evaluated on real-world classification problems namely accelerometer based 3D gesture recognition, smart home activity recognition and text categorization. The performance on these datasets is analyzed and evaluated against popular boosting-based instance transfer techniques. In addition, supporting empirical studies, that investigate some of the less explored bottlenecks of boosting based instance transfer methods, are presented, to understand the suitability and effectiveness of this form of knowledge transfer.

ContributorsVenkatesan, Ashok (Author) / Panchanathan, Sethuraman (Thesis advisor) / Li, Baoxin (Committee member) / Ye, Jieping (Committee member) / Arizona State University (Publisher)

Created2011

Representing and Reasoning about Dynamic Multi-Agent Domains: An Action Language Approach

Description

Reasoning about actions forms the basis of many tasks such as prediction, planning, and diagnosis in a dynamic domain. Within the reasoning about actions community, a broad class of languages, called action languages, has been developed together with a methodology for their use in representing and reasoning about dynamic domains.…

Reasoning about actions forms the basis of many tasks such as prediction, planning, and diagnosis in a dynamic domain. Within the reasoning about actions community, a broad class of languages, called action languages, has been developed together with a methodology for their use in representing and reasoning about dynamic domains. With a few notable exceptions, the focus of these efforts has largely centered around single-agent systems. Agents rarely operate in a vacuum however, and almost in parallel, substantial work has been done within the dynamic epistemic logic community towards understanding how the actions of an agent may effect not just his own knowledge and/or beliefs, but those of his fellow agents as well. What is less understood by both communities is how to represent and reason about both the direct and indirect effects of both ontic and epistemic actions within a multi-agent setting. This dissertation presents ongoing research towards a framework for representing and reasoning about dynamic multi-agent domains involving both classes of actions.

The contributions of this work are as follows: the formulation of a precise mathematical model of a dynamic multi-agent domain based on the notion of a transition diagram; the development of the multi-agent action languages mA+ and mAL based upon this model, as well as preliminary investigations of their properties and implementations via logic programming under the answer set semantics; precise formulations of the temporal projection, and planning problems within a multi-agent context; and an investigation of the application of the proposed approach to the representation of, and reasoning about, scenarios involving the modalities of knowledge and belief.

ContributorsGelfond, Gregory (Author) / Baral, Chitta (Thesis advisor) / Kambhampati, Subbarao (Committee member) / Lee, Joohyung (Committee member) / Moss, Larry (Committee member) / Cao Son, Tran (Committee member) / Arizona State University (Publisher)

Created2018

Knowledge and Reasoning for Image Understanding

Description

Image Understanding is a long-established discipline in computer vision, which encompasses a body of advanced image processing techniques, that are used to locate (“where”), characterize and recognize (“what”) objects, regions, and their attributes in the image. However, the notion of “understanding” (and the goal of artificial intelligent machines) goes beyond…

Image Understanding is a long-established discipline in computer vision, which encompasses a body of advanced image processing techniques, that are used to locate (“where”), characterize and recognize (“what”) objects, regions, and their attributes in the image. However, the notion of “understanding” (and the goal of artificial intelligent machines) goes beyond factual recall of the recognized components and includes reasoning and thinking beyond what can be seen (or perceived). Understanding is often evaluated by asking questions of increasing difficulty. Thus, the expected functionalities of an intelligent Image Understanding system can be expressed in terms of the functionalities that are required to answer questions about an image. Answering questions about images require primarily three components: Image Understanding, question (natural language) understanding, and reasoning based on knowledge. Any question, asking beyond what can be directly seen, requires modeling of commonsense (or background/ontological/factual) knowledge and reasoning.

Knowledge and reasoning have seen scarce use in image understanding applications. In this thesis, we demonstrate the utilities of incorporating background knowledge and using explicit reasoning in image understanding applications. We first present a comprehensive survey of the previous work that utilized background knowledge and reasoning in understanding images. This survey outlines the limited use of commonsense knowledge in high-level applications. We then present a set of vision and reasoning-based methods to solve several applications and show that these approaches benefit in terms of accuracy and interpretability from the explicit use of knowledge and reasoning. We propose novel knowledge representations of image, knowledge acquisition methods, and a new implementation of an efficient probabilistic logical reasoning engine that can utilize publicly available commonsense knowledge to solve applications such as visual question answering, image puzzles. Additionally, we identify the need for new datasets that explicitly require external commonsense knowledge to solve. We propose the new task of Image Riddles, which requires a combination of vision, and reasoning based on ontological knowledge; and we collect a sufficiently large dataset to serve as an ideal testbed for vision and reasoning research. Lastly, we propose end-to-end deep architectures that can combine vision, knowledge and reasoning modules together and achieve large performance boosts over state-of-the-art methods.

ContributorsAditya, Somak (Author) / Baral, Chitta (Thesis advisor) / Yang, Yezhou (Thesis advisor) / Aloimonos, Yiannis (Committee member) / Lee, Joohyung (Committee member) / Li, Baoxin (Committee member) / Arizona State University (Publisher)

Created2018

Filtering by