Search Content

Knowledge and Reasoning for Image Understanding

Description

Image Understanding is a long-established discipline in computer vision, which encompasses a body of advanced image processing techniques, that are used to locate (“where”), characterize and recognize (“what”) objects, regions, and their attributes in the image. However, the notion of “understanding” (and the goal of artificial intelligent machines) goes beyond…

Image Understanding is a long-established discipline in computer vision, which encompasses a body of advanced image processing techniques, that are used to locate (“where”), characterize and recognize (“what”) objects, regions, and their attributes in the image. However, the notion of “understanding” (and the goal of artificial intelligent machines) goes beyond factual recall of the recognized components and includes reasoning and thinking beyond what can be seen (or perceived). Understanding is often evaluated by asking questions of increasing difficulty. Thus, the expected functionalities of an intelligent Image Understanding system can be expressed in terms of the functionalities that are required to answer questions about an image. Answering questions about images require primarily three components: Image Understanding, question (natural language) understanding, and reasoning based on knowledge. Any question, asking beyond what can be directly seen, requires modeling of commonsense (or background/ontological/factual) knowledge and reasoning.

Knowledge and reasoning have seen scarce use in image understanding applications. In this thesis, we demonstrate the utilities of incorporating background knowledge and using explicit reasoning in image understanding applications. We first present a comprehensive survey of the previous work that utilized background knowledge and reasoning in understanding images. This survey outlines the limited use of commonsense knowledge in high-level applications. We then present a set of vision and reasoning-based methods to solve several applications and show that these approaches benefit in terms of accuracy and interpretability from the explicit use of knowledge and reasoning. We propose novel knowledge representations of image, knowledge acquisition methods, and a new implementation of an efficient probabilistic logical reasoning engine that can utilize publicly available commonsense knowledge to solve applications such as visual question answering, image puzzles. Additionally, we identify the need for new datasets that explicitly require external commonsense knowledge to solve. We propose the new task of Image Riddles, which requires a combination of vision, and reasoning based on ontological knowledge; and we collect a sufficiently large dataset to serve as an ideal testbed for vision and reasoning research. Lastly, we propose end-to-end deep architectures that can combine vision, knowledge and reasoning modules together and achieve large performance boosts over state-of-the-art methods.

ContributorsAditya, Somak (Author) / Baral, Chitta (Thesis advisor) / Yang, Yezhou (Thesis advisor) / Aloimonos, Yiannis (Committee member) / Lee, Joohyung (Committee member) / Li, Baoxin (Committee member) / Arizona State University (Publisher)

Created2018

Computing a Probabilistic Extension of Answer Set Program Language Using ASP and Markov Logic Solvers

Description

LPMLN is a recent probabilistic logic programming language which combines both Answer Set Programming (ASP) and Markov Logic. It is a proper extension of Answer Set programs which allows for reasoning about uncertainty using weighted rules under the stable model semantics with a weight scheme that is adopted from Markov…

LPMLN is a recent probabilistic logic programming language which combines both Answer Set Programming (ASP) and Markov Logic. It is a proper extension of Answer Set programs which allows for reasoning about uncertainty using weighted rules under the stable model semantics with a weight scheme that is adopted from Markov Logic. LPMLN has been shown to be related to several formalisms from the knowledge representation (KR) side such as ASP and P-Log, and the statistical relational learning (SRL) side such as Markov Logic Networks (MLN), Problog and Pearl’s causal models (PCM). Formalisms like ASP, P-Log, Problog, MLN, PCM have all been shown to embeddable in LPMLN which demonstrates the expressivity of the language. Interestingly, LPMLN has also been shown to reducible to ASP and MLN which is not only theoretically interesting, but also practically important from a computational point of view in that the reductions yield ways to compute LPMLN programs utilizing ASP and MLN solvers. Additionally, the reductions also allow the users to compute other formalisms which can be reduced to LPMLN.

This thesis realizes two implementations of LPMLN based on the reductions from LPMLN to ASP and LPMLN to MLN. This thesis first presents an implementation of LPMLN called LPMLN2ASP that uses standard ASP solvers for computing MAP inference using weak constraints, and marginal and conditional probabilities using stable models enumeration. Next, in this thesis, another implementation of LPMLN called LPMLN2MLN is presented that uses MLN solvers which apply completion to compute the tight fragment of LPMLN programs for MAP inference, marginal and conditional probabilities. The computation using ASP solvers yields exact inference as opposed to approximate inference using MLN solvers. Using these implementations, the usefulness of LPMLN for computing other formalisms is demonstrated by reducing them to LPMLN. The thesis also shows how the implementations are better than the native solvers of some of these formalisms on certain domains. The implementations make use of the current state of the art solving technologies in ASP and MLN, and therefore they benefit from any theoretical and practical advances in these technologies, thereby also benefiting the computation of other formalisms that can be reduced to LPMLN. Furthermore, the implementation also allows for certain SRL formalisms to be computed by ASP solvers, and certain KR formalisms to be computed by MLN solvers.

ContributorsTalsania, Samidh (Author) / Lee, Joohyung (Thesis advisor, Committee member) / Baral, Chitta (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)

Created2017

Towards understanding natural language: semantic parsing, commonsense knowledge acquisition, reasoning framework and applications

Description

Reasoning with commonsense knowledge is an integral component of human behavior. It is due to this capability that people know that a weak person may not be able to lift someone. It has been a long standing goal of the Artificial Intelligence community to simulate such commonsense reasoning abilities in…

Reasoning with commonsense knowledge is an integral component of human behavior. It is due to this capability that people know that a weak person may not be able to lift someone. It has been a long standing goal of the Artificial Intelligence community to simulate such commonsense reasoning abilities in machines. Over the years, many advances have been made and various challenges have been proposed to test their abilities. The Winograd Schema Challenge (WSC) is one such Natural Language Understanding (NLU) task which was also proposed as an alternative to the Turing Test. It is made up of textual question answering problems which require resolution of a pronoun to its correct antecedent.

In this thesis, two approaches of developing NLU systems to solve the Winograd Schema Challenge are demonstrated. To this end, a semantic parser is presented, various kinds of commonsense knowledge are identified, techniques to extract commonsense knowledge are developed and two commonsense reasoning algorithms are presented. The usefulness of the developed tools and techniques is shown by applying them to solve the challenge.

ContributorsSharma, Arpita (Author) / Baral, Chitta (Thesis advisor) / Lee, Joohyung (Committee member) / Papotti, Paolo (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)

Created2019

Knowledge Representation, Reasoning and Learning for Non-Extractive Reading Comprehension

Description

While in recent years deep learning (DL) based approaches have been the popular approach in developing end-to-end question answering (QA) systems, such systems lack several desired properties, such as the ability to do sophisticated reasoning with knowledge, the ability to learn using less resources and interpretability. In this thesis, I…

While in recent years deep learning (DL) based approaches have been the popular approach in developing end-to-end question answering (QA) systems, such systems lack several desired properties, such as the ability to do sophisticated reasoning with knowledge, the ability to learn using less resources and interpretability. In this thesis, I explore solutions that aim to address these drawbacks.

Towards this goal, I work with a specific family of reading comprehension tasks, normally referred to as the Non-Extractive Reading Comprehension (NRC), where the given passage does not contain enough information and to correctly answer sophisticated reasoning and ``additional knowledge" is required. I have organized the NRC tasks into three categories. Here I present my solutions to the first two categories and some preliminary results on the third category.

Category 1 NRC tasks refer to the scenarios where the required ``additional knowledge" is missing but there exists a decent natural language parser. For these tasks, I learn the missing ``additional knowledge" with the help of the parser and a novel inductive logic programming. The learned knowledge is then used to answer new questions. Experiments on three NRC tasks show that this approach along with providing an interpretable solution achieves better or comparable accuracy to that of the state-of-the-art DL based approaches.

The category 2 NRC tasks refer to the alternate scenario where the ``additional knowledge" is available but no natural language parser works well for the sentences of the target domain. To deal with these tasks, I present a novel hybrid reasoning approach which combines symbolic and natural language inference (neural reasoning) and ultimately allows symbolic modules to reason over raw text without requiring any translation. Experiments on two NRC tasks shows its effectiveness.

The category 3 neither provide the ``missing knowledge" and nor a good parser. This thesis does not provide an interpretable solution for this category but some preliminary results and analysis of a pure DL based approach. Nonetheless, the thesis shows beyond the world of pure DL based approaches, there are tools that can offer interpretable solutions for challenging tasks without using much resource and possibly with better accuracy.

ContributorsMitra, Arindam (Author) / Baral, Chitta (Thesis advisor) / Lee, Joohyung (Committee member) / Yang, Yezhou (Committee member) / Devarakonda, Murthy (Committee member) / Arizona State University (Publisher)

Created2019

Theses and Dissertations

Filtering by

Knowledge and Reasoning for Image Understanding

Computing a Probabilistic Extension of Answer Set Program Language Using ASP and Markov Logic Solvers

Towards understanding natural language: semantic parsing, commonsense knowledge acquisition, reasoning framework and applications

Knowledge Representation, Reasoning and Learning for Non-Extractive Reading Comprehension