Search Content

A graphical language for LTL motion and mission planning

Description

Linear Temporal Logic is gaining increasing popularity as a high level specification language for robot motion planning due to its expressive power and scalability of LTL control synthesis algorithms. This formalism, however, requires expert knowledge and makes it inaccessible to non-expert users. This thesis introduces a graphical specification environment to…

Linear Temporal Logic is gaining increasing popularity as a high level specification language for robot motion planning due to its expressive power and scalability of LTL control synthesis algorithms. This formalism, however, requires expert knowledge and makes it inaccessible to non-expert users. This thesis introduces a graphical specification environment to create high level motion plans to control robots in the field by converting a visual representation of the motion/task plan into a Linear Temporal Logic (LTL) specification. The visual interface is built on the Android tablet platform and provides functionality to create task plans through a set of well defined gestures and on screen controls. It uses the notion of waypoints to quickly and efficiently describe the motion plan and enables a variety of complex Linear Temporal Logic specifications to be described succinctly and intuitively by the user without the need for the knowledge and understanding of LTL specification. Thus, it opens avenues for its use by personnel in military, warehouse management, and search and rescue missions. This thesis describes the construction of LTL for various scenarios used for robot navigation using the visual interface developed and leverages the use of existing LTL based motion planners to carry out the task plan by a robot.

ContributorsSrinivas, Shashank (Author) / Fainekos, Georgios (Thesis advisor) / Baral, Chitta (Committee member) / Burleson, Winslow (Committee member) / Arizona State University (Publisher)

Created2013

An intelligent co-reference resolver for Winograd schema sentences containing resolved semantic entities

Description

There has been a lot of research in the field of artificial intelligence about thinking machines. Alan Turing proposed a test to observe a machine's intelligent behaviour with respect to natural language conversation. The Winograd schema challenge is suggested as an alternative, to the Turing test. It needs inferencing capabilities,…

There has been a lot of research in the field of artificial intelligence about thinking machines. Alan Turing proposed a test to observe a machine's intelligent behaviour with respect to natural language conversation. The Winograd schema challenge is suggested as an alternative, to the Turing test. It needs inferencing capabilities, reasoning abilities and background knowledge to get the answer right. It involves a coreference resolution task in which a machine is given a sentence containing a situation which involves two entities, one pronoun and some more information about the situation and the machine has to come up with the right resolution of a pronoun to one of the entities. The complexity of the task is increased with the fact that the Winograd sentences are not constrained by one domain or specific sentence structure and it also contains a lot of human proper names. This modification makes the task of association of entities, to one particular word in the sentence, to derive the answer, difficult. I have developed a pronoun resolver system for the confined domain Winograd sentences. I have developed a classifier or filter which takes input sentences and decides to accept or reject them based on a particular criteria. Once the sentence is accepted. I run parsers on it to obtain the detailed analysis. Furthermore I have developed four answering modules which use world knowledge and inferencing mechanisms to try and resolve the pronoun. The four techniques I use are : ConceptNet knowledgebase, Search engine pattern counts,Narrative event chains and sentiment analysis. I have developed a particular aggregation mechanism for the answers from these modules to arrive at a final answer. I have used caching technique for the association relations that I obtain for different modules, so as to boost the performance. I run my system on the standard ‘nyu dataset’ of Winograd sentences and questions. This dataset is then restricted, by my classifier, to 90 sentences. I evaluate my system on this 90 sentence dataset. When I compare my results against the state of the art system on the same dataset, I get nearly 4.5 % improvement in the restricted domain.

ContributorsBudukh, Tejas Ulhas (Author) / Baral, Chitta (Thesis advisor) / VanLehn, Kurt (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)

Created2013

Representing the language of the causal calculator in answer set programming

Description

Action language C+ is a formalism for describing properties of actions, which is based on nonmonotonic causal logic. The definite fragment of C+ is implemented in the Causal Calculator (CCalc), which is based on the reduction of nonmonotonic causal logic to propositional logic. This thesis describes the language…

Action language C+ is a formalism for describing properties of actions, which is based on nonmonotonic causal logic. The definite fragment of C+ is implemented in the Causal Calculator (CCalc), which is based on the reduction of nonmonotonic causal logic to propositional logic. This thesis describes the language of CCalc in terms of answer set programming (ASP), based on the translation of nonmonotonic causal logic to formulas under the stable model semantics. I designed a standard library which describes the constructs of the input language of CCalc in terms of ASP, allowing a simple modular method to represent CCalc input programs in the language of ASP. Using the combination of system F2LP and answer set solvers, this method achieves functionality close to that of CCalc while taking advantage of answer set solvers to yield efficient computation that is orders of magnitude faster than CCalc for many benchmark examples. In support of this, I created an automated translation system Cplus2ASP that implements the translation and encoding method and automatically invokes the necessary software to solve the translated input programs.

ContributorsCasolary, Michael (Author) / Lee, Joohyung (Thesis advisor) / Ahn, Gail-Joon (Committee member) / Baral, Chitta (Committee member) / Arizona State University (Publisher)

Created2011

Multimodal Representation Learning for Visual Reasoning and Text-to-Image Translation

Description

Multimodal Representation Learning is a multi-disciplinary research field which aims to integrate information from multiple communicative modalities in a meaningful manner to help solve some downstream task. These modalities can be visual, acoustic, linguistic, haptic etc. The interpretation of ’meaningful integration of information from different modalities’ remains modality and task…

Multimodal Representation Learning is a multi-disciplinary research field which aims to integrate information from multiple communicative modalities in a meaningful manner to help solve some downstream task. These modalities can be visual, acoustic, linguistic, haptic etc. The interpretation of ’meaningful integration of information from different modalities’ remains modality and task dependent. The downstream task can range from understanding one modality in the presence of information from other modalities, to that of translating input from one modality to another. In this thesis the utility of multimodal representation learning for understanding one modality vis-à-vis Image Understanding for Visual Reasoning given corresponding information in other modalities, as well as translating from one modality to the other, specifically, Text to Image Translation was investigated.

Visual Reasoning has been an active area of research in computer vision. It encompasses advanced image processing and artificial intelligence techniques to locate, characterize and recognize objects, regions and their attributes in the image in order to comprehend the image itself. One way of building a visual reasoning system is to ask the system to answer questions about the image that requires attribute identification, counting, comparison, multi-step attention, and reasoning. An intelligent system is thought to have a proper grasp of the image if it can answer said questions correctly and provide a valid reasoning for the given answers. In this work how a system can be built by learning a multimodal representation between the stated image and the questions was investigated. Also, how background knowledge, specifically scene-graph information, if available, can be incorporated into existing image understanding models was demonstrated.

Multimodal learning provides an intuitive way of learning a joint representation between different modalities. Such a joint representation can be used to translate from one modality to the other. It also gives way to learning a shared representation between these varied modalities and allows to provide meaning to what this shared representation should capture. In this work, using the surrogate task of text to image translation, neural network based architectures to learn a shared representation between these two modalities was investigated. Also, the ability that such a shared representation is capable of capturing parts of different modalities that are equivalent in some sense is proposed. Specifically, given an image and a semantic description of certain objects present in the image, a shared representation between the text and the image modality capable of capturing parts of the image being mentioned in the text was demonstrated. Such a capability was showcased on a publicly available dataset.

ContributorsSaha, Rudra (Author) / Yang, Yezhou (Thesis advisor) / Singh, Maneesh Kumar (Committee member) / Baral, Chitta (Committee member) / Arizona State University (Publisher)

Created2018

Fixed verse generation using neural word embeddings

Description

For the past three decades, the design of an effective strategy for generating poetry that matches that of a human’s creative capabilities and complexities has been an elusive goal in artificial intelligence (AI) and natural language generation (NLG) research, and among linguistic creativity researchers in particular. This thesis presents a…

For the past three decades, the design of an effective strategy for generating poetry that matches that of a human’s creative capabilities and complexities has been an elusive goal in artificial intelligence (AI) and natural language generation (NLG) research, and among linguistic creativity researchers in particular. This thesis presents a novel approach to fixed verse poetry generation using neural word embeddings. During the course of generation, a two layered poetry classifier is developed. The first layer uses a lexicon based method to classify poems into types based on form and structure, and the second layer uses a supervised classification method to classify poems into subtypes based on content with an accuracy of 92%. The system then uses a two-layer neural network to generate poetry based on word similarities and word movements in a 50-dimensional vector space.

The verses generated by the system are evaluated using rhyme, rhythm, syllable counts and stress patterns. These computational features of language are considered for generating haikus, limericks and iambic pentameter verses. The generated poems are evaluated using a Turing test on both experts and non-experts. The user study finds that only 38% computer generated poems were correctly identified by nonexperts while 65% of the computer generated poems were correctly identified by experts. Although the system does not pass the Turing test, the results from the Turing test suggest an improvement of over 17% when compared to previous methods which use Turing tests to evaluate poetry generators.

ContributorsMagge, Arjun (Author) / Syrotiuk, Violet R. (Thesis advisor) / Baral, Chitta (Committee member) / Hogue, Cynthia (Committee member) / Bazzi, Rida (Committee member) / Arizona State University (Publisher)

Created2016

On the Relationships Among Probabilistic Extensions of Answer Set Semantics

Description

Answer Set Programming (ASP) is one of the main formalisms in Knowledge Representation (KR) that is being widely applied in a large number of applications. While ASP is effective on Boolean decision problems, it has difficulty in expressing quantitative uncertainty and probability in a natural way.

Logic Programs under the answer…

Answer Set Programming (ASP) is one of the main formalisms in Knowledge Representation (KR) that is being widely applied in a large number of applications. While ASP is effective on Boolean decision problems, it has difficulty in expressing quantitative uncertainty and probability in a natural way.

Logic Programs under the answer set semantics and Markov Logic Network (LPMLN) is a recent extension of answer set programs to overcome the limitation of the deterministic nature of ASP by adopting the log-linear weight scheme of Markov Logic. This thesis investigates the relationships between LPMLN and two other extensions of ASP: weak constraints to express a quantitative preference among answer sets, and P-log to incorporate probabilistic uncertainty. The studied relationships show how different extensions of answer set programs are related to each other, and how they are related to formalisms in Statistical Relational Learning, such as Problog and MLN, which have shown to be closely related to LPMLN. The studied relationships compare the properties of the involved languages and provide ways to compute one language using an implementation of another language.

This thesis first presents a translation of LPMLN into programs with weak constraints. The translation allows for computing the most probable stable models (i.e., MAP estimates) or probability distribution in LPMLN programs using standard ASP solvers so that the well-developed techniques in ASP can be utilized. This result can be extended to other formalisms, such as Markov Logic, ProbLog, and Pearl’s Causal Models, that are shown to be translatable into LPMLN.

This thesis also presents a translation of P-log into LPMLN. The translation tells how probabilistic nonmonotonicity (the ability of the reasoner to change his probabilistic model as a result of new information) of P-log can be represented in LPMLN, which yields a way to compute P-log using standard ASP solvers or MLN solvers.

ContributorsYang, Zhun (Author) / Lee, Joohyung (Thesis advisor) / Baral, Chitta (Committee member) / Li, Baoxin (Committee member) / Arizona State University (Publisher)

Created2017

Word and Relation Embedding for Sentence Representation

Description

In recent years, several methods have been proposed to encode sentences into fixed length continuous vectors called sentence representation or sentence embedding. With the recent advancements in various deep learning methods applied in Natural Language Processing (NLP), these representations play a crucial role in tasks such as named entity recognition,…

In recent years, several methods have been proposed to encode sentences into fixed length continuous vectors called sentence representation or sentence embedding. With the recent advancements in various deep learning methods applied in Natural Language Processing (NLP), these representations play a crucial role in tasks such as named entity recognition, question answering and sentence classification.

Traditionally, sentence vector representations are learnt from its constituent word representations, also known as word embeddings. Various methods to learn the distributed representation (embedding) of words have been proposed using the notion of Distributional Semantics, i.e. “meaning of a word is characterized by the company it keeps”. However, principle of compositionality states that meaning of a sentence is a function of the meanings of words and also the way they are syntactically combined. In various recent methods for sentence representation, the syntactic information like dependency or relation between words have been largely ignored.

In this work, I have explored the effectiveness of sentence representations that are composed of the representation of both, its constituent words and the relations between the words in a sentence. The word and relation embeddings are learned based on their context. These general-purpose embeddings can also be used as off-the- shelf semantic and syntactic features for various NLP tasks. Similarity Evaluation tasks was performed on two datasets showing the usefulness of the learned word embeddings. Experiments were conducted on three different sentence classification tasks showing that our sentence representations outperform the original word-based sentence representations, when used with the state-of-the-art Neural Network architectures.

ContributorsRath, Trideep (Author) / Baral, Chitta (Thesis advisor) / Li, Baoxin (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)

Created2017

An inverse lambda calculus algorithm for natural language processing

Description

Natural Language Processing is a subject that combines computer science and linguistics, aiming to provide computers with the ability to understand natural language and to develop a more intuitive human-computer interaction. The research community has developed ways to translate natural language to mathematical formalisms. It has not yet been shown,…

Natural Language Processing is a subject that combines computer science and linguistics, aiming to provide computers with the ability to understand natural language and to develop a more intuitive human-computer interaction. The research community has developed ways to translate natural language to mathematical formalisms. It has not yet been shown, however, how to automatically translate different kinds of knowledge in English to distinct formal languages. Most of the recent work presents the problem that the translation method aims to a specific formal language or is hard to generalize. In this research, I take a first step to overcome this difficulty and present two algorithms which take as input two lambda-calculus expressions G and H and compute a lambda-calculus expression F. The expression F returned by the first algorithm satisfies F@G=H and, in the case of the second algorithm, we obtain G@F=H. The lambda expressions represent the meanings of words and sentences. For each formal language that one desires to use with the algorithms, the language must be defined in terms of lambda calculus. Also, some additional concepts must be included. After doing this, given a sentence, its representation and knowing the representation of several words in the sentence, the algorithms can be used to obtain the representation of the other words in that sentence. In this work, I define two languages and show examples of their use with the algorithms. The algorithms are illustrated along with soundness and completeness proofs, the latter with respect to typed lambda-calculus formulas up to the second order. These algorithms are a core part of a natural language semantics system that translates sentences from English to formulas in different formal languages.

ContributorsAlvarez Gonzalez, Marcos (Author) / Baral, Chitta (Thesis advisor) / Lee, Joohyung (Committee member) / Ye, Jieping (Committee member) / Arizona State University (Publisher)

Created2010

Medical Question Answering using Instructional Prompts

Description

Instructional prompts are a novel technique that can significantly improve the performance of natural language processing tasks by specifying the task instruction to the language model. This is the first paper that uses instructional prompts to improve performance of the question answering task in biomedical domain. This work makes two…

Instructional prompts are a novel technique that can significantly improve the performance of natural language processing tasks by specifying the task instruction to the language model. This is the first paper that uses instructional prompts to improve performance of the question answering task in biomedical domain. This work makes two significant contributions. Firstly, a question answer dataset of 600K question answer pairs has been developed by using the medical textbook ‘Differential Diagnosis Primary Care’, which contains information on how to diagnose a patient by observing their disease symptoms. Secondly, a question answering language model augmented with instructional prompts has been developed by training on the medical information extracted from the book ‘Differential Diagnosis Primary Care’. Experiments have been conducted to demonstrate that it performs better than a normal question answering model that does not use instructional prompts. Instructional prompts are based on prompt tuning and prefix tuning, which are novel techniques which can help train language model to do specific downstream tasks by keeping majority of model parameters frozen, and only optimizing a small number of continuous task-specific vectors (called the prefixes).

ContributorsSaxena, Sharad (Author) / Baral, Chitta (Thesis advisor) / Blanco, Eduardo (Committee member) / Anwar, Saadat (Committee member) / Arizona State University (Publisher)

Created2021

Event Detection as Multi-Task Text Generation

Description

Event detection refers to the task of identifying event occurrences in a given natural language text. Event detection comprises two subtasks; recognizing event mention (event identification) and the type of event (event classification). Breaking from the sequence labeling and word classification approaches, this work models event detection, and its constituent…

Event detection refers to the task of identifying event occurrences in a given natural language text. Event detection comprises two subtasks; recognizing event mention (event identification) and the type of event (event classification). Breaking from the sequence labeling and word classification approaches, this work models event detection, and its constituent subtasks of trigger identification and trigger classification, as independent sequence generation tasks. This work proposes a prompted multi-task generative model trained on event identification, classification, and combined event detection. The model is evaluated on on general-domain and biomedical-domain event detection datasets, achieving state-of-the-art results on the general-domain Roles Across Multiple Sentences (RAMS) dataset, establishing event detection benchmark performance on WikiEvents, and achieving competitive performance on the general-domain Massive Event Detection (MAVEN) dataset and the biomedical-domain Multi-Level Event Extraction (MLEE) dataset.

ContributorsAnantheswaran, Ujjwala (Author) / Baral, Chitta (Thesis advisor) / Kerner, Hannah (Committee member) / Gopalan, Nakul (Committee member) / Arizona State University (Publisher)

Created2022

Filtering by