Search Content

Towards Development of Models that Learn New Tasks from Instructions

Description

Humans have the remarkable ability to solve different tasks by simply reading textual instructions that define the tasks and looking at a few examples. Natural Language Processing (NLP) models built with the conventional machine learning paradigm, however, often struggle to generalize across tasks (e.g., a question-answering system cannot solve classification…

Humans have the remarkable ability to solve different tasks by simply reading textual instructions that define the tasks and looking at a few examples. Natural Language Processing (NLP) models built with the conventional machine learning paradigm, however, often struggle to generalize across tasks (e.g., a question-answering system cannot solve classification tasks) despite training with lots of examples. A long-standing challenge in Artificial Intelligence (AI) is to build a model that learns a new task by understanding the human-readable instructions that define it. To study this, I led the development of NATURAL INSTRUCTIONS and SUPERNATURAL INSTRUCTIONS, large-scale datasets of diverse tasks, their human-authored instructions, and instances. I adopt generative pre-trained language models to encode task-specific instructions along with input and generate task output. Empirical results in my experiments indicate that the instruction-tuning helps models achieve cross-task generalization. This leads to the question: how to write good instructions? Backed by extensive empirical analysis on large language models, I observe important attributes for successful instructional prompts and propose several reframing techniques for model designers to create such prompts. Empirical results in my experiments show that reframing notably improves few-shot learning performance; this is particularly important on large language models, such as GPT3 where tuning models or prompts on large datasets is expensive. In another experiment, I observe that representing a chain of thought instruction of mathematical reasoning questions as a program improves model performance significantly. This observation leads to the development of a large scale mathematical reasoning model BHASKAR and a unified benchmark LILA. In case of program synthesis tasks, however, summarizing a question (instead of expanding as in chain of thought) helps models significantly. This thesis also contains the study of instruction-example equivalence, power of decomposition instruction to replace the need for new models and origination of dataset bias from crowdsourcing instructions to better understand the advantages and disadvantages of instruction paradigm. Finally, I apply the instruction paradigm to match real user needs and introduce a new prompting technique HELP ME THINK to help humans perform various tasks by asking questions.

ContributorsMishra, Swaroop (Author) / Baral, Chitta (Thesis advisor) / Mitra, Arindam (Committee member) / Blanco, Eduardo (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)

Created2023

Knowledge Representation, Reasoning and Learning for Non-Extractive Reading Comprehension

Description

While in recent years deep learning (DL) based approaches have been the popular approach in developing end-to-end question answering (QA) systems, such systems lack several desired properties, such as the ability to do sophisticated reasoning with knowledge, the ability to learn using less resources and interpretability. In this thesis, I…

While in recent years deep learning (DL) based approaches have been the popular approach in developing end-to-end question answering (QA) systems, such systems lack several desired properties, such as the ability to do sophisticated reasoning with knowledge, the ability to learn using less resources and interpretability. In this thesis, I explore solutions that aim to address these drawbacks.

Towards this goal, I work with a specific family of reading comprehension tasks, normally referred to as the Non-Extractive Reading Comprehension (NRC), where the given passage does not contain enough information and to correctly answer sophisticated reasoning and ``additional knowledge" is required. I have organized the NRC tasks into three categories. Here I present my solutions to the first two categories and some preliminary results on the third category.

Category 1 NRC tasks refer to the scenarios where the required ``additional knowledge" is missing but there exists a decent natural language parser. For these tasks, I learn the missing ``additional knowledge" with the help of the parser and a novel inductive logic programming. The learned knowledge is then used to answer new questions. Experiments on three NRC tasks show that this approach along with providing an interpretable solution achieves better or comparable accuracy to that of the state-of-the-art DL based approaches.

The category 2 NRC tasks refer to the alternate scenario where the ``additional knowledge" is available but no natural language parser works well for the sentences of the target domain. To deal with these tasks, I present a novel hybrid reasoning approach which combines symbolic and natural language inference (neural reasoning) and ultimately allows symbolic modules to reason over raw text without requiring any translation. Experiments on two NRC tasks shows its effectiveness.

The category 3 neither provide the ``missing knowledge" and nor a good parser. This thesis does not provide an interpretable solution for this category but some preliminary results and analysis of a pure DL based approach. Nonetheless, the thesis shows beyond the world of pure DL based approaches, there are tools that can offer interpretable solutions for challenging tasks without using much resource and possibly with better accuracy.

ContributorsMitra, Arindam (Author) / Baral, Chitta (Thesis advisor) / Lee, Joohyung (Committee member) / Yang, Yezhou (Committee member) / Devarakonda, Murthy (Committee member) / Arizona State University (Publisher)

Created2019

Interpretable Question Answering using Deep Embedded Knowledge Reasoning to Solve Qualitative Word Problems

Description

One of the measures to determine the intelligence of a system is through Question Answering, as it requires a system to comprehend a question and reason using its knowledge base to accurately answer it. Qualitative word problems are an important subset of such problems, as they require a system to…

One of the measures to determine the intelligence of a system is through Question Answering, as it requires a system to comprehend a question and reason using its knowledge base to accurately answer it. Qualitative word problems are an important subset of such problems, as they require a system to recognize and reason with qualitative knowledge expressed in natural language. Traditional approaches in this domain include multiple modules to parse a given problem and to perform the required reasoning. Recent approaches involve using large pre-trained Language models like the Bidirection Encoder Representations from Transformers for downstream question answering tasks through supervision. These approaches however either suffer from errors between multiple modules, or are not interpretable with respect to the reasoning process employed. The proposed solution in this work aims to overcome these drawbacks through a single end-to-end trainable model that performs both the required parsing and reasoning. The parsing is achieved through an attention mechanism, whereas the reasoning is performed in vector space using soft logic operations. The model also enforces constraints in the form of auxiliary loss terms to increase the interpretability of the underlying reasoning process. The work achieves state of the art accuracy on the QuaRel dataset and matches that of the QuaRTz dataset with additional interpretability.

ContributorsNarayana, Sanjay (Author) / Baral, Chitta (Thesis advisor) / Mitra, Arindam (Committee member) / Anwar, Saadat (Committee member) / Arizona State University (Publisher)

Created2020

Comparative Genome Analysis of the High Pathogenicity Salmonella Typhimurium Strain UK-1

Description

Salmonella enterica serovar Typhimurium, a gram-negative facultative rod-shaped bacterium causing salmonellosis and foodborne disease, is one of the most common isolated Salmonella serovars in both developed and developing nations. Several S. Typhimurium genomes have been completed and many more genome-sequencing projects are underway. Comparative genome analysis of the multiple strains…

Salmonella enterica serovar Typhimurium, a gram-negative facultative rod-shaped bacterium causing salmonellosis and foodborne disease, is one of the most common isolated Salmonella serovars in both developed and developing nations. Several S. Typhimurium genomes have been completed and many more genome-sequencing projects are underway. Comparative genome analysis of the multiple strains leads to a better understanding of the evolution of S. Typhimurium and its pathogenesis. S. Typhimurium strain UK-1 (belongs to phage type 1) is highly virulent when orally administered to mice and chickens and efficiently colonizes lymphoid tissues of these species. These characteristics make this strain a good choice for use in vaccine development. In fact, UK-1 has been used as the parent strain for a number of nonrecombinant and recombinant vaccine strains, including several commercial vaccines for poultry. In this study, we conducted a thorough comparative genome analysis of the UK-1 strain with other S. Typhimurium strains and examined the phenotypic impact of several genomic differences. Whole genomic comparison highlights an extremely close relationship between the UK-1 strain and other S. Typhimurium strains; however, many interesting genetic and genomic variations specific to UK-1 were explored. In particular, the deletion of a UK-1-specific gene that is highly similar to the gene encoding the T3SS effector protein NleC exhibited a significant decrease in oral virulence in BALB/c mice. The complete genetic complements in UK-1, especially those elements that contribute to virulence or aid in determining the diversity within bacterial species, provide key information in evaluating the functional characterization of important genetic determinants and for development of vaccines.

ContributorsLuo, Yingqin (Author) / Kong, Qingke (Author) / Yang, Jiseon (Author) / Mitra, Arindam (Author) / Golden, Greg (Author) / Wanda, Soo-Young (Author) / Roland, Kenneth (Author) / Jensen, Roderick V. (Author) / Ernst, Peter B. (Author) / Curtiss, Roy (Author) / ASU Biodesign Center Immunotherapy, Vaccines and Virotherapy (Contributor) / Biodesign Institute (Contributor)

Created2012-07-06

BarA-UvrY Two-Component System Regulates Virulence of Uropathogenic E. Coli CFT073

Description

Uropathogenic Escherichia coli (UPEC), a member of extraintestinal pathogenic E. coli, cause ∼80% of community-acquired urinary tract infections (UTI) in humans. UPEC initiates its colonization in epithelial cells lining the urinary tract with a complicated life cycle, replicating and persisting in intracellular and extracellular niches. Consequently, UPEC causes cystitis and…

Uropathogenic Escherichia coli (UPEC), a member of extraintestinal pathogenic E. coli, cause ∼80% of community-acquired urinary tract infections (UTI) in humans. UPEC initiates its colonization in epithelial cells lining the urinary tract with a complicated life cycle, replicating and persisting in intracellular and extracellular niches. Consequently, UPEC causes cystitis and more severe form of pyelonephritis. To further understand the virulence characteristics of UPEC, we investigated the roles of BarA-UvrY two-component system (TCS) in regulating UPEC virulence. Our results showed that mutation of BarA-UvrY TCS significantly decreased the virulence of UPEC CFT073, as assessed by mouse urinary tract infection, chicken embryo killing assay, and cytotoxicity assay on human kidney and uroepithelial cell lines. Furthermore, mutation of either barA or uvrY gene reduced the production of hemolysin, lipopolysaccharide (LPS), proinflammatory cytokines (TNF-α and IL-6) and chemokine (IL-8). The virulence phenotype was restored similar to that of wild-type by complementation of either barA or uvrY gene in trans. In addition, we discussed a possible link between the BarA-UvrY TCS and CsrA in positively and negatively controlling virulence in UPEC. Overall, this study provides the evidences for BarA-UvrY TCS regulates the virulence of UPEC CFT073 and may point to mechanisms by which virulence regulations are observed in different ways may control the long-term survival of UPEC in the urinary tract.

ContributorsPalaniyandi, Senthilkumar (Author) / Mitra, Arindam (Author) / Herren, Christopher D. (Author) / Lockatell, C. Virginia (Author) / Johnson, David E. (Author) / Zhu, Xiaoping (Author) / Mukhopadhyay, Suman (Author) / ASU Biodesign Center Immunotherapy, Vaccines and Virotherapy (Contributor) / Biodesign Institute (Contributor)

Created2012-02-21