Matching Items (34)

135018-Thumbnail Image.png

Voice Reconfigurable Networks

Description

The software element of home and small business networking solutions has failed to keep pace with annual development of newer and faster hardware. The software running on these devices is

The software element of home and small business networking solutions has failed to keep pace with annual development of newer and faster hardware. The software running on these devices is an afterthought, oftentimes equipped with minimal features, an obtuse user interface, or both. At the same time, this past year has seen the rise of smart home assistants that represent the next step in human-computer interaction with their advanced use of natural language processing. This project seeks to quell the issues with the former by exploring a possible fusion of a powerful, feature-rich software-defined networking stack and the incredible natural language processing tools of smart home assistants. To accomplish these ends, a piece of software was developed to leverage the powerful natural language processing capabilities of one such smart home assistant, the Amazon Echo. On one end, this software interacts with Amazon Web Services to retrieve information about a user's speech patterns and key information contained in their speech. On the other end, the software joins that information with its previous session state to intelligently translate speech into a series of commands for the separate components of a networking stack. The software developed for this project empowers a user to quickly make changes to several facets of their networking gear or acquire information about it with just their language \u2014 no terminals, java applets, or web configuration interfaces needed, thus circumventing clunky UI's or jumping from shell to shell. It is the author's hope that showing how networking equipment can be configured in this innovative way will draw more attention to the current failings of networking equipment and inspire a new series of intuitive user interfaces.

Contributors

Agent

Created

Date Created
  • 2016-12

Exploring Financial Credit Contracts Using Natural Language Processing Techniques

Description

Natural Language Processing (NLP) techniques have increasingly been used in finance, accounting, and economics research to analyze text-based information more efficiently and effectively than primarily human-centered methods. The literature is

Natural Language Processing (NLP) techniques have increasingly been used in finance, accounting, and economics research to analyze text-based information more efficiently and effectively than primarily human-centered methods. The literature is rich with computational textual analysis techniques applied to consistent annual or quarterly financial fillings, with promising results to identify similarities between documents and firms, in addition to further using this information in relation to other economic phenomena. Building upon the knowledge gained from previous research and extending the application of NLP methods to other categories of financial documents, this project explores financial credit contracts, better understanding the information provided through their textual data by assessing patterns and relationships between documents and firms. The main methods used throughout this project is Term Frequency-Inverse Document Frequency (to represent each document as a numerical vector), Cosine Similarity (to measure the similarity between contracts), and K-Means Clustering (to organically derive clusters of documents based on the text included in the contract itself). Using these methods, the dimensions analyzed are various grouping methodologies (external industry classifications and text derived classifications), various granularities (document-wise and firm-wise), various financial documents associated with a single firm (the relationship between credit contracts and 10-K product descriptions), and how various mean cosine similarity distributions change over time.

Contributors

Created

Date Created
  • 2020-05

133339-Thumbnail Image.png

Prescription Information Extraction from Electronic Health Records using BiLSTM-CRF and Word Embeddings

Description

Medical records are increasingly being recorded in the form of electronic health records (EHRs), with a significant amount of patient data recorded as unstructured natural language text. Consequently, being able

Medical records are increasingly being recorded in the form of electronic health records (EHRs), with a significant amount of patient data recorded as unstructured natural language text. Consequently, being able to extract and utilize clinical data present within these records is an important step in furthering clinical care. One important aspect within these records is the presence of prescription information. Existing techniques for extracting prescription information — which includes medication names, dosages, frequencies, reasons for taking, and mode of administration — from unstructured text have focused on the application of rule- and classifier-based methods. While state-of-the-art systems can be effective in extracting many types of information, they require significant effort to develop hand-crafted rules and conduct effective feature engineering. This paper presents the use of a bidirectional LSTM with CRF tagging model initialized with precomputed word embeddings for extracting prescription information from sentences without requiring significant feature engineering. The experimental results, run on the i2b2 2009 dataset, achieve an F1 macro measure of 0.8562, and scores above 0.9449 on four of the six categories, indicating significant potential for this model.

Contributors

Agent

Created

Date Created
  • 2018-05

147700-Thumbnail Image.png

Using Machine Learning Models to Detect Fake News, Bots, and Rumors on Social Media

Description

In this paper, I introduce the fake news problem and detail how it has been exacerbated<br/>through social media. I explore current practices for fake news detection using natural language<br/>processing and

In this paper, I introduce the fake news problem and detail how it has been exacerbated<br/>through social media. I explore current practices for fake news detection using natural language<br/>processing and current benchmarks in ranking the efficacy of various language models. Using a<br/>Twitter-specific benchmark, I attempt to reproduce the scores of six language models<br/>demonstrating their effectiveness in seven tweet classification tasks. I explain the successes and<br/>challenges in reproducing these results and provide analysis for the future implications of fake<br/>news research.

Contributors

Agent

Created

Date Created
  • 2021-05

133515-Thumbnail Image.png

Instructional Design with Natural Language Processing in a Virtual Reality Environment

Description

Natural Language Processing and Virtual Reality are hot topics in the present. How can we synthesize these together in order to make a cohesive experience? The game focuses on users

Natural Language Processing and Virtual Reality are hot topics in the present. How can we synthesize these together in order to make a cohesive experience? The game focuses on users using vocal commands, building structures, and memorizing spatial objects. In order to get proper vocal commands, the IBM Watson API for Natural Language Processing was incorporated into our game system. User experience elements like gestures, UI color change, and images were used to help guide users in memorizing and building structures. The process to create these elements were streamlined through the VRTK library in Unity. The game has two segments. The first segment is a tutorial level where the user learns to perform motions and in-game actions. The second segment is a game where the user must correctly create a structure by utilizing vocal commands and spatial recognition. A standardized usability test, System Usability Scale, was used to evaluate the effectiveness of the game. A survey was also created in order to evaluate a more descriptive user opinion. Overall, users gave a positive score on the System Usability Scale and slightly positive reviews in the custom survey.

Contributors

Agent

Created

Date Created
  • 2018-05

135047-Thumbnail Image.png

Conjugating Honorifics in English-to-Japanese Machine Translation

Description

This research lays down foundational work in the semantic reconstruction of linguistic politeness in English-to-Japanese machine translation and thereby advances semantic-based automated translation of English into other natural languages. I

This research lays down foundational work in the semantic reconstruction of linguistic politeness in English-to-Japanese machine translation and thereby advances semantic-based automated translation of English into other natural languages. I developed a Java project called the PoliteParser that is intended as a plug-in to existing semantic parsers to determine whether verbs in dialogue in an English corpus should be conjugated into the plain or the polite honorific form when translated into Japanese. The PoliteParser bases this decision off of semantic information about the social relationships between the speaker and the listener, the speaker's personality, and the circumstances of the utterance. Testing undergone during the course of this research demonstrates that the PoliteParser can achieve levels of accuracy 31 percentage points higher than that of statistical translation systems when integrated with a semantic parser and 54 percentage points higher when used with pre-parsed data.

Contributors

Created

Date Created
  • 2016-12

136202-Thumbnail Image.png

Learning the Initial Lexicon in Translating Natural Language to Formal Language

Description

The objective of this research is to determine an approach for automating the learning of the initial lexicon used in translating natural language sentences to their formal knowledge representations based

The objective of this research is to determine an approach for automating the learning of the initial lexicon used in translating natural language sentences to their formal knowledge representations based on lambda-calculus expressions. Using a universal knowledge representation and its associated parser, this research attempts to use word alignment techniques to align natural language sentences to the linearized parses of their associated knowledge representations in order to learn the meanings of individual words. The work includes proposing and analyzing an approach that can be used to learn some of the initial lexicon.

Contributors

Agent

Created

Date Created
  • 2015-05

158120-Thumbnail Image.png

Language Image Transformer

Description

Humans perceive the environment using multiple modalities like vision, speech (language), touch, taste, and smell. The knowledge obtained from one modality usually complements the other. Learning through several modalities helps

Humans perceive the environment using multiple modalities like vision, speech (language), touch, taste, and smell. The knowledge obtained from one modality usually complements the other. Learning through several modalities helps in constructing an accurate model of the environment. Most of the current vision and language models are modality-specific and, in many cases, extensively use deep-learning based attention mechanisms for learning powerful representations. This work discusses the role of attention in associating vision and language for generating shared representation. Language Image Transformer (LIT) is proposed for learning multi-modal representations of the environment. It uses a training objective based on Contrastive Predictive Coding (CPC) to maximize the Mutual Information (MI) between the visual and linguistic representations. It learns the relationship between the modalities using the proposed cross-modal attention layers. It is trained and evaluated using captioning datasets, MS COCO, and Conceptual Captions. The results and the analysis offers a perspective on the use of Mutual Information Maximisation (MIM) for generating generalizable representations across multiple modalities.

Contributors

Agent

Created

Date Created
  • 2020

153988-Thumbnail Image.png

Automatic text summarization using importance of sentences for email corpus

Description

With the advent of Internet, the data being added online is increasing at enormous rate. Though search engines are using IR techniques to facilitate the search requests from users, the

With the advent of Internet, the data being added online is increasing at enormous rate. Though search engines are using IR techniques to facilitate the search requests from users, the results are not effective towards the search query of the user. The search engine user has to go through certain webpages before getting at the webpage he/she wanted. This problem of Information Overload can be solved using Automatic Text Summarization. Summarization is a process of obtaining at abridged version of documents so that user can have a quick view to understand what exactly the document is about. Email threads from W3C are used in this system. Apart from common IR features like Term Frequency, Inverse Document Frequency, Term Rank, a variation of page rank based on graph model, which can cluster the words with respective to word ambiguity, is implemented. Term Rank also considers the possibility of co-occurrence of words with the corpus and evaluates the rank of the word accordingly. Sentences of email threads are ranked as per features and summaries are generated. System implemented the concept of pyramid evaluation in content selection. The system can be considered as a framework for Unsupervised Learning in text summarization.

Contributors

Agent

Created

Date Created
  • 2015

158297-Thumbnail Image.png

Domain-Agnostic Context-Aware Assistant Framework for Task-Based Environment

Description

Smart home assistants are becoming a norm due to their ease-of-use. They employ spoken language as an interface, facilitating easy interaction with their users. Even with their obvious advantages, natural-language

Smart home assistants are becoming a norm due to their ease-of-use. They employ spoken language as an interface, facilitating easy interaction with their users. Even with their obvious advantages, natural-language based interfaces are not prevalent outside the domain of home assistants. It is hard to adopt them for computer-controlled systems due to the numerous complexities involved with their implementation in varying fields. The main challenge is the grounding of natural language base terms into the underlying system's primitives. The existing systems that do use natural language interfaces are specific to one problem domain only.

In this thesis, a domain-agnostic framework that creates natural language interfaces for computer-controlled systems has been developed by making the mapping between the language constructs and the system primitives customizable. The framework employs ontologies built using OWL (Web Ontology Language) for knowledge representation purposes and machine learning models for language processing tasks. It has been evaluated within a simulation environment consisting of objects and a robot. This environment has been deployed as a web application, providing anonymous user testing for evaluation, and generating training data for machine learning components. Performance evaluation has been done on metrics such as time taken for a task or the number of instructions given by the user to the robot to accomplish a task. Additionally, the framework has been used to create a natural language interface for a database system to demonstrate its domain independence.

Contributors

Agent

Created

Date Created
  • 2020