Search Content

Applications of Deep Neural Networks to Neurocognitive Poetics: A Quantitative Study of the Project Gutenberg English Poetry Corpus

Description

With the advent of sophisticated computer technology, we increasingly see the use of computational techniques in the study of problems from a variety of disciplines, including the humanities. In a field such as poetry, where classic works are subject to frequent re-analysis over the course of years, decades, or even…

With the advent of sophisticated computer technology, we increasingly see the use of computational techniques in the study of problems from a variety of disciplines, including the humanities. In a field such as poetry, where classic works are subject to frequent re-analysis over the course of years, decades, or even centuries, there is a certain demand for fresh approaches to familiar tasks, and such breaks from convention may even be necessary for the advancement of the field. Existing quantitative studies of poetry have employed computational techniques in their analyses, however, there remains work to be done with regards to the deployment of deep neural networks on large corpora of poetry to classify portions of the works contained therein based on certain features. While applications of neural networks to social media sites, consumer reviews, and other web-originated data are common within computational linguistics and natural language processing, comparatively little work has been done on the computational analysis of poetry using the same techniques. In this work, I begin to lay out the first steps for the study of poetry using neural networks. Using a convolutional neural network to classify author birth date, I was able to not only extract a non-trivial signal from the data, but also identify the presence of clustering within by-author model accuracy. While definitive conclusions about the cause of this clustering were not reached, investigation of this clustering reveals immense heterogeneity in the traits of accurately classified authors. Further study may unpack this clustering and reveal key insights about how temporal information is encoded in poetry. The study of poetry using neural networks remains very open but exhibits potential to be an interesting and deep area of work.

ContributorsGoodloe, Oscar Laurence (Author) / Nishimura, Joel (Thesis director) / Broatch, Jennifer (Committee member) / School of Mathematical and Natural Sciences (Contributor) / Barrett, The Honors College (Contributor)

Created2019-05

Understanding User Privacy Issues: Publishing User Data with Privacy in Mind

Description

The explosive Web growth in the last decade has drastically changed the way billions of people all around the globe conduct numerous activities including creating, sharing, and consuming information. The massive amount of user-generated information encourages companies and service providers to collect users' information and use it in order to…

The explosive Web growth in the last decade has drastically changed the way billions of people all around the globe conduct numerous activities including creating, sharing, and consuming information. The massive amount of user-generated information encourages companies and service providers to collect users' information and use it in order to better their own goals and then further provide personalized services to users as well. However, the users' information contains their private and sensitive information and can lead to breach of users' privacy. Anonymizing users' information before publishing and using such data is vital in securing their privacy. Due to the many forms of user information (e.g., structural, interactions, etc), different techniques are required for anonymization of users' data. In this thesis, first we discuss different anonymization techniques for various types of user-generated data, i.e., network graphs, web browsing history, and user-item interactions. Our experimental results show the effectiveness of such techniques for data anonymization. Then, we briefly touch on securely and privately sharing information through blockchains.

ContributorsNou, Alex Sheavin (Author) / Liu, Huan (Thesis director) / Beigi, Ghazaleh (Committee member) / Computer Science and Engineering Program (Contributor, Contributor) / Barrett, The Honors College (Contributor)

Created2019-05

Deep Periodic Networks

Description

In the field of machine learning, reinforcement learning stands out for its ability to explore approaches to complex, high dimensional problems that outperform even expert humans. For robotic locomotion tasks reinforcement learning provides an approach to solving them without the need for unique controllers. In this thesis, two reinforcement learning…

In the field of machine learning, reinforcement learning stands out for its ability to explore approaches to complex, high dimensional problems that outperform even expert humans. For robotic locomotion tasks reinforcement learning provides an approach to solving them without the need for unique controllers. In this thesis, two reinforcement learning algorithms, Deep Deterministic Policy Gradient and Group Factor Policy Search are compared based upon their performance in the bipedal walking environment provided by OpenAI gym. These algorithms are evaluated on their performance in the environment and their sample efficiency.

ContributorsMcDonald, Dax (Author) / Ben Amor, Heni (Thesis director) / Yang, Yezhou (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)

Created2018-12

Analyzing LinkedIn Profiles Using Machine Learning

Description

Understanding the necessary skills required to work in an industry is a difficult task with many potential uses. By being able to predict the industry of a person based on their skills, professional social networks could make searching better with automated tagging, advertisers can target more carefully, and students can…

Understanding the necessary skills required to work in an industry is a difficult task with many potential uses. By being able to predict the industry of a person based on their skills, professional social networks could make searching better with automated tagging, advertisers can target more carefully, and students can better find a career path that fits their skillset. The aim in this project is to apply deep learning to the world of professional networking. Deep Learning is a type of machine learning that has recently been making breakthroughs in the analysis of complex datasets that previously were not of much use. Initially the goal was to apply deep learning to the skills-to-company relationship, but a lack of quality data required a change to the skills-to-industry relationship. To accomplish the new goal, a database of LinkedIn profiles that are part of various industries was gathered and processed. From this dataset a model was created to take a list of skills and output an industry that people with those skills work in. Such a model has value in the insights that it forms allowing candidates to: determine what industry fits a skillset, identify key skills for industries, and locate which industries possible candidates may best fit in. Various models were trained and tested on a skill to industry dataset. The model was able to learn similarities between industries, and predict the most likely industries for each profiles skillset.

ContributorsAndrew, Benjamin (Co-author) / Thiel, Alex (Co-author) / Sodemann, Angela (Thesis director) / Sebold, Brent (Committee member) / Engineering Programs (Contributor) / Barrett, The Honors College (Contributor)

Created2017-12

Automatic Song Lyric Generation and Classification with Long Short-Term Networks

Description

Lyric classification and generation are trending in topics in the machine learning community. Long Short-Term Networks (LSTMs) are effective tools for classifying and generating text. We explored their effectiveness in the generation and classification of lyrical data and proposed methods of evaluating their accuracy. We found that LSTM networks with…

Lyric classification and generation are trending in topics in the machine learning community. Long Short-Term Networks (LSTMs) are effective tools for classifying and generating text. We explored their effectiveness in the generation and classification of lyrical data and proposed methods of evaluating their accuracy. We found that LSTM networks with dropout layers were effective at lyric classification. We also found that Word embedding LSTM networks were extremely effective at lyric generation.

ContributorsTallapragada, Amit (Author) / Ben Amor, Heni (Thesis director) / Caviedes, Jorge (Committee member) / Computer Science and Engineering Program (Contributor, Contributor) / Barrett, The Honors College (Contributor)

Created2019-05

Bioreactor Alternative to Conventional Landfills

Description

Currently conventional Subtitle D landfills are the primary means of disposing of our waste in the United States. While this method of waste disposal aims at protecting the environment, it does so through the use of liners and caps that effectively freeze the breakdown of waste. Because this method can…

Currently conventional Subtitle D landfills are the primary means of disposing of our waste in the United States. While this method of waste disposal aims at protecting the environment, it does so through the use of liners and caps that effectively freeze the breakdown of waste. Because this method can keep landfills active, and thus a potential groundwater threat for over a hundred years, I take an in depth look at the ability of bioreactor landfills to quickly stabilize waste. In the thesis I detail the current state of bioreactor landfill technologies, assessing the pros and cons of anaerobic and aerobic bioreactor technologies. Finally, with an industrial perspective, I conclude that moving on to bioreactor landfills as an alternative isn't as simple as it may first appear, and that it is a contextually specific solution that must be further refined before replacing current landfills.

ContributorsWhitten, George Avery (Author) / Kavazanjian, Edward (Thesis director) / Allenby, Braden (Committee member) / Houston, Sandra (Committee member) / Civil, Environmental and Sustainable Engineering Programs (Contributor) / Barrett, The Honors College (Contributor)

Created2013-05

Utilizing Neural Networks to Predict Freezing of Gait in Parkinson's Patients

Description

The artificial neural network is a form of machine learning that is highly effective at recognizing patterns in large, noise-filled datasets. Possessing these attributes uniquely qualifies the neural network as a mathematical basis for adaptability in personal biomedical devices. The purpose of this study was to determine the viability of…

The artificial neural network is a form of machine learning that is highly effective at recognizing patterns in large, noise-filled datasets. Possessing these attributes uniquely qualifies the neural network as a mathematical basis for adaptability in personal biomedical devices. The purpose of this study was to determine the viability of neural networks in predicting Freezing of Gait (FoG), a symptom of Parkinson's disease in which the patient's legs are suddenly rendered unable to move. More specifically, a class of neural networks known as layered recurrent networks (LRNs) was applied to an open- source FoG experimental dataset donated to the Machine Learning Repository of the University of California at Irvine. The independent variables in this experiment \u2014 the subject being tested, neural network architecture, and sampling of the majority classes \u2014 were each varied and compared against the performance of the neural network in predicting future FoG events. It was determined that single-layered recurrent networks are a viable method of predicting FoG events given the volume of the training data available, though results varied significantly between different patients. For the three patients tested, shank acceleration data was used to train networks with peak precision/recall values of 41.88%/47.12%, 89.05%/29.60%, and 57.19%/27.39% respectively. These values were obtained for networks optimized using detection theory rather than optimized for desired values of precision and recall. Furthermore, due to the nature of the experiments performed in this study, these values are representative of the lower-bound performance of layered recurrent networks trained to detect gait freezing. As such, these values may be improved through a variety of measures.

ContributorsZia, Jonathan Sargon (Author) / Panchanathan, Sethuraman (Thesis director) / McDaniel, Troy (Committee member) / Adler, Charles (Committee member) / Electrical Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2016-05

An Image Analysis Environment for Species Identification of Food Contaminating Beetles

Description

Food safety is vital to the well-being of society; therefore, it is important to inspect food products to ensure minimal health risks are present. A crucial phase of food inspection is the identification of foreign particles found in the sample, such as insect body parts. The presence of certain species…

Food safety is vital to the well-being of society; therefore, it is important to inspect food products to ensure minimal health risks are present. A crucial phase of food inspection is the identification of foreign particles found in the sample, such as insect body parts. The presence of certain species of insects, especially storage beetles, is a reliable indicator of possible contamination during storage and food processing. However, the current approach to identifying species is visual examination by human analysts; this method is rather subjective and time-consuming. Furthermore, confident identification requires extensive experience and training. To aid this inspection process, we have developed in collaboration with FDA analysts some image analysis-based machine intelligence to achieve species identification with up to 90% accuracy. The current project is a continuation of this development effort. Here we present an image analysis environment that allows practical deployment of the machine intelligence on computers with limited processing power and memory. Using this environment, users can prepare input sets by selecting images for analysis, and inspect these images through the integrated pan, zoom, and color analysis capabilities. After species analysis, the results panel allows the user to compare the analyzed images with referenced images of the proposed species. Further additions to this environment should include a log of previously analyzed images, and eventually extend to interaction with a central cloud repository of images through a web-based interface. Additional issues to address include standardization of image layout, extension of the feature-extraction algorithm, and utilizing image classification to build a central search engine for widespread usage.

ContributorsMartin, Daniel Luis (Author) / Ahn, Gail-Joon (Thesis director) / DoupÃÂ©, Adam (Committee member) / Xu, Joshua (Committee member) / Computer Science and Engineering Program (Contributor) / Department of Finance (Contributor) / Barrett, The Honors College (Contributor)

Created2016-05

Transformers: An Architectural Analysis, Survey and Implementation

Description

The aim of this project is to understand the basic algorithmic components of the transformer deep learning architecture. At a high level, a transformer is a machine learning model based off of a recurrent neural network that adopts a self-attention mechanism, which can weigh significant parts of sequential input data…

The aim of this project is to understand the basic algorithmic components of the transformer deep learning architecture. At a high level, a transformer is a machine learning model based off of a recurrent neural network that adopts a self-attention mechanism, which can weigh significant parts of sequential input data which is very useful for solving problems in natural language processing and computer vision. There are other approaches to solving these problems which have been implemented in the past (i.e., convolutional neural networks and recurrent neural networks), but these architectures introduce the issue of the vanishing gradient problem when an input becomes too long (which essentially means the network loses its memory and halts learning) and have a slow training time in general. The transformer architecture’s features enable a much better “memory” and a faster training time, which makes it a more optimal architecture in solving problems. Most of this project will be spent producing a survey that captures the current state of research on the transformer, and any background material to understand it. First, I will do a keyword search of the most well cited and up-to-date peer reviewed publications on transformers to understand them conceptually. Next, I will investigate any necessary programming frameworks that will be required to implement the architecture. I will use this to implement a simplified version of the architecture or follow an easy to use guide or tutorial in implementing the architecture. Once the programming aspect of the architecture is understood, I will then Implement a transformer based on the academic paper “Attention is All You Need”. I will then slightly tweak this model using my understanding of the architecture to improve performance. Once finished, the details (i.e., successes, failures, process and inner workings) of the implementation will be evaluated and reported, as well as the fundamental concepts surveyed. The motivation behind this project is to explore the rapidly growing area of AI algorithms, and the transformer algorithm in particular was chosen because it is a major milestone for engineering with AI and software. Since their introduction, transformers have provided a very effective way of solving natural language processing, which has allowed any related applications to succeed with high speed while maintaining accuracy. Since then, this type of model can be applied to more cutting edge natural language processing applications, such as extracting semantic information from a text description and generating an image to satisfy it.

ContributorsCereghini, Nicola (Author) / Acuna, Ruben (Thesis director) / Bansal, Ajay (Committee member) / Barrett, The Honors College (Contributor) / Software Engineering (Contributor)

Created2023-05

Characterization and Manipulation of Microbiomes From Arid Landfills for Improved Methane Production

Description

Environmentally harmful byproducts from solid waste’s decomposition, including methane (CH4) emissions, are managed through standardized landfill engineering and gas-capture mechanisms. Yet only a limited number of studies have analyzed the development and composition of Bacteria and Archaea involved in CH4 production from landfills. The objectives of this research were to…

Environmentally harmful byproducts from solid waste’s decomposition, including methane (CH4) emissions, are managed through standardized landfill engineering and gas-capture mechanisms. Yet only a limited number of studies have analyzed the development and composition of Bacteria and Archaea involved in CH4 production from landfills. The objectives of this research were to compare microbiomes and bioactivity from CH4-producing communities in contrasting spatial areas of arid landfills and to tests a new technology to biostimulate CH4 production (methanogenesis) from solid waste under dynamic environmental conditions controlled in the laboratory. My hypothesis was that the diversity and abundance of methanogenic Archaea in municipal solid waste (MSW), or its leachate, play an important role on CH4 production partially attributed to the group’s wide hydrogen (H2) consumption capabilities. I tested this hypothesis by conducting complementary field observations and laboratory experiments. I describe niches of methanogenic Archaea in MSW leachate across defined areas within a single landfill, while demonstrating functional H2-dependent activity. To alleviate limited H2 bioavailability encountered in-situ, I present biostimulant feasibility and proof-of-concepts studies through the amendment of zero valent metals (ZVMs). My results demonstrate that older-aged MSW was minimally biostimulated for greater CH4 production relative to a control when exposed to iron (Fe0) or manganese (Mn0), due to highly discernable traits of soluble carbon, nitrogen, and unidentified fluorophores found in water extracts between young and old aged, starting MSW. Acetate and inhibitory H2 partial pressures accumulated in microcosms containing old-aged MSW. In a final experiment, repeated amendments of ZVMs to MSW in a 600 day mesocosm experiment mediated significantly higher CH4 concentrations and yields during the first of three ZVM injections. Fe0 and Mn0 experimental treatments at mesocosm-scale also highlighted accelerated development of seemingly important, but elusive Archaea including Methanobacteriaceae, a methane-producing family that is found in diverse environments. Also, prokaryotic classes including Candidatus Bathyarchaeota, an uncultured group commonly found in carbon-rich ecosystems, and Clostridia; All three taxa I identified as highly predictive in the time-dependent progression of MSW decomposition. Altogether, my experiments demonstrate the importance of H2 bioavailability on CH4 production and the consistent development of Methanobacteriaceae in productive MSW microbiomes.

ContributorsReynolds, Mark Christian (Author) / Cadillo-Quiroz, Hinsby (Thesis advisor) / Krajmalnik-Brown, Rosa (Thesis advisor) / Wang, Xuan (Committee member) / Kavazanjian, Edward (Committee member) / Arizona State University (Publisher)

Created2022

Filtering by