Matching Items (18)
Filtering by

Clear all filters

Description
With the advent of sophisticated computer technology, we increasingly see the use of computational techniques in the study of problems from a variety of disciplines, including the humanities. In a field such as poetry, where classic works are subject to frequent re-analysis over the course of years, decades, or even

With the advent of sophisticated computer technology, we increasingly see the use of computational techniques in the study of problems from a variety of disciplines, including the humanities. In a field such as poetry, where classic works are subject to frequent re-analysis over the course of years, decades, or even centuries, there is a certain demand for fresh approaches to familiar tasks, and such breaks from convention may even be necessary for the advancement of the field. Existing quantitative studies of poetry have employed computational techniques in their analyses, however, there remains work to be done with regards to the deployment of deep neural networks on large corpora of poetry to classify portions of the works contained therein based on certain features. While applications of neural networks to social media sites, consumer reviews, and other web-originated data are common within computational linguistics and natural language processing, comparatively little work has been done on the computational analysis of poetry using the same techniques. In this work, I begin to lay out the first steps for the study of poetry using neural networks. Using a convolutional neural network to classify author birth date, I was able to not only extract a non-trivial signal from the data, but also identify the presence of clustering within by-author model accuracy. While definitive conclusions about the cause of this clustering were not reached, investigation of this clustering reveals immense heterogeneity in the traits of accurately classified authors. Further study may unpack this clustering and reveal key insights about how temporal information is encoded in poetry. The study of poetry using neural networks remains very open but exhibits potential to be an interesting and deep area of work.
ContributorsGoodloe, Oscar Laurence (Author) / Nishimura, Joel (Thesis director) / Broatch, Jennifer (Committee member) / School of Mathematical and Natural Sciences (Contributor) / Barrett, The Honors College (Contributor)
Created2019-05
132930-Thumbnail Image.png
Description
The explosive Web growth in the last decade has drastically changed the way billions of people all around the globe conduct numerous activities including creating, sharing, and consuming information. The massive amount of user-generated information encourages companies and service providers to collect users' information and use it in order to

The explosive Web growth in the last decade has drastically changed the way billions of people all around the globe conduct numerous activities including creating, sharing, and consuming information. The massive amount of user-generated information encourages companies and service providers to collect users' information and use it in order to better their own goals and then further provide personalized services to users as well. However, the users' information contains their private and sensitive information and can lead to breach of users' privacy. Anonymizing users' information before publishing and using such data is vital in securing their privacy. Due to the many forms of user information (e.g., structural, interactions, etc), different techniques are required for anonymization of users' data. In this thesis, first we discuss different anonymization techniques for various types of user-generated data, i.e., network graphs, web browsing history, and user-item interactions. Our experimental results show the effectiveness of such techniques for data anonymization. Then, we briefly touch on securely and privately sharing information through blockchains.
ContributorsNou, Alex Sheavin (Author) / Liu, Huan (Thesis director) / Beigi, Ghazaleh (Committee member) / Computer Science and Engineering Program (Contributor, Contributor) / Barrett, The Honors College (Contributor)
Created2019-05
Description
In the field of machine learning, reinforcement learning stands out for its ability to explore approaches to complex, high dimensional problems that outperform even expert humans. For robotic locomotion tasks reinforcement learning provides an approach to solving them without the need for unique controllers. In this thesis, two reinforcement learning

In the field of machine learning, reinforcement learning stands out for its ability to explore approaches to complex, high dimensional problems that outperform even expert humans. For robotic locomotion tasks reinforcement learning provides an approach to solving them without the need for unique controllers. In this thesis, two reinforcement learning algorithms, Deep Deterministic Policy Gradient and Group Factor Policy Search are compared based upon their performance in the bipedal walking environment provided by OpenAI gym. These algorithms are evaluated on their performance in the environment and their sample efficiency.
ContributorsMcDonald, Dax (Author) / Ben Amor, Heni (Thesis director) / Yang, Yezhou (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)
Created2018-12
134107-Thumbnail Image.png
Description
Understanding the necessary skills required to work in an industry is a difficult task with many potential uses. By being able to predict the industry of a person based on their skills, professional social networks could make searching better with automated tagging, advertisers can target more carefully, and students can

Understanding the necessary skills required to work in an industry is a difficult task with many potential uses. By being able to predict the industry of a person based on their skills, professional social networks could make searching better with automated tagging, advertisers can target more carefully, and students can better find a career path that fits their skillset. The aim in this project is to apply deep learning to the world of professional networking. Deep Learning is a type of machine learning that has recently been making breakthroughs in the analysis of complex datasets that previously were not of much use. Initially the goal was to apply deep learning to the skills-to-company relationship, but a lack of quality data required a change to the skills-to-industry relationship. To accomplish the new goal, a database of LinkedIn profiles that are part of various industries was gathered and processed. From this dataset a model was created to take a list of skills and output an industry that people with those skills work in. Such a model has value in the insights that it forms allowing candidates to: determine what industry fits a skillset, identify key skills for industries, and locate which industries possible candidates may best fit in. Various models were trained and tested on a skill to industry dataset. The model was able to learn similarities between industries, and predict the most likely industries for each profiles skillset.
ContributorsAndrew, Benjamin (Co-author) / Thiel, Alex (Co-author) / Sodemann, Angela (Thesis director) / Sebold, Brent (Committee member) / Engineering Programs (Contributor) / Barrett, The Honors College (Contributor)
Created2017-12
132995-Thumbnail Image.png
Description
Lyric classification and generation are trending in topics in the machine learning community. Long Short-Term Networks (LSTMs) are effective tools for classifying and generating text. We explored their effectiveness in the generation and classification of lyrical data and proposed methods of evaluating their accuracy. We found that LSTM networks with

Lyric classification and generation are trending in topics in the machine learning community. Long Short-Term Networks (LSTMs) are effective tools for classifying and generating text. We explored their effectiveness in the generation and classification of lyrical data and proposed methods of evaluating their accuracy. We found that LSTM networks with dropout layers were effective at lyric classification. We also found that Word embedding LSTM networks were extremely effective at lyric generation.
ContributorsTallapragada, Amit (Author) / Ben Amor, Heni (Thesis director) / Caviedes, Jorge (Committee member) / Computer Science and Engineering Program (Contributor, Contributor) / Barrett, The Honors College (Contributor)
Created2019-05
135798-Thumbnail Image.png
Description
The artificial neural network is a form of machine learning that is highly effective at recognizing patterns in large, noise-filled datasets. Possessing these attributes uniquely qualifies the neural network as a mathematical basis for adaptability in personal biomedical devices. The purpose of this study was to determine the viability of

The artificial neural network is a form of machine learning that is highly effective at recognizing patterns in large, noise-filled datasets. Possessing these attributes uniquely qualifies the neural network as a mathematical basis for adaptability in personal biomedical devices. The purpose of this study was to determine the viability of neural networks in predicting Freezing of Gait (FoG), a symptom of Parkinson's disease in which the patient's legs are suddenly rendered unable to move. More specifically, a class of neural networks known as layered recurrent networks (LRNs) was applied to an open- source FoG experimental dataset donated to the Machine Learning Repository of the University of California at Irvine. The independent variables in this experiment \u2014 the subject being tested, neural network architecture, and sampling of the majority classes \u2014 were each varied and compared against the performance of the neural network in predicting future FoG events. It was determined that single-layered recurrent networks are a viable method of predicting FoG events given the volume of the training data available, though results varied significantly between different patients. For the three patients tested, shank acceleration data was used to train networks with peak precision/recall values of 41.88%/47.12%, 89.05%/29.60%, and 57.19%/27.39% respectively. These values were obtained for networks optimized using detection theory rather than optimized for desired values of precision and recall. Furthermore, due to the nature of the experiments performed in this study, these values are representative of the lower-bound performance of layered recurrent networks trained to detect gait freezing. As such, these values may be improved through a variety of measures.
ContributorsZia, Jonathan Sargon (Author) / Panchanathan, Sethuraman (Thesis director) / McDaniel, Troy (Committee member) / Adler, Charles (Committee member) / Electrical Engineering Program (Contributor) / Barrett, The Honors College (Contributor)
Created2016-05
135758-Thumbnail Image.png
Description
Food safety is vital to the well-being of society; therefore, it is important to inspect food products to ensure minimal health risks are present. A crucial phase of food inspection is the identification of foreign particles found in the sample, such as insect body parts. The presence of certain species

Food safety is vital to the well-being of society; therefore, it is important to inspect food products to ensure minimal health risks are present. A crucial phase of food inspection is the identification of foreign particles found in the sample, such as insect body parts. The presence of certain species of insects, especially storage beetles, is a reliable indicator of possible contamination during storage and food processing. However, the current approach to identifying species is visual examination by human analysts; this method is rather subjective and time-consuming. Furthermore, confident identification requires extensive experience and training. To aid this inspection process, we have developed in collaboration with FDA analysts some image analysis-based machine intelligence to achieve species identification with up to 90% accuracy. The current project is a continuation of this development effort. Here we present an image analysis environment that allows practical deployment of the machine intelligence on computers with limited processing power and memory. Using this environment, users can prepare input sets by selecting images for analysis, and inspect these images through the integrated pan, zoom, and color analysis capabilities. After species analysis, the results panel allows the user to compare the analyzed images with referenced images of the proposed species. Further additions to this environment should include a log of previously analyzed images, and eventually extend to interaction with a central cloud repository of images through a web-based interface. Additional issues to address include standardization of image layout, extension of the feature-extraction algorithm, and utilizing image classification to build a central search engine for widespread usage.
ContributorsMartin, Daniel Luis (Author) / Ahn, Gail-Joon (Thesis director) / Doupé, Adam (Committee member) / Xu, Joshua (Committee member) / Computer Science and Engineering Program (Contributor) / Department of Finance (Contributor) / Barrett, The Honors College (Contributor)
Created2016-05
Description

The aim of this project is to understand the basic algorithmic components of the transformer deep learning architecture. At a high level, a transformer is a machine learning model based off of a recurrent neural network that adopts a self-attention mechanism, which can weigh significant parts of sequential input data

The aim of this project is to understand the basic algorithmic components of the transformer deep learning architecture. At a high level, a transformer is a machine learning model based off of a recurrent neural network that adopts a self-attention mechanism, which can weigh significant parts of sequential input data which is very useful for solving problems in natural language processing and computer vision. There are other approaches to solving these problems which have been implemented in the past (i.e., convolutional neural networks and recurrent neural networks), but these architectures introduce the issue of the vanishing gradient problem when an input becomes too long (which essentially means the network loses its memory and halts learning) and have a slow training time in general. The transformer architecture’s features enable a much better “memory” and a faster training time, which makes it a more optimal architecture in solving problems. Most of this project will be spent producing a survey that captures the current state of research on the transformer, and any background material to understand it. First, I will do a keyword search of the most well cited and up-to-date peer reviewed publications on transformers to understand them conceptually. Next, I will investigate any necessary programming frameworks that will be required to implement the architecture. I will use this to implement a simplified version of the architecture or follow an easy to use guide or tutorial in implementing the architecture. Once the programming aspect of the architecture is understood, I will then Implement a transformer based on the academic paper “Attention is All You Need”. I will then slightly tweak this model using my understanding of the architecture to improve performance. Once finished, the details (i.e., successes, failures, process and inner workings) of the implementation will be evaluated and reported, as well as the fundamental concepts surveyed. The motivation behind this project is to explore the rapidly growing area of AI algorithms, and the transformer algorithm in particular was chosen because it is a major milestone for engineering with AI and software. Since their introduction, transformers have provided a very effective way of solving natural language processing, which has allowed any related applications to succeed with high speed while maintaining accuracy. Since then, this type of model can be applied to more cutting edge natural language processing applications, such as extracting semantic information from a text description and generating an image to satisfy it.

ContributorsCereghini, Nicola (Author) / Acuna, Ruben (Thesis director) / Bansal, Ajay (Committee member) / Barrett, The Honors College (Contributor) / Software Engineering (Contributor)
Created2023-05
132164-Thumbnail Image.png
Description
With the coming advances of computational power, algorithmic trading has become one of the primary strategies to trading on the stock market. To understand why and how these strategies have been effective, this project has taken a look at the complete process of creating tools and applications to analyze and

With the coming advances of computational power, algorithmic trading has become one of the primary strategies to trading on the stock market. To understand why and how these strategies have been effective, this project has taken a look at the complete process of creating tools and applications to analyze and predict stock prices in order to perform low-frequency trading. The project is composed of three main components. The first component is integrating several public resources to acquire and process financial trading data and store it in order to complete the other components. Alpha Vantage API, a free open source application, provides an accurate and comprehensive dataset of features for each stock ticker requested. The second component is researching, prototyping, and implementing various trading algorithms in code. We began by focusing on the Mean Reversion algorithm as a proof of concept algorithm to develop meaningful trading strategies and identify patterns within our datasets. To augment our market prediction power (“alpha”), we implemented a Long Short-Term Memory recurrent neural network. Neural Networks are an incredibly effective but often complex tool used frequently in data science when traditional methods are found lacking. Following the implementation, the last component is to optimize, analyze, compare, and contrast all of the algorithms and identify key features to conclude the overall effectiveness of each algorithm. We were able to identify conclusively which aspects of each algorithm provided better alpha and create an entire pipeline to automate this process for live trading implementation. An additional reason for automation is to provide an educational framework such that any who may be interested in quantitative finance in the future can leverage this project to gain further insight.
ContributorsYurowkin, Alexander (Co-author) / Kumar, Rohit (Co-author) / Welfert, Bruno (Thesis director) / Li, Baoxin (Committee member) / Economics Program in CLAS (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Barrett, The Honors College (Contributor)
Created2019-05
132368-Thumbnail Image.png
Description
A defense-by-randomization framework is proposed as an effective defense mechanism against different types of adversarial attacks on neural networks. Experiments were conducted by selecting a combination of differently constructed image classification neural networks to observe which combinations applied to this framework were most effective in maximizing classification accuracy. Furthermore, the

A defense-by-randomization framework is proposed as an effective defense mechanism against different types of adversarial attacks on neural networks. Experiments were conducted by selecting a combination of differently constructed image classification neural networks to observe which combinations applied to this framework were most effective in maximizing classification accuracy. Furthermore, the reasons why particular combinations were more effective than others is explored.
ContributorsMazboudi, Yassine Ahmad (Author) / Yang, Yezhou (Thesis director) / Ren, Yi (Committee member) / School of Mathematical and Statistical Sciences (Contributor) / Economics Program in CLAS (Contributor) / Barrett, The Honors College (Contributor)
Created2019-05