Search Content

Towards Learning Representations in Visual Computing Tasks

Description

The performance of most of the visual computing tasks depends on the quality of the features extracted from the raw data. Insightful feature representation increases the performance of many learning algorithms by exposing the underlying explanatory factors of the output for the unobserved input. A good representation should also handle…

The performance of most of the visual computing tasks depends on the quality of the features extracted from the raw data. Insightful feature representation increases the performance of many learning algorithms by exposing the underlying explanatory factors of the output for the unobserved input. A good representation should also handle anomalies in the data such as missing samples and noisy input caused by the undesired, external factors of variation. It should also reduce the data redundancy. Over the years, many feature extraction processes have been invented to produce good representations of raw images and videos.

The feature extraction processes can be categorized into three groups. The first group contains processes that are hand-crafted for a specific task. Hand-engineering features requires the knowledge of domain experts and manual labor. However, the feature extraction process is interpretable and explainable. Next group contains the latent-feature extraction processes. While the original feature lies in a high-dimensional space, the relevant factors for a task often lie on a lower dimensional manifold. The latent-feature extraction employs hidden variables to expose the underlying data properties that cannot be directly measured from the input. Latent features seek a specific structure such as sparsity or low-rank into the derived representation through sophisticated optimization techniques. The last category is that of deep features. These are obtained by passing raw input data with minimal pre-processing through a deep network. Its parameters are computed by iteratively minimizing a task-based loss.

In this dissertation, I present four pieces of work where I create and learn suitable data representations. The first task employs hand-crafted features to perform clinically-relevant retrieval of diabetic retinopathy images. The second task uses latent features to perform content-adaptive image enhancement. The third task ranks a pair of images based on their aestheticism. The goal of the last task is to capture localized image artifacts in small datasets with patch-level labels. For both these tasks, I propose novel deep architectures and show significant improvement over the previous state-of-art approaches. A suitable combination of feature representations augmented with an appropriate learning approach can increase performance for most visual computing tasks.

ContributorsChandakkar, Parag Shridhar (Author) / Li, Baoxin (Thesis advisor) / Yang, Yezhou (Committee member) / Turaga, Pavan (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)

Created2017

Content Detection in Handwritten Documents

Description

Handwritten documents have gained popularity in various domains including education and business. A key task in analyzing a complex document is to distinguish between various content types such as text, math, graphics, tables and so on. For example, one such aspect could be a region on the document with a…

Handwritten documents have gained popularity in various domains including education and business. A key task in analyzing a complex document is to distinguish between various content types such as text, math, graphics, tables and so on. For example, one such aspect could be a region on the document with a mathematical expression; in this case, the label would be math. This differentiation facilitates the performance of specific recognition tasks depending on the content type. We hypothesize that the recognition accuracy of the subsequent tasks such as textual, math, and shape recognition will increase, further leading to a better analysis of the document.

Content detection on handwritten documents assigns a particular class to a homogeneous portion of the document. To complete this task, a set of handwritten solutions was digitally collected from middle school students located in two different geographical regions in 2017 and 2018. This research discusses the methods to collect, pre-process and detect content type in the collected handwritten documents. A total of 4049 documents were extracted in the form of image, and json format; and were labelled using an object labelling software with tags being text, math, diagram, cross out, table, graph, tick mark, arrow, and doodle. The labelled images were fed to the Tensorflow’s object detection API to learn a neural network model. We show our results from two neural networks models, Faster Region-based Convolutional Neural Network (Faster R-CNN) and Single Shot detection model (SSD).

ContributorsFaizaan, Shaik Mohammed (Author) / VanLehn, Kurt (Thesis advisor) / Cheema, Salman Shaukat (Thesis advisor) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)

Created2018

Pain-Inspired Intrinsic Reward For Deep Reinforcement Learning

Description

Reinforcement learning (RL) is a powerful methodology for teaching autonomous agents complex behaviors and skills. A critical component in most RL algorithms is the reward function -- a mathematical function that provides numerical estimates for desirable and undesirable states. Typically, the reward function must be hand-designed by a human expert…

Reinforcement learning (RL) is a powerful methodology for teaching autonomous agents complex behaviors and skills. A critical component in most RL algorithms is the reward function -- a mathematical function that provides numerical estimates for desirable and undesirable states. Typically, the reward function must be hand-designed by a human expert and, as a result, the scope of a robot's autonomy and ability to safely explore and learn in new and unforeseen environments is constrained by the specifics of the designed reward function. In this thesis, I design and implement a stateful collision anticipation model with powerful predictive capability based upon my research of sequential data modeling and modern recurrent neural networks. I also develop deep reinforcement learning methods whose rewards are generated by self-supervised training and intrinsic signals. The main objective is to work towards the development of resilient robots that can learn to anticipate and avoid damaging interactions by combining visual and proprioceptive cues from internal sensors. The introduced solutions are inspired by pain pathways in humans and animals, because such pathways are known to guide decision-making processes and promote self-preservation. A new "robot dodge ball' benchmark is introduced in order to test the validity of the developed algorithms in dynamic environments.

ContributorsRichardson, Trevor W (Author) / Ben Amor, Heni (Thesis advisor) / Yang, Yezhou (Committee member) / Srivastava, Siddharth (Committee member) / Arizona State University (Publisher)

Created2018

Data Management Behind Machine Learning

Description

This thesis dives into the world of artificial intelligence by exploring the functionality of a single layer artificial neural network through a simple housing price classification example while simultaneously considering its impact from a data management perspective on both the software and hardware level. To begin this study, the universally…

This thesis dives into the world of artificial intelligence by exploring the functionality of a single layer artificial neural network through a simple housing price classification example while simultaneously considering its impact from a data management perspective on both the software and hardware level. To begin this study, the universally accepted model of an artificial neuron is broken down into its key components and then analyzed for functionality by relating back to its biological counterpart. The role of a neuron is then described in the context of a neural network, with equal emphasis placed on how it individually undergoes training and then for an entire network. Using the technique of supervised learning, the neural network is trained with three main factors for housing price classification, including its total number of rooms, bathrooms, and square footage. Once trained with most of the generated data set, it is tested for accuracy by introducing the remainder of the data-set and observing how closely its computed output for each set of inputs compares to the target value. From a programming perspective, the artificial neuron is implemented in C so that it would be more closely tied to the operating system and therefore make the collected profiler data more precise during the program's execution. The program is designed to break down each stage of the neuron's training process into distinct functions. In addition to utilizing more functional code, the struct data type is used as the underlying data structure for this project to not only represent the neuron but for implementing the neuron's training and test data. Once fully trained, the neuron's test results are then graphed to visually depict how well the neuron learned from its sample training set. Finally, the profiler data is analyzed to describe how the program operated from a data management perspective on the software and hardware level.

ContributorsRichards, Nicholas Giovanni (Author) / Miller, Phillip (Thesis director) / Meuth, Ryan (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2018-05

Exploring Computational Thinking in 9-12 Education: Developing a Computer Science Curriculum for Bioscience High School

Description

Bioscience High School, a small magnet high school located in Downtown Phoenix and a STEAM (Science, Technology, Engineering, Arts, Math) focused school, has been pushing to establish a computer science curriculum for all of their students from freshman to senior year. The school's Mision (Mission and Vision) is to: "..provide…

Bioscience High School, a small magnet high school located in Downtown Phoenix and a STEAM (Science, Technology, Engineering, Arts, Math) focused school, has been pushing to establish a computer science curriculum for all of their students from freshman to senior year. The school's Mision (Mission and Vision) is to: "..provide a rigorous, collaborative, and relevant academic program emphasizing an innovative, problem-based curriculum that develops literacy in the sciences, mathematics, and the arts, thus cultivating critical thinkers, creative problem-solvers, and compassionate citizens, who are able to thrive in our increasingly complex and technological communities." Computational thinking is an important part in developing a future problem solver Bioscience High School is looking to produce. Bioscience High School is unique in the fact that every student has a computer available for him or her to use. Therefore, it makes complete sense for the school to add computer science to their curriculum because one of the school's goals is to be able to utilize their resources to their full potential. However, the school's attempt at computer science integration falls short due to the lack of expertise amongst the math and science teachers. The lack of training and support has postponed the development of the program and they are desperately in need of someone with expertise in the field to help reboot the program. As a result, I've decided to create a course that is focused on teaching students the concepts of computational thinking and its application through Scratch and Arduino programming.

ContributorsLiu, Deming (Author) / Meuth, Ryan (Thesis director) / Nakamura, Mutsumi (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2016-05

Comparative Analysis in Acquisition of Coding Skills

Description

Students learn in various ways \u2014 visualization, auditory, memorizing, or making analogies. Traditional lecturing in engineering courses and the learning styles of engineering students are inharmonious causing students to be at a disadvantage based on their learning style (Felder & Silverman, 1988). My study analyzes the traditional approach to learning…

Students learn in various ways \u2014 visualization, auditory, memorizing, or making analogies. Traditional lecturing in engineering courses and the learning styles of engineering students are inharmonious causing students to be at a disadvantage based on their learning style (Felder & Silverman, 1988). My study analyzes the traditional approach to learning coding skills which is unnatural to engineering students with no previous exposure and examining if visual learning enhances introductory computer science education. Visual and text-based learning are evaluated to determine how students learn introductory coding skills and associated problem solving skills. My study was conducted to observe how the two types of learning aid the students in learning how to problem solve as well as how much knowledge can be obtained in a short period of time. The application used for visual learning was Scratch and Repl.it was used for text-based learning. Two exams were made to measure the progress made by each student. The topics covered by the exam were initialization, variable reassignment, output, if statements, if else statements, nested if statements, logical operators, arrays/lists, while loop, type casting, functions, object orientation, and sorting. Analysis of the data collected in the study allow us to observe whether the traditional method of teaching programming or block-based programming is more beneficial and in what topics of introductory computer science concepts.

ContributorsVidaure, Destiny Vanessa (Author) / Meuth, Ryan (Thesis director) / Yang, Yezhou (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2018-05

Predicting Sneaker Resale Prices using Machine Learning

Description

This thesis dives into the world of machine learning by attempting to create an application that will accurately predict whether or not a sneaker will resell at a profit. To begin this study, I first researched different machine learning algorithms to determine which would be best for this project. After…

This thesis dives into the world of machine learning by attempting to create an application that will accurately predict whether or not a sneaker will resell at a profit. To begin this study, I first researched different machine learning algorithms to determine which would be best for this project. After ultimately deciding on using an artificial neural network, I then moved on to collecting data, using StockX and Twitter. StockX is a platform where individuals can post and resell shoes, while also providing statistics and analytics about each pair of shoes. I used StockX to retrieve data about the actual shoe, which involved retrieving data for the network feature variables: gender, brand, and retail price. Additionally, I also retrieved the data for the average deadstock price for each shoe, which describes what the mean price of new, unworn shoes are selling for on StockX. This data was used with the retail price data to determine whether or not a shoe has been, on average, selling for a profit. I used Twitter’s API to retrieve links to different shoes on StockX along with retrieving the number of favorites and retweets each of those links had. These metrics were used to account for ‘hype’ of the shoe, with shoes traditionally being more profitable the larger the hype surrounding them. After preprocessing the data, I trained the model using a randomized 80% of the data. On average, the model had about a 65-70% accuracy range when tested with the remaining 20% of the data. Once the model was optimized, I saved it and uploaded it to a web application that took in user input for the five feature variables, tested the datapoint using the model, and outputted the confidence in whether or not the shoe would generate a profit.
From a technical perspective, I used Python for the whole project, while also using HTML/CSS for the front-end of the application. As for key packages, I used Keras, an open source neural network library to build the model; data preprocessing was done using sklearn’s various subpackages. All charts and graphs were done using data visualization libraries matplotlib and seaborn. These charts provided insight as to what the final dataset looked like. They showed how the brand distribution is relatively close to what it should be, while the gender distribution was heavily skewed. Future work on this project would involve expanding the dataset, automating the entirety of the data retrieval process, and finally deploying the project on the cloud for users everywhere to use the application.

ContributorsShah, Shail (Author) / Meuth, Ryan (Thesis director) / Nakamura, Mutsumi (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2019-05

Deep Periodic Networks

Description

In the field of machine learning, reinforcement learning stands out for its ability to explore approaches to complex, high dimensional problems that outperform even expert humans. For robotic locomotion tasks reinforcement learning provides an approach to solving them without the need for unique controllers. In this thesis, two reinforcement learning…

In the field of machine learning, reinforcement learning stands out for its ability to explore approaches to complex, high dimensional problems that outperform even expert humans. For robotic locomotion tasks reinforcement learning provides an approach to solving them without the need for unique controllers. In this thesis, two reinforcement learning algorithms, Deep Deterministic Policy Gradient and Group Factor Policy Search are compared based upon their performance in the bipedal walking environment provided by OpenAI gym. These algorithms are evaluated on their performance in the environment and their sample efficiency.

ContributorsMcDonald, Dax (Author) / Ben Amor, Heni (Thesis director) / Yang, Yezhou (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)

Created2018-12

Utilizing Machine Learning Methods to Model Cryptocurrency

Description

Cryptocurrencies have become one of the most fascinating forms of currency and economics due to their fluctuating values and lack of centralization. This project attempts to use machine learning methods to effectively model in-sample data for Bitcoin and Ethereum using rule induction methods. The dataset is cleaned by removing entries…

Cryptocurrencies have become one of the most fascinating forms of currency and economics due to their fluctuating values and lack of centralization. This project attempts to use machine learning methods to effectively model in-sample data for Bitcoin and Ethereum using rule induction methods. The dataset is cleaned by removing entries with missing data. The new column is created to measure price difference to create a more accurate analysis on the change in price. Eight relevant variables are selected using cross validation: the total number of bitcoins, the total size of the blockchains, the hash rate, mining difficulty, revenue from mining, transaction fees, the cost of transactions and the estimated transaction volume. The in-sample data is modeled using a simple tree fit, first with one variable and then with eight. Using all eight variables, the in-sample model and data have a correlation of 0.6822657. The in-sample model is improved by first applying bootstrap aggregation (also known as bagging) to fit 400 decision trees to the in-sample data using one variable. Then the random forests technique is applied to the data using all eight variables. This results in a correlation between the model and data of 9.9443413. The random forests technique is then applied to an Ethereum dataset, resulting in a correlation of 9.6904798. Finally, an out-of-sample model is created for Bitcoin and Ethereum using random forests, with a benchmark correlation of 0.03 for financial data. The correlation between the training model and the testing data for Bitcoin was 0.06957639, while for Ethereum the correlation was -0.171125. In conclusion, it is confirmed that cryptocurrencies can have accurate in-sample models by applying the random forests method to a dataset. However, out-of-sample modeling is more difficult, but in some cases better than typical forms of financial data. It should also be noted that cryptocurrency data has similar properties to other related financial datasets, realizing future potential for system modeling for cryptocurrency within the financial world.

ContributorsBrowning, Jacob Christian (Author) / Meuth, Ryan (Thesis director) / Jones, Donald (Committee member) / McCulloch, Robert (Committee member) / Computer Science and Engineering Program (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Barrett, The Honors College (Contributor)

Created2018-05

Machine Learning Enabled Analytics for Health-Related Demographics: a Case Study Identifying Important Factors in Cardiac Disease

Description

Machine learning for analytics has exponentially increased in the past few years due to its ability to identify hidden insights in data. It also has a plethora of applications in healthcare ranging from improving image recognition in CT scans to extracting semantic meaning from thousands of medical form PDFs. Currently…

Machine learning for analytics has exponentially increased in the past few years due to its ability to identify hidden insights in data. It also has a plethora of applications in healthcare ranging from improving image recognition in CT scans to extracting semantic meaning from thousands of medical form PDFs. Currently in the BioElectrical Systems and Technology Lab, there is a biosensor in development that retrieves and analyzes data manually. In a proof of concept, this project uses the neural network architecture to automatically parse and classify a cardiac disease data set as well as explore health related factors impacting cardiac disease in patients of all ages.

ContributorsMurella, Akhila Sainagaki (Author) / Blain-Christen, Jennifer (Thesis director) / Meuth, Ryan (Committee member) / Computer Science and Engineering Program (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / School of the Arts, Media and Engineering (Contributor) / Barrett, The Honors College (Contributor)

Created2018-05

Filtering by