Matching Items (20)
Filtering by

Clear all filters

156084-Thumbnail Image.png
Description
The performance of most of the visual computing tasks depends on the quality of the features extracted from the raw data. Insightful feature representation increases the performance of many learning algorithms by exposing the underlying explanatory factors of the output for the unobserved input. A good representation should also handle

The performance of most of the visual computing tasks depends on the quality of the features extracted from the raw data. Insightful feature representation increases the performance of many learning algorithms by exposing the underlying explanatory factors of the output for the unobserved input. A good representation should also handle anomalies in the data such as missing samples and noisy input caused by the undesired, external factors of variation. It should also reduce the data redundancy. Over the years, many feature extraction processes have been invented to produce good representations of raw images and videos.

The feature extraction processes can be categorized into three groups. The first group contains processes that are hand-crafted for a specific task. Hand-engineering features requires the knowledge of domain experts and manual labor. However, the feature extraction process is interpretable and explainable. Next group contains the latent-feature extraction processes. While the original feature lies in a high-dimensional space, the relevant factors for a task often lie on a lower dimensional manifold. The latent-feature extraction employs hidden variables to expose the underlying data properties that cannot be directly measured from the input. Latent features seek a specific structure such as sparsity or low-rank into the derived representation through sophisticated optimization techniques. The last category is that of deep features. These are obtained by passing raw input data with minimal pre-processing through a deep network. Its parameters are computed by iteratively minimizing a task-based loss.

In this dissertation, I present four pieces of work where I create and learn suitable data representations. The first task employs hand-crafted features to perform clinically-relevant retrieval of diabetic retinopathy images. The second task uses latent features to perform content-adaptive image enhancement. The third task ranks a pair of images based on their aestheticism. The goal of the last task is to capture localized image artifacts in small datasets with patch-level labels. For both these tasks, I propose novel deep architectures and show significant improvement over the previous state-of-art approaches. A suitable combination of feature representations augmented with an appropriate learning approach can increase performance for most visual computing tasks.
ContributorsChandakkar, Parag Shridhar (Author) / Li, Baoxin (Thesis advisor) / Yang, Yezhou (Committee member) / Turaga, Pavan (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)
Created2017
156611-Thumbnail Image.png
Description
Handwritten documents have gained popularity in various domains including education and business. A key task in analyzing a complex document is to distinguish between various content types such as text, math, graphics, tables and so on. For example, one such aspect could be a region on the document with a

Handwritten documents have gained popularity in various domains including education and business. A key task in analyzing a complex document is to distinguish between various content types such as text, math, graphics, tables and so on. For example, one such aspect could be a region on the document with a mathematical expression; in this case, the label would be math. This differentiation facilitates the performance of specific recognition tasks depending on the content type. We hypothesize that the recognition accuracy of the subsequent tasks such as textual, math, and shape recognition will increase, further leading to a better analysis of the document.

Content detection on handwritten documents assigns a particular class to a homogeneous portion of the document. To complete this task, a set of handwritten solutions was digitally collected from middle school students located in two different geographical regions in 2017 and 2018. This research discusses the methods to collect, pre-process and detect content type in the collected handwritten documents. A total of 4049 documents were extracted in the form of image, and json format; and were labelled using an object labelling software with tags being text, math, diagram, cross out, table, graph, tick mark, arrow, and doodle. The labelled images were fed to the Tensorflow’s object detection API to learn a neural network model. We show our results from two neural networks models, Faster Region-based Convolutional Neural Network (Faster R-CNN) and Single Shot detection model (SSD).
ContributorsFaizaan, Shaik Mohammed (Author) / VanLehn, Kurt (Thesis advisor) / Cheema, Salman Shaukat (Thesis advisor) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)
Created2018
156771-Thumbnail Image.png
Description
Reinforcement learning (RL) is a powerful methodology for teaching autonomous agents complex behaviors and skills. A critical component in most RL algorithms is the reward function -- a mathematical function that provides numerical estimates for desirable and undesirable states. Typically, the reward function must be hand-designed by a human expert

Reinforcement learning (RL) is a powerful methodology for teaching autonomous agents complex behaviors and skills. A critical component in most RL algorithms is the reward function -- a mathematical function that provides numerical estimates for desirable and undesirable states. Typically, the reward function must be hand-designed by a human expert and, as a result, the scope of a robot's autonomy and ability to safely explore and learn in new and unforeseen environments is constrained by the specifics of the designed reward function. In this thesis, I design and implement a stateful collision anticipation model with powerful predictive capability based upon my research of sequential data modeling and modern recurrent neural networks. I also develop deep reinforcement learning methods whose rewards are generated by self-supervised training and intrinsic signals. The main objective is to work towards the development of resilient robots that can learn to anticipate and avoid damaging interactions by combining visual and proprioceptive cues from internal sensors. The introduced solutions are inspired by pain pathways in humans and animals, because such pathways are known to guide decision-making processes and promote self-preservation. A new "robot dodge ball' benchmark is introduced in order to test the validity of the developed algorithms in dynamic environments.
ContributorsRichardson, Trevor W (Author) / Ben Amor, Heni (Thesis advisor) / Yang, Yezhou (Committee member) / Srivastava, Siddharth (Committee member) / Arizona State University (Publisher)
Created2018
Description
In the field of machine learning, reinforcement learning stands out for its ability to explore approaches to complex, high dimensional problems that outperform even expert humans. For robotic locomotion tasks reinforcement learning provides an approach to solving them without the need for unique controllers. In this thesis, two reinforcement learning

In the field of machine learning, reinforcement learning stands out for its ability to explore approaches to complex, high dimensional problems that outperform even expert humans. For robotic locomotion tasks reinforcement learning provides an approach to solving them without the need for unique controllers. In this thesis, two reinforcement learning algorithms, Deep Deterministic Policy Gradient and Group Factor Policy Search are compared based upon their performance in the bipedal walking environment provided by OpenAI gym. These algorithms are evaluated on their performance in the environment and their sample efficiency.
ContributorsMcDonald, Dax (Author) / Ben Amor, Heni (Thesis director) / Yang, Yezhou (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)
Created2018-12
Description
Graph matching is a fundamental but notoriously difficult problem due to its NP-hard nature, and serves as a cornerstone for a series of applications in machine learning and computer vision, such as image matching, dynamic routing, drug design, to name a few. Although there has been massive previous investigation on

Graph matching is a fundamental but notoriously difficult problem due to its NP-hard nature, and serves as a cornerstone for a series of applications in machine learning and computer vision, such as image matching, dynamic routing, drug design, to name a few. Although there has been massive previous investigation on high-performance graph matching solvers, it still remains a challenging task to tackle the matching problem under real-world scenarios with severe graph uncertainty (e.g., noise, outlier, misleading or ambiguous link).In this dissertation, a main focus is to investigate the essence and propose solutions to graph matching with higher reliability under such uncertainty. To this end, the proposed research was conducted taking into account three perspectives related to reliable graph matching: modeling, optimization and learning. For modeling, graph matching is extended from typical quadratic assignment problem to a more generic mathematical model by introducing a specific family of separable function, achieving higher capacity and reliability. In terms of optimization, a novel high gradient-efficient determinant-based regularization technique is proposed in this research, showing high robustness against outliers. Then learning paradigm for graph matching under intrinsic combinatorial characteristics is explored. First, a study is conducted on the way of filling the gap between discrete problem and its continuous approximation under a deep learning framework. Then this dissertation continues to investigate the necessity of more reliable latent topology of graphs for matching, and propose an effective and flexible framework to obtain it. Coherent findings in this dissertation include theoretical study and several novel algorithms, with rich experiments demonstrating the effectiveness.
ContributorsYu, Tianshu (Author) / Li, Baoxin (Thesis advisor) / Wang, Yalin (Committee member) / Yang, Yezhou (Committee member) / Yang, Yingzhen (Committee member) / Arizona State University (Publisher)
Created2021
168275-Thumbnail Image.png
Description
Graph matching is a fundamental but notoriously difficult problem due to its NP-hard nature, and serves as a cornerstone for a series of applications in machine learning and computer vision, such as image matching, dynamic routing, drug design, to name a few. Although there has been massive previous investigation on

Graph matching is a fundamental but notoriously difficult problem due to its NP-hard nature, and serves as a cornerstone for a series of applications in machine learning and computer vision, such as image matching, dynamic routing, drug design, to name a few. Although there has been massive previous investigation on high-performance graph matching solvers, it still remains a challenging task to tackle the matching problem under real-world scenarios with severe graph uncertainty (e.g., noise, outlier, misleading or ambiguous link).In this dissertation, a main focus is to investigate the essence and propose solutions to graph matching with higher reliability under such uncertainty. To this end, the proposed research was conducted taking into account three perspectives related to reliable graph matching: modeling, optimization and learning. For modeling, graph matching is extended from typical quadratic assignment problem to a more generic mathematical model by introducing a specific family of separable function, achieving higher capacity and reliability. In terms of optimization, a novel high gradient-efficient determinant-based regularization technique is proposed in this research, showing high robustness against outliers. Then learning paradigm for graph matching under intrinsic combinatorial characteristics is explored. First, a study is conducted on the way of filling the gap between discrete problem and its continuous approximation under a deep learning framework. Then this dissertation continues to investigate the necessity of more reliable latent topology of graphs for matching, and propose an effective and flexible framework to obtain it. Coherent findings in this dissertation include theoretical study and several novel algorithms, with rich experiments demonstrating the effectiveness.
ContributorsYu, Tianshu (Author) / Li, Baoxin (Thesis advisor) / Wang, Yalin (Committee member) / Yang, Yezhou (Committee member) / Yang, Yingzhen (Committee member) / Arizona State University (Publisher)
Created2021
Description

Machine learning has a near infinite number of applications, of which the potential has yet to have been fully harnessed and realized. This thesis will outline two departments that machine learning can be utilized in, and demonstrate the execution of one methodology in each department. The first department that will

Machine learning has a near infinite number of applications, of which the potential has yet to have been fully harnessed and realized. This thesis will outline two departments that machine learning can be utilized in, and demonstrate the execution of one methodology in each department. The first department that will be described is self-play in video games, where a neural model will be researched and described that will teach a computer to complete a level of Super Mario World (1990) on its own. The neural model in question was inspired by the academic paper “Evolving Neural Networks through Augmenting Topologies”, which was written by Kenneth O. Stanley and Risto Miikkulainen of University of Texas at Austin. The model that will actually be described is from YouTuber SethBling of the California Institute of Technology. The second department that will be described is cybersecurity, where an algorithm is described from the academic paper “Process Based Volatile Memory Forensics for Ransomware Detection”, written by Asad Arfeen, Muhammad Asim Khan, Obad Zafar, and Usama Ahsan. This algorithm utilizes Python and the Volatility framework to detect malicious software in an infected system.

ContributorsBallecer, Joshua (Author) / Yang, Yezhou (Thesis director) / Luo, Yiran (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)
Created2023-05
132368-Thumbnail Image.png
Description
A defense-by-randomization framework is proposed as an effective defense mechanism against different types of adversarial attacks on neural networks. Experiments were conducted by selecting a combination of differently constructed image classification neural networks to observe which combinations applied to this framework were most effective in maximizing classification accuracy. Furthermore, the

A defense-by-randomization framework is proposed as an effective defense mechanism against different types of adversarial attacks on neural networks. Experiments were conducted by selecting a combination of differently constructed image classification neural networks to observe which combinations applied to this framework were most effective in maximizing classification accuracy. Furthermore, the reasons why particular combinations were more effective than others is explored.
ContributorsMazboudi, Yassine Ahmad (Author) / Yang, Yezhou (Thesis director) / Ren, Yi (Committee member) / School of Mathematical and Statistical Sciences (Contributor) / Economics Program in CLAS (Contributor) / Barrett, The Honors College (Contributor)
Created2019-05
Description
Propaganda bots are malicious bots on Twitter that spread divisive opinions and support political accounts. This project is based on detecting propaganda bots on Twitter using machine learning. Once I began to observe patterns within propaganda followers on Twitter, I determined that I could train algorithms to detect

Propaganda bots are malicious bots on Twitter that spread divisive opinions and support political accounts. This project is based on detecting propaganda bots on Twitter using machine learning. Once I began to observe patterns within propaganda followers on Twitter, I determined that I could train algorithms to detect these bots. The paper focuses on my development and process of training classifiers and using them to create a user-facing server that performs prediction functions automatically. The learning goals of this project were detailed, the focus of which was to learn some form of machine learning architecture. I needed to learn some aspect of large data handling, as well as being able to maintain these datasets for training use. I also needed to develop a server that would execute these functionalities on command. I wanted to be able to design a full-stack system that allowed me to create every aspect of a user-facing server that can execute predictions using the classifiers that I design.
Throughout this project, I decided on a number of learning goals to consider it a success. I needed to learn how to use the supporting libraries that would help me to design this system. I also learned how to use the Twitter API, as well as create the infrastructure behind it that would allow me to collect large amounts of data for machine learning. I needed to become familiar with common machine learning libraries in Python in order to create the necessary algorithms and pipelines to make predictions based on Twitter data.
This paper details the steps and decisions needed to determine how to collect this data and apply it to machine learning algorithms. I determined how to create labelled data using pre-existing Botometer ratings, and the levels of confidence I needed to label data for training. I use the scikit-learn library to create these algorithms to best detect these bots. I used a number of pre-processing routines to refine the classifiers’ precision, including natural language processing and data analysis techniques. I eventually move to remotely-hosted versions of the system on Amazon web instances to collect larger amounts of data and train more advanced classifiers. This leads to the details of my final implementation of a user-facing server, hosted on AWS and interfacing over Gmail’s IMAP server.
The current and future development of this system is laid out. This includes more advanced classifiers, better data analysis, conversions to third party Twitter data collection systems, and user features. I detail what it is I have learned from this exercise, and what it is I hope to continue working on.
ContributorsPeterson, Austin (Author) / Yang, Yezhou (Thesis director) / Sadasivam, Aadhavan (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)
Created2019-05
165084-Thumbnail Image.png
Description
This project aspires to develop an AI capable of playing on a variety of maps in a Risk-like board game. While AI has been successfully applied to many other board games, such as Chess and Go, most research is confined to a single board and is inflexible to topological changes.

This project aspires to develop an AI capable of playing on a variety of maps in a Risk-like board game. While AI has been successfully applied to many other board games, such as Chess and Go, most research is confined to a single board and is inflexible to topological changes. Further, almost all of these games are played on a rectangular grid. Contrarily, this project develops an AI player, referred to as GG-net, to play the online strategy game Warzone, which is based on the classic board game Risk. Warzone is played on a wide variety of irregularly shaped maps. Prior research has struggled to create an effective AI for Risk-like games due to the immense branching factor. The most successful attempts tended to rely on manually restricting the set of actions the AI considered while also engineering useful features for the AI to consider. GG-net uses no human knowledge, but rather a genetic algorithm combined with a graph neural network. Together, these methods allow GG-net to perform competitively across a multitude of maps. GG-net outperformed the built-in rule-based AI by 413 Elo (representing an 80.7% chance of winning) and an approach based on AlphaZero using graph neural networks by 304 Elo (representing a 74.2% chance of winning). This same advantage holds across both seen and unseen maps. GG-net appears to be a strong opponent on both small and medium maps, however, on large maps with hundreds of territories, inefficiencies in GG-net become more significant and GG-net struggles against the rule-based approach. Overall, GG-net was able to successfully learn the game and generalize across maps of a similar size, albeit further work is required for GG-net to become more successful on large maps.
ContributorsBauer, Andrew (Author) / Yang, Yezhou (Thesis director) / Harrison, Blake (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor) / School of Mathematical and Statistical Sciences (Contributor)
Created2022-05