Matching Items (41)
Filtering by

Clear all filters

156044-Thumbnail Image.png
Description
In a collaborative environment where multiple robots and human beings are expected

to collaborate to perform a task, it becomes essential for a robot to be aware of multiple

agents working in its work environment. A robot must also learn to adapt to

different agents in the workspace and conduct its interaction based

In a collaborative environment where multiple robots and human beings are expected

to collaborate to perform a task, it becomes essential for a robot to be aware of multiple

agents working in its work environment. A robot must also learn to adapt to

different agents in the workspace and conduct its interaction based on the presence

of these agents. A theoretical framework was introduced which performs interaction

learning from demonstrations in a two-agent work environment, and it is called

Interaction Primitives.

This document is an in-depth description of the new state of the art Python

Framework for Interaction Primitives between two agents in a single as well as multiple

task work environment and extension of the original framework in a work environment

with multiple agents doing a single task. The original theory of Interaction

Primitives has been extended to create a framework which will capture correlation

between more than two agents while performing a single task. The new state of the

art Python framework is an intuitive, generic, easy to install and easy to use python

library which can be applied to use the Interaction Primitives framework in a work

environment. This library was tested in simulated environments and controlled laboratory

environment. The results and benchmarks of this library are available in the

related sections of this document.
ContributorsKumar, Ashish, M.S (Author) / Amor, Hani Ben (Thesis advisor) / Zhang, Yu (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)
Created2017
156084-Thumbnail Image.png
Description
The performance of most of the visual computing tasks depends on the quality of the features extracted from the raw data. Insightful feature representation increases the performance of many learning algorithms by exposing the underlying explanatory factors of the output for the unobserved input. A good representation should also handle

The performance of most of the visual computing tasks depends on the quality of the features extracted from the raw data. Insightful feature representation increases the performance of many learning algorithms by exposing the underlying explanatory factors of the output for the unobserved input. A good representation should also handle anomalies in the data such as missing samples and noisy input caused by the undesired, external factors of variation. It should also reduce the data redundancy. Over the years, many feature extraction processes have been invented to produce good representations of raw images and videos.

The feature extraction processes can be categorized into three groups. The first group contains processes that are hand-crafted for a specific task. Hand-engineering features requires the knowledge of domain experts and manual labor. However, the feature extraction process is interpretable and explainable. Next group contains the latent-feature extraction processes. While the original feature lies in a high-dimensional space, the relevant factors for a task often lie on a lower dimensional manifold. The latent-feature extraction employs hidden variables to expose the underlying data properties that cannot be directly measured from the input. Latent features seek a specific structure such as sparsity or low-rank into the derived representation through sophisticated optimization techniques. The last category is that of deep features. These are obtained by passing raw input data with minimal pre-processing through a deep network. Its parameters are computed by iteratively minimizing a task-based loss.

In this dissertation, I present four pieces of work where I create and learn suitable data representations. The first task employs hand-crafted features to perform clinically-relevant retrieval of diabetic retinopathy images. The second task uses latent features to perform content-adaptive image enhancement. The third task ranks a pair of images based on their aestheticism. The goal of the last task is to capture localized image artifacts in small datasets with patch-level labels. For both these tasks, I propose novel deep architectures and show significant improvement over the previous state-of-art approaches. A suitable combination of feature representations augmented with an appropriate learning approach can increase performance for most visual computing tasks.
ContributorsChandakkar, Parag Shridhar (Author) / Li, Baoxin (Thesis advisor) / Yang, Yezhou (Committee member) / Turaga, Pavan (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)
Created2017
156611-Thumbnail Image.png
Description
Handwritten documents have gained popularity in various domains including education and business. A key task in analyzing a complex document is to distinguish between various content types such as text, math, graphics, tables and so on. For example, one such aspect could be a region on the document with a

Handwritten documents have gained popularity in various domains including education and business. A key task in analyzing a complex document is to distinguish between various content types such as text, math, graphics, tables and so on. For example, one such aspect could be a region on the document with a mathematical expression; in this case, the label would be math. This differentiation facilitates the performance of specific recognition tasks depending on the content type. We hypothesize that the recognition accuracy of the subsequent tasks such as textual, math, and shape recognition will increase, further leading to a better analysis of the document.

Content detection on handwritten documents assigns a particular class to a homogeneous portion of the document. To complete this task, a set of handwritten solutions was digitally collected from middle school students located in two different geographical regions in 2017 and 2018. This research discusses the methods to collect, pre-process and detect content type in the collected handwritten documents. A total of 4049 documents were extracted in the form of image, and json format; and were labelled using an object labelling software with tags being text, math, diagram, cross out, table, graph, tick mark, arrow, and doodle. The labelled images were fed to the Tensorflow’s object detection API to learn a neural network model. We show our results from two neural networks models, Faster Region-based Convolutional Neural Network (Faster R-CNN) and Single Shot detection model (SSD).
ContributorsFaizaan, Shaik Mohammed (Author) / VanLehn, Kurt (Thesis advisor) / Cheema, Salman Shaukat (Thesis advisor) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)
Created2018
156771-Thumbnail Image.png
Description
Reinforcement learning (RL) is a powerful methodology for teaching autonomous agents complex behaviors and skills. A critical component in most RL algorithms is the reward function -- a mathematical function that provides numerical estimates for desirable and undesirable states. Typically, the reward function must be hand-designed by a human expert

Reinforcement learning (RL) is a powerful methodology for teaching autonomous agents complex behaviors and skills. A critical component in most RL algorithms is the reward function -- a mathematical function that provides numerical estimates for desirable and undesirable states. Typically, the reward function must be hand-designed by a human expert and, as a result, the scope of a robot's autonomy and ability to safely explore and learn in new and unforeseen environments is constrained by the specifics of the designed reward function. In this thesis, I design and implement a stateful collision anticipation model with powerful predictive capability based upon my research of sequential data modeling and modern recurrent neural networks. I also develop deep reinforcement learning methods whose rewards are generated by self-supervised training and intrinsic signals. The main objective is to work towards the development of resilient robots that can learn to anticipate and avoid damaging interactions by combining visual and proprioceptive cues from internal sensors. The introduced solutions are inspired by pain pathways in humans and animals, because such pathways are known to guide decision-making processes and promote self-preservation. A new "robot dodge ball' benchmark is introduced in order to test the validity of the developed algorithms in dynamic environments.
ContributorsRichardson, Trevor W (Author) / Ben Amor, Heni (Thesis advisor) / Yang, Yezhou (Committee member) / Srivastava, Siddharth (Committee member) / Arizona State University (Publisher)
Created2018
133397-Thumbnail Image.png
Description
Students learn in various ways \u2014 visualization, auditory, memorizing, or making analogies. Traditional lecturing in engineering courses and the learning styles of engineering students are inharmonious causing students to be at a disadvantage based on their learning style (Felder & Silverman, 1988). My study analyzes the traditional approach to learning

Students learn in various ways \u2014 visualization, auditory, memorizing, or making analogies. Traditional lecturing in engineering courses and the learning styles of engineering students are inharmonious causing students to be at a disadvantage based on their learning style (Felder & Silverman, 1988). My study analyzes the traditional approach to learning coding skills which is unnatural to engineering students with no previous exposure and examining if visual learning enhances introductory computer science education. Visual and text-based learning are evaluated to determine how students learn introductory coding skills and associated problem solving skills. My study was conducted to observe how the two types of learning aid the students in learning how to problem solve as well as how much knowledge can be obtained in a short period of time. The application used for visual learning was Scratch and Repl.it was used for text-based learning. Two exams were made to measure the progress made by each student. The topics covered by the exam were initialization, variable reassignment, output, if statements, if else statements, nested if statements, logical operators, arrays/lists, while loop, type casting, functions, object orientation, and sorting. Analysis of the data collected in the study allow us to observe whether the traditional method of teaching programming or block-based programming is more beneficial and in what topics of introductory computer science concepts.
ContributorsVidaure, Destiny Vanessa (Author) / Meuth, Ryan (Thesis director) / Yang, Yezhou (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)
Created2018-05
Description
In the field of machine learning, reinforcement learning stands out for its ability to explore approaches to complex, high dimensional problems that outperform even expert humans. For robotic locomotion tasks reinforcement learning provides an approach to solving them without the need for unique controllers. In this thesis, two reinforcement learning

In the field of machine learning, reinforcement learning stands out for its ability to explore approaches to complex, high dimensional problems that outperform even expert humans. For robotic locomotion tasks reinforcement learning provides an approach to solving them without the need for unique controllers. In this thesis, two reinforcement learning algorithms, Deep Deterministic Policy Gradient and Group Factor Policy Search are compared based upon their performance in the bipedal walking environment provided by OpenAI gym. These algorithms are evaluated on their performance in the environment and their sample efficiency.
ContributorsMcDonald, Dax (Author) / Ben Amor, Heni (Thesis director) / Yang, Yezhou (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)
Created2018-12
155378-Thumbnail Image.png
Description
To ensure system integrity, robots need to proactively avoid any unwanted physical perturbation that may cause damage to the underlying hardware. In this thesis work, we investigate a machine learning approach that allows robots to anticipate impending physical perturbations from perceptual cues. In contrast to other approaches that require knowledge

To ensure system integrity, robots need to proactively avoid any unwanted physical perturbation that may cause damage to the underlying hardware. In this thesis work, we investigate a machine learning approach that allows robots to anticipate impending physical perturbations from perceptual cues. In contrast to other approaches that require knowledge about sources of perturbation to be encoded before deployment, our method is based on experiential learning. Robots learn to associate visual cues with subsequent physical perturbations and contacts. In turn, these extracted visual cues are then used to predict potential future perturbations acting on the robot. To this end, we introduce a novel deep network architecture which combines multiple sub- networks for dealing with robot dynamics and perceptual input from the environment. We present a self-supervised approach for training the system that does not require any labeling of training data. Extensive experiments in a human-robot interaction task show that a robot can learn to predict physical contact by a human interaction partner without any prior information or labeling. Furthermore, the network is able to successfully predict physical contact from either depth stream input or traditional video input or using both modalities as input.
ContributorsSur, Indranil (Author) / Amor, Heni B (Thesis advisor) / Fainekos, Georgios (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)
Created2017
Description
Graph matching is a fundamental but notoriously difficult problem due to its NP-hard nature, and serves as a cornerstone for a series of applications in machine learning and computer vision, such as image matching, dynamic routing, drug design, to name a few. Although there has been massive previous investigation on

Graph matching is a fundamental but notoriously difficult problem due to its NP-hard nature, and serves as a cornerstone for a series of applications in machine learning and computer vision, such as image matching, dynamic routing, drug design, to name a few. Although there has been massive previous investigation on high-performance graph matching solvers, it still remains a challenging task to tackle the matching problem under real-world scenarios with severe graph uncertainty (e.g., noise, outlier, misleading or ambiguous link).In this dissertation, a main focus is to investigate the essence and propose solutions to graph matching with higher reliability under such uncertainty. To this end, the proposed research was conducted taking into account three perspectives related to reliable graph matching: modeling, optimization and learning. For modeling, graph matching is extended from typical quadratic assignment problem to a more generic mathematical model by introducing a specific family of separable function, achieving higher capacity and reliability. In terms of optimization, a novel high gradient-efficient determinant-based regularization technique is proposed in this research, showing high robustness against outliers. Then learning paradigm for graph matching under intrinsic combinatorial characteristics is explored. First, a study is conducted on the way of filling the gap between discrete problem and its continuous approximation under a deep learning framework. Then this dissertation continues to investigate the necessity of more reliable latent topology of graphs for matching, and propose an effective and flexible framework to obtain it. Coherent findings in this dissertation include theoretical study and several novel algorithms, with rich experiments demonstrating the effectiveness.
ContributorsYu, Tianshu (Author) / Li, Baoxin (Thesis advisor) / Wang, Yalin (Committee member) / Yang, Yezhou (Committee member) / Yang, Yingzhen (Committee member) / Arizona State University (Publisher)
Created2021
168275-Thumbnail Image.png
Description
Graph matching is a fundamental but notoriously difficult problem due to its NP-hard nature, and serves as a cornerstone for a series of applications in machine learning and computer vision, such as image matching, dynamic routing, drug design, to name a few. Although there has been massive previous investigation on

Graph matching is a fundamental but notoriously difficult problem due to its NP-hard nature, and serves as a cornerstone for a series of applications in machine learning and computer vision, such as image matching, dynamic routing, drug design, to name a few. Although there has been massive previous investigation on high-performance graph matching solvers, it still remains a challenging task to tackle the matching problem under real-world scenarios with severe graph uncertainty (e.g., noise, outlier, misleading or ambiguous link).In this dissertation, a main focus is to investigate the essence and propose solutions to graph matching with higher reliability under such uncertainty. To this end, the proposed research was conducted taking into account three perspectives related to reliable graph matching: modeling, optimization and learning. For modeling, graph matching is extended from typical quadratic assignment problem to a more generic mathematical model by introducing a specific family of separable function, achieving higher capacity and reliability. In terms of optimization, a novel high gradient-efficient determinant-based regularization technique is proposed in this research, showing high robustness against outliers. Then learning paradigm for graph matching under intrinsic combinatorial characteristics is explored. First, a study is conducted on the way of filling the gap between discrete problem and its continuous approximation under a deep learning framework. Then this dissertation continues to investigate the necessity of more reliable latent topology of graphs for matching, and propose an effective and flexible framework to obtain it. Coherent findings in this dissertation include theoretical study and several novel algorithms, with rich experiments demonstrating the effectiveness.
ContributorsYu, Tianshu (Author) / Li, Baoxin (Thesis advisor) / Wang, Yalin (Committee member) / Yang, Yezhou (Committee member) / Yang, Yingzhen (Committee member) / Arizona State University (Publisher)
Created2021
168739-Thumbnail Image.png
Description
Visual navigation is a useful and important task for a variety of applications. As the preva­lence of robots increase, there is an increasing need for energy-­efficient navigation methods as well. Many aspects of efficient visual navigation algorithms have been implemented in the lit­erature, but there is a lack of work

Visual navigation is a useful and important task for a variety of applications. As the preva­lence of robots increase, there is an increasing need for energy-­efficient navigation methods as well. Many aspects of efficient visual navigation algorithms have been implemented in the lit­erature, but there is a lack of work on evaluation of the efficiency of the image sensors. In this thesis, two methods are evaluated: adaptive image sensor quantization for traditional camera pipelines as well as new event­-based sensors for low­-power computer vision.The first contribution in this thesis is an evaluation of performing varying levels of sen­sor linear and logarithmic quantization with the task of visual simultaneous localization and mapping (SLAM). This unconventional method can provide efficiency benefits with a trade­ off between accuracy of the task and energy-­efficiency. A new sensor quantization method, gradient­-based quantization, is introduced to improve the accuracy of the task. This method only lowers the bit level of parts of the image that are less likely to be important in the SLAM algorithm since lower bit levels signify better energy­-efficiency, but worse task accuracy. The third contribution is an evaluation of the efficiency and accuracy of event­-based camera inten­sity representations for the task of optical flow. The results of performing a learning based optical flow are provided for each of five different reconstruction methods along with ablation studies. Lastly, the challenges of an event feature­-based SLAM system are presented with re­sults demonstrating the necessity for high quality and high­ resolution event data. The work in this thesis provides studies useful for examining trade­offs for an efficient visual navigation system with traditional and event vision sensors. The results of this thesis also provide multiple directions for future work.
ContributorsChristie, Olivia Catherine (Author) / Jayasuriya, Suren (Thesis advisor) / Chakrabarti, Chaitali (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)
Created2022