Search Content

Pain-Inspired Intrinsic Reward For Deep Reinforcement Learning

Description

Reinforcement learning (RL) is a powerful methodology for teaching autonomous agents complex behaviors and skills. A critical component in most RL algorithms is the reward function -- a mathematical function that provides numerical estimates for desirable and undesirable states. Typically, the reward function must be hand-designed by a human expert…

Reinforcement learning (RL) is a powerful methodology for teaching autonomous agents complex behaviors and skills. A critical component in most RL algorithms is the reward function -- a mathematical function that provides numerical estimates for desirable and undesirable states. Typically, the reward function must be hand-designed by a human expert and, as a result, the scope of a robot's autonomy and ability to safely explore and learn in new and unforeseen environments is constrained by the specifics of the designed reward function. In this thesis, I design and implement a stateful collision anticipation model with powerful predictive capability based upon my research of sequential data modeling and modern recurrent neural networks. I also develop deep reinforcement learning methods whose rewards are generated by self-supervised training and intrinsic signals. The main objective is to work towards the development of resilient robots that can learn to anticipate and avoid damaging interactions by combining visual and proprioceptive cues from internal sensors. The introduced solutions are inspired by pain pathways in humans and animals, because such pathways are known to guide decision-making processes and promote self-preservation. A new "robot dodge ball' benchmark is introduced in order to test the validity of the developed algorithms in dynamic environments.

ContributorsRichardson, Trevor W (Author) / Ben Amor, Heni (Thesis advisor) / Yang, Yezhou (Committee member) / Srivastava, Siddharth (Committee member) / Arizona State University (Publisher)

Created2018

Identifying critical regions for robot planning using convolutional neural networks

Description

In this thesis, a new approach to learning-based planning is presented where critical regions of an environment with low probability measure are learned from a given set of motion plans. Critical regions are learned using convolutional neural networks (CNN) to improve sampling processes for motion planning (MP).

In addition to an…

In this thesis, a new approach to learning-based planning is presented where critical regions of an environment with low probability measure are learned from a given set of motion plans. Critical regions are learned using convolutional neural networks (CNN) to improve sampling processes for motion planning (MP).

In addition to an identification network, a new sampling-based motion planner, Learn and Link, is introduced. This planner leverages critical regions to overcome the limitations of uniform sampling while still maintaining guarantees of correctness inherent to sampling-based algorithms. Learn and Link is evaluated against planners from the Open Motion Planning Library (OMPL) on an extensive suite of challenging navigation planning problems. This work shows that critical areas of an environment are learnable, and can be used by Learn and Link to solve MP problems with far less planning time than existing sampling-based planners.

ContributorsMolina, Daniel, M.S (Author) / Srivastava, Siddharth (Thesis advisor) / Li, Baoxin (Committee member) / Zhang, Yu (Committee member) / Arizona State University (Publisher)

Created2019

Reasoning and Learning with Probabilistic Answer Set Programming

Description

Knowledge Representation (KR) is one of the prominent approaches to Artificial Intelligence (AI) that is concerned with representing knowledge in a form that computer systems can utilize to solve complex problems. Answer Set Programming (ASP), based on the stable model semantics, is a widely-used KR framework that facilitates elegant and…

Knowledge Representation (KR) is one of the prominent approaches to Artificial Intelligence (AI) that is concerned with representing knowledge in a form that computer systems can utilize to solve complex problems. Answer Set Programming (ASP), based on the stable model semantics, is a widely-used KR framework that facilitates elegant and efficient representations for many problem domains that require complex reasoning.

However, while ASP is effective on deterministic problem domains, it is not suitable for applications involving quantitative uncertainty, for example, those that require probabilistic reasoning. Furthermore, it is hard to utilize information that can be statistically induced from data with ASP problem modeling.

This dissertation presents the language LP^MLN, which is a probabilistic extension of the stable model semantics with the concept of weighted rules, inspired by Markov Logic. An LP^MLN program defines a probability distribution over "soft" stable models, which may not satisfy all rules, but the more rules with the bigger weights they satisfy, the bigger their probabilities. LP^MLN takes advantage of both ASP and Markov Logic in a single framework, allowing representation of problems that require both logical and probabilistic reasoning in an intuitive and elaboration tolerant way.

This dissertation establishes formal relations between LP^MLN and several other formalisms, discusses inference and weight learning algorithms under LP^MLN, and presents systems implementing the algorithms. LP^MLN systems can be used to compute other languages translatable into LP^MLN.

The advantage of LP^MLN for probabilistic reasoning is illustrated by a probabilistic extension of the action language BC+, called pBC+, defined as a high-level notation of LP^MLN for describing transition systems. Various probabilistic reasoning about transition systems, especially probabilistic diagnosis, can be modeled in pBC+ and computed using LP^MLN systems. pBC+ is further extended with the notion of utility, through a decision-theoretic extension of LP^MLN, and related with Markov Decision Process (MDP) in terms of policy optimization problems. pBC+ can be used to represent (PO)MDP in a succinct and elaboration tolerant way, which enables planning with (PO)MDP algorithms in action domains whose description requires rich KR constructs, such as recursive definitions and indirect effects of actions.

ContributorsWang, Yi (Author) / Lee, Joohyung (Thesis advisor) / Baral, Chitta (Committee member) / Kambhampati, Subbarao (Committee member) / Natarajan, Sriraam (Committee member) / Srivastava, Siddharth (Committee member) / Arizona State University (Publisher)

Created2019

Training Robot Policies using External Memory Based Networks Via Imitation Learning

Description

Recent advancements in external memory based neural networks have shown promise

in solving tasks that require precise storage and retrieval of past information. Re-

searchers have applied these models to a wide range of tasks that have algorithmic

properties but have not applied these models to real-world robotic tasks. In this

thesis, we present…

Recent advancements in external memory based neural networks have shown promise

in solving tasks that require precise storage and retrieval of past information. Re-

searchers have applied these models to a wide range of tasks that have algorithmic

properties but have not applied these models to real-world robotic tasks. In this

thesis, we present memory-augmented neural networks that synthesize robot navigation policies which a) encode long-term temporal dependencies b) make decisions in

partially observed environments and c) quantify the uncertainty inherent in the task.

We extract information about the temporal structure of a task via imitation learning

from human demonstration and evaluate the performance of the models on control

policies for a robot navigation task. Experiments are performed in partially observed

environments in both simulation and the real world

ContributorsSrivatsav, Nambi (Author) / Ben Amor, Hani (Thesis advisor) / Srivastava, Siddharth (Committee member) / Tong, Hanghang (Committee member) / Arizona State University (Publisher)

Created2018

Design and Modeling of Soft Curved Reconfigurable Anisotropic Mechanisms

Description

This dissertation introduces and examines Soft Curved Reconfigurable Anisotropic Mechanisms (SCRAMs) as a solution to address actuation, manufacturing, and modeling challenges in the field of soft robotics, with the aim of facilitating the broader implementation of soft robots in various industries. SCRAM systems utilize the curved geometry of thin elastic…

This dissertation introduces and examines Soft Curved Reconfigurable Anisotropic Mechanisms (SCRAMs) as a solution to address actuation, manufacturing, and modeling challenges in the field of soft robotics, with the aim of facilitating the broader implementation of soft robots in various industries. SCRAM systems utilize the curved geometry of thin elastic structures to tackle these challenges in soft robots. SCRAM devices can modify their dynamic behavior by incorporating reconfigurable anisotropic stiffness, thereby enabling tailored locomotion patterns for specific tasks. This approach simplifies the actuation of robots, resulting in lighter, more flexible, cost-effective, and safer soft robotic systems. This dissertation demonstrates the potential of SCRAM devices through several case studies. These studies investigate virtual joints and shape change propagation in tubes, as well as anisotropic dynamic behavior in vibrational soft twisted beams, effectively demonstrating interesting locomotion patterns that are achievable using simple actuation mechanisms. The dissertation also addresses modeling and simulation challenges by introducing a reduced-order modeling approach. This approach enables fast and accurate simulations of soft robots and is compatible with existing rigid body simulators. Additionally, this dissertation investigates the prototyping processes of SCRAM devices and offers a comprehensive framework for the development of these devices. Overall, this dissertation demonstrates the potential of SCRAM devices to overcome actuation, modeling, and manufacturing challenges in soft robotics. The innovative concepts and approaches presented have implications for various industries that require cost-effective, adaptable, and safe robotic systems. SCRAM devices pave the way for the widespread application of soft robots in diverse domains.

ContributorsJiang, Yuhao (Author) / Aukes, Daniel (Thesis advisor) / Berman, Spring (Committee member) / Lee, Hyunglae (Committee member) / Marvi, Hamidreza (Committee member) / Srivastava, Siddharth (Committee member) / Arizona State University (Publisher)

Created2023

An Investigation into Modern Facial Expressions Recognition by a Computer

Description

Facial Expressions Recognition using the Convolution Neural Network has been actively researched upon in the last decade due to its high number of applications in the human-computer interaction domain. As Convolution Neural Networks have the exceptional ability to learn, they outperform the methods using handcrafted features. Though the state-of-the-art models…

Facial Expressions Recognition using the Convolution Neural Network has been actively researched upon in the last decade due to its high number of applications in the human-computer interaction domain. As Convolution Neural Networks have the exceptional ability to learn, they outperform the methods using handcrafted features. Though the state-of-the-art models achieve high accuracy on the lab-controlled images, they still struggle for the wild expressions. Wild expressions are captured in a real-world setting and have natural expressions. Wild databases have many challenges such as occlusion, variations in lighting conditions and head poses. In this work, I address these challenges and propose a new model containing a Hybrid Convolutional Neural Network with a Fusion Layer. The Fusion Layer utilizes a combination of the knowledge obtained from two different domains for enhanced feature extraction from the in-the-wild images. I tested my network on two publicly available in-the-wild datasets namely RAF-DB and AffectNet. Next, I tested my trained model on CK+ dataset for the cross-database evaluation study. I prove that my model achieves comparable results with state-of-the-art methods. I argue that it can perform well on such datasets because it learns the features from two different domains rather than a single domain. Last, I present a real-time facial expression recognition system as a part of this work where the images are captured in real-time using laptop camera and passed to the model for obtaining a facial expression label for it. It indicates that the proposed model has low processing time and can produce output almost instantly.

ContributorsChhabra, Sachin (Author) / Li, Baoxin (Thesis advisor) / Venkateswara, Hemanth (Committee member) / Srivastava, Siddharth (Committee member) / Arizona State University (Publisher)

Created2019

Foundations of Human-Aware Explanations for Sequential Decision-Making Problems

Description

Recent breakthroughs in Artificial Intelligence (AI) have brought the dream of developing and deploying complex AI systems that can potentially transform everyday life closer to reality than ever before. However, the growing realization that there might soon be people from all walks of life using and working with these systems…

Recent breakthroughs in Artificial Intelligence (AI) have brought the dream of developing and deploying complex AI systems that can potentially transform everyday life closer to reality than ever before. However, the growing realization that there might soon be people from all walks of life using and working with these systems has also spurred a lot of interest in ensuring that AI systems can efficiently and effectively work and collaborate with their intended users. Chief among the efforts in this direction has been the pursuit of imbuing these agents with the ability to provide intuitive and useful explanations regarding their decisions and actions to end-users. In this dissertation, I will describe various works that I have done in the area of explaining sequential decision-making problems. Furthermore, I will frame the discussions of my work within a broader framework for understanding and analyzing explainable AI (XAI). My works herein tackle many of the core challenges related to explaining automated decisions to users including (1) techniques to address asymmetry in knowledge between the user and the system, (2) techniques to address asymmetry in inferential capabilities, and (3) techniques to address vocabulary mismatch.The dissertation will also describe the works I have done in generating interpretable behavior and policy summarization. I will conclude this dissertation, by using the framework of human-aware explanation as a lens to analyze and understand the current landscape of explainable planning.

ContributorsSreedharan, Sarath (Author) / Kambhampati, Subbarao (Thesis advisor) / Kim, Been (Committee member) / Smith, David E (Committee member) / Srivastava, Siddharth (Committee member) / Zhang, Yu (Committee member) / Arizona State University (Publisher)

Created2022

Incorporating Human Cognitive Limitations Into Sequential Decision Making Problems and Algorithms

Description

With improvements in automation and system capabilities, human responsibilities in those advanced systems can get more complicated; greater situational awareness and performance may be asked of human agents in roles such as fail-safe operators. This phenomenon of automation improvements requiring more from humans in the loop, is connected to the…

With improvements in automation and system capabilities, human responsibilities in those advanced systems can get more complicated; greater situational awareness and performance may be asked of human agents in roles such as fail-safe operators. This phenomenon of automation improvements requiring more from humans in the loop, is connected to the well-known “paradox of automation”. Unfortunately, humans have cognitive limitations that can constrain a person's performance on a task. If one considers human cognitive limitations when designing solutions or policies for human agents, then better results are possible. The focus of this dissertation is on improving human involvement in planning and execution for Sequential Decision Making (SDM) problems. Existing work already considers incorporating humans into planning and execution in SDM, but with limited consideration for cognitive limitations. The work herein focuses on how to improve human involvement through problems in motion planning, planning interfaces, Markov Decision Processes (MDP), and human-team scheduling. This done by first discussing the human modeling assumptions currently used in the literature and their shortcomings. Then this dissertation tackles a set of problems by considering problem-specific human cognitive limitations --such as those associated with memory and inference-- as well as use lessons from fields such as cognitive ergonomics.

ContributorsGopalakrishnan, Sriram (Author) / Kambhampati, Subbarao (Thesis advisor) / Srivastava, Siddharth (Committee member) / Scheutz, Matthias (Committee member) / Zhang, Yu (Tony) (Committee member) / Arizona State University (Publisher)

Created2022

Max Markov Chain

Description

High-order Markov Chains are useful in a variety of situations. However, theseprocesses are limited in the complexity of the domains they can model. In complex domains, Markov models can require 100’s of Gigabytes of ram leading to the need of a parsimonious model. In this work, I present the Max Markov Chain…

High-order Markov Chains are useful in a variety of situations. However, theseprocesses are limited in the complexity of the domains they can model. In complex domains, Markov models can require 100’s of Gigabytes of ram leading to the need of a parsimonious model. In this work, I present the Max Markov Chain (MMC). A robust model for estimating high-order datasets using only first-order parameters. High-order Markov chains (HMC) and Markov approximations (MTDg) struggle to scale to large state spaces due to the exponentially growing number of parameters required to model these domains. MMC can accurately approximate these models using only first-order parameters given the domain fulfills the MMC assumption. MMC naturally has better sample efficiency, and the desired spatial and computational advantages over HMCs and approximate HMCs. I will present evidence demonstrating the effectiveness of MMC in a variety of domains and compare its performance with HMCs and Markov approximations. Human behavior is inherently complex and challenging to model. Due to the high number of parameters required for traditional Markov models, the excessive computing requirements make real-time human simulation computationally expensive and impractical. I argue in certain situations, the behavior of humans follows that of a sparsely connected Markov model. In this work I focus on the subset of Markov Models which are just that, sparsely connected.

ContributorsBucklew, Mitchell (Author) / Zhang, Yu T (Thesis advisor) / Srivastava, Siddharth (Committee member) / Kambhampati, Subbarao (Committee member) / Arizona State University (Publisher)

Created2022

Roblocks: An Educational System for AI Planning and Reasoning

Description

This research introduces Roblocks, a user-friendly system for learning Artificial Intelligence (AI) planning concepts using mobile manipulator robots. It uses a visual programming interface based on block-structured programming to make AI planning concepts easier to grasp for those who are new to robotics and AI planning. Users get to accomplish…

This research introduces Roblocks, a user-friendly system for learning Artificial Intelligence (AI) planning concepts using mobile manipulator robots. It uses a visual programming interface based on block-structured programming to make AI planning concepts easier to grasp for those who are new to robotics and AI planning. Users get to accomplish any desired tasks by dynamically populating puzzle shaped blocks encoding the robot’s possible actions, allowing them to carry out tasks like navigation, planning, and manipulation by connecting blocks instead of writing code. Roblocks has two levels, where in the first level users are made to re-arrange a jumbled set of actions of a plan in the correct order so that a given goal could be achieved. In the second level, they select actions of their choice but at each step only those actions pertaining to the current state are made available to them, thereby pruning down the vast number of possible actions and suggesting only the truly feasible and relevant actions. Both of these levels have a simulation where the user plan is executed. Moreover, if the user plan is invalid or fails to achieve the given goal condition then an explanation for the failure is provided in simple English language. This makes it easier for everyone (especially for non-roboticists) to understand the cause of the failure.

ContributorsDave, Chirav (Author) / Srivastava, Siddharth (Thesis advisor) / Hsiao, Ihan (Committee member) / Zhang, Yu (Committee member) / Arizona State University (Publisher)

Created2019

Filtering by