Search Content

Haptic perception, decision-making, and learning for manipulation with artificial hands

Description

Robotic systems are outmatched by the abilities of the human hand to perceive and manipulate the world. Human hands are able to physically interact with the world to perceive, learn, and act to accomplish tasks. Limitations of robotic systems to interact with and manipulate the world diminish their usefulness. In…

Robotic systems are outmatched by the abilities of the human hand to perceive and manipulate the world. Human hands are able to physically interact with the world to perceive, learn, and act to accomplish tasks. Limitations of robotic systems to interact with and manipulate the world diminish their usefulness. In order to advance robot end effectors, specifically artificial hands, rich multimodal tactile sensing is needed. In this work, a multi-articulating, anthropomorphic robot testbed was developed for investigating tactile sensory stimuli during finger-object interactions. The artificial finger is controlled by a tendon-driven remote actuation system that allows for modular control of any tendon-driven end effector and capabilities for both speed and strength. The artificial proprioception system enables direct measurement of joint angles and tendon tensions while temperature, vibration, and skin deformation are provided by a multimodal tactile sensor. Next, attention was focused on real-time artificial perception for decision-making. A robotic system needs to perceive its environment in order to make decisions. Specific actions such as “exploratory procedures” can be employed to classify and characterize object features. Prior work on offline perception was extended to develop an anytime predictive model that returns the probability of having touched a specific feature of an object based on minimally processed sensor data. Developing models for anytime classification of features facilitates real-time action-perception loops. Finally, by combining real-time action-perception with reinforcement learning, a policy was learned to complete a functional contour-following task: closing a deformable ziplock bag. The approach relies only on proprioceptive and localized tactile data. A Contextual Multi-Armed Bandit (C-MAB) reinforcement learning algorithm was implemented to maximize cumulative rewards within a finite time period by balancing exploration versus exploitation of the action space. Performance of the C-MAB learner was compared to a benchmark Q-learner that eventually returns the optimal policy. To assess robustness and generalizability, the learned policy was tested on variations of the original contour-following task. The work presented contributes to the full range of tools necessary to advance the abilities of artificial hands with respect to dexterity, perception, decision-making, and learning.

ContributorsHellman, Randall Blake (Author) / Santos, Veronica J (Thesis advisor) / Artemiadis, Panagiotis K (Committee member) / Berman, Spring (Committee member) / Helms Tillery, Stephen I (Committee member) / Fainekos, Georgios (Committee member) / Arizona State University (Publisher)

Created2016

Design of a Graph Neural Network Coupled with an Advantage Actor-Critic Reinforcement Learning Algorithm for Multi-Agent Navigation

Description

A Graph Neural Network (GNN) is a type of neural network architecture that operates on data consisting of objects and their relationships, which are represented by a graph. Within the graph, nodes represent objects and edges represent associations between those objects. The representation of relationships and correlations between data is…

A Graph Neural Network (GNN) is a type of neural network architecture that operates on data consisting of objects and their relationships, which are represented by a graph. Within the graph, nodes represent objects and edges represent associations between those objects. The representation of relationships and correlations between data is unique to graph structures. GNNs exploit this feature of graphs by augmenting both forms of data, individual and relational, and have been designed to allow for communication and sharing of data within each neural network layer. These benefits allow each node to have an enriched perspective, or a better understanding, of its neighbouring nodes and its connections to those nodes. The ability of GNNs to efficiently process high-dimensional node data and multi-faceted relationships among nodes gives them advantages over neural network architectures such as Convolutional Neural Networks (CNNs) that do not implicitly handle relational data. These quintessential characteristics of GNN models make them suitable for solving problems in which the correspondences among input data are needed to produce an accurate and precise representation of these data. GNN frameworks may significantly improve existing communication and control techniques for multi-agent tasks by implicitly representing not only information associated with the individual agents, such as agent position, velocity, and camera data, but also their relationships with one another, such as distances between the agents and their ability to communicate with one another. One such task is a multi-agent navigation problem in which the agents must coordinate with one another in a decentralized manner, using proximity sensors only, to navigate safely to their intended goal positions in the environment without collisions or deadlocks. The contribution of this thesis is the design of an end-to-end decentralized control scheme for multi-agent navigation that utilizes GNNs to prevent inter-agent collisions and deadlocks. The contributions consist of the development, simulation and evaluation of the performance of an advantage actor-critic (A2C) reinforcement learning algorithm that employs actor and critic networks for training that simultaneously approximate the policy function and value function, respectively. These networks are implemented using GNN frameworks for navigation by groups of 3, 5, 10 and 15 agents in simulated two-dimensional environments. It is observed that in $40\%$ to $50\%$ of the simulation trials, between 70$\%$ to 80$\%$ of the agents reach their goal positions without colliding with other agents or becoming trapped in deadlocks. The model is also compared to a random run simulation, where actions are chosen randomly for the agents and observe that the model performs notably well for smaller groups of agents.

ContributorsAyalasomayajula, Manaswini (Author) / Berman, Spring (Thesis advisor) / Mian, Sami (Committee member) / Pavlic, Theodore (Committee member) / Arizona State University (Publisher)

Created2022

Combining learning with knowledge-rich planning allows for efficient multi-agent solutions to the problem of perpetual sparse rewards

Description

This work has improved the quality of the solution to the sparse rewards problemby combining reinforcement learning (RL) with knowledge-rich planning. Classical methods for coping with sparse rewards during reinforcement learning modify the reward landscape so as to better guide the learner. In contrast, this work combines RL with a planner in order…

This work has improved the quality of the solution to the sparse rewards problemby combining reinforcement learning (RL) with knowledge-rich planning. Classical methods for coping with sparse rewards during reinforcement learning modify the reward landscape so as to better guide the learner. In contrast, this work combines RL with a planner in order to utilize other information about the environment. As the scope for representing environmental information is limited in RL, this work has conflated a model-free learning algorithm – temporal difference (TD) learning – with a Hierarchical Task Network (HTN) planner to accommodate rich environmental information in the algorithm. In the perpetual sparse rewards problem, rewards reemerge after being collected within a fixed interval of time, culminating in a lack of a well-defined goal state as an exit condition to the problem. Incorporating planning in the learning algorithm not only improves the quality of the solution, but the algorithm also avoids the ambiguity of incorporating a goal of maximizing profit while using only a planning algorithm to solve this problem. Upon occasionally using the HTN planner, this algorithm provides the necessary tweak toward the optimal solution. In this work, I have demonstrated an on-policy algorithm that has improved the quality of the solution over vanilla reinforcement learning. The objective of this work has been to observe the capacity of the synthesized algorithm in finding optimal policies to maximize rewards, awareness of the environment, and the awareness of the presence of other agents in the vicinity.

ContributorsNandan, Swastik (Author) / Pavlic, Theodore (Thesis advisor) / Das, Jnaneshwar (Thesis advisor) / Berman, Spring (Committee member) / Arizona State University (Publisher)

Created2022

Decentralized Motion Planning for Autonomous Multi-Agent Systems: Multi-Segment Manipulators and Mobile Robot Collectives

Description

Multi-segment manipulators and mobile robot collectives are examples of multi-agent robotic systems, in which each segment or robot can be considered an agent. Fundamental motion control problems for such systems include the stabilization of one or more agents to target configurations or trajectories while preventing inter-agent collisions, agent collisions with…

Multi-segment manipulators and mobile robot collectives are examples of multi-agent robotic systems, in which each segment or robot can be considered an agent. Fundamental motion control problems for such systems include the stabilization of one or more agents to target configurations or trajectories while preventing inter-agent collisions, agent collisions with obstacles, and deadlocks. Despite extensive research on these control problems, there are still challenges in designing controllers that (1) are scalable with the number of agents; (2) have theoretical guarantees on collision-free agent navigation; and (3) can be used when the states of the agents and the environment are only partially observable. Existing centralized and distributed control architectures have limited scalability due to their computational complexity and communication requirements, while decentralized control architectures are often effective only under impractical assumptions that do not hold in real-world implementations. The main objective of this dissertation is to develop and evaluate decentralized approaches for multi-agent motion control that enable agents to use their onboard sensors and computational resources to decide how to move through their environment, with limited or absent inter-agent communication and external supervision. Specifically, control approaches are designed for multi-segment manipulators and mobile robot collectives to achieve position and pose (position and orientation) stabilization, trajectory tracking, and collision and deadlock avoidance. These control approaches are validated in both simulations and physical experiments to show that they can be implemented in real-time while remaining computationally tractable. First, kinematic controllers are proposed for position stabilization and trajectory tracking control of two- or three-dimensional hyper-redundant multi-segment manipulators. Next, robust and gradient-based feedback controllers are presented for individual holonomic and nonholonomic mobile robots that achieve position stabilization, trajectory tracking control, and obstacle avoidance. Then, nonlinear Model Predictive Control methods are developed for collision-free, deadlock-free pose stabilization and trajectory tracking control of multiple nonholonomic mobile robots in known and unknown environments with obstacles, both static and dynamic. Finally, a feedforward proportional-derivative controller is defined for collision-free velocity tracking of a moving ground target by multiple unmanned aerial vehicles.

ContributorsSalimi Lafmejani, Amir (Author) / Berman, Spring (Thesis advisor) / Tsakalis, Konstantinos (Thesis advisor) / Papandreou-Suppappola, Antonia (Committee member) / Marvi, Hamidreza (Committee member) / Arizona State University (Publisher)

Created2022

Dynamical System Design for Control of Single and Multiple Non-holonomic Differential Drive Robots Based on Critical Design Trade Studies

Description

Over the past few decades, there is an increase in demand for various ground robot applications such as warehouse management, surveillance, mapping, infrastructure inspection, etc. This steady increase in demand has led to a significant rise in the nonholonomic differential drive vehicles (DDV) research. Albeit extensive work has been done…

Over the past few decades, there is an increase in demand for various ground robot applications such as warehouse management, surveillance, mapping, infrastructure inspection, etc. This steady increase in demand has led to a significant rise in the nonholonomic differential drive vehicles (DDV) research. Albeit extensive work has been done in developing various control laws for trajectory tracking, point stabilization, formation control, etc., there are still problems and critical questions in regards to design, modeling, and control of DDV’s - that need to be adequately addressed. In this thesis, three different dynamical models are considered that are formed by varying the input/output parameters of the DDV model. These models are analyzed to understand their stability, bandwidth, input-output coupling, and control design properties. Furthermore, a systematic approach has been presented to show the impact of design parameters such as mass, inertia, radius of the wheels, and center of gravity location on the dynamic and inner-loop (speed) control design properties. Subsequently, extensive simulation and hardware trade studies have been conductedto quantify the impact of design parameters and modeling variations on the performance of outer-loop cruise and position control (along a curve). In addition to this, detailed guidelines are provided for when a multi-input multi-output (MIMO) control strategy is advisable over a single-input single-output (SISO) control strategy; when a less stable plant is preferable over a more stable one in order to accommodate performance specifications. Additionally, a multi-robot trajectory tracking implementation based on receding horizon optimization approach is also presented. In most of the optimization-based trajectory tracking approaches found in the literature, only the constraints imposed by the kinematic model are incorporated into the problem. This thesis elaborates the fundamental problem associated with these methods and presents a systematic approach to understand and quantify when kinematic model based constraints are sufficient and when dynamic model-based constraints are necessary to obtain good tracking properties. Detailed instructions are given for designing and building the DDV based on performance specifications, and also, an open-source platform capable of handling high-speed multi-robot research is developed in C++.

ContributorsManne, Sai Sravan (Author) / Rodriguez, Armando A (Thesis advisor) / Si, Jennie (Committee member) / Berman, Spring (Committee member) / Arizona State University (Publisher)

Created2021

Autonomous Racing: An Exploration of Localization, Waypoint Following, and Actuation for High-Speed Autonomous Vehicles

Description

The objective of this project was to research and experimentally test methods of localization, waypoint following, and actuation for high-speed driving by an autonomous vehicle. This thesis describes the implementation of LiDAR localization techniques, Model Predictive Control waypoint following, and communication for actuation on a 2016 Chevrolet Camaro, Arizona State…

The objective of this project was to research and experimentally test methods of localization, waypoint following, and actuation for high-speed driving by an autonomous vehicle. This thesis describes the implementation of LiDAR localization techniques, Model Predictive Control waypoint following, and communication for actuation on a 2016 Chevrolet Camaro, Arizona State University’s former EcoCAR. The LiDAR localization techniques include the NDT Mapping and Matching algorithms from the open-source autonomous vehicle platform, Autoware. The mapping algorithm was supplemented by that of Google Cartographer due to the limitations of map size in Autoware’s algorithms. The Model Predictive Control for waypoint following and the computer-microcontroller-actuator communication line are described. In addition to this experimental work, the thesis discusses an investigation of alternative approaches for each problem.

ContributorsCopenhaver, Bryce Stone (Author) / Berman, Spring (Thesis director) / Yong, Sze Zheng (Committee member) / Dean, W.P. Carey School of Business (Contributor) / Engineering Programs (Contributor) / Barrett, The Honors College (Contributor)

Created2020-05

Filtering by

Haptic perception, decision-making, and learning for manipulation with artificial hands

Design of a Graph Neural Network Coupled with an Advantage Actor-Critic Reinforcement Learning Algorithm for Multi-Agent Navigation

Combining learning with knowledge-rich planning allows for efficient multi-agent solutions to the problem of perpetual sparse rewards

Decentralized Motion Planning for Autonomous Multi-Agent Systems: Multi-Segment Manipulators and Mobile Robot Collectives

Dynamical System Design for Control of Single and Multiple Non-holonomic Differential Drive Robots Based on Critical Design Trade Studies

Autonomous Racing: An Exploration of Localization, Waypoint Following, and Actuation for High-Speed Autonomous Vehicles