Search Content

Environmental Learning for Robot Path-Planning Via Pareto Evolution

Description

Robots are often used in long-duration scenarios, such as on the surface of Mars,where they may need to adapt to environmental changes. Typically, robots have been built specifically for single tasks, such as moving boxes in a warehouse…

Robots are often used in long-duration scenarios, such as on the surface of Mars,where they may need to adapt to environmental changes. Typically, robots have been built specifically for single tasks, such as moving boxes in a warehouse or surveying construction sites. However, there is a modern trend away from human hand-engineering and toward robot learning. To this end, the ideal robot is not engineered,but automatically designed for a specific task. This thesis focuses on robots which learn path-planning algorithms for specific environments. Learning is accomplished via genetic programming. Path-planners are represented as Python code, which is optimized via Pareto evolution. These planners are encouraged to explore curiously and efficiently. This research asks the questions: “How can robots exhibit life-long learning where they adapt to changing environments in a robust way?”, and “How can robots learn to be curious?”.

ContributorsSaldyt, Lucas P (Author) / Ben Amor, Heni (Thesis director) / Pavlic, Theodore (Committee member) / Computer Science and Engineering Program (Contributor, Contributor) / Barrett, The Honors College (Contributor)

Created2021-05

Design of a Graph Neural Network Coupled with an Advantage Actor-Critic Reinforcement Learning Algorithm for Multi-Agent Navigation

Description

A Graph Neural Network (GNN) is a type of neural network architecture that operates on data consisting of objects and their relationships, which are represented by a graph. Within the graph, nodes represent objects and edges represent associations between those objects. The representation of relationships and correlations between data is…

A Graph Neural Network (GNN) is a type of neural network architecture that operates on data consisting of objects and their relationships, which are represented by a graph. Within the graph, nodes represent objects and edges represent associations between those objects. The representation of relationships and correlations between data is unique to graph structures. GNNs exploit this feature of graphs by augmenting both forms of data, individual and relational, and have been designed to allow for communication and sharing of data within each neural network layer. These benefits allow each node to have an enriched perspective, or a better understanding, of its neighbouring nodes and its connections to those nodes. The ability of GNNs to efficiently process high-dimensional node data and multi-faceted relationships among nodes gives them advantages over neural network architectures such as Convolutional Neural Networks (CNNs) that do not implicitly handle relational data. These quintessential characteristics of GNN models make them suitable for solving problems in which the correspondences among input data are needed to produce an accurate and precise representation of these data. GNN frameworks may significantly improve existing communication and control techniques for multi-agent tasks by implicitly representing not only information associated with the individual agents, such as agent position, velocity, and camera data, but also their relationships with one another, such as distances between the agents and their ability to communicate with one another. One such task is a multi-agent navigation problem in which the agents must coordinate with one another in a decentralized manner, using proximity sensors only, to navigate safely to their intended goal positions in the environment without collisions or deadlocks. The contribution of this thesis is the design of an end-to-end decentralized control scheme for multi-agent navigation that utilizes GNNs to prevent inter-agent collisions and deadlocks. The contributions consist of the development, simulation and evaluation of the performance of an advantage actor-critic (A2C) reinforcement learning algorithm that employs actor and critic networks for training that simultaneously approximate the policy function and value function, respectively. These networks are implemented using GNN frameworks for navigation by groups of 3, 5, 10 and 15 agents in simulated two-dimensional environments. It is observed that in $40\%$ to $50\%$ of the simulation trials, between 70$\%$ to 80$\%$ of the agents reach their goal positions without colliding with other agents or becoming trapped in deadlocks. The model is also compared to a random run simulation, where actions are chosen randomly for the agents and observe that the model performs notably well for smaller groups of agents.

ContributorsAyalasomayajula, Manaswini (Author) / Berman, Spring (Thesis advisor) / Mian, Sami (Committee member) / Pavlic, Theodore (Committee member) / Arizona State University (Publisher)

Created2022

On the Numerical Computation of Second Order Control Barrier Functions

Description

In recent years, the development of Control Barrier Functions (CBF) has allowed safety guarantees to be placed on nonlinear control affine systems. While powerful as a mathematical tool, CBF implementations on systems with high relative degree constraints can become too computationally intensive for real-time control. Such deployments typically rely on…

In recent years, the development of Control Barrier Functions (CBF) has allowed safety guarantees to be placed on nonlinear control affine systems. While powerful as a mathematical tool, CBF implementations on systems with high relative degree constraints can become too computationally intensive for real-time control. Such deployments typically rely on the analysis of a system's symbolic equations of motion, leading to large, platform-specific control programs that do not generalize well. To address this, a more generalized framework is needed. This thesis provides a formulation for second-order CBFs for rigid open kinematic chains. An algorithm for numerically computing the safe control input of a CBF is then introduced based on this formulation. It is shown that this algorithm can be used on a broad category of systems, with specific examples shown for convoy platooning, drone obstacle avoidance, and robotic arms with large degrees of freedom. These examples show up to three-times performance improvements in computation time as well as 2-3 orders of magnitude in the reduction in program size.

ContributorsPietz, Daniel Johannes (Author) / Fainekos, Georgios (Thesis advisor) / Vrudhula, Sarma (Thesis advisor) / Pedrielli, Giulia (Committee member) / Pavlic, Theodore (Committee member) / Arizona State University (Publisher)

Created2022

Ant-Inspired Control Strategies for Collective Transport by Dynamic Multi-Robot Teams with Temporary Leaders

Description

In certain ant species, groups of ants work together to transport food and materials back to their nests. In some cases, the group exhibits a leader-follower behavior in which a single ant guides the entire group based on its knowledge of the destination. In some cases, the leader role is…

In certain ant species, groups of ants work together to transport food and materials back to their nests. In some cases, the group exhibits a leader-follower behavior in which a single ant guides the entire group based on its knowledge of the destination. In some cases, the leader role is occupied temporarily by an ant, only to be replaced when an ant with new information arrives. This kind of behavior can be very useful in uncertain environments where robot teams work together to transport a heavy or bulky payload. The purpose of this research was to study ways to implement this behavior on robot teams.

In this work, I combined existing dynamical models of collective transport in ants to create a stochastic model that describes these behaviors and can be used to control multi-robot systems to perform collective transport. In this model, each agent transitions stochastically between roles based on the force that it senses the other agents are applying to the load. The agent’s motion is governed by a proportional controller that updates its applied force based on the load velocity. I developed agent-based simulations of this model in NetLogo and explored leader-follower scenarios in which agents receive information about the transport destination by a newly informed agent (leader) joining the team. From these simulations, I derived the mean allocations of agents between “puller” and “lifter” roles and the mean forces applied by the agents throughout the motion.

From the simulation results obtained, we show that the mean ratio of lifter to puller populations is approximately 1:1. We also show that agents using the role update procedure based on forces are required to exert less force than agents that select their role based on their position on the load, although both strategies achieve similar transport speeds.

ContributorsGah, Elikplim (Author) / Berman, Spring M (Thesis advisor, Committee member) / Pavlic, Theodore (Committee member) / Marvi, Hamidreza (Committee member) / Arizona State University (Publisher)

Created2020

Collaborating in Motion: Distributed and Stochastic Algorithms for Emergent Behavior in Programmable Matter

Description

The world is filled with systems of entities that collaborate in motion, both natural and engineered. These cooperative distributed systems are capable of sophisticated emergent behavior arising from the comparatively simple interactions of their members. A model system for emergent collective behavior is programmable matter, a physical substance capable of…

The world is filled with systems of entities that collaborate in motion, both natural and engineered. These cooperative distributed systems are capable of sophisticated emergent behavior arising from the comparatively simple interactions of their members. A model system for emergent collective behavior is programmable matter, a physical substance capable of autonomously changing its properties in response to user input or environmental stimuli. This dissertation studies distributed and stochastic algorithms that control the local behaviors of individual modules of programmable matter to induce complex collective behavior at the macroscale. It consists of four parts. In the first, the canonical amoebot model of programmable matter is proposed. A key goal of this model is to bring algorithmic theory closer to the physical realities of programmable matter hardware, especially with respect to concurrency and energy distribution. Two protocols are presented that together extend sequential, energy-agnostic algorithms to the more realistic concurrent, energy-constrained setting without sacrificing correctness, assuming the original algorithms satisfy certain conventions. In the second part, stateful distributed algorithms using amoebot memory and communication are presented for leader election, object coating, convex hull formation, and hexagon formation. The first three algorithms are proven to have linear runtimes when assuming a simplified sequential setting. The final algorithm for hexagon formation is instead proven to be correct under unfair asynchronous adversarial activation, the most general of all adversarial activation models. In the third part, distributed algorithms are combined with ideas from statistical physics and Markov chain design to replace algorithm reliance on memory and communication with biased random decisions, gaining inherent self-stabilizing and fault-tolerant properties. Using this stochastic approach, algorithms for compression, shortcut bridging, and separation are designed and analyzed. Finally, a two-pronged approach to "programming" physical ensembles is presented. This approach leverages the physics of local interactions to pair theoretical abstractions of self-organizing particle systems with experimental robot systems of active granular matter that intentionally lack digital computation and communication. By physically embodying the salient features of an algorithm in robot design, the algorithm's theoretical analysis can predict the robot ensemble's behavior. This approach is applied to phototaxing, aggregation, dispersion, and object transport.

ContributorsDaymude, Joshua (Author) / Richa, Andréa W (Thesis advisor) / Scheideler, Christian (Committee member) / Randall, Dana (Committee member) / Pavlic, Theodore (Committee member) / Gil, Stephanie (Committee member) / Arizona State University (Publisher)

Created2021

Robotic Swarm Control using Deep Reinforcement Learning Strategies based on Mean-Field Models

Description

As technological advancements in silicon, sensors, and actuation continue, the development of robotic swarms is shifting from the domain of science fiction to reality. Many swarm applications, such as environmental monitoring, precision agriculture, disaster response, and lunar prospecting, will require controlling numerous robots with limited capabilities and information to redistribute…

As technological advancements in silicon, sensors, and actuation continue, the development of robotic swarms is shifting from the domain of science fiction to reality. Many swarm applications, such as environmental monitoring, precision agriculture, disaster response, and lunar prospecting, will require controlling numerous robots with limited capabilities and information to redistribute among multiple states, such as spatial locations or tasks. A scalable control approach is to program the robots with stochastic control policies such that the robot population in each state evolves according to a mean-field model, which is independent of the number and identities of the robots. Using this model, the control policies can be designed to stabilize the swarm to the target distribution. To avoid the need to reprogram the robots for different target distributions, the robot control policies can be defined to depend only on the presence of a “leader” agent, whose control policy is designed to guide the swarm to a particular distribution. This dissertation presents a novel deep reinforcement learning (deep RL) approach to designing control policies that redistribute a swarm as quickly as possible over a strongly connected graph, according to a mean-field model in the form of the discrete-time Kolmogorov forward equation. In the leader-based strategies, the leader determines its next action based on its observations of robot populations and shepherds the swarm over the graph by probabilistically repelling nearby robots. The scalability of this approach with the swarm size is demonstrated with leader control policies that are designed using two tabular Temporal-Difference learning algorithms, trained on a discretization of the swarm distribution. To improve the scalability of the approach with robot population and graph size, control policies for both leader-based and leaderless strategies are designed using an actor-critic deep RL method that is trained on the swarm distribution predicted by the mean-field model. In the leaderless strategy, the robots’ control policies depend only on their local measurements of nearby robot populations. The control approaches are validated for different graph and swarm sizes in numerical simulations, 3D robot simulations, and experiments on a multi-robot testbed.

ContributorsKakish, Zahi Mousa (Author) / Berman, Spring (Thesis advisor) / Yong, Sze Zheng (Committee member) / Marvi, Hamid (Committee member) / Pavlic, Theodore (Committee member) / Pratt, Stephen (Committee member) / Ben Amor, Hani (Committee member) / Arizona State University (Publisher)

Created2021

LanSAR – Language-commanded Scene-aware Action Response

Description

Robot motion and control remains a complex problem both in general and inthe field of machine learning (ML). Without ML approaches, robot controllers are typically designed manually, which can take considerable time, generally requiring accounting for a range of edge cases and often producing models highly constrained to specific tasks. ML can decrease…

Robot motion and control remains a complex problem both in general and inthe field of machine learning (ML). Without ML approaches, robot controllers are typically designed manually, which can take considerable time, generally requiring accounting for a range of edge cases and often producing models highly constrained to specific tasks. ML can decrease the time it takes to create a model while simultaneously allowing it to operate on a broader range of tasks. The utilization of neural networks to learn from demonstration is, in particular, an approach with growing popularity due to its potential to quickly fit the parameters of a model to mimic training data. Many such neural networks, especially in the realm of transformer-based architectures, act more as planners, taking in an initial context and then generating a sequence from that context one step at a time. Others hybridize the approach, predicting a latent plan and conditioning immediate actions on that plan. Such approaches may limit a model’s ability to interact with a dynamic environment, needing to replan to fully update its understanding of the environmental context. In this thesis, Language-commanded Scene-aware Action Response (LanSAR) is proposed as a reactive transformer-based neural network that makes immediate decisions based on previous actions and environmental changes. Its actions are further conditioned on a language command, serving as a control mechanism while also narrowing the distribution of possible actions around this command. It is shown that LanSAR successfully learns a strong representation of multimodal visual and spatial input, and learns reasonable motions in relation to most language commands. It is also shown that LanSAR can struggle with both the accuracy of motions and understanding the specific semantics of language commands

ContributorsHardy, Adam (Author) / Ben Amor, Heni (Thesis advisor) / Srivastava, Siddharth (Committee member) / Pavlic, Theodore (Committee member) / Arizona State University (Publisher)

Created2024

Theses and Dissertations

Filtering by

Environmental Learning for Robot Path-Planning Via Pareto Evolution

Design of a Graph Neural Network Coupled with an Advantage Actor-Critic Reinforcement Learning Algorithm for Multi-Agent Navigation

On the Numerical Computation of Second Order Control Barrier Functions

Ant-Inspired Control Strategies for Collective Transport by Dynamic Multi-Robot Teams with Temporary Leaders

Collaborating in Motion: Distributed and Stochastic Algorithms for Emergent Behavior in Programmable Matter

Robotic Swarm Control using Deep Reinforcement Learning Strategies based on Mean-Field Models

LanSAR – Language-commanded Scene-aware Action Response