Search Content

Dynamic Potential Fields for Flexible Behavior-based Swarm Control via Reinforcement Learning

Description

In this thesis work, a novel learning approach to solving the problem of controllinga quadcopter (drone) swarm is explored. To deal with large sizes, swarm control is often achieved in a distributed fashion by combining different behaviors such that each behavior implements some desired swarm characteristics, such as avoiding ob- stacles and staying…

In this thesis work, a novel learning approach to solving the problem of controllinga quadcopter (drone) swarm is explored. To deal with large sizes, swarm control is often achieved in a distributed fashion by combining different behaviors such that each behavior implements some desired swarm characteristics, such as avoiding ob- stacles and staying close to neighbors. One common approach in distributed swarm control uses potential fields. A limitation of this approach is that the potential fields often depend statically on a set of control parameters that are manually specified a priori. This paper introduces Dynamic Potential Fields for flexible swarm control. These potential fields are modulated by a set of dynamic control parameters (DCPs) that can change under different environment situations. Since the focus is only on these DCPs, it simplifies the learning problem and makes it feasible for practical use. This approach uses soft actor critic (SAC) where the actor only determines how to modify DCPs in the current situation, resulting in more flexible swarm control. In the results, this work will show that the DCP approach allows for the drones to bet- ter traverse environments with obstacles compared to several state-of-the-art swarm control methods with a fixed set of control parameters. This approach also obtained a higher safety score commonly used to assess swarm behavior. A basic reinforce- ment learning approach is compared to demonstrate faster convergence. Finally, an ablation study is conducted to validate the design of this approach.

ContributorsFerraro, Calvin Shores (Author) / Zhang, Yu (Thesis advisor) / Ben Amor, Hani (Committee member) / Berman, Spring (Committee member) / Arizona State University (Publisher)

Created2022

Automated Geoscience with Robotics and Machine Learning: A New Hammer of Rock Detection, Mapping, and Dynamics Analysis

Description

Despite the rapid adoption of robotics and machine learning in industry, their application to scientific studies remains under-explored. Combining industry-driven advances with scientific exploration provides new perspectives and a greater understanding of the planet and its environmental processes. Focusing on rock detection, mapping, and dynamics analysis, I present technical approaches…

Despite the rapid adoption of robotics and machine learning in industry, their application to scientific studies remains under-explored. Combining industry-driven advances with scientific exploration provides new perspectives and a greater understanding of the planet and its environmental processes. Focusing on rock detection, mapping, and dynamics analysis, I present technical approaches and scientific results of developing robotics and machine learning technologies for geomorphology and seismic hazard analysis. I demonstrate an interdisciplinary research direction to push the frontiers of both robotics and geosciences, with potential translational contributions to commercial applications for hazard monitoring and prospecting. To understand the effects of rocky fault scarp development on rock trait distributions, I present a data-processing pipeline that utilizes unpiloted aerial vehicles (UAVs) and deep learning to segment densely distributed rocks in several orders of magnitude. Quantification and correlation analysis of rock trait distributions demonstrate a statistical approach for geomorphology studies. Fragile geological features such as precariously balanced rocks (PBRs) provide upper-bound ground motion constraints for hazard analysis. I develop an offboard method and onboard method as complementary to each other for PBR searching and mapping. Using deep learning, the offboard method segments PBRs in point clouds reconstructed from UAV surveys. The onboard method equips a UAV with edge-computing devices and stereo cameras, enabling onboard machine learning for real-time PBR search, detection, and mapping during surveillance. The offboard method provides an efficient solution to find PBR candidates in existing point clouds, which is useful for field reconnaissance. The onboard method emphasizes mapping individual PBRs for their complete visible surface features, such as basal contacts with pedestals–critical geometry to analyze fragility. After PBRs are mapped, I investigate PBR dynamics by building a virtual shake robot (VSR) that simulates ground motions to test PBR overturning. The VSR demonstrates that ground motion directions and niches are important factors determining PBR fragility, which were rarely considered in previous studies. The VSR also enables PBR large-displacement studies by tracking a toppled-PBR trajectory, presenting novel methods of rockfall hazard zoning. I build a real mini shake robot providing a reverse method to validate simulation experiments in the VSR.

ContributorsChen, Zhiang (Author) / Arrowsmith, Ramon (Thesis advisor) / Das, Jnaneshwar (Thesis advisor) / Bell, James (Committee member) / Berman, Spring (Committee member) / Christensen, Philip (Committee member) / Whipple, Kelin (Committee member) / Arizona State University (Publisher)

Created2022

Design of a Graph Neural Network Coupled with an Advantage Actor-Critic Reinforcement Learning Algorithm for Multi-Agent Navigation

Description

A Graph Neural Network (GNN) is a type of neural network architecture that operates on data consisting of objects and their relationships, which are represented by a graph. Within the graph, nodes represent objects and edges represent associations between those objects. The representation of relationships and correlations between data is…

A Graph Neural Network (GNN) is a type of neural network architecture that operates on data consisting of objects and their relationships, which are represented by a graph. Within the graph, nodes represent objects and edges represent associations between those objects. The representation of relationships and correlations between data is unique to graph structures. GNNs exploit this feature of graphs by augmenting both forms of data, individual and relational, and have been designed to allow for communication and sharing of data within each neural network layer. These benefits allow each node to have an enriched perspective, or a better understanding, of its neighbouring nodes and its connections to those nodes. The ability of GNNs to efficiently process high-dimensional node data and multi-faceted relationships among nodes gives them advantages over neural network architectures such as Convolutional Neural Networks (CNNs) that do not implicitly handle relational data. These quintessential characteristics of GNN models make them suitable for solving problems in which the correspondences among input data are needed to produce an accurate and precise representation of these data. GNN frameworks may significantly improve existing communication and control techniques for multi-agent tasks by implicitly representing not only information associated with the individual agents, such as agent position, velocity, and camera data, but also their relationships with one another, such as distances between the agents and their ability to communicate with one another. One such task is a multi-agent navigation problem in which the agents must coordinate with one another in a decentralized manner, using proximity sensors only, to navigate safely to their intended goal positions in the environment without collisions or deadlocks. The contribution of this thesis is the design of an end-to-end decentralized control scheme for multi-agent navigation that utilizes GNNs to prevent inter-agent collisions and deadlocks. The contributions consist of the development, simulation and evaluation of the performance of an advantage actor-critic (A2C) reinforcement learning algorithm that employs actor and critic networks for training that simultaneously approximate the policy function and value function, respectively. These networks are implemented using GNN frameworks for navigation by groups of 3, 5, 10 and 15 agents in simulated two-dimensional environments. It is observed that in $40\%$ to $50\%$ of the simulation trials, between 70$\%$ to 80$\%$ of the agents reach their goal positions without colliding with other agents or becoming trapped in deadlocks. The model is also compared to a random run simulation, where actions are chosen randomly for the agents and observe that the model performs notably well for smaller groups of agents.

ContributorsAyalasomayajula, Manaswini (Author) / Berman, Spring (Thesis advisor) / Mian, Sami (Committee member) / Pavlic, Theodore (Committee member) / Arizona State University (Publisher)

Created2022

AvaCAR

Description

For a system of autonomous vehicles functioning together in a traffic scene, 3Dunderstanding of participants in the field of view or surrounding is very essential for assessing the safety operation of the involved. This problem can be decomposed into online pose and shape estimation, which has been a core research area of…

For a system of autonomous vehicles functioning together in a traffic scene, 3Dunderstanding of participants in the field of view or surrounding is very essential for assessing the safety operation of the involved. This problem can be decomposed into online pose and shape estimation, which has been a core research area of computer vision for over a decade now. This work is an add-on to support and improve the joint estimate of the pose and shape of vehicles from monocular cameras. The objective of jointly estimating the vehicle pose and shape online is enabled by what is called an offline reconstruction pipeline. In the offline reconstruction step, an approach to obtain the vehicle 3D shape with keypoints labeled is formulated. This work proposes a multi-view reconstruction pipeline using images and masks which can create an approximate shape of vehicles and can be used as a shape prior. Then a 3D model-fitting optimization approach to refine the shape prior using high quality computer-aided design (CAD) models of vehicles is developed. A dataset of such 3D vehicles with 20 keypoints annotated is prepared and call it the AvaCAR dataset. The AvaCAR dataset can be used to estimate the vehicle shape and pose, without having the need to collect significant amounts of data needed for adequate training of a neural network. The online reconstruction can use this synthesis dataset to generate novel viewpoints and simultaneously train a neural network for pose and shape estimation. Most methods in the current literature using deep neural networks, that are trained to estimate pose of the object from a single image, are inherently biased to the viewpoint of the images used. This approach aims at addressing these existing limitations in the current method by delivering the online estimation a shape prior which can generate novel views to account for the bias due to viewpoint. The dataset is provided with ground truth extrinsic parameters and the compact vector based shape representations which along with the multi-view dataset can be used to efficiently trained neural networks for vehicle pose and shape estimation. The vehicles in this library are evaluated with some standard metrics to assure they are capable of aiding online estimation and model based tracking.

ContributorsDUTTA, PRABAL BIJOY (Author) / Yang, Yezhou (Thesis advisor) / Berman, Spring (Committee member) / Lu, Duo (Committee member) / Arizona State University (Publisher)

Created2022

Combining learning with knowledge-rich planning allows for efficient multi-agent solutions to the problem of perpetual sparse rewards

Description

This work has improved the quality of the solution to the sparse rewards problemby combining reinforcement learning (RL) with knowledge-rich planning. Classical methods for coping with sparse rewards during reinforcement learning modify the reward landscape so as to better guide the learner. In contrast, this work combines RL with a planner in order…

This work has improved the quality of the solution to the sparse rewards problemby combining reinforcement learning (RL) with knowledge-rich planning. Classical methods for coping with sparse rewards during reinforcement learning modify the reward landscape so as to better guide the learner. In contrast, this work combines RL with a planner in order to utilize other information about the environment. As the scope for representing environmental information is limited in RL, this work has conflated a model-free learning algorithm – temporal difference (TD) learning – with a Hierarchical Task Network (HTN) planner to accommodate rich environmental information in the algorithm. In the perpetual sparse rewards problem, rewards reemerge after being collected within a fixed interval of time, culminating in a lack of a well-defined goal state as an exit condition to the problem. Incorporating planning in the learning algorithm not only improves the quality of the solution, but the algorithm also avoids the ambiguity of incorporating a goal of maximizing profit while using only a planning algorithm to solve this problem. Upon occasionally using the HTN planner, this algorithm provides the necessary tweak toward the optimal solution. In this work, I have demonstrated an on-policy algorithm that has improved the quality of the solution over vanilla reinforcement learning. The objective of this work has been to observe the capacity of the synthesized algorithm in finding optimal policies to maximize rewards, awareness of the environment, and the awareness of the presence of other agents in the vicinity.

ContributorsNandan, Swastik (Author) / Pavlic, Theodore (Thesis advisor) / Das, Jnaneshwar (Thesis advisor) / Berman, Spring (Committee member) / Arizona State University (Publisher)

Created2022

Adapting Robotic Systems to User Control

Description

In this work, I propose to bridge the gap between human users and adaptive control of robotic systems. The goal is to enable robots to consider user feedback and adjust their behaviors. A critical challenge with designing such systems is that users are often non-experts, with limited knowledge about…

In this work, I propose to bridge the gap between human users and adaptive control of robotic systems. The goal is to enable robots to consider user feedback and adjust their behaviors. A critical challenge with designing such systems is that users are often non-experts, with limited knowledge about the robot's hardware and dynamics. In the domain of human-robot interaction, there exist different modalities of conveying information regarding the desired behavior of the robot, most commonly used are demonstrations, and preferences. While it is challenging for non-experts to provide demonstrations of robot behavior, works that consider preferences expressed as trajectory rankings lead to users providing noisy and possibly conflicting information, leading to slow adaptation or system failures. The end user can be expected to be familiar with the dynamics and how they relate to their desired objectives through repeated interactions with the system. However, due to inadequate knowledge about the system dynamics, it is expected that the user would find it challenging to provide feedback on all dimension's of the system's behavior at all times. Thus, the key innovation of this work is to enable users to provide partial instead of completely specified preferences as with traditional methods that learn from user preferences. In particular, I consider partial preferences in the form of preferences over plant dynamic parameters, for which I propose Adaptive User Control (AUC) of robotic systems. I leverage the correlations between the observed and hidden parameter preferences to deal with incompleteness. I use a sparse Gaussian Process Latent Variable Model formulation to learn hidden variables that represent the relationships between the observed and hidden preferences over the system parameters. This model is trained using Stochastic Variational Inference with a distributed loss formulation. I evaluate AUC in a custom drone-swarm environment and several domains from DeepMind control suite. I compare AUC with the state-of-the-art preference-based reinforcement learning methods that are utilized with user preferences. Results show that AUC outperforms the baselines substantially in terms of sample and feedback complexity.

ContributorsBiswas, Upasana (Author) / Zhang, Yu (Thesis advisor) / Kambhampati, Subbarao (Committee member) / Berman, Spring (Committee member) / Liu, Lantao (Committee member) / Arizona State University (Publisher)

Created2023

Multi-Agent Control for Collective Construction using Chemical Reaction Network Models

Description

Chemical Reaction Networks (CRNs) provide a useful framework for modeling andcontrolling large numbers of agents that undergo stochastic transitions between a set of states in a manner similar to chemical compounds. By utilizing CRN models to design agent control policies, some of the computational challenges in the coordination of multi-agent systems can be…

Chemical Reaction Networks (CRNs) provide a useful framework for modeling andcontrolling large numbers of agents that undergo stochastic transitions between a set of states in a manner similar to chemical compounds. By utilizing CRN models to design agent control policies, some of the computational challenges in the coordination of multi-agent systems can be overcome. In this thesis, a CRN model is developed that defines agent control policies for a multi-agent construction task. The use of surface CRNs to overcome the tradeoff between speed and accuracy of task performance is explained. The computational difficulties involved in coordinating multiple agents to complete collective construction tasks is then discussed. A method for stochastic task and motion planning (TAMP) is proposed to explain how a TAMP solver can be applied with CRNs to coordinate multiple agents. This work defines a collective construction scenario in which a group of noncommunicating agents must rearrange blocks on a discrete domain with obstacles into a predefined target distribution. Four different construction tasks are considered with 10, 20, 30, or 40 blocks, and a simulation of each scenario with 2, 4, 6, or 8 agents is performed. As the number of blocks increases, the construction problem becomes more complex, and a given population of agents requires more time to complete the task. Populations of fewer than 8 agents are unable to solve the 30-block and 40-block problems in the allotted simulation time, suggesting an inflection point for computational feasibility, implying that beyond that point the solution times for fewer than 8 agents would be expected to increase significantly. For a group of 8 agents, the time to complete the task generally increases as the number of blocks increases, except for the 30-block problem, which has specifications that make the task slightly easier for the agents to complete compared to the 20-block problem. For the 10-block and 20- block problems, the time to complete the task decreases as the number of agents increases; however, the marginal effect of each additional two agents on this time decreases. This can be explained through the pigeonhole principle: since there area finite number of states, when the number of agents is greater than the number of available spaces, deadlocks start to occur and the expectation is that the overall solution time to tend to infinity.

ContributorsKamojjhala, Pranav (Author) / Berman, Spring (Thesis advisor) / Fainekos, Gergios E (Thesis advisor) / Pavlic, Theodore P (Committee member) / Arizona State University (Publisher)

Created2022

Segmentation and Classification of Melanoma

Description

A skin lesion is a part of the skin which has an uncommon growth or appearance in comparison with the skin around it. While most are harmless, some can be warnings of skin cancer. Melanoma is the deadliest form of skin cancer and its early detection in dermoscopic images is…

A skin lesion is a part of the skin which has an uncommon growth or appearance in comparison with the skin around it. While most are harmless, some can be warnings of skin cancer. Melanoma is the deadliest form of skin cancer and its early detection in dermoscopic images is crucial and results in increase in the survival rate. The clinical ABCD (asymmetry, border irregularity, color variation and diameter greater than 6mm) rule is one of the most widely used method for early melanoma recognition. However, accurate classification of melanoma is still extremely difficult due to following reasons(not limited to): great visual resemblance between melanoma and non-melanoma skin lesions, less contrast difference between skin and the lesions etc. There is an ever-growing need of correct and reliable detection of skin cancers. Advances in the field of deep learning deems it perfect for the task of automatic detection and is very useful to pathologists as they aid them in terms of efficiency and accuracy. In this thesis various state of the art deep learning frameworks are used. An analysis of their parameters is done, innovative techniques are implemented to address the challenges faced in the tasks, segmentation, and classification in skin lesions.• Segmentation is task of dividing out regions of interest. This is used to only keep the ROI and separate it from its background. • Classification is the task of assigning the image a class, i.e., Melanoma(Cancer) and Nevus(Not Cancer). A pre-trained model is used and fine-tuned as per the needs of the given problem statement/dataset. Experimental results show promise as the implemented techniques reduce the false negatives rate, i.e., neural network is less likely to misclassify a melanoma.

ContributorsVerma, Vivek (Author) / Motsch, Sebastien (Thesis advisor) / Berman, Spring (Thesis advisor) / Zhuang, Houlong (Committee member) / Arizona State University (Publisher)

Created2021

Adversarial Machine Learning for Recommendation Systems

Description

Recently, Generative Adversarial Networks (GANs) have been applied to the problem of Cold-Start Recommendation, but the training performance of these models is hampered by the extreme sparsity in warm user purchase behavior. This thesis introduces a novel representation for user-vectors by combining user demographics and user preferences, making the model…

Recently, Generative Adversarial Networks (GANs) have been applied to the problem of Cold-Start Recommendation, but the training performance of these models is hampered by the extreme sparsity in warm user purchase behavior. This thesis introduces a novel representation for user-vectors by combining user demographics and user preferences, making the model a hybrid system which uses Collaborative Filtering and Content Based Recommendation. This system models user purchase behavior using weighted user-product preferences (explicit feedback) rather than binary user-product interactions (implicit feedback). Using this a novel sparse adversarial model, Sparse ReguLarized Generative Adversarial Network (SRLGAN), is developed for Cold-Start Recommendation. SRLGAN leverages the sparse user-purchase behavior which ensures training stability and avoids over-fitting on warm users. The performance of SRLGAN is evaluated on two popular datasets and demonstrate state-of-the-art results.

ContributorsShah, Aksheshkumar Ajaykumar (Author) / Venkateswara, Hemanth (Thesis advisor) / Berman, Spring (Thesis advisor) / Ladani, Leila J (Committee member) / Arizona State University (Publisher)

Created2022

Filtering by