Search Content

TaxiWorld: developing and evaluating solution methods for multi-agent planning domains

Description

TaxiWorld is a Matlab simulation of a city with a fleet of taxis which operate within it, with the goal of transporting passengers to their destinations. The size of the city, as well as the number of available taxis and the frequency and general locations of fare appearances can all…

TaxiWorld is a Matlab simulation of a city with a fleet of taxis which operate within it, with the goal of transporting passengers to their destinations. The size of the city, as well as the number of available taxis and the frequency and general locations of fare appearances can all be set on a scenario-by-scenario basis. The taxis must attempt to service the fares as quickly as possible, by picking each one up and carrying it to its drop-off location. The TaxiWorld scenario is formally modeled using both Decentralized Partially-Observable Markov Decision Processes (Dec-POMDPs) and Multi-agent Markov Decision Processes (MMDPs). The purpose of developing formal models is to learn how to build and use formal Markov models, such as can be given to planners to solve for optimal policies in problem domains. However, finding optimal solutions for Dec-POMDPs is NEXP-Complete, so an empirical algorithm was also developed as an improvement to the method already in use on the simulator, and the methods were compared in identical scenarios to determine which is more effective. The empirical method is of course not optimal - rather, it attempts to simply account for some of the most important factors to achieve an acceptable level of effectiveness while still retaining a reasonable level of computational complexity for online solving.

ContributorsWhite, Christopher (Author) / Kambhampati, Subbarao (Thesis advisor) / Gupta, Sandeep (Committee member) / Varsamopoulos, Georgios (Committee member) / Arizona State University (Publisher)

Created2011

RAProp: ranking tweets by exploiting the tweet/user/web ecosystem

Description

The increasing popularity of Twitter renders improved trustworthiness and relevance assessment of tweets much more important for search. However, given the limitations on the size of tweets, it is hard to extract measures for ranking from the tweet's content alone. I propose a method of ranking tweets by generating a…

The increasing popularity of Twitter renders improved trustworthiness and relevance assessment of tweets much more important for search. However, given the limitations on the size of tweets, it is hard to extract measures for ranking from the tweet's content alone. I propose a method of ranking tweets by generating a reputation score for each tweet that is based not just on content, but also additional information from the Twitter ecosystem that consists of users, tweets, and the web pages that tweets link to. This information is obtained by modeling the Twitter ecosystem as a three-layer graph. The reputation score is used to power two novel methods of ranking tweets by propagating the reputation over an agreement graph based on tweets' content similarity. Additionally, I show how the agreement graph helps counter tweet spam. An evaluation of my method on 16~million tweets from the TREC 2011 Microblog Dataset shows that it doubles the precision over baseline Twitter Search and achieves higher precision than current state of the art method. I present a detailed internal empirical evaluation of RAProp in comparison to several alternative approaches proposed by me, as well as external evaluation in comparison to the current state of the art method.

ContributorsRavikumar, Srijith (Author) / Kambhampati, Subbarao (Thesis advisor) / Davulcu, Hasan (Committee member) / Liu, Huan (Committee member) / Arizona State University (Publisher)

Created2013

Extensions to a unified theory of the cognitive architecture

Description

Building computational models of human problem solving has been a longstanding goal in Artificial Intelligence research. The theories of cognitive architectures addressed this issue by embedding models of problem solving within them. This thesis presents an extended account of human problem solving and describes its implementation within one such theory…

Building computational models of human problem solving has been a longstanding goal in Artificial Intelligence research. The theories of cognitive architectures addressed this issue by embedding models of problem solving within them. This thesis presents an extended account of human problem solving and describes its implementation within one such theory of cognitive architecture--ICARUS. The document begins by reviewing the standard theory of problem solving, along with how previous versions of ICARUS have incorporated and expanded on it. Next it discusses some limitations of the existing mechanism and proposes four extensions that eliminate these limitations, elaborate the framework along interesting dimensions, and bring it into closer alignment with human problem-solving abilities. After this, it presents evaluations on four domains that establish the benefits of these extensions. The results demonstrate the system's ability to solve problems in various domains and its generality. In closing, it outlines related work and notes promising directions for additional research.

ContributorsTrivedi, Nishant (Author) / Langley, Patrick W (Thesis advisor) / VanLehn, Kurt (Committee member) / Kambhampati, Subbarao (Committee member) / Arizona State University (Publisher)

Created2011

Generative Models for Trajectory Prediction

Description

Trajectory forecasting is used in many fields such as vehicle future trajectory prediction, stock market price prediction, human motion prediction and so on. Also, robots having the capability to reason about human behavior is an important aspect in human robot interaction. In trajectory prediction with regards to human motion prediction,…

Trajectory forecasting is used in many fields such as vehicle future trajectory prediction, stock market price prediction, human motion prediction and so on. Also, robots having the capability to reason about human behavior is an important aspect in human robot interaction. In trajectory prediction with regards to human motion prediction, implicit learning and reproduction of human behavior is the major challenge. This work tries to compare some of the recent advances taking a phenomenological approach to trajectory prediction. \par The work is expected to mainly target on generating future events or trajectories based on the previous data observed across many time intervals. In particular, this work presents and compares machine learning models to generate various human handwriting trajectories. Although the behavior of every individual is unique, it is still possible to broadly generalize and learn the underlying human behavior from the current observations to predict future human writing trajectories. This enables the machine or the robot to generate future handwriting trajectories given an initial trajectory from the individual thus helping the person to fill up the rest of the letter or curve. This work tests and compares the performance of Conditional Variational Autoencoders and Sinusoidal Representation Network models on handwriting trajectory prediction and reconstruction.

ContributorsKota, Venkata Anil (Author) / Ben Amor, Hani (Thesis advisor) / Venkateswara, Hemanth Kumar Demakethepalli (Committee member) / Redkar, Sangram (Committee member) / Arizona State University (Publisher)

Created2021

Traffic Accident Reconstruction Using Monocular Dashcam Videos

Description

Automated driving systems (ADS) have come a long way since their inception. It is clear that these systems rely heavily on stochastic deep learning techniques for perception, planning, and prediction, as it is impossible to construct every possible driving scenario to generate driving policies. Moreover, these systems need to be…

Automated driving systems (ADS) have come a long way since their inception. It is clear that these systems rely heavily on stochastic deep learning techniques for perception, planning, and prediction, as it is impossible to construct every possible driving scenario to generate driving policies. Moreover, these systems need to be trained and validated extensively on typical and abnormal driving situations before they can be trusted with human life. However, most publicly available driving datasets only consist of typical driving behaviors. On the other hand, there is a plethora of videos available on the internet that capture abnormal driving scenarios, but they are unusable for ADS training or testing as they lack important information such as camera calibration parameters, and annotated vehicle trajectories. This thesis proposes a new toolbox, DeepCrashTest-V2, that is capable of reconstructing high-quality simulations from monocular dashcam videos found on the internet. The toolbox not only estimates the crucial parameters such as camera calibration, ego-motion, and surrounding road user trajectories but also creates a virtual world in Car Learning to Act (CARLA) using data from OpenStreetMaps to simulate the estimated trajectories. The toolbox is open-source and is made available in the form of a python package on GitHub at https://github.com/C-Aniruddh/deepcrashtest_v2.

ContributorsChandratre, Aniruddh Vinay (Author) / Fainekos, Georgios (Thesis advisor) / Ben Amor, Hani (Thesis advisor) / Pedrielli, Giulia (Committee member) / Arizona State University (Publisher)

Created2022

Autonomous System Control of Multiple Robotic Arms Collaboration via Machine Learning

Description

Multiple robotic arms collaboration is to control multiple robotic arms to collaborate with each other to work on the same task. During the collaboration, theagent is required to avoid all possible collisions between each part of the robotic arms. Thus, incentivizing collaboration and preventing collisions are the two principles which are followed…

Multiple robotic arms collaboration is to control multiple robotic arms to collaborate with each other to work on the same task. During the collaboration, theagent is required to avoid all possible collisions between each part of the robotic arms. Thus, incentivizing collaboration and preventing collisions are the two principles which are followed by the agent during the training process. Nowadays, more and more applications, both in industry and daily lives, require at least two arms, instead of requiring only a single arm. A dual-arm robot satisfies much more needs of different types of tasks, such as folding clothes at home, making a hamburger in a grill or picking and placing a product in a warehouse. The applications done in this paper are all about object pushing. This thesis focuses on how to train the agent to learn pushing an object away as far as possible. Reinforcement Learning (RL), which is a type of Machine Learning (ML), is then utilized in this paper to train the agent to generate optimal actions. Deep Deterministic Policy Gradient (DDPG) and Hindsight Experience Replay (HER) are the two RL methods used in this thesis.

ContributorsLin, Steve (Author) / Ben Amor, Hani (Thesis advisor) / Redkar, Sangram (Committee member) / Zhang, Yu (Committee member) / Arizona State University (Publisher)

Created2023

Dynamic Potential Fields for Flexible Behavior-based Swarm Control via Reinforcement Learning

Description

In this thesis work, a novel learning approach to solving the problem of controllinga quadcopter (drone) swarm is explored. To deal with large sizes, swarm control is often achieved in a distributed fashion by combining different behaviors such that each behavior implements some desired swarm characteristics, such as avoiding ob- stacles and staying…

In this thesis work, a novel learning approach to solving the problem of controllinga quadcopter (drone) swarm is explored. To deal with large sizes, swarm control is often achieved in a distributed fashion by combining different behaviors such that each behavior implements some desired swarm characteristics, such as avoiding ob- stacles and staying close to neighbors. One common approach in distributed swarm control uses potential fields. A limitation of this approach is that the potential fields often depend statically on a set of control parameters that are manually specified a priori. This paper introduces Dynamic Potential Fields for flexible swarm control. These potential fields are modulated by a set of dynamic control parameters (DCPs) that can change under different environment situations. Since the focus is only on these DCPs, it simplifies the learning problem and makes it feasible for practical use. This approach uses soft actor critic (SAC) where the actor only determines how to modify DCPs in the current situation, resulting in more flexible swarm control. In the results, this work will show that the DCP approach allows for the drones to bet- ter traverse environments with obstacles compared to several state-of-the-art swarm control methods with a fixed set of control parameters. This approach also obtained a higher safety score commonly used to assess swarm behavior. A basic reinforce- ment learning approach is compared to demonstrate faster convergence. Finally, an ablation study is conducted to validate the design of this approach.

ContributorsFerraro, Calvin Shores (Author) / Zhang, Yu (Thesis advisor) / Ben Amor, Hani (Committee member) / Berman, Spring (Committee member) / Arizona State University (Publisher)

Created2022

An Approximate Dynamic Programming Framework for Occlusion-Robust Multi-Object Tracking

Description

In this work, the problem of multi-object tracking (MOT) is studied, particularly the challenges that arise from object occlusions. A solution based on a principled approximate dynamic programming approach called ADPTrack is presented. ADPTrack relies on existing MOT solutions and directly improves them. When matching tracks to objects at a…

In this work, the problem of multi-object tracking (MOT) is studied, particularly the challenges that arise from object occlusions. A solution based on a principled approximate dynamic programming approach called ADPTrack is presented. ADPTrack relies on existing MOT solutions and directly improves them. When matching tracks to objects at a particular frame, the proposed approach simulates executions of these existing solutions into future frames to obtain approximate track extensions, from which a comparison of past and future appearance feature information is leveraged to improve overall robustness to occlusion-based error. The proposed solution when applied to the renowned MOT17 dataset empirically demonstrates a 0.7% improvement in the association accuracy (IDF1 metric) over a state-of-the-art baseline that it builds upon while obtaining minor improvements with respect to all other metrics. Moreover, it is shown that this improvement is even more pronounced in scenarios where the camera maintains a fixed position. This implies that the proposed method is effective in addressing MOT issues pertaining to object occlusions.

ContributorsMusunuru, Pratyusha (Author) / Bertsekas, Dimitri (Thesis advisor) / Kambhampati, Subbarao (Thesis advisor) / Richa, Andrea (Committee member) / Arizona State University (Publisher)

Created2024

Adapting Robotic Systems to User Control

Description

In this work, I propose to bridge the gap between human users and adaptive control of robotic systems. The goal is to enable robots to consider user feedback and adjust their behaviors. A critical challenge with designing such systems is that users are often non-experts, with limited knowledge about…

In this work, I propose to bridge the gap between human users and adaptive control of robotic systems. The goal is to enable robots to consider user feedback and adjust their behaviors. A critical challenge with designing such systems is that users are often non-experts, with limited knowledge about the robot's hardware and dynamics. In the domain of human-robot interaction, there exist different modalities of conveying information regarding the desired behavior of the robot, most commonly used are demonstrations, and preferences. While it is challenging for non-experts to provide demonstrations of robot behavior, works that consider preferences expressed as trajectory rankings lead to users providing noisy and possibly conflicting information, leading to slow adaptation or system failures. The end user can be expected to be familiar with the dynamics and how they relate to their desired objectives through repeated interactions with the system. However, due to inadequate knowledge about the system dynamics, it is expected that the user would find it challenging to provide feedback on all dimension's of the system's behavior at all times. Thus, the key innovation of this work is to enable users to provide partial instead of completely specified preferences as with traditional methods that learn from user preferences. In particular, I consider partial preferences in the form of preferences over plant dynamic parameters, for which I propose Adaptive User Control (AUC) of robotic systems. I leverage the correlations between the observed and hidden parameter preferences to deal with incompleteness. I use a sparse Gaussian Process Latent Variable Model formulation to learn hidden variables that represent the relationships between the observed and hidden preferences over the system parameters. This model is trained using Stochastic Variational Inference with a distributed loss formulation. I evaluate AUC in a custom drone-swarm environment and several domains from DeepMind control suite. I compare AUC with the state-of-the-art preference-based reinforcement learning methods that are utilized with user preferences. Results show that AUC outperforms the baselines substantially in terms of sample and feedback complexity.

ContributorsBiswas, Upasana (Author) / Zhang, Yu (Thesis advisor) / Kambhampati, Subbarao (Committee member) / Berman, Spring (Committee member) / Liu, Lantao (Committee member) / Arizona State University (Publisher)

Created2023

QPMeL: Quantum Polar Metric Learning

Description

Deep metric learning has recently shown extremely promising results in the classical data domain, creating well-separated feature spaces. This idea was also adapted to quantum computers via Quantum Metric Learning (QMeL). QMeL consists of a 2 step process with a classical model to compress the data to fit into the…

Deep metric learning has recently shown extremely promising results in the classical data domain, creating well-separated feature spaces. This idea was also adapted to quantum computers via Quantum Metric Learning (QMeL). QMeL consists of a 2 step process with a classical model to compress the data to fit into the limited number of qubits, then train a Parameterized Quantum Circuit (PQC) to create better separation in Hilbert Space. However, on Noisy Intermediate Scale Quantum (NISQ) devices, QMeL solutions result in high circuit width and depth, both of which limit scalability. The proposed Quantum Polar Metric Learning (QPMeL ), uses a classical model to learn the parameters of the polar form of a qubit. A shallow PQC with Ry and Rz gates is then utilized to create the state and a trainable layer of ZZ(θ)-gates to learn entanglement. The circuit also computes fidelity via a SWAP Test for the proposed Fidelity Triplet Loss function, used to train both classical and quantum components. When compared to QMeL approaches, QPMeL achieves 3X better multi-class separation, while using only 1/2 the number of gates and depth. QPMeL is shown to outperform classical networks with similar configurations, presentinga promising avenue for future research on fully classical models with quantum loss functions.

ContributorsSharma, Vinayak (Author) / Shrivastava, Aviral (Thesis advisor) / Jiang, Zilin (Committee member) / Kambhampati, Subbarao (Committee member) / Arizona State University (Publisher)

Created2024

Filtering by