This collection includes most of the ASU Theses and Dissertations from 2011 to present. ASU Theses and Dissertations are available in downloadable PDF format; however, a small percentage of items are under embargo. Information about the dissertations/theses includes degree information, committee members, an abstract, supporting data or media.

In addition to the electronic theses found in the ASU Digital Repository, ASU Theses and Dissertations can be found in the ASU Library Catalog.

Dissertations and Theses granted by Arizona State University are archived and made available through a joint effort of the ASU Graduate College and the ASU Libraries. For more information or questions about this collection contact or visit the Digital Repository ETD Library Guide or contact the ASU Graduate College at gradformat@asu.edu.

Displaying 1 - 10 of 28
Filtering by

Clear all filters

168694-Thumbnail Image.png
Description
Retinotopic map, the map between visual inputs on the retina and neuronal activation in brain visual areas, is one of the central topics in visual neuroscience. For human observers, the map is typically obtained by analyzing functional magnetic resonance imaging (fMRI) signals of cortical responses to slowly moving visual stimuli

Retinotopic map, the map between visual inputs on the retina and neuronal activation in brain visual areas, is one of the central topics in visual neuroscience. For human observers, the map is typically obtained by analyzing functional magnetic resonance imaging (fMRI) signals of cortical responses to slowly moving visual stimuli on the retina. Biological evidences show the retinotopic mapping is topology-preserving/topological (i.e. keep the neighboring relationship after human brain process) within each visual region. Unfortunately, due to limited spatial resolution and the signal-noise ratio of fMRI, state of art retinotopic map is not topological. The topic was to model the topology-preserving condition mathematically, fix non-topological retinotopic map with numerical methods, and improve the quality of retinotopic maps. The impose of topological condition, benefits several applications. With the topological retinotopic maps, one may have a better insight on human retinotopic maps, including better cortical magnification factor quantification, more precise description of retinotopic maps, and potentially better exam ways of in Ophthalmology clinic.
ContributorsTu, Yanshuai (Author) / Wang, Yalin (Thesis advisor) / Lu, Zhong-Lin (Committee member) / Crook, Sharon (Committee member) / Yang, Yezhou (Committee member) / Zhang, Yu (Committee member) / Arizona State University (Publisher)
Created2022
189299-Thumbnail Image.png
Description
Multiple robotic arms collaboration is to control multiple robotic arms to collaborate with each other to work on the same task. During the collaboration, theagent is required to avoid all possible collisions between each part of the robotic arms. Thus, incentivizing collaboration and preventing collisions are the two principles which are followed

Multiple robotic arms collaboration is to control multiple robotic arms to collaborate with each other to work on the same task. During the collaboration, theagent is required to avoid all possible collisions between each part of the robotic arms. Thus, incentivizing collaboration and preventing collisions are the two principles which are followed by the agent during the training process. Nowadays, more and more applications, both in industry and daily lives, require at least two arms, instead of requiring only a single arm. A dual-arm robot satisfies much more needs of different types of tasks, such as folding clothes at home, making a hamburger in a grill or picking and placing a product in a warehouse. The applications done in this paper are all about object pushing. This thesis focuses on how to train the agent to learn pushing an object away as far as possible. Reinforcement Learning (RL), which is a type of Machine Learning (ML), is then utilized in this paper to train the agent to generate optimal actions. Deep Deterministic Policy Gradient (DDPG) and Hindsight Experience Replay (HER) are the two RL methods used in this thesis.
ContributorsLin, Steve (Author) / Ben Amor, Hani (Thesis advisor) / Redkar, Sangram (Committee member) / Zhang, Yu (Committee member) / Arizona State University (Publisher)
Created2023
172013-Thumbnail Image.png
Description
In this thesis work, a novel learning approach to solving the problem of controllinga quadcopter (drone) swarm is explored. To deal with large sizes, swarm control is often achieved in a distributed fashion by combining different behaviors such that each behavior implements some desired swarm characteristics, such as avoiding ob- stacles and staying

In this thesis work, a novel learning approach to solving the problem of controllinga quadcopter (drone) swarm is explored. To deal with large sizes, swarm control is often achieved in a distributed fashion by combining different behaviors such that each behavior implements some desired swarm characteristics, such as avoiding ob- stacles and staying close to neighbors. One common approach in distributed swarm control uses potential fields. A limitation of this approach is that the potential fields often depend statically on a set of control parameters that are manually specified a priori. This paper introduces Dynamic Potential Fields for flexible swarm control. These potential fields are modulated by a set of dynamic control parameters (DCPs) that can change under different environment situations. Since the focus is only on these DCPs, it simplifies the learning problem and makes it feasible for practical use. This approach uses soft actor critic (SAC) where the actor only determines how to modify DCPs in the current situation, resulting in more flexible swarm control. In the results, this work will show that the DCP approach allows for the drones to bet- ter traverse environments with obstacles compared to several state-of-the-art swarm control methods with a fixed set of control parameters. This approach also obtained a higher safety score commonly used to assess swarm behavior. A basic reinforce- ment learning approach is compared to demonstrate faster convergence. Finally, an ablation study is conducted to validate the design of this approach.
ContributorsFerraro, Calvin Shores (Author) / Zhang, Yu (Thesis advisor) / Ben Amor, Hani (Committee member) / Berman, Spring (Committee member) / Arizona State University (Publisher)
Created2022
171959-Thumbnail Image.png
Description
Recent breakthroughs in Artificial Intelligence (AI) have brought the dream of developing and deploying complex AI systems that can potentially transform everyday life closer to reality than ever before. However, the growing realization that there might soon be people from all walks of life using and working with these systems

Recent breakthroughs in Artificial Intelligence (AI) have brought the dream of developing and deploying complex AI systems that can potentially transform everyday life closer to reality than ever before. However, the growing realization that there might soon be people from all walks of life using and working with these systems has also spurred a lot of interest in ensuring that AI systems can efficiently and effectively work and collaborate with their intended users. Chief among the efforts in this direction has been the pursuit of imbuing these agents with the ability to provide intuitive and useful explanations regarding their decisions and actions to end-users. In this dissertation, I will describe various works that I have done in the area of explaining sequential decision-making problems. Furthermore, I will frame the discussions of my work within a broader framework for understanding and analyzing explainable AI (XAI). My works herein tackle many of the core challenges related to explaining automated decisions to users including (1) techniques to address asymmetry in knowledge between the user and the system, (2) techniques to address asymmetry in inferential capabilities, and (3) techniques to address vocabulary mismatch.The dissertation will also describe the works I have done in generating interpretable behavior and policy summarization. I will conclude this dissertation, by using the framework of human-aware explanation as a lens to analyze and understand the current landscape of explainable planning.
ContributorsSreedharan, Sarath (Author) / Kambhampati, Subbarao (Thesis advisor) / Kim, Been (Committee member) / Smith, David E (Committee member) / Srivastava, Siddharth (Committee member) / Zhang, Yu (Committee member) / Arizona State University (Publisher)
Created2022
168454-Thumbnail Image.png
Description
Federated Learning (FL) is envisaged to be a promising solution for collaboratively training a machine learning model while keeping the training data decentralized and private. Instead of sharing raw data to the central entity, the participating client devices share focused updates for aggregation to ensure global convergence of the model.

Federated Learning (FL) is envisaged to be a promising solution for collaboratively training a machine learning model while keeping the training data decentralized and private. Instead of sharing raw data to the central entity, the participating client devices share focused updates for aggregation to ensure global convergence of the model. Owing to the shortcomings of manually handcrafted neural network architectures, the research community is striving to develop Neural Architecture Search (NAS) approaches to automatically search for optimal networks that fit the clients’ data. Despite the inaccessibility of clients’ data in an FL setting, the federated NAS literature has recently witnessed great progress to apply these NAS techniques to an FL setting. However, one of the key bottlenecks of Federated Learning is the cost of communication between clients and the server, and the state-of-the-art federated NAS techniques search for networks with millions of parameters that require several rounds of communication to find the optimal weight parameters. Also, deploying a network having millions of parameters on edge devices (which are the typical participants in an FL process) is infeasible due to its computational limitations and increased latency. Thus, this work proposes Weight-Agnostic Federated Neural Architecture Search (WFNAS), a novel evolutionary framework to search for well-performing and minimally connected weight-agnostic network architectures in an FL setting. As the connectivity of the networks themselves is the solution, there is no need for weight training and hyperparameter tuning, reducing the communication overhead significantly. The experiments indicate a gain of nearly 40% for orthogonal (vertical FL) data distributions compared to local training. This work is the first federated NAS technique in the literature for vertical FL. Although the experiments are performed in a resource-constrained environment, the aim of this thesis is to show a new direction of research to the FL community.
ContributorsThakkar, Om (Author) / Bazzi, Rida (Thesis advisor) / Li, Baoxin (Committee member) / Zhang, Yu (Committee member) / Arizona State University (Publisher)
Created2021
168422-Thumbnail Image.png
Description
Natural Language plays a crucial role in human-robot interaction as it is the common ground where human beings and robots can communicate and understand each other. However, most of the work in natural language and robotics is majorly on generating robot actions using a natural language command, which is a

Natural Language plays a crucial role in human-robot interaction as it is the common ground where human beings and robots can communicate and understand each other. However, most of the work in natural language and robotics is majorly on generating robot actions using a natural language command, which is a unidirectional way of communication. This work focuses on the other direction of communication, where the approach allows a robot to describe its actions from sampled images and joint sequences from the robot task. The importance of this work is that it utilizes multiple modalities, which are the start and end images from the robot task environment and the joint trajectories of the robot arms. The fusion of different modalities is not just about fusing the data but knowing what information to extract from which data sources in such a way that the language description represents the state of the manipulator and the environment that it is performing the task on. From the experimental results of various simulated robot environments, this research demonstrates that utilizing multiple modalities improves the accuracy of the natural language description, and efficiently fusing the modalities is crucial in generating such descriptions by harnessing most of the various data sources.
ContributorsKALIRATHINAM, KAMALESH (Author) / Ben Amor, Heni (Thesis advisor) / Phielipp, Mariano (Committee member) / Zhang, Yu (Committee member) / Arizona State University (Publisher)
Created2021
189263-Thumbnail Image.png
Description
In this work, I propose to bridge the gap between human users and adaptive control of robotic systems. The goal is to enable robots to consider user feedback and adjust their behaviors. A critical challenge with designing such systems is that users are often non-experts, with limited knowledge about

In this work, I propose to bridge the gap between human users and adaptive control of robotic systems. The goal is to enable robots to consider user feedback and adjust their behaviors. A critical challenge with designing such systems is that users are often non-experts, with limited knowledge about the robot's hardware and dynamics. In the domain of human-robot interaction, there exist different modalities of conveying information regarding the desired behavior of the robot, most commonly used are demonstrations, and preferences. While it is challenging for non-experts to provide demonstrations of robot behavior, works that consider preferences expressed as trajectory rankings lead to users providing noisy and possibly conflicting information, leading to slow adaptation or system failures. The end user can be expected to be familiar with the dynamics and how they relate to their desired objectives through repeated interactions with the system. However, due to inadequate knowledge about the system dynamics, it is expected that the user would find it challenging to provide feedback on all dimension's of the system's behavior at all times. Thus, the key innovation of this work is to enable users to provide partial instead of completely specified preferences as with traditional methods that learn from user preferences. In particular, I consider partial preferences in the form of preferences over plant dynamic parameters, for which I propose Adaptive User Control (AUC) of robotic systems. I leverage the correlations between the observed and hidden parameter preferences to deal with incompleteness. I use a sparse Gaussian Process Latent Variable Model formulation to learn hidden variables that represent the relationships between the observed and hidden preferences over the system parameters. This model is trained using Stochastic Variational Inference with a distributed loss formulation. I evaluate AUC in a custom drone-swarm environment and several domains from DeepMind control suite. I compare AUC with the state-of-the-art preference-based reinforcement learning methods that are utilized with user preferences. Results show that AUC outperforms the baselines substantially in terms of sample and feedback complexity.
ContributorsBiswas, Upasana (Author) / Zhang, Yu (Thesis advisor) / Kambhampati, Subbarao (Committee member) / Berman, Spring (Committee member) / Liu, Lantao (Committee member) / Arizona State University (Publisher)
Created2023
193439-Thumbnail Image.png
Description
In contemporary society, the proliferation of fake identity documents presents a profound menace that permeates various facets of the social fabric. The advent of artificial intelligence coupled with sophisticated printing techniques has significantly exacerbated this issue. The ramifications of counterfeit identity documents extend far beyond the legal infractions and financial

In contemporary society, the proliferation of fake identity documents presents a profound menace that permeates various facets of the social fabric. The advent of artificial intelligence coupled with sophisticated printing techniques has significantly exacerbated this issue. The ramifications of counterfeit identity documents extend far beyond the legal infractions and financial losses incurred by victims of identity theft because they pose a severe threat to public safety, national security, and societal trust. Given these multifaceted threats, the imperative to detect and thwart fraud identity documents has become paramount. The efficacy of fraud detection tools is contingent upon the availability of extensive identity document datasets for training purposes. However, existing benchmark datasets such as MIDV-500, MIDV-2020, and FMIDV exhibit notable deficiencies such as a limited number of samples, insufficient coverage of various fraud patterns, and occasional alterations in critical personal identifier fields, particularly portrait images. These limitations constrain their effectiveness in training models capable of detecting realistic fraud instances while also safeguarding privacy. This thesis delineates the research work to address this gap by proposing a streamlined pipeline for generating synthetic identity documents and introducing the resultant benchmark dataset, named IDNet. IDNet is meticulously crafted to propel advancements in privacy-preserving fraud detection initiatives and comprises 597,900 images of synthetically generated identity documents, amounting to approximately 350 gigabytes of data. These documents are categorized into 20 types, encompassing identity documents from 10 U.S. states and 10 European countries. Additionally, the dataset includes identity documents consisting of either a single fraud pattern or multiple fraud patterns, to cater to various model training requirements.
ContributorsNag, Soham (Author) / Zou, Jia (Thesis advisor) / Yang, Yingzhen (Committee member) / Zhang, Yu (Committee member) / Arizona State University (Publisher)
Created2024
193542-Thumbnail Image.png
Description
As robots become increasingly integrated into the environments, they need to learn how to interact with the objects around them. Many of these objects are articulated with multiple degrees of freedom (DoF). Multi-DoF objects have complex joints that require specific manipulation orders, but existing methods only consider objects with a

As robots become increasingly integrated into the environments, they need to learn how to interact with the objects around them. Many of these objects are articulated with multiple degrees of freedom (DoF). Multi-DoF objects have complex joints that require specific manipulation orders, but existing methods only consider objects with a single joint. To capture the joint structure and manipulation sequence of any object, I introduce the "Object Kinematic State Machines" (OKSMs), a novel representation that models the kinematic constraints and manipulation sequences of multi-DoF objects. I also present Pokenet, a deep neural network architecture that estimates the OKSMs from the sequence of point cloud data of human demonstrations. I conduct experiments on both simulated and real-world datasets to validate my approach. First, I evaluate the modeling of multi-DoF objects on a simulated dataset, comparing against the current state-of-the-art method. I then assess Pokenet's real-world usability on a dataset collected in my lab, comprising 5,500 data points across 4 objects. Results showcase that my method can successfully estimate joint parameters of novel multi-DoF objects with over 25% more accuracy on average than prior methods.
ContributorsGUPTA, ANMOL (Author) / Gopalan, Nakul (Thesis advisor) / Zhang, Yu (Committee member) / Wang, Yalin (Committee member) / Arizona State University (Publisher)
Created2024
193680-Thumbnail Image.png
Description
Recent advances in Artificial Intelligence (AI) have brought AI closer to laypeople than ever before. This leads to a pervasive problem: how would a user ascertain whether an AI system will be safe, reliable, or useful in a given situation? This problem becomes particularly challenging when it is considered that

Recent advances in Artificial Intelligence (AI) have brought AI closer to laypeople than ever before. This leads to a pervasive problem: how would a user ascertain whether an AI system will be safe, reliable, or useful in a given situation? This problem becomes particularly challenging when it is considered that most autonomous systems are not designed by their users; the internal software of these systems may be unavailable or difficult to understand; and the functionality of these systems may even change from initial specifications as a result of learning. To overcome these challenges, this dissertation proposes a paradigm for third-party autonomous assessment of black-box taskable AI systems. The four main desiderata of such assessment systems are: (i) interpretability: generating a description of the AI system's functionality in a language that the target user can understand; (ii) correctness: ensuring that the description of AI system's working is accurate; (iii) generalizability creating a solution approach that works well for different types of AI systems; and (iv) minimal requirements: creating an assessment system that does not place complex requirements on AI systems to support the third-party assessment, otherwise the manufacturers of AI system's might not support such an assessment. To satisfy these properties, this dissertation presents algorithms and requirements that would enable user-aligned autonomous assessment that helps the user understand the limits of a black-box AI system's safe operability. This dissertation proposes a personalized AI assessment module that discovers the high-level ``capabilities'' of an AI system with arbitrary internal planning algorithms/policies and learns an accurate symbolic description of these capabilities in terms of concepts that a user understands. Furthermore, the dissertation includes the associated theoretical results and the empirical evaluations. The results show that (i) a primitive query-response interface can enable the development of autonomous assessment modules that can derive a causally accurate user-interpretable model of the system's capabilities efficiently, and (ii) such descriptions are easier to understand and reason with for the users than the agent's primitive actions.
ContributorsVerma, Pulkit (Author) / Srivastava, Siddharth (Thesis advisor) / Cooke, Nancy (Committee member) / Fainekos, Georgios (Committee member) / Zhang, Yu (Committee member) / Arizona State University (Publisher)
Created2024