Search Content

Imitation Learning on Bimanual Robots

Description

Bimanual robot manipulation, involving the coordinated control of two robot arms, holds great promise for enhancing the dexterity and efficiency of robotic systems across a wide range of applications, from manufacturing and healthcare to household chores and logistics. However, enabling robots to perform complex bimanual tasks with the same level…

Bimanual robot manipulation, involving the coordinated control of two robot arms, holds great promise for enhancing the dexterity and efficiency of robotic systems across a wide range of applications, from manufacturing and healthcare to household chores and logistics. However, enabling robots to perform complex bimanual tasks with the same level of skill and adaptability as humans remains a challenging problem. The control of a bimanual robot can be tackled through various methods like inverse dynamic controller or reinforcement learning, but each of these methods have their own problems. Inverse dynamic controller cannot adapt to a changing environment, whereas Reinforcement learning is computationally intensive and may require weeks of training for even simple tasks, and reward formulation for Reinforcement Learning is often challenging and is still an open research topic. Imitation learning, leverages human demonstrations to enable robots to acquire the skills necessary for complex tasks and it can be highly sample-efficient and reduces exploration. Given the advantages of Imitation learning we want to explore the application of imitation learning techniques to bridge the gap between human expertise and robotic dexterity in the context of bimanual manipulation. In this thesis, an examination of the Implicit Behavioral Cloning imitation learning algorithm is conducted. Implicit behavioral cloning aims to capture the fundamental behavior or policy of the expert by utilizing energy-based models, which frequently demonstrate superior performance when compared to explicit behavior cloning policies. The assessment encompasses an investigation of the impact of expert demonstrations' quality on the efficacy of the acquired policies. Furthermore, computational and performance metrics of diverse training and inference techniques for energy-based models are compared.

ContributorsRayavarapu, Ravi Swaroop (Author) / Amor, Heni Ben (Thesis advisor) / Gopalan, Nakul (Committee member) / Senanayake, Ransalu (Committee member) / Arizona State University (Publisher)

Created2023

Towards Robot-aided Gait Rehabilitation and Assistance via Characterization and Estimation of Human Locomotion

Description

Walking and mobility are essential aspects of our daily lives, enabling us to engage in various activities. Gait disorders and impaired mobility are widespread challenges faced by older adults and people with neurological injuries, as these conditions can significantly impact their quality of life, leading to a loss of independence…

Walking and mobility are essential aspects of our daily lives, enabling us to engage in various activities. Gait disorders and impaired mobility are widespread challenges faced by older adults and people with neurological injuries, as these conditions can significantly impact their quality of life, leading to a loss of independence and an increased risk of mortality. In response to these challenges, rehabilitation, and assistive robotics have emerged as promising alternatives to conventional gait therapy, offering potential solutions that are less labor-intensive and costly. Despite numerous advances in wearable lower-limb robotics, their current applicability remains confined to laboratory settings. To expand their utility to broader gait impairments and daily living conditions, there is a pressing need for more intelligent robot controllers. In this dissertation, these challenges are tackled from two perspectives: First, to improve the robot's understanding of human motion and intentions which is crucial for assistive robot control, a robust human locomotion estimation technique is presented, focusing on measuring trunk motion. Employing an invariant extended Kalman filtering method that takes sensor misplacement into account, improved convergence properties over the existing methods for different locomotion modes are shown. Secondly, to enhance safe and effective robot-aided gait training, this dissertation proposes to directly learn from physical therapists' demonstrations of manual gait assistance in post-stroke rehabilitation. Lower-limb kinematics of patients and assistive force applied by therapists to the patient's leg are measured using a wearable sensing system which includes a custom-made force sensing array. The collected data is then used to characterize a therapist's strategies. Preliminary analysis indicates that knee extension and weight-shifting play pivotal roles in shaping a therapist's assistance strategies, which are then incorporated into a virtual impedance model that effectively captures high-level therapist behaviors throughout a complete training session. Furthermore, to introduce safety constraints in the design of such controllers, a safety-critical learning framework is explored through theoretical analysis and simulations. A safety filter incorporating an online iterative learning component is introduced to bring robust safety guarantees for gait robotic assistance and training, addressing challenges such as stochasticity and the absence of a known prior dynamic model.

ContributorsRezayat Sorkhabadi, Seyed Mostafa (Author) / Zhang, Wenlong (Thesis advisor) / Berman, Spring (Committee member) / Lee, Hyunglae (Committee member) / Marvi, Hamid (Committee member) / Sugar, Thomas (Committee member) / Arizona State University (Publisher)

Created2023

Design of Reinforcement Learning Controllers with Application to Robotic Knee Tuning with Human in the Loop

Description

This dissertation focuses on reinforcement learning (RL) controller design aiming for real-life applications in continuous state and control problems. It involves three major research investigations in the aspect of design, analysis, implementation, and evaluation. The application case addresses automatically configuring robotic prosthesis impedance parameters. Major contributions of the dissertation include…

This dissertation focuses on reinforcement learning (RL) controller design aiming for real-life applications in continuous state and control problems. It involves three major research investigations in the aspect of design, analysis, implementation, and evaluation. The application case addresses automatically configuring robotic prosthesis impedance parameters. Major contributions of the dissertation include the following. 1) An “echo control” using the intact knee profile as target is designed to overcome the limitation of a designer prescribed robotic knee profile. 2) Collaborative multiagent reinforcement learning (cMARL) is proposed to directly take into account human influence in the robot control design. 3) A phased actor in actor-critic (PAAC) reinforcement learning method is developed to reduce learning variance in RL. The design of an “echo control” is based on a new formulation of direct heuristic dynamic programming (dHDP) for tracking control of a robotic knee prosthesis to mimic the intact knee profile. A systematic simulation of the proposed control is provided using a human-robot system simulation in OpenSim. The tracking controller is then tested on able-bodied and amputee subjects. This is the first real-time human testing of RL tracking control of a robotic knee to mirror the profile of an intact knee. The cMARL is a new solution framework for the human-prosthesis collaboration (HPC) problem. This is the first attempt at considering human influence on human-robot walking with the presence of a reinforcement learning controlled lower limb prosthesis. Results show that treating the human and robot as coupled and collaborating agents and using an estimated human adaptation in robot control design help improve human walking performance. The above studies have demonstrated great potential of RL control in solving continuous problems. To solve more complex real-life tasks with multiple control inputs and high dimensional state space, high variance, low data efficiency, slow learning or even instability are major roadblocks to be addressed. A novel PAAC method is proposed to improve learning performance in policy gradient RL by accounting for both Q value and TD error in actor updates. Systematical and comprehensive demonstrations show its effectiveness by qualitative analysis and quantitative evaluation in DeepMind Control Suite.

ContributorsWu, Ruofan (Author) / Si, Jennie (Thesis advisor) / Huang, He (Committee member) / Santello, Marco (Committee member) / Papandreou- Suppappola, Antonia (Committee member) / Arizona State University (Publisher)

Created2023

Robust Performance Monitoring for Adaptive PID Controllers in System Recovery from Insufficient Excitation

Description

The aim of this thesis is to study adaptive controllers in the context of a Pro-portional Integral Derivative (PID) controller. The PID controller is tuned via loop shaping techniques to ensure desired robustness and performance characteristics with respect to a target loop shape. There are two problems that this work…

The aim of this thesis is to study adaptive controllers in the context of a Pro-portional Integral Derivative (PID) controller. The PID controller is tuned via loop shaping techniques to ensure desired robustness and performance characteristics with respect to a target loop shape. There are two problems that this work addresses: Consider a system that is controlled via an adaptive PID controller. If in absence of or under lack of excitation, the system or controller parameters drift to an arbitrary system (that may or may not be stable). Then, once the system gets sufficient ex- citation, there are two questions to be addressed: First, how quickly is the system able to recover to the target system, and in the process of recovery, how large are the transient overshoots and what factors affect the recovery of the drifted system? Second, continuous online adaptation of the controller may not always be necessary (and economical). So, is there a means to monitor the performance of the current controller and determine via robustness conditions whether to continue with the same controller or reject it and adapt to a new controller? Hence, this work is concerned with robust performance monitoring and recovery of an adaptive PID control system that had drifted to another system in absence of sufficient excitation or excessive noise.

Contributorsiyer, kaushik (Author) / Tsakalis, Konstantinos (Thesis advisor) / Arenz, Christian (Committee member) / Redkar, Sangram (Committee member) / Arizona State University (Publisher)

Created2024

Human-Machine Relationality and the Illusion of Being Cared For: An In-Depth Exploration of Relationships with Communicative Machines

Description

The purpose of this dissertation is to explore how humans experience relationships with machines such as love and sex dolls and robots. This study places a particular emphasis on in-depth, rich, and holistic understanding of people’s lived experiences in the context of human-machine relationships and draws on human-machine communication scholarshi…

The purpose of this dissertation is to explore how humans experience relationships with machines such as love and sex dolls and robots. This study places a particular emphasis on in-depth, rich, and holistic understanding of people’s lived experiences in the context of human-machine relationships and draws on human-machine communication scholarship by examining media evocation perspectives, the role of illusions, and the topic of care. Therefore, this study uses a funneled serial interview design employing three waves of semi-structured interviews (N = 47) with 29 love and sex doll owners and users. Utilizing a phronetic iterative qualitative data analysis approach coupled with metaphor analysis, the findings of this study reveal how participants experience dolls as evocative objects and quasi-others. Moreover, the findings illustrate how participants actively construct and (re)negotiate authenticity in their human-machine relationships, driven by a cyclical process between doll characteristics (agency and presence) and doll owner characteristics (imagination and identity extension) that results in an illusion of being cared for. This study extends previous scholarship by: 1) showcasing a new type of mute machines, namely humanoid mute relational machines; 2) adding empirical evidence to the largely theoretical work on dolls and doll owners; 3) adding empirical evidence to and extending media evocation perspectives by illustrating the suitability of participant metaphors for understanding machines’ evocative nature; and 4) proposing an integrative model of care and illusions that lays the foundation for a new relational interaction illusion model to be examined in future research. This study also discusses practical implications for doll owners, the public, and doll developers.

ContributorsDehnert, Marco (Author) / Sharabi, Liesel L (Thesis advisor) / Tracy, Sarah J (Thesis advisor) / Edwards, Autumn P (Committee member) / Arizona State University (Publisher)

Created2024

Joint Learning of Reward Machines and Policies for Multi-Agent Reinforcement Learning in Non-Cooperative Stochastic Games

Description

Multi-agent reinforcement learning (MARL) plays a pivotal role in artificial intelligence by facilitating the learning process in complex environments inhabited by multiple entities. This thesis explores the integration of learning high-level knowledge through reward machines (RMs) with MARL to effectively manage non-Markovian reward functions in non-cooperative stochastic games. Reward machines…

Multi-agent reinforcement learning (MARL) plays a pivotal role in artificial intelligence by facilitating the learning process in complex environments inhabited by multiple entities. This thesis explores the integration of learning high-level knowledge through reward machines (RMs) with MARL to effectively manage non-Markovian reward functions in non-cooperative stochastic games. Reward machines offer a sophisticated way to model the temporal structure of rewards, thereby providing an enhanced representation of agent decision-making processes. A novel algorithm JIRP-SG is introduced, enabling agents to concurrently learn RMs and optimize their best response policies while navigating the intricate temporal dependencies present in non-cooperative settings. This approach employs automata learning to iteratively acquire RMs and utilizes the Lemke-Howson method to update the Q-functions, aiming for a Nash equilibrium. It is demonstrated that the method introduced reliably converges to accurately encode the reward functions and achieve the optimal best response policy for each agent over time. The effectiveness of the proposed approach is validated through case studies, including a Pacman Game scenario and a Factory Assembly scenario, illustrating its superior performance compared to baseline methods. Additionally, the impact of batch size on learning performance is examined, revealing that a diligent agent employing smaller batches can surpass the performance of an agent using larger batches, which fails to summarize experiences as effectively.

ContributorsKim, Hyohun (Author) / Xu, Zhe ZX (Thesis advisor) / Lee, Hyunglae HL (Committee member) / Berman, Spring SB (Committee member) / Arizona State University (Publisher)

Created2024

LanSAR – Language-commanded Scene-aware Action Response

Description

Robot motion and control remains a complex problem both in general and inthe field of machine learning (ML). Without ML approaches, robot controllers are typically designed manually, which can take considerable time, generally requiring accounting for a range of edge cases and often producing models highly constrained to specific tasks. ML can decrease…

Robot motion and control remains a complex problem both in general and inthe field of machine learning (ML). Without ML approaches, robot controllers are typically designed manually, which can take considerable time, generally requiring accounting for a range of edge cases and often producing models highly constrained to specific tasks. ML can decrease the time it takes to create a model while simultaneously allowing it to operate on a broader range of tasks. The utilization of neural networks to learn from demonstration is, in particular, an approach with growing popularity due to its potential to quickly fit the parameters of a model to mimic training data. Many such neural networks, especially in the realm of transformer-based architectures, act more as planners, taking in an initial context and then generating a sequence from that context one step at a time. Others hybridize the approach, predicting a latent plan and conditioning immediate actions on that plan. Such approaches may limit a model’s ability to interact with a dynamic environment, needing to replan to fully update its understanding of the environmental context. In this thesis, Language-commanded Scene-aware Action Response (LanSAR) is proposed as a reactive transformer-based neural network that makes immediate decisions based on previous actions and environmental changes. Its actions are further conditioned on a language command, serving as a control mechanism while also narrowing the distribution of possible actions around this command. It is shown that LanSAR successfully learns a strong representation of multimodal visual and spatial input, and learns reasonable motions in relation to most language commands. It is also shown that LanSAR can struggle with both the accuracy of motions and understanding the specific semantics of language commands

ContributorsHardy, Adam (Author) / Ben Amor, Heni (Thesis advisor) / Srivastava, Siddharth (Committee member) / Pavlic, Theodore (Committee member) / Arizona State University (Publisher)

Created2024

Design and Manufacturing of a Shape Memory Alloy-Based Actuator

Description

Shape memory alloys (SMAs) are a class of smart materials that can recover their predetermined shape when subjected to an appropriate thermal cycle. This unique property makes SMAs attractive for actuator applications, where the material’s phase transformation can be used to generate controlled motion or force. The actuator design leverages…

Shape memory alloys (SMAs) are a class of smart materials that can recover their predetermined shape when subjected to an appropriate thermal cycle. This unique property makes SMAs attractive for actuator applications, where the material’s phase transformation can be used to generate controlled motion or force. The actuator design leverages the one-way shape memory effect of NiTi (Nickel-Titanium) alloy wire, which contracts upon heating and recovers its original length when cooled. A bias spring opposes the SMA wire contraction, enabling a cyclical actuation motion. Thermal actuation is achieved through joule heating by passing an electric current through the SMA wire. This thesis presents the design of a compact, lightweight SMA-based actuator, providing controlled and precise motion in various engineering applications. A design of a soft actuator is presented exploiting the responses of the shape memory alloy (SMA) to trigger intrinsically mono-stable shape reconfiguration. The proposed class of soft actuators will perform bending actuation by selectively activating the SMA. The transition sequences were optimized by geometric parameterizations and energy-based criteria. The reconfigured structure is capable of arbitrary bending, which is reported here. The proposed class of robots has shown promise as a fast actuator or shape reconfigurable structure, which will bring new capabilities in future long-duration missions in space or undersea, as well as in bio-inspired robotics.

ContributorsShankar, Kaushik (Author) / Ma, Leixin (Thesis advisor) / Berman, Spring (Committee member) / Marvi, Hamidreza (Committee member) / Arizona State University (Publisher)

Created2024

Model-Predictive Control Enhanced Energy Management and Analysis of Dual-Motor Electric Vehicles

Description

Electric vehicles (EVs) have emerged as a promising solution to reduce greenhouse gas emissions and dependency on fossil fuels in the transportation sector. However, limited battery capacity remains a significant challenge, impacting range and overall performance. This thesis explores the application of Nonlinear Model Predictive Control (NMPC) techniques to optimize…

Electric vehicles (EVs) have emerged as a promising solution to reduce greenhouse gas emissions and dependency on fossil fuels in the transportation sector. However, limited battery capacity remains a significant challenge, impacting range and overall performance. This thesis explores the application of Nonlinear Model Predictive Control (NMPC) techniques to optimize energy management in EVs. The study begins with a comprehensive review of existing literature on EV energy optimization strategies and NMPC methodologies. Subsequently, a detailed model of the EV's dynamics, including the battery, motor, and vehicle dynamics, is developed to formulate the optimization problem. The NMPC controller is designed to dynamically adjust the power distribution among different vehicle components, such as the motor, battery, and regenerative braking system, while considering constraints such as battery state-of-charge, vehicle speed, and road conditions. Simulation studies are conducted to evaluate the performance of the proposed NMPC-based energy optimization strategy under various driving scenarios and compare it with conventional control strategies. The results demonstrate that NMPC offers superior performance in terms of energy efficiency, range extension, and overall vehicle dynamics. The findings of this research contribute to the advancement of energy optimization techniques for EVs, paving the way for more efficient and sustainable transportation systems in the future.

ContributorsGangwar, Harsh (Author) / Chen, Yan Dr. (Thesis advisor) / Zhao, Junfeng Dr. (Committee member) / Suo, Dajiang Dr. (Committee member) / Arizona State University (Publisher)

Created2024

A Framework to Allow Unmanned Aerial Vehicles to Make Good Collisions

Description

The field of unmanned aerial vehicle, or UAV, navigation has been moving towards collision inclusive path planning, yet work has not been done to consider what a UAV is colliding with, and if it should or not. Therefore, there is a need for a framework that allows a UAV to…

The field of unmanned aerial vehicle, or UAV, navigation has been moving towards collision inclusive path planning, yet work has not been done to consider what a UAV is colliding with, and if it should or not. Therefore, there is a need for a framework that allows a UAV to consider what is around it and find the best collision candidate. The following work presents a framework that allows UAVs to do so, by considering what an object is and the properties associated with it. Specifically, it considers an object’s material and monetary value to decide if it is good to collide with or not. This information is then published on a binary occupancy map that contains the objects’ size and location with respect to the current position of the UAV. The intent is that the generated binary occupancy map can be used with a path planner to decide what the UAV should collide with. The framework was designed to be as modular as possible and to work with conventional UAV's that have some degree of crash resistance incorporated into their design. The framework was tested by using it to identify various objects that could be collision candidates or not, and then carrying out collisions with some of the objects to test the framework’s accuracy. The purpose of this research was to further the field of collision inclusive path planning by allowing UAVs to know, in a way, what they are intending to collide with and decide if they should or not in order to make safer and more efficient collisions.

ContributorsMolnar, Madelyn Helena (Author) / Zhang, Wenlong (Thesis advisor) / Sugar, Thomas (Committee member) / Guo, Shenghan (Committee member) / Arizona State University (Publisher)

Created2024

Theses and Dissertations

Filtering by

Imitation Learning on Bimanual Robots

Towards Robot-aided Gait Rehabilitation and Assistance via Characterization and Estimation of Human Locomotion

Design of Reinforcement Learning Controllers with Application to Robotic Knee Tuning with Human in the Loop

Robust Performance Monitoring for Adaptive PID Controllers in System Recovery from Insufficient Excitation

Human-Machine Relationality and the Illusion of Being Cared For: An In-Depth Exploration of Relationships with Communicative Machines

Joint Learning of Reward Machines and Policies for Multi-Agent Reinforcement Learning in Non-Cooperative Stochastic Games

LanSAR – Language-commanded Scene-aware Action Response

Design and Manufacturing of a Shape Memory Alloy-Based Actuator

Model-Predictive Control Enhanced Energy Management and Analysis of Dual-Motor Electric Vehicles

A Framework to Allow Unmanned Aerial Vehicles to Make Good Collisions