Repairing Neural Networks with Safety Assurances for Robot Learning

Description
Autonomous systems powered by Artificial Neural Networks (NNs) have shown remarkable capabilities in performing complex tasks that are difficult to formally specify. However, ensuring the safety, reliability, and trustworthiness of these NN-based systems remains a significant challenge, especially when they

Autonomous systems powered by Artificial Neural Networks (NNs) have shown remarkable capabilities in performing complex tasks that are difficult to formally specify. However, ensuring the safety, reliability, and trustworthiness of these NN-based systems remains a significant challenge, especially when they encounter inputs that fall outside the distribution of their training data. In robot learning applications, such as lower-leg prostheses, even well-trained policies can exhibit unsafe behaviors when faced with unforeseen or adversarial inputs, potentially leading to harmful outcomes. Addressing these safety concerns is crucial for the adoption and deployment of autonomous systems in real-world, safety-critical environments. To address these challenges, this dissertation presents a neural network repair framework aimed at enhancing safety in robot learning applications. First, a novel layer-wise repair method utilizing Mixed-Integer Quadratic Programming (MIQP) is introduced that enables targeted adjustments to specific layers of a neural network to satisfy predefined safety constraints without altering the network’s structure. Second, the practical effectiveness of the proposed methods is demonstrated through extensive experiments on safety-critical assistive devices, particularly lower-leg prostheses, to ensure the generation of safe and reliable neural policies. Third, the integration of predictive models is explored to enforce implicit safety constraints, allowing for anticipation and mitigation of unsafe behaviors through a two-step supervised learning approach that combines behavioral cloning with neural network repair. By addressing these areas, this dissertation advances the state-of-the-art in neural network repair for robot learning. The outcome of this work promotes the development of robust and secure autonomous systems capable of operating safely in unpredictable and dynamic real-world environments.

Details

Contributors
Date Created
2024
Topical Subject
Resource Type
Language
  • eng
Note
  • Partial requirement for: Ph.D., Arizona State University, 2024
  • Field of study: Computer Science

Additional Information

English
Extent
  • 98 pages
Open Access
Peer-reviewed

Intelligent Visual Signaling for Mixed Reality based Human Robot Interaction

Description
Augmented and mixed-reality technologies are increasingly recognized as pivotal enablers of human-robot collaboration, offering intuitive visual signals that enhance communication and task execution. Despite their potential, the effective integration and optimization of these visual cues in collaborative environments remain underexplored.

Augmented and mixed-reality technologies are increasingly recognized as pivotal enablers of human-robot collaboration, offering intuitive visual signals that enhance communication and task execution. Despite their potential, the effective integration and optimization of these visual cues in collaborative environments remain underexplored. This thesis addresses these gaps through comprehensive studies on the design and implementation of innovative communication frameworks for human-robot interaction. Initially, this research identifies and empirically evaluates effective visual signals for human-robot collaboration. Through a comparative analysis of static and dynamic cues within a collaborative object sorting task and using information-theoretic approaches, the influence of these cues on human behavior is quantified. The results demonstrate that a strategic combination of visual signals can significantly enhance task efficiency and reduce cognitive load. Further advancing this field, the thesis introduces SiSCo—a novel framework employing Large Language Models (LLMs) to dynamically generate context-aware visual cues. Experimental validation shows that SiSCo enhances communication within human-robot teams, improving team efficiency by approximately 60\% over conventional natural language signals and significantly reducing cognitive strain, as measured by NASA-TLX metrics. Committed to community development, the implementation and associated resources of SiSCo are made openly accessible, underscoring the approach of blending empirical research, immersive technologies, and computational innovation to advance human-robot collaboration. Building upon this foundational work, the thesis proposes IMMRSY, an immersive mixed-reality system designed to enrich human-robot interactions in the domain of robot learning. This system facilitates intuitive virtual manipulation and promotes efficient, cost-effective data collection methodologies, setting new standards for immersive interaction in robotic systems.

Details

Contributors
Date Created
2024
Resource Type
Language
  • eng
Note
  • Partial requirement for: Ph.D., Arizona State University, 2024
  • Field of study: Electrical Engineering

Additional Information

English
Extent
  • 136 pages
Open Access
Peer-reviewed

Robot Learning via Deep State-space Model

Description
Robot learning aims to enable robots to acquire new skills and adapt to their environment through advanced learning algorithms. As an embodiment of AI, robots continue to face the challenges of precisely estimating a robot’s state across varied environments and

Robot learning aims to enable robots to acquire new skills and adapt to their environment through advanced learning algorithms. As an embodiment of AI, robots continue to face the challenges of precisely estimating a robot’s state across varied environments and executing actions based on these state estimates. Although many approaches focus on developing end-to-end models and policies, they often lack explainability and do not effectively integrate algorithmic priors to understand the underlying robot models. This thesis addresses the challenges of robot learning through the application of state-space models, demonstrating their efficacy in representing a wide range of robotic systems within a differentiable Bayesian framework that integrates states, observations, and actions. It establishes that foundational state-space models possess the adaptability to be learned through data-driven approaches, enabling robots to accurately estimate their states from environmental interactions and to use these estimated states to execute more complex tasks. Additionally, the thesis shows that state-space modeling can be effectively applied in multimodal settings by learning latent state representations for sensor fusion. Furthermore, it demonstrates that state-space models can be utilized to impose conditions on robot policy networks, thereby enhancing their performance and consistency. The practical implications of deep state-space models are evaluated across a variety of robot manipulation tasks in both simulated and real-world environments, including pick-and-place operations and manipulation in dynamic contexts. The state estimation methods are also applied to soft robot systems, which present significant modeling challenges. In the final part, the thesis discusses the connection between robot learning and foundation models, exploring whether state-space agents based on large language models (LLMs) serve as a more conducive reasoning framework for robot learning. It further explores the use of foundation models to enhance data quality, demonstrating improved success rates for robot policy networks with enriched task context.

Details

Contributors
Date Created
2024
Resource Type
Language
  • eng
Note
  • Partial requirement for: Ph.D., Arizona State University, 2024
  • Field of study: Computer Science

Additional Information

English
Extent
  • 160 pages
Open Access
Peer-reviewed

LanSAR – Language-commanded Scene-aware Action Response

Description
Robot motion and control remains a complex problem both in general and inthe field of machine learning (ML). Without ML approaches, robot controllers are typically designed manually, which can take considerable time, generally requiring accounting for a range of edge cases and

Robot motion and control remains a complex problem both in general and inthe field of machine learning (ML). Without ML approaches, robot controllers are typically designed manually, which can take considerable time, generally requiring accounting for a range of edge cases and often producing models highly constrained to specific tasks. ML can decrease the time it takes to create a model while simultaneously allowing it to operate on a broader range of tasks. The utilization of neural networks to learn from demonstration is, in particular, an approach with growing popularity due to its potential to quickly fit the parameters of a model to mimic training data. Many such neural networks, especially in the realm of transformer-based architectures, act more as planners, taking in an initial context and then generating a sequence from that context one step at a time. Others hybridize the approach, predicting a latent plan and conditioning immediate actions on that plan. Such approaches may limit a model’s ability to interact with a dynamic environment, needing to replan to fully update its understanding of the environmental context. In this thesis, Language-commanded Scene-aware Action Response (LanSAR) is proposed as a reactive transformer-based neural network that makes immediate decisions based on previous actions and environmental changes. Its actions are further conditioned on a language command, serving as a control mechanism while also narrowing the distribution of possible actions around this command. It is shown that LanSAR successfully learns a strong representation of multimodal visual and spatial input, and learns reasonable motions in relation to most language commands. It is also shown that LanSAR can struggle with both the accuracy of motions and understanding the specific semantics of language commands

Details

Contributors
Date Created
2024
Resource Type
Language
  • eng
Note
  • Partial requirement for: M.S., Arizona State University, 2024
  • Field of study: Computer Science

Additional Information

English
Extent
  • 61 pages
Open Access
Peer-reviewed

Learning Predictive Models for Assisted Human Biomechanics

Description
This dissertation explores the use of artificial intelligence and machine learningtechniques for the development of controllers for fully-powered robotic prosthetics. The aim of the research is to enable prosthetics to predict future states and control biomechanical properties in both linear and nonlinear

This dissertation explores the use of artificial intelligence and machine learningtechniques for the development of controllers for fully-powered robotic prosthetics. The aim of the research is to enable prosthetics to predict future states and control biomechanical properties in both linear and nonlinear fashions, with a particular focus on ergonomics. The research is motivated by the need to provide amputees with prosthetic devices that not only replicate the functionality of the missing limb, but also offer a high level of comfort and usability. Traditional prosthetic devices lack the sophistication to adjust to a user’s movement patterns and can cause discomfort and pain over time. The proposed solution involves the development of machine learning-based controllers that can learn from user movements and adjust the prosthetic device’s movements accordingly. The research involves a combination of simulation and real-world testing to evaluate the effectiveness of the proposed approach. The simulation involves the creation of a model of the prosthetic device and the use of machine learning algorithms to train controllers that predict future states and control biomechanical properties. The real- world testing involves the use of human subjects wearing the prosthetic device to evaluate its performance and usability. The research focuses on two main areas: the prediction of future states and the control of biomechanical properties. The prediction of future states involves the development of machine learning algorithms that can analyze a user’s movements and predict the next movements with a high degree of accuracy. The control of biomechanical properties involves the development of algorithms that can adjust the prosthetic device’s movements to ensure maximum comfort and usability for the user. The results of the research show that the use of artificial intelligence and machine learning techniques can significantly improve the performance and usability of pros- thetic devices. The machine learning-based controllers developed in this research are capable of predicting future states and adjusting the prosthetic device’s movements in real-time, leading to a significant improvement in ergonomics and usability. Overall, this dissertation provides a comprehensive analysis of the use of artificial intelligence and machine learning techniques for the development of controllers for fully-powered robotic prosthetics.

Details

Contributors
Date Created
2023
Resource Type
Language
  • eng
Note
  • Partial requirement for: Ph.D., Arizona State University, 2023
  • Field of study: Electrical Engineering

Additional Information

English
Extent
  • 163 pages
Open Access
Peer-reviewed

Towards Reliable Semantic Vision

Description
Models that learn from data are widely and rapidly being deployed today for real-world use, and have become an integral and embedded part of human lives. While these technological advances are exciting and impactful, such data-driven computer vision systems often

Models that learn from data are widely and rapidly being deployed today for real-world use, and have become an integral and embedded part of human lives. While these technological advances are exciting and impactful, such data-driven computer vision systems often fail in inscrutable ways. This dissertation seeks to study and improve the reliability of machine learning models from several perspectives including the development of robust training algorithms to mitigate the risks of such failures, construction of new datasets that provide a new perspective on capabilities of vision models, and the design of evaluation metrics for re-calibrating the perception of performance improvements. I will first address distribution shift in image classification with the following contributions: (1) two methods for improving the robustness of image classifiers to distribution shift by leveraging the classifier's failures into an adversarial data transformation pipeline guided by domain knowledge, (2) an interpolation-based technique for flagging out-of-distribution samples, and (3) an intriguing trade-off between distributional and adversarial robustness resulting from data modification strategies. I will then explore reliability considerations for \textit{semantic vision} models that learn from both visual and natural language data; I will discuss how logical and semantic sentence transformations affect the performance of vision--language models and my contributions towards developing knowledge-guided learning algorithms to mitigate these failures. Finally, I will describe the effort towards building and evaluating complex reasoning capabilities of vision--language models towards the long-term goal of robust and reliable computer vision models that can communicate, collaborate, and reason with humans.

Details

Contributors
Date Created
2023
Resource Type
Language
  • eng
Note
  • Partial requirement for: Ph.D., Arizona State University, 2023
  • Field of study: Computer Science

Additional Information

English
Extent
  • 306 pages
Open Access
Peer-reviewed

Generating Natural Language Descriptions from Multimodal Data Traces of Robot Behavior

Description
Natural Language plays a crucial role in human-robot interaction as it is the common ground where human beings and robots can communicate and understand each other. However, most of the work in natural language and robotics is majorly on generating

Natural Language plays a crucial role in human-robot interaction as it is the common ground where human beings and robots can communicate and understand each other. However, most of the work in natural language and robotics is majorly on generating robot actions using a natural language command, which is a unidirectional way of communication. This work focuses on the other direction of communication, where the approach allows a robot to describe its actions from sampled images and joint sequences from the robot task. The importance of this work is that it utilizes multiple modalities, which are the start and end images from the robot task environment and the joint trajectories of the robot arms. The fusion of different modalities is not just about fusing the data but knowing what information to extract from which data sources in such a way that the language description represents the state of the manipulator and the environment that it is performing the task on. From the experimental results of various simulated robot environments, this research demonstrates that utilizing multiple modalities improves the accuracy of the natural language description, and efficiently fusing the modalities is crucial in generating such descriptions by harnessing most of the various data sources.

Details

Contributors
Date Created
2021
Topical Subject
Resource Type
Language
  • eng
Note
  • Partial requirement for: M.S., Arizona State University, 2021
  • Field of study: Computer Science

Additional Information

English
Extent
  • 38 pages
Open Access
Peer-reviewed

Multimodal Robot Learning for Grasping and Manipulation

Description
Enabling robots to physically engage with their environment in a safe and efficient manner is an essential step towards human-robot interaction. To date, robots usually operate as pre-programmed workers that blindly execute tasks in highly structured environments crafted by skilled

Enabling robots to physically engage with their environment in a safe and efficient manner is an essential step towards human-robot interaction. To date, robots usually operate as pre-programmed workers that blindly execute tasks in highly structured environments crafted by skilled engineers. Changing the robots’ behavior to cover new duties or handle variability is an expensive, complex, and time-consuming process. However, with the advent of more complex sensors and algorithms, overcoming these limitations becomes within reach. This work proposes innovations in artificial intelligence, language understanding, and multimodal integration to enable next-generation grasping and manipulation capabilities in autonomous robots. The underlying thesis is that multimodal observations and instructions can drastically expand the responsiveness and dexterity of robot manipulators. Natural language, in particular, can be used to enable intuitive, bidirectional communication between a human user and the machine. To this end, this work presents a system that learns context-aware robot control policies from multimodal human demonstrations. Among the main contributions presented are techniques for (a) collecting demonstrations in an efficient and intuitive fashion, (b) methods for leveraging physical contact with the environment and objects, (c) the incorporation of natural language to understand context, and (d) the generation of robust robot control policies. The presented approach and systems are evaluated in multiple grasping and manipulation settings ranging from dexterous manipulation to pick-and-place, as well as contact-rich bimanual insertion tasks. Moreover, the usability of these innovations, especially when utilizing human task demonstrations and communication interfaces, is evaluated in several human-subject studies.

Details

Contributors
Date Created
2021
Resource Type
Language
  • eng
Note
  • Partial requirement for: Ph.D., Arizona State University, 2021
  • Field of study: Computer Science

Additional Information

English
Extent
  • 140 pages
Open Access
Peer-reviewed

Probabilistic Imitation Learning for Spatiotemporal Human-Robot Interaction

Description
Imitation learning is a promising methodology for teaching robots how to physically interact and collaborate with human partners. However, successful interaction requires complex coordination in time and space, i.e., knowing what to do as well as when to do it.

Imitation learning is a promising methodology for teaching robots how to physically interact and collaborate with human partners. However, successful interaction requires complex coordination in time and space, i.e., knowing what to do as well as when to do it. This dissertation introduces Bayesian Interaction Primitives, a probabilistic imitation learning framework which establishes a conceptual and theoretical relationship between human-robot interaction (HRI) and simultaneous localization and mapping. In particular, it is established that HRI can be viewed through the lens of recursive filtering in time and space. In turn, this relationship allows one to leverage techniques from an existing, mature field and develop a powerful new formulation which enables multimodal spatiotemporal inference in collaborative settings involving two or more agents. Through the development of exact and approximate variations of this method, it is shown in this work that it is possible to learn complex real-world interactions in a wide variety of settings, including tasks such as handshaking, cooperative manipulation, catching, hugging, and more.

Details

Contributors
Date Created
2021
Resource Type
Language
  • eng
Note
  • Partial requirement for: Ph.D., Arizona State University, 2021
  • Field of study: Computer Science

Additional Information

English
Extent
  • 165 pages
Open Access
Peer-reviewed

Safe and Robust Cooperative Algorithm for Connected Autonomous Vehicles

Description
Autonomous Vehicles (AVs) have the potential to significantly evolve transportation. AVs are expected to make transportation safer by avoiding accidents that happen due to human errors. When AVs become connected, they can exchange information with the infrastructure or other Connected

Autonomous Vehicles (AVs) have the potential to significantly evolve transportation. AVs are expected to make transportation safer by avoiding accidents that happen due to human errors. When AVs become connected, they can exchange information with the infrastructure or other Connected Autonomous Vehicles (CAVs) to efficiently plan their future motion and therefore, increase the road throughput and reduce energy consumption. Cooperative algorithms for CAVs will not be deployed in real life unless they are proved to be safe, robust, and resilient to different failure models. Since intersections are crucial areas where most accidents happen, this dissertation first focuses on making existing intersection management algorithms safe and resilient against network and computation time, bounded model mismatches and external disturbances, and the existence of a rogue vehicle. Then, a generic algorithm for conflict resolution and cooperation of CAVs is proposed that ensures the safety of vehicles even when other vehicles suddenly change their plan. The proposed approach can also detect deadlock situations among CAVs and resolve them through a negotiation process. A testbed consisting of 1/10th scale model CAVs is built to evaluate the proposed algorithms. In addition, a simulator is developed to perform tests at a large scale. Results from the conducted experiments indicate the robustness and resilience of proposed approaches.

Details

Contributors
Date Created
2021
Topical Subject
Resource Type
Language
  • eng
Note
  • Partial requirement for: Ph.D., Arizona State University, 2021
  • Field of study: Computer Engineering

Additional Information

English
Extent
  • 180 pages
Open Access
Peer-reviewed