Robots are often used in long-duration scenarios, such as on the surface of Mars,where they may need to adapt to environmental changes. Typically, robots have been built specifically for single tasks, such as moving boxes in a warehouse…
Robots are often used in long-duration scenarios, such as on the surface of Mars,where they may need to adapt to environmental changes. Typically, robots have been built specifically for single tasks, such as moving boxes in a warehouse or surveying construction sites. However, there is a modern trend away from human hand-engineering and toward robot learning. To this end, the ideal robot is not engineered,but automatically designed for a specific task. This thesis focuses on robots which learn path-planning algorithms for specific environments. Learning is accomplished via genetic programming. Path-planners are represented as Python code, which is optimized via Pareto evolution. These planners are encouraged to explore curiously and efficiently. This research asks the questions: “How can robots exhibit life-long learning where they adapt to changing environments in a robust way?”, and “How can robots learn to be curious?”.
In recent years, the development of Control Barrier Functions (CBF) has allowed safety guarantees to be placed on nonlinear control affine systems. While powerful as a mathematical tool, CBF implementations on systems with high relative degree constraints can become too computationally intensive for real-time control. Such deployments typically rely on…
In recent years, the development of Control Barrier Functions (CBF) has allowed safety guarantees to be placed on nonlinear control affine systems. While powerful as a mathematical tool, CBF implementations on systems with high relative degree constraints can become too computationally intensive for real-time control. Such deployments typically rely on the analysis of a system's symbolic equations of motion, leading to large, platform-specific control programs that do not generalize well. To address this, a more generalized framework is needed. This thesis provides a formulation for second-order CBFs for rigid open kinematic chains. An algorithm for numerically computing the safe control input of a CBF is then introduced based on this formulation. It is shown that this algorithm can be used on a broad category of systems, with specific examples shown for convoy platooning, drone obstacle avoidance, and robotic arms with large degrees of freedom. These examples show up to three-times performance improvements in computation time as well as 2-3 orders of magnitude in the reduction in program size.
Automated driving systems (ADS) have come a long way since their inception. It is clear that these systems rely heavily on stochastic deep learning techniques for perception, planning, and prediction, as it is impossible to construct every possible driving scenario to generate driving policies. Moreover, these systems need to be…
Automated driving systems (ADS) have come a long way since their inception. It is clear that these systems rely heavily on stochastic deep learning techniques for perception, planning, and prediction, as it is impossible to construct every possible driving scenario to generate driving policies. Moreover, these systems need to be trained and validated extensively on typical and abnormal driving situations before they can be trusted with human life. However, most publicly available driving datasets only consist of typical driving behaviors. On the other hand, there is a plethora of videos available on the internet that capture abnormal driving scenarios, but they are unusable for ADS training or testing as they lack important information such as camera calibration parameters, and annotated vehicle trajectories. This thesis proposes a new toolbox, DeepCrashTest-V2, that is capable of reconstructing high-quality simulations from monocular dashcam videos found on the internet. The toolbox not only estimates the crucial parameters such as camera calibration, ego-motion, and surrounding road user trajectories but also creates a virtual world in Car Learning to Act (CARLA) using data from OpenStreetMaps to simulate the estimated trajectories. The toolbox is open-source and is made available in the form of a python package on GitHub at https://github.com/C-Aniruddh/deepcrashtest_v2.
This dissertation explores the use of artificial intelligence and machine learningtechniques for the development of controllers for fully-powered robotic prosthetics.
The aim of the research is to enable prosthetics to predict future states and control
biomechanical properties in both linear and nonlinear fashions, with a particular focus
on ergonomics.
The research is motivated by…
This dissertation explores the use of artificial intelligence and machine learningtechniques for the development of controllers for fully-powered robotic prosthetics.
The aim of the research is to enable prosthetics to predict future states and control
biomechanical properties in both linear and nonlinear fashions, with a particular focus
on ergonomics.
The research is motivated by the need to provide amputees with prosthetic devices
that not only replicate the functionality of the missing limb, but also offer a high
level of comfort and usability. Traditional prosthetic devices lack the sophistication to
adjust to a user’s movement patterns and can cause discomfort and pain over time.
The proposed solution involves the development of machine learning-based controllers
that can learn from user movements and adjust the prosthetic device’s movements
accordingly.
The research involves a combination of simulation and real-world testing to evaluate
the effectiveness of the proposed approach. The simulation involves the creation of a
model of the prosthetic device and the use of machine learning algorithms to train
controllers that predict future states and control biomechanical properties. The real-
world testing involves the use of human subjects wearing the prosthetic device to
evaluate its performance and usability.
The research focuses on two main areas: the prediction of future states and the
control of biomechanical properties. The prediction of future states involves the
development of machine learning algorithms that can analyze a user’s movements
and predict the next movements with a high degree of accuracy. The control of
biomechanical properties involves the development of algorithms that can adjust the
prosthetic device’s movements to ensure maximum comfort and usability for the user.
The results of the research show that the use of artificial intelligence and machine
learning techniques can significantly improve the performance and usability of pros-
thetic devices. The machine learning-based controllers developed in this research are
capable of predicting future states and adjusting the prosthetic device’s movements in
real-time, leading to a significant improvement in ergonomics and usability. Overall,
this dissertation provides a comprehensive analysis of the use of artificial intelligence
and machine learning techniques for the development of controllers for fully-powered
robotic prosthetics.
Natural Language plays a crucial role in human-robot interaction as it is the common ground where human beings and robots can communicate and understand each other. However, most of the work in natural language and robotics is majorly on generating robot actions using a natural language command, which is a…
Natural Language plays a crucial role in human-robot interaction as it is the common ground where human beings and robots can communicate and understand each other. However, most of the work in natural language and robotics is majorly on generating robot actions using a natural language command, which is a unidirectional way of communication. This work focuses on the other direction of communication, where the approach allows a robot to describe its actions from sampled images and joint sequences from the robot task. The importance of this work is that it utilizes multiple modalities, which are the start and end images from the robot task environment and the joint trajectories of the robot arms. The fusion of different modalities is not just about fusing the data but knowing what information to extract from which data sources in such a way that the language description represents the state of the manipulator and the environment that it is performing the task on. From the experimental results of various simulated robot environments, this research demonstrates that utilizing multiple modalities improves the accuracy of the natural language description, and efficiently fusing the modalities is crucial in generating such descriptions by harnessing most of the various data sources.
Enabling robots to physically engage with their environment in a safe and efficient manner is an essential step towards human-robot interaction. To date, robots usually operate as pre-programmed workers that blindly execute tasks in highly structured environments crafted by skilled engineers. Changing the robots’ behavior to cover new duties or…
Enabling robots to physically engage with their environment in a safe and efficient manner is an essential step towards human-robot interaction. To date, robots usually operate as pre-programmed workers that blindly execute tasks in highly structured environments crafted by skilled engineers. Changing the robots’ behavior to cover new duties or handle variability is an expensive, complex, and time-consuming process. However, with the advent of more complex sensors and algorithms, overcoming these limitations becomes within reach. This work proposes innovations in artificial intelligence, language understanding, and multimodal integration to enable next-generation grasping and manipulation capabilities in autonomous robots. The underlying thesis is that multimodal observations and instructions can drastically expand the responsiveness and dexterity of robot manipulators. Natural language, in particular, can be used to enable intuitive, bidirectional communication between a human user and the machine. To this end, this work presents a system that learns context-aware robot control policies from multimodal human demonstrations. Among the main contributions presented are techniques for (a) collecting demonstrations in an efficient and intuitive fashion, (b) methods for leveraging physical contact with the environment and objects, (c) the incorporation of natural language to understand context, and (d) the generation of robust robot control policies. The presented approach and systems are evaluated in multiple grasping and manipulation settings ranging from dexterous manipulation to pick-and-place, as well as contact-rich bimanual insertion tasks. Moreover, the usability of these innovations, especially when utilizing human task demonstrations and communication interfaces, is evaluated in several human-subject studies.
Autonomous Vehicles (AV) are inevitable entities in future mobility systems thatdemand safety and adaptability as two critical factors in replacing/assisting human
drivers. Safety arises in defining, standardizing, quantifying, and monitoring requirements
for all autonomous components. Adaptability, on the other hand, involves
efficient handling of uncertainty and inconsistencies in models and data. First, I…
Autonomous Vehicles (AV) are inevitable entities in future mobility systems thatdemand safety and adaptability as two critical factors in replacing/assisting human
drivers. Safety arises in defining, standardizing, quantifying, and monitoring requirements
for all autonomous components. Adaptability, on the other hand, involves
efficient handling of uncertainty and inconsistencies in models and data. First, I address
safety by presenting a search-based test-case generation framework that can be
used in training and testing deep-learning components of AV. Next, to address adaptability,
I propose a framework based on multi-valued linear temporal logic syntax and
semantics that allows autonomous agents to perform model-checking on systems with
uncertainties. The search-based test-case generation framework provides safety assurance
guarantees through formalizing and monitoring Responsibility Sensitive Safety
(RSS) rules. I use the RSS rules in signal temporal logic as qualification specifications
for monitoring and screening the quality of generated test-drive scenarios. Furthermore,
to extend the existing temporal-based formal languages’ expressivity, I propose
a new spatio-temporal perception logic that enables formalizing qualification specifications
for perception systems. All-in-one, my test-generation framework can be
used for reasoning about the quality of perception, prediction, and decision-making
components in AV. Finally, my efforts resulted in publicly available software. One
is an offline monitoring algorithm based on the proposed logic to reason about the
quality of perception systems. The other is an optimal planner (model checker) that
accepts mission specifications and model descriptions in the form of multi-valued logic
and multi-valued sets, respectively. My monitoring framework is distributed with the
publicly available S-TaLiRo and Sim-ATAV tools.
Imitation learning is a promising methodology for teaching robots how to physically interact and collaborate with human partners. However, successful interaction requires complex coordination in time and space, i.e., knowing what to do as well as when to do it. This dissertation introduces Bayesian Interaction Primitives, a probabilistic imitation learning…
Imitation learning is a promising methodology for teaching robots how to physically interact and collaborate with human partners. However, successful interaction requires complex coordination in time and space, i.e., knowing what to do as well as when to do it. This dissertation introduces Bayesian Interaction Primitives, a probabilistic imitation learning framework which establishes a conceptual and theoretical relationship between human-robot interaction (HRI) and simultaneous localization and mapping. In particular, it is established that HRI can be viewed through the lens of recursive filtering in time and space. In turn, this relationship allows one to leverage techniques from an existing, mature field and develop a powerful new formulation which enables multimodal spatiotemporal inference in collaborative settings involving two or more agents. Through the development of exact and approximate variations of this method, it is shown in this work that it is possible to learn complex real-world interactions in a wide variety of settings, including tasks such as handshaking, cooperative manipulation, catching, hugging, and more.
Autonomous Vehicles (AVs) have the potential to significantly evolve transportation. AVs are expected to make transportation safer by avoiding accidents that happen due to human errors. When AVs become connected, they can exchange information with the infrastructure or other Connected Autonomous Vehicles (CAVs) to efficiently plan their future motion and…
Autonomous Vehicles (AVs) have the potential to significantly evolve transportation. AVs are expected to make transportation safer by avoiding accidents that happen due to human errors. When AVs become connected, they can exchange information with the infrastructure or other Connected Autonomous Vehicles (CAVs) to efficiently plan their future motion and therefore, increase the road throughput and reduce energy consumption. Cooperative algorithms for CAVs will not be deployed in real life unless they are proved to be safe, robust, and resilient to different failure models. Since intersections are crucial areas where most accidents happen, this dissertation first focuses on making existing intersection management algorithms safe and resilient against network and computation time, bounded model mismatches and external disturbances, and the existence of a rogue vehicle. Then, a generic algorithm for conflict resolution and cooperation of CAVs is proposed that ensures the safety of vehicles even when other vehicles suddenly change their plan. The proposed approach can also detect deadlock situations among CAVs and resolve them through a negotiation process. A testbed consisting of 1/10th scale model CAVs is built to evaluate the proposed algorithms. In addition, a simulator is developed to perform tests at a large scale. Results from the conducted experiments indicate the robustness and resilience of proposed approaches.
Robot motion and control remains a complex problem both in general and inthe field of machine learning (ML). Without ML approaches, robot controllers are
typically designed manually, which can take considerable time, generally requiring
accounting for a range of edge cases and often producing models highly constrained
to specific tasks. ML can decrease…
Robot motion and control remains a complex problem both in general and inthe field of machine learning (ML). Without ML approaches, robot controllers are
typically designed manually, which can take considerable time, generally requiring
accounting for a range of edge cases and often producing models highly constrained
to specific tasks. ML can decrease the time it takes to create a model while simultaneously allowing it to operate on a broader range of tasks. The utilization of neural
networks to learn from demonstration is, in particular, an approach with growing
popularity due to its potential to quickly fit the parameters of a model to mimic
training data.
Many such neural networks, especially in the realm of transformer-based architectures, act more as planners, taking in an initial context and then generating a
sequence from that context one step at a time. Others hybridize the approach, predicting a latent plan and conditioning immediate actions on that plan. Such approaches may limit a model’s ability to interact with a dynamic environment, needing to replan to fully update its understanding of the environmental context. In this
thesis, Language-commanded Scene-aware Action Response (LanSAR) is proposed as
a reactive transformer-based neural network that makes immediate decisions based
on previous actions and environmental changes. Its actions are further conditioned
on a language command, serving as a control mechanism while also narrowing the
distribution of possible actions around this command. It is shown that LanSAR successfully learns a strong representation of multimodal visual and spatial input, and
learns reasonable motions in relation to most language commands. It is also shown
that LanSAR can struggle with both the accuracy of motions and understanding the
specific semantics of language commands