Search Content

Imitation Learning on Bimanual Robots

Description

Bimanual robot manipulation, involving the coordinated control of two robot arms, holds great promise for enhancing the dexterity and efficiency of robotic systems across a wide range of applications, from manufacturing and healthcare to household chores and logistics. However, enabling robots to perform complex bimanual tasks with the same level…

Bimanual robot manipulation, involving the coordinated control of two robot arms, holds great promise for enhancing the dexterity and efficiency of robotic systems across a wide range of applications, from manufacturing and healthcare to household chores and logistics. However, enabling robots to perform complex bimanual tasks with the same level of skill and adaptability as humans remains a challenging problem. The control of a bimanual robot can be tackled through various methods like inverse dynamic controller or reinforcement learning, but each of these methods have their own problems. Inverse dynamic controller cannot adapt to a changing environment, whereas Reinforcement learning is computationally intensive and may require weeks of training for even simple tasks, and reward formulation for Reinforcement Learning is often challenging and is still an open research topic. Imitation learning, leverages human demonstrations to enable robots to acquire the skills necessary for complex tasks and it can be highly sample-efficient and reduces exploration. Given the advantages of Imitation learning we want to explore the application of imitation learning techniques to bridge the gap between human expertise and robotic dexterity in the context of bimanual manipulation. In this thesis, an examination of the Implicit Behavioral Cloning imitation learning algorithm is conducted. Implicit behavioral cloning aims to capture the fundamental behavior or policy of the expert by utilizing energy-based models, which frequently demonstrate superior performance when compared to explicit behavior cloning policies. The assessment encompasses an investigation of the impact of expert demonstrations' quality on the efficacy of the acquired policies. Furthermore, computational and performance metrics of diverse training and inference techniques for energy-based models are compared.

ContributorsRayavarapu, Ravi Swaroop (Author) / Amor, Heni Ben (Thesis advisor) / Gopalan, Nakul (Committee member) / Senanayake, Ransalu (Committee member) / Arizona State University (Publisher)

Created2023

Estimating Object Kinematic State Machines Via Human Demonstration

Description

As robots become increasingly integrated into the environments, they need to learn how to interact with the objects around them. Many of these objects are articulated with multiple degrees of freedom (DoF). Multi-DoF objects have complex joints that require specific manipulation orders, but existing methods only consider objects with a…

As robots become increasingly integrated into the environments, they need to learn how to interact with the objects around them. Many of these objects are articulated with multiple degrees of freedom (DoF). Multi-DoF objects have complex joints that require specific manipulation orders, but existing methods only consider objects with a single joint. To capture the joint structure and manipulation sequence of any object, I introduce the "Object Kinematic State Machines" (OKSMs), a novel representation that models the kinematic constraints and manipulation sequences of multi-DoF objects. I also present Pokenet, a deep neural network architecture that estimates the OKSMs from the sequence of point cloud data of human demonstrations. I conduct experiments on both simulated and real-world datasets to validate my approach. First, I evaluate the modeling of multi-DoF objects on a simulated dataset, comparing against the current state-of-the-art method. I then assess Pokenet's real-world usability on a dataset collected in my lab, comprising 5,500 data points across 4 objects. Results showcase that my method can successfully estimate joint parameters of novel multi-DoF objects with over 25% more accuracy on average than prior methods.

ContributorsGUPTA, ANMOL (Author) / Gopalan, Nakul (Thesis advisor) / Zhang, Yu (Committee member) / Wang, Yalin (Committee member) / Arizona State University (Publisher)

Created2024

AnyNMP: Generative Cross-Embodiment Neural Motion Planning

Description

Manipulator motion planning has conventionally been solved using sampling and optimization-based algorithms that are agnostic to embodiment and environment configurations. However, these algorithms plan on a fixed environment representation approximated using shape primitives, and hence struggle to find solutions for cluttered and dynamic environments. Furthermore, these algorithms fail to produce…

Manipulator motion planning has conventionally been solved using sampling and optimization-based algorithms that are agnostic to embodiment and environment configurations. However, these algorithms plan on a fixed environment representation approximated using shape primitives, and hence struggle to find solutions for cluttered and dynamic environments. Furthermore, these algorithms fail to produce solutions for complex unstructured environments under real-time bounds. Neural Motion Planners (NMPs) are an appealing alternative to algorithmic approaches as they can leverage parallel computing for planning while incorporating arbitrary environmental constraints directly from raw sensor observations. Contemporary NMPs successfully transfer to different environment variations, however, fail to generalize across embodiments. This thesis proposes "AnyNMP'', a generalist motion planning policy for zero-shot transfer across different robotic manipulators and environments. The policy is conditioned on semantically segmented 3D pointcloud representation of the workspace thus enabling implicit sim2real transfer. In the proposed approach, templates are formulated for manipulator kinematics and ground truth motion plans are collected for over 3 million procedurally sampled robots in randomized environments. The planning pipeline consists of a state validation model for differentiable collision detection and a sampling based planner for motion generation. AnyNMP has been validated on 5 different commercially available manipulators and showcases successful cross-embodiment planning, achieving an 80% average success rate on baseline benchmarks.

ContributorsRath, Prabin Kumar (Author) / Gopalan, Nakul (Thesis advisor) / Yu, Hongbin (Thesis advisor) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)

Created2024

Learning Temporally Composable Task Segmentations with Language

Description

Learning longer-horizon tasks is challenging with techniques such as reinforcement learning and behavior cloning. Previous approaches have split these long tasks into shorter tasks that are easier to learn by using statistical change point detection methods. However, classical changepoint detection methods function only with low-dimensional robot trajectory data and not…

Learning longer-horizon tasks is challenging with techniques such as reinforcement learning and behavior cloning. Previous approaches have split these long tasks into shorter tasks that are easier to learn by using statistical change point detection methods. However, classical changepoint detection methods function only with low-dimensional robot trajectory data and not with high-dimensional inputs such as vision. In this thesis, I have split a long horizon tasks, represented by trajectories into short-horizon sub-tasks with the supervision of language. These shorter horizon tasks can be learned using conventional behavior cloning approaches. I found comparisons between the techniques from the video moment retrieval problem and changepoint detection in robot trajectory data consisting of high-dimensional data. The proposed moment retrieval-based approach shows a more than 30% improvement in mean average precision (mAP) for identifying trajectory sub-tasks with language guidance compared to that without language. Several ablations are performed to understand the effects of domain randomization, sample complexity, views, and sim-to-real transfer of this method. The data ablation shows that just with a 100 labeled trajectories a 42.01 mAP can be achieved, demonstrating the sample efficiency of using such an approach. Further, behavior cloning models trained on the segmented trajectories outperform a single model trained on the whole trajectory by up to 20%.

ContributorsRaj, Divyanshu (Author) / Gopalan, Nakul (Thesis advisor) / Baral, Chitta (Committee member) / Senanayake, Ransalu (Committee member) / Arizona State University (Publisher)

Created2024

Learning to Grasp Using the Extrinsic Property of the Environment

Description

Grasping objects in a general household setting is a dexterous task, high compliance is needed to generate a grasp that leads to grasp closure. Standard 6 Degree of Freedom (DoF) manipulators with parallel grippers are naturally incapable of showing such dexterity. This renders many objects in household settings difficult…

Grasping objects in a general household setting is a dexterous task, high compliance is needed to generate a grasp that leads to grasp closure. Standard 6 Degree of Freedom (DoF) manipulators with parallel grippers are naturally incapable of showing such dexterity. This renders many objects in household settings difficult to grasp, as the manipulator cannot access readily available antipodal (planar) grasps. In such scenarios, one must either use a high DoF end effector to learn this compliance or change the initial configuration of the object to find an antipodal grasp. A pipeline that uses the extrinsic forces present in the environment to make up for this lack of compliance is proposed. The proposed method: i) Takes the point cloud input from the environment, and creates a search space with all its available poses. This search space is used to identify the best graspable position for an object with a grasp score network ii) Learn how to approach an object, and generate an appropriate set of motor primitives that converts the current ungraspable pose to a graspable pose. iii) Run a naive grasp detection network to verify the proposed methods and subsequently grasp the initially ungraspable object. By integrating these components, objects that were initially ungraspable, with a standard grasp detection model DexNet, remain no longer ungraspable.

ContributorsSah, Anant (Author) / Gopalan, Nakul (Thesis advisor) / Zhang, Wenlong (Committee member) / Senanayake, Ransalu (Committee member) / Arizona State University (Publisher)

Created2024

Theses and Dissertations

Filtering by

Imitation Learning on Bimanual Robots

Estimating Object Kinematic State Machines Via Human Demonstration

AnyNMP: Generative Cross-Embodiment Neural Motion Planning

Learning Temporally Composable Task Segmentations with Language

Learning to Grasp Using the Extrinsic Property of the Environment