Search Content

Pose Estimation with Convolutional Neural Networks

Description

Convolutional neural networks boast a myriad of applications in artificial intelligence, but one of the most common uses for such networks is image extraction. The ability of convolutional layers to extract and combine data features for the purpose of image analysis can be leveraged for pose estimation on an object…

Convolutional neural networks boast a myriad of applications in artificial intelligence, but one of the most common uses for such networks is image extraction. The ability of convolutional layers to extract and combine data features for the purpose of image analysis can be leveraged for pose estimation on an object - detecting the presence and attitude of corners and edges allows a convolutional neural network to identify how an object is positioned. This task can assist in working to grasp an object correctly in robotics applications, or to track an object more accurately in 3D space. However, the effectiveness of pose estimation may change based on properties of the object; the pose of a complex object, complexity being determined by internal occlusions, similar faces, etcetera, can be difficult to resolve.
This thesis is part of a collaboration between ASU’s Interactive Robotics Laboratory and NASA’s Jet Propulsion Laboratory. In this thesis, the training pipeline from Sharma’s paper “Pose Estimation for Non-Cooperative Spacecraft Rendezvous Using Convolutional Neural Networks” was modified to perform pose estimation on a complex object - specifically, a segment of a hollow truss. After initial attempts to replicate the architecture used in the paper and train solely on synthetic images, a combination of synthetic dataset generation and transfer learning on an ImageNet-pretrained AlexNet model was implemented to mitigate the difficulty of gathering large amounts of real-world data. Experimentation with pose estimation accuracy and hyperparameters of the model resulted in gradual test accuracy improvement, and future work is suggested to improve pose estimation for complex objects with some form of rotational symmetry.

ContributorsDsouza, Susanna Roshini (Author) / Ben Amor, Hani (Thesis director) / Maneparambil, Kailasnath (Committee member) / Computer Science and Engineering Program (Contributor, Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Barrett, The Honors College (Contributor)

Created2019-05

Beyond Deep Learning: Synthesizing Navigation Programs using Neural Turing Machines

Description

This thesis aims to improve neural control policies for self-driving cars. State-of-the-art navigation software for self-driving cars is based on deep neural networks, where the network is trained on a dataset of past driving experience in various situations. With previous methods, the car can only make decisions based on short-term…

This thesis aims to improve neural control policies for self-driving cars. State-of-the-art navigation software for self-driving cars is based on deep neural networks, where the network is trained on a dataset of past driving experience in various situations. With previous methods, the car can only make decisions based on short-term memory. To address this problem, we proposed that using a Neural Turing Machine (NTM) framework adds long-term memory to the system. We evaluated this approach by using it to master a palindrome task. The network was able to infer how to create a palindrome with 100% accuracy. Since the NTM structure proves useful, we aim to use it in the given scenarios to improve the navigation safety and accuracy of a simulated autonomous car.

ContributorsMartin, Sarah (Author) / Ben Amor, Hani (Thesis director) / Fainekos, Georgios (Committee member) / Barrett, The Honors College (Contributor)

Created2018-05

3D Printed Robotic Arm

Description

For those interested in the field of robotics, there are not many options to get your hands on a physical robot without paying a steep price. This is why the folks at BCN3D Technologies decided to design a fully open-source 3D-printable robotic arm. Their goal was to reduce the barrier…

For those interested in the field of robotics, there are not many options to get your hands on a physical robot without paying a steep price. This is why the folks at BCN3D Technologies decided to design a fully open-source 3D-printable robotic arm. Their goal was to reduce the barrier to entry for the field of robotics and make it exponentially more accessible for people around the world. For our honors thesis, we chose to take the design from BCN3D and attempt to build their robot, to see how accessible the design truly is. Although their designs were not perfect and we were forced to make some adjustments to the 3D files, overall the work put forth by the people at BCN3D was extremely useful in successfully building a robotic arm that is programmed with ease.

ContributorsCohn, Riley (Co-author) / Petty, Charles (Co-author) / Ben Amor, Hani (Thesis director) / Yong, Sze Zheng (Committee member) / Computer Science and Engineering Program (Contributor) / Mechanical and Aerospace Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2017-12

Fielding an Autonomous Cobot in a University Environment: Engineering and Evaluation

Description

Many researchers aspire to create robotics systems that assist humans in common office tasks, especially by taking over delivery and messaging tasks. For meaningful interactions to take place, a mobile robot must be able to identify the humans it interacts with and communicate successfully with them. It must also be…

Many researchers aspire to create robotics systems that assist humans in common office tasks, especially by taking over delivery and messaging tasks. For meaningful interactions to take place, a mobile robot must be able to identify the humans it interacts with and communicate successfully with them. It must also be able to successfully navigate the office environment. While mobile robots are well suited for navigating and interacting with elements inside a deterministic office environment, attempting to interact with human beings in an office environment remains a challenge due to the limits on the amount of cost-efficient compute power onboard the robot. In this work, I propose the use of remote cloud services to offload intensive interaction tasks. I detail the interactions required in an office environment and discuss the challenges faced when implementing a human-robot interaction platform in a stochastic office environment. I also experiment with cloud services for facial recognition, speech recognition, and environment navigation and discuss my results. As part of my thesis, I have implemented a human-robot interaction system utilizing cloud APIs into a mobile robot, enabling it to navigate the office environment, identify humans within the environment, and communicate with these humans.

ContributorsDSouza, Daniel Anand (Author) / Kambhampati, Subbarao (Thesis director) / Zhang, Yu (Committee member) / Computer Science and Engineering Program (Contributor) / School of Computing, Informatics, and Decision Systems Engineering (Contributor) / Barrett, The Honors College (Contributor)

Created2017-05

Generative Models for Trajectory Prediction

Description

Trajectory forecasting is used in many fields such as vehicle future trajectory prediction, stock market price prediction, human motion prediction and so on. Also, robots having the capability to reason about human behavior is an important aspect in human robot interaction. In trajectory prediction with regards to human motion prediction,…

Trajectory forecasting is used in many fields such as vehicle future trajectory prediction, stock market price prediction, human motion prediction and so on. Also, robots having the capability to reason about human behavior is an important aspect in human robot interaction. In trajectory prediction with regards to human motion prediction, implicit learning and reproduction of human behavior is the major challenge. This work tries to compare some of the recent advances taking a phenomenological approach to trajectory prediction. \par The work is expected to mainly target on generating future events or trajectories based on the previous data observed across many time intervals. In particular, this work presents and compares machine learning models to generate various human handwriting trajectories. Although the behavior of every individual is unique, it is still possible to broadly generalize and learn the underlying human behavior from the current observations to predict future human writing trajectories. This enables the machine or the robot to generate future handwriting trajectories given an initial trajectory from the individual thus helping the person to fill up the rest of the letter or curve. This work tests and compares the performance of Conditional Variational Autoencoders and Sinusoidal Representation Network models on handwriting trajectory prediction and reconstruction.

ContributorsKota, Venkata Anil (Author) / Ben Amor, Hani (Thesis advisor) / Venkateswara, Hemanth Kumar Demakethepalli (Committee member) / Redkar, Sangram (Committee member) / Arizona State University (Publisher)

Created2021

Learning Complex Behaviors from Simple Ones: An analysis of Behavior-based Modular Design for RL Agents

Description

Traditional Reinforcement Learning (RL) assumes to learn policies with respect to reward available from the environment but sometimes learning in a complex domain requires wisdom which comes from a wide range of experience. In behavior based robotics, it is observed that a complex behavior can be described by a combination…

Traditional Reinforcement Learning (RL) assumes to learn policies with respect to reward available from the environment but sometimes learning in a complex domain requires wisdom which comes from a wide range of experience. In behavior based robotics, it is observed that a complex behavior can be described by a combination of simpler behaviors. It is tempting to apply similar idea such that simpler behaviors can be combined in a meaningful way to tailor the complex combination. Such an approach would enable faster learning and modular design of behaviors. Complex behaviors can be combined with other behaviors to create even more advanced behaviors resulting in a rich set of possibilities. Similar to RL, combined behavior can keep evolving by interacting with the environment. The requirement of this method is to specify a reasonable set of simple behaviors. In this research, I present an algorithm that aims at combining behavior such that the resulting behavior has characteristics of each individual behavior. This approach has been inspired by behavior based robotics, such as the subsumption architecture and motor schema-based design. The combination algorithm outputs n weights to combine behaviors linearly. The weights are state dependent and change dynamically at every step in an episode. This idea is tested on discrete and continuous environments like OpenAI’s “Lunar Lander” and “Biped Walker”. Results are compared with related domains like Multi-objective RL, Hierarchical RL, Transfer learning, and basic RL. It is observed that the combination of behaviors is a novel way of learning which helps the agent achieve required characteristics. A combination is learned for a given state and so the agent is able to learn faster in an efficient manner compared to other similar approaches. Agent beautifully demonstrates characteristics of multiple behaviors which helps the agent to learn and adapt to the environment. Future directions are also suggested as possible extensions to this research.

ContributorsVora, Kevin Jatin (Author) / Zhang, Yu (Thesis advisor) / Yang, Yezhou (Committee member) / Praharaj, Sarbeswar (Committee member) / Arizona State University (Publisher)

Created2021

Topology Processing of Retinotopic Maps

Description

Retinotopic map, the map between visual inputs on the retina and neuronal activation in brain visual areas, is one of the central topics in visual neuroscience. For human observers, the map is typically obtained by analyzing functional magnetic resonance imaging (fMRI) signals of cortical responses to slowly moving visual stimuli…

Retinotopic map, the map between visual inputs on the retina and neuronal activation in brain visual areas, is one of the central topics in visual neuroscience. For human observers, the map is typically obtained by analyzing functional magnetic resonance imaging (fMRI) signals of cortical responses to slowly moving visual stimuli on the retina. Biological evidences show the retinotopic mapping is topology-preserving/topological (i.e. keep the neighboring relationship after human brain process) within each visual region. Unfortunately, due to limited spatial resolution and the signal-noise ratio of fMRI, state of art retinotopic map is not topological. The topic was to model the topology-preserving condition mathematically, fix non-topological retinotopic map with numerical methods, and improve the quality of retinotopic maps. The impose of topological condition, benefits several applications. With the topological retinotopic maps, one may have a better insight on human retinotopic maps, including better cortical magnification factor quantification, more precise description of retinotopic maps, and potentially better exam ways of in Ophthalmology clinic.

ContributorsTu, Yanshuai (Author) / Wang, Yalin (Thesis advisor) / Lu, Zhong-Lin (Committee member) / Crook, Sharon (Committee member) / Yang, Yezhou (Committee member) / Zhang, Yu (Committee member) / Arizona State University (Publisher)

Created2022

Traffic Accident Reconstruction Using Monocular Dashcam Videos

Description

Automated driving systems (ADS) have come a long way since their inception. It is clear that these systems rely heavily on stochastic deep learning techniques for perception, planning, and prediction, as it is impossible to construct every possible driving scenario to generate driving policies. Moreover, these systems need to be…

Automated driving systems (ADS) have come a long way since their inception. It is clear that these systems rely heavily on stochastic deep learning techniques for perception, planning, and prediction, as it is impossible to construct every possible driving scenario to generate driving policies. Moreover, these systems need to be trained and validated extensively on typical and abnormal driving situations before they can be trusted with human life. However, most publicly available driving datasets only consist of typical driving behaviors. On the other hand, there is a plethora of videos available on the internet that capture abnormal driving scenarios, but they are unusable for ADS training or testing as they lack important information such as camera calibration parameters, and annotated vehicle trajectories. This thesis proposes a new toolbox, DeepCrashTest-V2, that is capable of reconstructing high-quality simulations from monocular dashcam videos found on the internet. The toolbox not only estimates the crucial parameters such as camera calibration, ego-motion, and surrounding road user trajectories but also creates a virtual world in Car Learning to Act (CARLA) using data from OpenStreetMaps to simulate the estimated trajectories. The toolbox is open-source and is made available in the form of a python package on GitHub at https://github.com/C-Aniruddh/deepcrashtest_v2.

ContributorsChandratre, Aniruddh Vinay (Author) / Fainekos, Georgios (Thesis advisor) / Ben Amor, Hani (Thesis advisor) / Pedrielli, Giulia (Committee member) / Arizona State University (Publisher)

Created2022

Autonomous System Control of Multiple Robotic Arms Collaboration via Machine Learning

Description

Multiple robotic arms collaboration is to control multiple robotic arms to collaborate with each other to work on the same task. During the collaboration, theagent is required to avoid all possible collisions between each part of the robotic arms. Thus, incentivizing collaboration and preventing collisions are the two principles which are followed…

Multiple robotic arms collaboration is to control multiple robotic arms to collaborate with each other to work on the same task. During the collaboration, theagent is required to avoid all possible collisions between each part of the robotic arms. Thus, incentivizing collaboration and preventing collisions are the two principles which are followed by the agent during the training process. Nowadays, more and more applications, both in industry and daily lives, require at least two arms, instead of requiring only a single arm. A dual-arm robot satisfies much more needs of different types of tasks, such as folding clothes at home, making a hamburger in a grill or picking and placing a product in a warehouse. The applications done in this paper are all about object pushing. This thesis focuses on how to train the agent to learn pushing an object away as far as possible. Reinforcement Learning (RL), which is a type of Machine Learning (ML), is then utilized in this paper to train the agent to generate optimal actions. Deep Deterministic Policy Gradient (DDPG) and Hindsight Experience Replay (HER) are the two RL methods used in this thesis.

ContributorsLin, Steve (Author) / Ben Amor, Hani (Thesis advisor) / Redkar, Sangram (Committee member) / Zhang, Yu (Committee member) / Arizona State University (Publisher)

Created2023

Machine Learning for Hardware-Constrained Wireless Communication Systems

Description

Millimeter wave (mmWave) and massive multiple-input multiple-output (MIMO) systems are intrinsic components of 5G and beyond. These systems rely on using beamforming codebooks for both initial access and data transmission. Current beam codebooks, however, are not optimized for the given deployment, which can sometimes incur noticeable performance loss. To address…

Millimeter wave (mmWave) and massive multiple-input multiple-output (MIMO) systems are intrinsic components of 5G and beyond. These systems rely on using beamforming codebooks for both initial access and data transmission. Current beam codebooks, however, are not optimized for the given deployment, which can sometimes incur noticeable performance loss. To address these problems, in this dissertation, three novel machine learning (ML) based frameworks for site-specific analog beam codebook design are proposed. In the first framework, two special neural network-based architectures are designed for learning environment and hardware aware beam codebooks through supervised and self-supervised learning respectively. To avoid explicitly estimating the channels, in the second framework, a deep reinforcement learning-based architecture is developed. The proposed solution significantly relaxes the system requirements and is particularly interesting in scenarios where the channel acquisition is challenging. Building upon it, in the third framework, a sample-efficient online reinforcement learning-based beam codebook design algorithm that learns how to shape the beam patterns to null the interfering directions, without requiring any coordination with the interferers, is developed. In the last part of the dissertation, the proposed beamforming framework is further extended to tackle the beam focusing problem in near field wideband systems. %Specifically, the developed solution can achieve beam focusing without knowing the user position and can account for unknown and non-uniform array geometry. All the frameworks are numerically evaluated and the simulation results highlight their potential of learning site-specific codebooks that adapt to the deployment. Furthermore, a hardware proof-of-concept prototype based on mmWave phased arrays is built and used to evaluate the developed online beam learning solutions in realistic scenarios. The learned beam patterns, measured in an anechoic chamber, show the performance gains of the developed framework. All that highlights a promising ML-based beam/codebook optimization direction for practical and hardware-constrained mmWave and terahertz systems.

ContributorsZhang, Yu (Author) / Alkhateeb, Ahmed AA (Thesis advisor) / Tepedelenlioglu, Cihan CT (Committee member) / Bliss, Daniel DB (Committee member) / Dasarathy, Gautam GD (Committee member) / Arizona State University (Publisher)

Created2023