Filtering by
- All Subjects: Machine Learning
- All Subjects: Human-Robot Interaction
- All Subjects: Reinforcement Learning
- Creators: Yang, Yezhou
- Member of: Theses and Dissertations
to collaborate to perform a task, it becomes essential for a robot to be aware of multiple
agents working in its work environment. A robot must also learn to adapt to
different agents in the workspace and conduct its interaction based on the presence
of these agents. A theoretical framework was introduced which performs interaction
learning from demonstrations in a two-agent work environment, and it is called
Interaction Primitives.
This document is an in-depth description of the new state of the art Python
Framework for Interaction Primitives between two agents in a single as well as multiple
task work environment and extension of the original framework in a work environment
with multiple agents doing a single task. The original theory of Interaction
Primitives has been extended to create a framework which will capture correlation
between more than two agents while performing a single task. The new state of the
art Python framework is an intuitive, generic, easy to install and easy to use python
library which can be applied to use the Interaction Primitives framework in a work
environment. This library was tested in simulated environments and controlled laboratory
environment. The results and benchmarks of this library are available in the
related sections of this document.
The feature extraction processes can be categorized into three groups. The first group contains processes that are hand-crafted for a specific task. Hand-engineering features requires the knowledge of domain experts and manual labor. However, the feature extraction process is interpretable and explainable. Next group contains the latent-feature extraction processes. While the original feature lies in a high-dimensional space, the relevant factors for a task often lie on a lower dimensional manifold. The latent-feature extraction employs hidden variables to expose the underlying data properties that cannot be directly measured from the input. Latent features seek a specific structure such as sparsity or low-rank into the derived representation through sophisticated optimization techniques. The last category is that of deep features. These are obtained by passing raw input data with minimal pre-processing through a deep network. Its parameters are computed by iteratively minimizing a task-based loss.
In this dissertation, I present four pieces of work where I create and learn suitable data representations. The first task employs hand-crafted features to perform clinically-relevant retrieval of diabetic retinopathy images. The second task uses latent features to perform content-adaptive image enhancement. The third task ranks a pair of images based on their aestheticism. The goal of the last task is to capture localized image artifacts in small datasets with patch-level labels. For both these tasks, I propose novel deep architectures and show significant improvement over the previous state-of-art approaches. A suitable combination of feature representations augmented with an appropriate learning approach can increase performance for most visual computing tasks.
Content detection on handwritten documents assigns a particular class to a homogeneous portion of the document. To complete this task, a set of handwritten solutions was digitally collected from middle school students located in two different geographical regions in 2017 and 2018. This research discusses the methods to collect, pre-process and detect content type in the collected handwritten documents. A total of 4049 documents were extracted in the form of image, and json format; and were labelled using an object labelling software with tags being text, math, diagram, cross out, table, graph, tick mark, arrow, and doodle. The labelled images were fed to the Tensorflow’s object detection API to learn a neural network model. We show our results from two neural networks models, Faster Region-based Convolutional Neural Network (Faster R-CNN) and Single Shot detection model (SSD).
Machine learning has a near infinite number of applications, of which the potential has yet to have been fully harnessed and realized. This thesis will outline two departments that machine learning can be utilized in, and demonstrate the execution of one methodology in each department. The first department that will be described is self-play in video games, where a neural model will be researched and described that will teach a computer to complete a level of Super Mario World (1990) on its own. The neural model in question was inspired by the academic paper “Evolving Neural Networks through Augmenting Topologies”, which was written by Kenneth O. Stanley and Risto Miikkulainen of University of Texas at Austin. The model that will actually be described is from YouTuber SethBling of the California Institute of Technology. The second department that will be described is cybersecurity, where an algorithm is described from the academic paper “Process Based Volatile Memory Forensics for Ransomware Detection”, written by Asad Arfeen, Muhammad Asim Khan, Obad Zafar, and Usama Ahsan. This algorithm utilizes Python and the Volatility framework to detect malicious software in an infected system.