Search Content

Study of Knowledge Transfer Techniques For Deep Learning on Edge Devices

Description

With the emergence of edge computing paradigm, many applications such as image recognition and augmented reality require to perform machine learning (ML) and artificial intelligence (AI) tasks on edge devices. Most AI and ML models are large and computational heavy, whereas edge devices are usually equipped with limited computational and…

With the emergence of edge computing paradigm, many applications such as image recognition and augmented reality require to perform machine learning (ML) and artificial intelligence (AI) tasks on edge devices. Most AI and ML models are large and computational heavy, whereas edge devices are usually equipped with limited computational and storage resources. Such models can be compressed and reduced in order to be placed on edge devices, but they may loose their capability and may not generalize and perform well compared to large models. Recent works used knowledge transfer techniques to transfer information from a large network (termed teacher) to a small one (termed student) in order to improve the performance of the latter. This approach seems to be promising for learning on edge devices, but a thorough investigation on its effectiveness is lacking.

The purpose of this work is to provide an extensive study on the performance (both in terms of accuracy and convergence speed) of knowledge transfer, considering different student-teacher architectures, datasets and different techniques for transferring knowledge from teacher to student.

A good performance improvement is obtained by transferring knowledge from both the intermediate layers and last layer of the teacher to a shallower student. But other architectures and transfer techniques do not fare so well and some of them even lead to negative performance impact. For example, a smaller and shorter network, trained with knowledge transfer on Caltech 101 achieved a significant improvement of 7.36\% in the accuracy and converges 16 times faster compared to the same network trained without knowledge transfer. On the other hand, smaller network which is thinner than the teacher network performed worse with an accuracy drop of 9.48\% on Caltech 101, even with utilization of knowledge transfer.

ContributorsSistla, Ragini (Author) / Zhao, Ming (Thesis advisor, Committee member) / Li, Baoxin (Committee member) / Tong, Hanghang (Committee member) / Arizona State University (Publisher)

Created2018

Data Management Behind Machine Learning

Description

This thesis dives into the world of artificial intelligence by exploring the functionality of a single layer artificial neural network through a simple housing price classification example while simultaneously considering its impact from a data management perspective on both the software and hardware level. To begin this study, the universally…

This thesis dives into the world of artificial intelligence by exploring the functionality of a single layer artificial neural network through a simple housing price classification example while simultaneously considering its impact from a data management perspective on both the software and hardware level. To begin this study, the universally accepted model of an artificial neuron is broken down into its key components and then analyzed for functionality by relating back to its biological counterpart. The role of a neuron is then described in the context of a neural network, with equal emphasis placed on how it individually undergoes training and then for an entire network. Using the technique of supervised learning, the neural network is trained with three main factors for housing price classification, including its total number of rooms, bathrooms, and square footage. Once trained with most of the generated data set, it is tested for accuracy by introducing the remainder of the data-set and observing how closely its computed output for each set of inputs compares to the target value. From a programming perspective, the artificial neuron is implemented in C so that it would be more closely tied to the operating system and therefore make the collected profiler data more precise during the program's execution. The program is designed to break down each stage of the neuron's training process into distinct functions. In addition to utilizing more functional code, the struct data type is used as the underlying data structure for this project to not only represent the neuron but for implementing the neuron's training and test data. Once fully trained, the neuron's test results are then graphed to visually depict how well the neuron learned from its sample training set. Finally, the profiler data is analyzed to describe how the program operated from a data management perspective on the software and hardware level.

ContributorsRichards, Nicholas Giovanni (Author) / Miller, Phillip (Thesis director) / Meuth, Ryan (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2018-05

Learning the Initial Lexicon in Translating Natural Language to Formal Language

Description

The objective of this research is to determine an approach for automating the learning of the initial lexicon used in translating natural language sentences to their formal knowledge representations based on lambda-calculus expressions. Using a universal knowledge representation and its associated parser, this research attempts to use word alignment techniques…

The objective of this research is to determine an approach for automating the learning of the initial lexicon used in translating natural language sentences to their formal knowledge representations based on lambda-calculus expressions. Using a universal knowledge representation and its associated parser, this research attempts to use word alignment techniques to align natural language sentences to the linearized parses of their associated knowledge representations in order to learn the meanings of individual words. The work includes proposing and analyzing an approach that can be used to learn some of the initial lexicon.

ContributorsBaldwin, Amy Lynn (Author) / Baral, Chitta (Thesis director) / Vo, Nguyen (Committee member) / Industrial, Systems (Contributor) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)

Created2015-05

Raspberry Pi Radio: Programming a Multiple Source Music Player

Description

The purpose of this project was to program a Raspberry Pi to be able to play music from both local storage on the Pi and from internet radio stations such as Pandora. The Pi also needs to be able to play various types of file formats, such as mp3 and…

The purpose of this project was to program a Raspberry Pi to be able to play music from both local storage on the Pi and from internet radio stations such as Pandora. The Pi also needs to be able to play various types of file formats, such as mp3 and FLAC. Finally, the project is also to be driven by a mobile app running on a smartphone or tablet. To achieve this, a client server design was employed where the Raspberry Pi acts as the server and the mobile app is the client. The server functionality was achieved using a Python script that listens on a socket and calls various executables that handle the different formats of music being played. The client functionality was achieved by programming an Android app in Java that sends encoded commands to the server, which the server decodes and begins playing the music that command dictates. The designs for both the client and server are easily extensible and allow for any future modifications to the project to be easily made.

ContributorsStorto, Michael Olson (Author) / Burger, Kevin (Thesis director) / Meuth, Ryan (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)

Created2015-05

Software for Agent-Based Computational Economics

Description

Agent Based modeling has been used in computer science to simulate complex phenomena. The introduction of Agent Based Models into the field of economics (Agent Based Computational Economics ACE) is not new, however work on making model environments simpler to design for individuals without a background in computer science or…

Agent Based modeling has been used in computer science to simulate complex phenomena. The introduction of Agent Based Models into the field of economics (Agent Based Computational Economics ACE) is not new, however work on making model environments simpler to design for individuals without a background in computer science or computer engineering is a constantly evolving topic. The issue is a trade off of how much is handled by the framework and how much control the modeler has, as well as what tools exist to allow the user to develop insights from the behavior of the model. The solutions looked at in this thesis are the construction of a simplified grammar for model construction, the design of an economic based library to assist in ACE modeling, and examples of how to construct interactive models.

ContributorsAnderson, Brandon David (Author) / Bazzi, Rida (Thesis director) / Kuminoff, Nicolai (Committee member) / Roberts, Nancy (Committee member) / Computer Science and Engineering Program (Contributor) / Economics Program in CLAS (Contributor) / Barrett, The Honors College (Contributor)

Created2016-05

Data Driven Game Theoretic Cyber Threat Mitigation

Description

Penetration testing is regarded as the gold-standard for understanding how well an organization can withstand sophisticated cyber-attacks. However, the recent prevalence of markets specializing in zero-day exploits on the darknet make exploits widely available to potential attackers. The cost associated with these sophisticated kits generally precludes penetration testers from simply…

Penetration testing is regarded as the gold-standard for understanding how well an organization can withstand sophisticated cyber-attacks. However, the recent prevalence of markets specializing in zero-day exploits on the darknet make exploits widely available to potential attackers. The cost associated with these sophisticated kits generally precludes penetration testers from simply obtaining such exploits – so an alternative approach is needed to understand what exploits an attacker will most likely purchase and how to defend against them. In this paper, we introduce a data-driven security game framework to model an attacker and provide policy recommendations to the defender. In addition to providing a formal framework and algorithms to develop strategies, we present experimental results from applying our framework, for various system conﬁgurations, on real-world exploit market data actively mined from the darknet.

ContributorsRobertson, John James (Author) / Shakarian, Paulo (Thesis director) / Doupe, Adam (Committee member) / Electrical Engineering Program (Contributor) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2016-05

Learning Generalized Heuristics Using Deep Neural Networks

Description

Classical planning is a field of Artificial Intelligence concerned with allowing autonomous agents to make reasonable decisions in complex environments. This work investigates
the application of deep learning and planning techniques, with the aim of constructing generalized plans capable of solving multiple problem instances. We construct a Deep Neural Network that,…

Classical planning is a field of Artificial Intelligence concerned with allowing autonomous agents to make reasonable decisions in complex environments. This work investigates
the application of deep learning and planning techniques, with the aim of constructing generalized plans capable of solving multiple problem instances. We construct a Deep Neural Network that, given an abstract problem state, predicts both (i) the best action to be taken from that state and (ii) the generalized “role” of the object being manipulated. The neural network was tested on two classical planning domains: the blocks world domain and the logistic domain. Results indicate that neural networks are capable of making such
predictions with high accuracy, indicating a promising new framework for approaching generalized planning problems.

ContributorsNakhleh, Julia Blair (Author) / Srivastava, Siddharth (Thesis director) / Fainekos, Georgios (Committee member) / Computer Science and Engineering Program (Contributor) / School of International Letters and Cultures (Contributor) / Barrett, The Honors College (Contributor)

Created2019-05

Prescription Information Extraction from Electronic Health Records using BiLSTM-CRF and Word Embeddings

Description

Medical records are increasingly being recorded in the form of electronic health records (EHRs), with a significant amount of patient data recorded as unstructured natural language text. Consequently, being able to extract and utilize clinical data present within these records is an important step in furthering clinical care. One important…

Medical records are increasingly being recorded in the form of electronic health records (EHRs), with a significant amount of patient data recorded as unstructured natural language text. Consequently, being able to extract and utilize clinical data present within these records is an important step in furthering clinical care. One important aspect within these records is the presence of prescription information. Existing techniques for extracting prescription information — which includes medication names, dosages, frequencies, reasons for taking, and mode of administration — from unstructured text have focused on the application of rule- and classifier-based methods. While state-of-the-art systems can be effective in extracting many types of information, they require significant effort to develop hand-crafted rules and conduct effective feature engineering. This paper presents the use of a bidirectional LSTM with CRF tagging model initialized with precomputed word embeddings for extracting prescription information from sentences without requiring significant feature engineering. The experimental results, run on the i2b2 2009 dataset, achieve an F1 macro measure of 0.8562, and scores above 0.9449 on four of the six categories, indicating significant potential for this model.

ContributorsRawal, Samarth Chetan (Author) / Baral, Chitta (Thesis director) / Anwar, Saadat (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2018-05

HA-MRA: A Human-Aware Multi-Robot Architecture

Description

This thesis describes a multi-robot architecture which allows teams of robots to work with humans to complete tasks. The multi-agent architecture was built using Robot Operating System and Python. This architecture was designed modularly, allowing the use of different planners and robots. The system automatically replans when robots connect or…

This thesis describes a multi-robot architecture which allows teams of robots to work with humans to complete tasks. The multi-agent architecture was built using Robot Operating System and Python. This architecture was designed modularly, allowing the use of different planners and robots. The system automatically replans when robots connect or disconnect. The system was demonstrated on two real robots, a Fetch and a PeopleBot, by conducting a surveillance task on the fifth floor of the Computer Science building at Arizona State University. The next part of the system includes extensions for teaming with humans. An Android application was created to serve as the interface between the system and human teammates. This application provides a way for the system to communicate with humans in the loop. In addition, it sends location information of the human teammates to the system so that goal recognition can be performed. This goal recognition allows the generation of human-aware plans. This capability was demonstrated in a mock search and rescue scenario using the Fetch to locate a missing teammate.

ContributorsSaba, Gabriel Christer (Author) / Kambhampati, Subbarao (Thesis director) / DoupÃÂ©, Adam (Committee member) / Chakraborti, Tathagata (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2017-05

Immersion of Choice

Description

The topic of my creative project centers on the question of "How can the audience's choices influence dancers' improvisation?" This dance work seeks to redefine the relationship between audience and performers through integration of audience, technology, and movement in real-time. This topic was derived from the fields of Computer Science…

The topic of my creative project centers on the question of "How can the audience's choices influence dancers' improvisation?" This dance work seeks to redefine the relationship between audience and performers through integration of audience, technology, and movement in real-time. This topic was derived from the fields of Computer Science and Dance. To answer my main question, I need to explore how I can interconnect the theory of Computer Science/fundamentals of a web application and the elements of dance improvisation. This topic interests me because it focuses on combining two studies that do not seem related. However, I find that when I am coding a web application, I can insert blocks of code. This relates to dance improvisation where I have a movement vocabulary, and I can insert different moves based on the context. The idea of gathering data from an audience in real time also interests me. I find that data is most useful when a story can be deduced from that data. To figure out how I can use dance to create and tell a story about the data that is collected, I find that to be intriguing as well. The main goals of my Creative Project are to learn the skills needed to develop a web application using the knowledge and theory that I am acquiring through Computer Science as well as learning about the skills needed to produce a performance piece. My object for the overall project is to create an audience-interactive experience that presents choices for dancers and creates a connection between two completely different studies: Computer Science and Dance. My project will consist of having the audience enter their answers to preset questions via an online voting application. The stage background screen will be utilized to show the question results in percentages in the form of a chart. The dancers will then serve as a live interpretation of these results. This Creative Project will serve as a gateway between the work that has been cultivated in my studies and the real world. The methods involve exploring movement qualities in improvisation, communicating with my cast about what worked best for the transitions between each section of the piece, and testing for the web applications. I learned the importance of having structure within improvisational movement for the purpose of choreography. The significance of structure is that it provides direction, clarity, and a sense of unification for the dancers. I also learned the basics of the programming language, Python, in order to develop the two real-time web applications. The significance of learning Python is that I will be able to add this to my skillset of programming languages as well as build upon my knowledge of Computer Science and develop more real-world applications in the future.

ContributorsNgai, Courtney Taylor (Author) / Britt, Melissa (Thesis director) / Standley, Eileen (Committee member) / Computer Science and Engineering Program (Contributor) / School of Film, Dance and Theatre (Contributor) / Barrett, The Honors College (Contributor)

Created2018-05

Filtering by