Matching Items (50)
Filtering by

Clear all filters

156084-Thumbnail Image.png
Description
The performance of most of the visual computing tasks depends on the quality of the features extracted from the raw data. Insightful feature representation increases the performance of many learning algorithms by exposing the underlying explanatory factors of the output for the unobserved input. A good representation should also handle

The performance of most of the visual computing tasks depends on the quality of the features extracted from the raw data. Insightful feature representation increases the performance of many learning algorithms by exposing the underlying explanatory factors of the output for the unobserved input. A good representation should also handle anomalies in the data such as missing samples and noisy input caused by the undesired, external factors of variation. It should also reduce the data redundancy. Over the years, many feature extraction processes have been invented to produce good representations of raw images and videos.

The feature extraction processes can be categorized into three groups. The first group contains processes that are hand-crafted for a specific task. Hand-engineering features requires the knowledge of domain experts and manual labor. However, the feature extraction process is interpretable and explainable. Next group contains the latent-feature extraction processes. While the original feature lies in a high-dimensional space, the relevant factors for a task often lie on a lower dimensional manifold. The latent-feature extraction employs hidden variables to expose the underlying data properties that cannot be directly measured from the input. Latent features seek a specific structure such as sparsity or low-rank into the derived representation through sophisticated optimization techniques. The last category is that of deep features. These are obtained by passing raw input data with minimal pre-processing through a deep network. Its parameters are computed by iteratively minimizing a task-based loss.

In this dissertation, I present four pieces of work where I create and learn suitable data representations. The first task employs hand-crafted features to perform clinically-relevant retrieval of diabetic retinopathy images. The second task uses latent features to perform content-adaptive image enhancement. The third task ranks a pair of images based on their aestheticism. The goal of the last task is to capture localized image artifacts in small datasets with patch-level labels. For both these tasks, I propose novel deep architectures and show significant improvement over the previous state-of-art approaches. A suitable combination of feature representations augmented with an appropriate learning approach can increase performance for most visual computing tasks.
ContributorsChandakkar, Parag Shridhar (Author) / Li, Baoxin (Thesis advisor) / Yang, Yezhou (Committee member) / Turaga, Pavan (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)
Created2017
156611-Thumbnail Image.png
Description
Handwritten documents have gained popularity in various domains including education and business. A key task in analyzing a complex document is to distinguish between various content types such as text, math, graphics, tables and so on. For example, one such aspect could be a region on the document with a

Handwritten documents have gained popularity in various domains including education and business. A key task in analyzing a complex document is to distinguish between various content types such as text, math, graphics, tables and so on. For example, one such aspect could be a region on the document with a mathematical expression; in this case, the label would be math. This differentiation facilitates the performance of specific recognition tasks depending on the content type. We hypothesize that the recognition accuracy of the subsequent tasks such as textual, math, and shape recognition will increase, further leading to a better analysis of the document.

Content detection on handwritten documents assigns a particular class to a homogeneous portion of the document. To complete this task, a set of handwritten solutions was digitally collected from middle school students located in two different geographical regions in 2017 and 2018. This research discusses the methods to collect, pre-process and detect content type in the collected handwritten documents. A total of 4049 documents were extracted in the form of image, and json format; and were labelled using an object labelling software with tags being text, math, diagram, cross out, table, graph, tick mark, arrow, and doodle. The labelled images were fed to the Tensorflow’s object detection API to learn a neural network model. We show our results from two neural networks models, Faster Region-based Convolutional Neural Network (Faster R-CNN) and Single Shot detection model (SSD).
ContributorsFaizaan, Shaik Mohammed (Author) / VanLehn, Kurt (Thesis advisor) / Cheema, Salman Shaukat (Thesis advisor) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)
Created2018
156622-Thumbnail Image.png
Description
Reasoning about the activities of cyber threat actors is critical to defend against cyber

attacks. However, this task is difficult for a variety of reasons. In simple terms, it is difficult

to determine who the attacker is, what the desired goals are of the attacker, and how they will

carry out their attacks.

Reasoning about the activities of cyber threat actors is critical to defend against cyber

attacks. However, this task is difficult for a variety of reasons. In simple terms, it is difficult

to determine who the attacker is, what the desired goals are of the attacker, and how they will

carry out their attacks. These three questions essentially entail understanding the attacker’s

use of deception, the capabilities available, and the intent of launching the attack. These

three issues are highly inter-related. If an adversary can hide their intent, they can better

deceive a defender. If an adversary’s capabilities are not well understood, then determining

what their goals are becomes difficult as the defender is uncertain if they have the necessary

tools to accomplish them. However, the understanding of these aspects are also mutually

supportive. If we have a clear picture of capabilities, intent can better be deciphered. If we

understand intent and capabilities, a defender may be able to see through deception schemes.

In this dissertation, I present three pieces of work to tackle these questions to obtain

a better understanding of cyber threats. First, we introduce a new reasoning framework

to address deception. We evaluate the framework by building a dataset from DEFCON

capture-the-flag exercise to identify the person or group responsible for a cyber attack.

We demonstrate that the framework not only handles cases of deception but also provides

transparent decision making in identifying the threat actor. The second task uses a cognitive

learning model to determine the intent – goals of the threat actor on the target system.

The third task looks at understanding the capabilities of threat actors to target systems by

identifying at-risk systems from hacker discussions on darkweb websites. To achieve this

task we gather discussions from more than 300 darkweb websites relating to malicious

hacking.
ContributorsNunes, Eric (Author) / Shakarian, Paulo (Thesis advisor) / Ahn, Gail-Joon (Committee member) / Baral, Chitta (Committee member) / Cooke, Nancy J. (Committee member) / Arizona State University (Publisher)
Created2018
156771-Thumbnail Image.png
Description
Reinforcement learning (RL) is a powerful methodology for teaching autonomous agents complex behaviors and skills. A critical component in most RL algorithms is the reward function -- a mathematical function that provides numerical estimates for desirable and undesirable states. Typically, the reward function must be hand-designed by a human expert

Reinforcement learning (RL) is a powerful methodology for teaching autonomous agents complex behaviors and skills. A critical component in most RL algorithms is the reward function -- a mathematical function that provides numerical estimates for desirable and undesirable states. Typically, the reward function must be hand-designed by a human expert and, as a result, the scope of a robot's autonomy and ability to safely explore and learn in new and unforeseen environments is constrained by the specifics of the designed reward function. In this thesis, I design and implement a stateful collision anticipation model with powerful predictive capability based upon my research of sequential data modeling and modern recurrent neural networks. I also develop deep reinforcement learning methods whose rewards are generated by self-supervised training and intrinsic signals. The main objective is to work towards the development of resilient robots that can learn to anticipate and avoid damaging interactions by combining visual and proprioceptive cues from internal sensors. The introduced solutions are inspired by pain pathways in humans and animals, because such pathways are known to guide decision-making processes and promote self-preservation. A new "robot dodge ball' benchmark is introduced in order to test the validity of the developed algorithms in dynamic environments.
ContributorsRichardson, Trevor W (Author) / Ben Amor, Heni (Thesis advisor) / Yang, Yezhou (Committee member) / Srivastava, Siddharth (Committee member) / Arizona State University (Publisher)
Created2018
131527-Thumbnail Image.png
Description
Object localization is used to determine the location of a device, an important aspect of applications ranging from autonomous driving to augmented reality. Commonly-used localization techniques include global positioning systems (GPS), simultaneous localization and mapping (SLAM), and positional tracking, but all of these methodologies have drawbacks, especially in high traffic

Object localization is used to determine the location of a device, an important aspect of applications ranging from autonomous driving to augmented reality. Commonly-used localization techniques include global positioning systems (GPS), simultaneous localization and mapping (SLAM), and positional tracking, but all of these methodologies have drawbacks, especially in high traffic indoor or urban environments. Using recent improvements in the field of machine learning, this project proposes a new method of localization using networks with several wireless transceivers and implemented without heavy computational loads or high costs. This project aims to build a proof-of-concept prototype and demonstrate that the proposed technique is feasible and accurate.

Modern communication networks heavily depend upon an estimate of the communication channel, which represents the distortions that a transmitted signal takes as it moves towards a receiver. A channel can become quite complicated due to signal reflections, delays, and other undesirable effects and, as a result, varies significantly with each different location. This localization system seeks to take advantage of this distinctness by feeding channel information into a machine learning algorithm, which will be trained to associate channels with their respective locations. A device in need of localization would then only need to calculate a channel estimate and pose it to this algorithm to obtain its location.

As an additional step, the effect of location noise is investigated in this report. Once the localization system described above demonstrates promising results, the team demonstrates that the system is robust to noise on its location labels. In doing so, the team demonstrates that this system could be implemented in a continued learning environment, in which some user agents report their estimated (noisy) location over a wireless communication network, such that the model can be implemented in an environment without extensive data collection prior to release.
ContributorsChang, Roger (Co-author) / Kann, Trevor (Co-author) / Alkhateeb, Ahmed (Thesis director) / Bliss, Daniel (Committee member) / Electrical Engineering Program (Contributor) / Barrett, The Honors College (Contributor)
Created2020-05
131537-Thumbnail Image.png
Description
At present, the vast majority of human subjects with neurological disease are still diagnosed through in-person assessments and qualitative analysis of patient data. In this paper, we propose to use Topological Data Analysis (TDA) together with machine learning tools to automate the process of Parkinson’s disease classification and severity assessment.

At present, the vast majority of human subjects with neurological disease are still diagnosed through in-person assessments and qualitative analysis of patient data. In this paper, we propose to use Topological Data Analysis (TDA) together with machine learning tools to automate the process of Parkinson’s disease classification and severity assessment. An automated, stable, and accurate method to evaluate Parkinson’s would be significant in streamlining diagnoses of patients and providing families more time for corrective measures. We propose a methodology which incorporates TDA into analyzing Parkinson’s disease postural shifts data through the representation of persistence images. Studying the topology of a system has proven to be invariant to small changes in data and has been shown to perform well in discrimination tasks. The contributions of the paper are twofold. We propose a method to 1) classify healthy patients from those afflicted by disease and 2) diagnose the severity of disease. We explore the use of the proposed method in an application involving a Parkinson’s disease dataset comprised of healthy-elderly, healthy-young and Parkinson’s disease patients.
ContributorsRahman, Farhan Nadir (Co-author) / Nawar, Afra (Co-author) / Turaga, Pavan (Thesis director) / Krishnamurthi, Narayanan (Committee member) / Electrical Engineering Program (Contributor) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)
Created2020-05
136475-Thumbnail Image.png
Description
Epilepsy affects numerous people around the world and is characterized by recurring seizures, prompting the ability to predict them so precautionary measures may be employed. One promising algorithm extracts spatiotemporal correlation based features from intracranial electroencephalography signals for use with support vector machines. The robustness of this methodology is tested

Epilepsy affects numerous people around the world and is characterized by recurring seizures, prompting the ability to predict them so precautionary measures may be employed. One promising algorithm extracts spatiotemporal correlation based features from intracranial electroencephalography signals for use with support vector machines. The robustness of this methodology is tested through a sensitivity analysis. Doing so also provides insight about how to construct more effective feature vectors.
ContributorsMa, Owen (Author) / Bliss, Daniel (Thesis director) / Berisha, Visar (Committee member) / Barrett, The Honors College (Contributor) / Electrical Engineering Program (Contributor)
Created2015-05
Description
This paper introduces a wireless reconfigurable “button-type” pressure sensor system, via machine learning, for gait analysis application. The pressure sensor system consists of an array of independent button-type pressure sensing units interfaced with a remote computer. The pressure sensing unit contains pressure-sensitive resistors, readout electronics, and a wireless Bluetooth module,

This paper introduces a wireless reconfigurable “button-type” pressure sensor system, via machine learning, for gait analysis application. The pressure sensor system consists of an array of independent button-type pressure sensing units interfaced with a remote computer. The pressure sensing unit contains pressure-sensitive resistors, readout electronics, and a wireless Bluetooth module, which are assembled within footprint of 40 × 25 × 6mm3. The small-footprint, low-profile sensors are populated onto a shoe insole, like buttons, to collect temporal pressure data. The pressure sensing unit measures pressures up to 2,000 kPa while maintaining an error under 10%. The reconfigurable pressure sensor array reduces the total power consumption of the system by 50%, allowing extended period of operation, up to 82.5 hrs. A robust machine learning program identifies the optimal pressure sensing units in any given configuration at an accuracy of up to 98%.
ContributorsBooth, Jayden Charles (Author) / Chae, Junseok (Thesis director) / Chen, Ang (Committee member) / Electrical Engineering Program (Contributor) / Barrett, The Honors College (Contributor)
Created2018-12
133339-Thumbnail Image.png
Description
Medical records are increasingly being recorded in the form of electronic health records (EHRs), with a significant amount of patient data recorded as unstructured natural language text. Consequently, being able to extract and utilize clinical data present within these records is an important step in furthering clinical care. One important

Medical records are increasingly being recorded in the form of electronic health records (EHRs), with a significant amount of patient data recorded as unstructured natural language text. Consequently, being able to extract and utilize clinical data present within these records is an important step in furthering clinical care. One important aspect within these records is the presence of prescription information. Existing techniques for extracting prescription information — which includes medication names, dosages, frequencies, reasons for taking, and mode of administration — from unstructured text have focused on the application of rule- and classifier-based methods. While state-of-the-art systems can be effective in extracting many types of information, they require significant effort to develop hand-crafted rules and conduct effective feature engineering. This paper presents the use of a bidirectional LSTM with CRF tagging model initialized with precomputed word embeddings for extracting prescription information from sentences without requiring significant feature engineering. The experimental results, run on the i2b2 2009 dataset, achieve an F1 macro measure of 0.8562, and scores above 0.9449 on four of the six categories, indicating significant potential for this model.
ContributorsRawal, Samarth Chetan (Author) / Baral, Chitta (Thesis director) / Anwar, Saadat (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)
Created2018-05
Description
In the field of machine learning, reinforcement learning stands out for its ability to explore approaches to complex, high dimensional problems that outperform even expert humans. For robotic locomotion tasks reinforcement learning provides an approach to solving them without the need for unique controllers. In this thesis, two reinforcement learning

In the field of machine learning, reinforcement learning stands out for its ability to explore approaches to complex, high dimensional problems that outperform even expert humans. For robotic locomotion tasks reinforcement learning provides an approach to solving them without the need for unique controllers. In this thesis, two reinforcement learning algorithms, Deep Deterministic Policy Gradient and Group Factor Policy Search are compared based upon their performance in the bipedal walking environment provided by OpenAI gym. These algorithms are evaluated on their performance in the environment and their sample efficiency.
ContributorsMcDonald, Dax (Author) / Ben Amor, Heni (Thesis director) / Yang, Yezhou (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)
Created2018-12