Search Content

Designing an AI-driven System at Scale for Detection of Abusive Head Trauma using Domain Modeling

Description

Traumatic injuries are the leading cause of death in children under 18, with head trauma being the leading cause of death in children below 5. A large but unknown number of traumatic injuries are non-accidental, i.e. inflicted. The lack of sensitivity and specificity required to diagnose Abusive Head Trauma (AHT)…

Traumatic injuries are the leading cause of death in children under 18, with head trauma being the leading cause of death in children below 5. A large but unknown number of traumatic injuries are non-accidental, i.e. inflicted. The lack of sensitivity and specificity required to diagnose Abusive Head Trauma (AHT) from radiological studies results in putting the children at risk of re-injury and death. Modern Deep Learning techniques can be utilized to detect Abusive Head Trauma using Computer Tomography (CT) scans. Training models using these techniques are only a part of building AI-driven Computer-Aided Diagnostic systems. There are challenges in deploying the models to make them highly available and scalable.

The thesis models the domain of Abusive Head Trauma using Deep Learning techniques and builds an AI-driven System at scale using best Software Engineering Practices. It has been done in collaboration with Phoenix Children Hospital (PCH). The thesis breaks down AHT into sub-domains of Medical Knowledge, Data Collection, Data Pre-processing, Image Generation, Image Classification, Building APIs, Containers and Kubernetes. Data Collection and Pre-processing were done at PCH with the help of trauma researchers and radiologists. Experiments are run using Deep Learning models such as DCGAN (for Image Generation), Pretrained 2D and custom 3D CNN classifiers for the classification tasks. The trained models are exposed as APIs using the Flask web framework, contained using Docker and deployed on a Kubernetes cluster.

The results are analyzed based on the accuracy of the models, the feasibility of their implementation as APIs and load testing the Kubernetes cluster. They suggest the need for Data Annotation at the Slice level for CT scans and an increase in the Data Collection process. Load Testing reveals the auto-scalability feature of the cluster to serve a high number of requests.

ContributorsVikram, Aditya (Author) / Sanchez, Javier Gonzalez (Thesis advisor) / Gaffar, Ashraf (Thesis advisor) / Findler, Michael (Committee member) / Arizona State University (Publisher)

Created2020

Generalized Domain Adaptation for Visual Domains

Description

Humans have a great ability to recognize objects in different environments irrespective of their variations. However, the same does not apply to machine learning models which are unable to generalize to images of objects from different domains. The generalization of these models to new data is constrained by the domain…

Humans have a great ability to recognize objects in different environments irrespective of their variations. However, the same does not apply to machine learning models which are unable to generalize to images of objects from different domains. The generalization of these models to new data is constrained by the domain gap. Many factors such as image background, image resolution, color, camera perspective and variations in the objects are responsible for the domain gap between the training data (source domain) and testing data (target domain). Domain adaptation algorithms aim to overcome the domain gap between the source and target domains and learn robust models that can perform well across both the domains.

This thesis provides solutions for the standard problem of unsupervised domain adaptation (UDA) and the more generic problem of generalized domain adaptation (GDA). The contributions of this thesis are as follows. (1) Certain and Consistent Domain Adaptation model for closed-set unsupervised domain adaptation by aligning the features of the source and target domain using deep neural networks. (2) A multi-adversarial deep learning model for generalized domain adaptation. (3) A gating model that detects out-of-distribution samples for generalized domain adaptation.

The models were tested across multiple computer vision datasets for domain adaptation.

The dissertation concludes with a discussion on the proposed approaches and future directions for research in closed set and generalized domain adaptation.

ContributorsNagabandi, Bhadrinath (Author) / Panchanathan, Sethuraman (Thesis advisor) / Venkateswara, Hemanth (Thesis advisor) / McDaniel, Troy (Committee member) / Arizona State University (Publisher)

Created2020

Diversifying Relevant Search Results from Social Media Using Community Contributed Images

Description

Availability of affordable image and video capturing devices as well as rapid development of social networking and content sharing websites has led to the creation of new type of content, Social Media. Any system serving the end user’s query search request should not only take the relevant images into consideration…

Availability of affordable image and video capturing devices as well as rapid development of social networking and content sharing websites has led to the creation of new type of content, Social Media. Any system serving the end user’s query search request should not only take the relevant images into consideration but they also need to be divergent for a well-rounded description of a query. As a result, the automated optimization of image retrieval results that are also divergent becomes exceedingly important.

The main focus of this thesis is to use visual description of a landmark by choosing the most diverse pictures that best describe all the details of the queried location from community-contributed datasets. For this, an end-to-end framework has been built, to retrieve relevant results that are also diverse. Different retrieval re-ranking and diversification strategies are evaluated to find a balance between relevance and diversification. Clustering techniques are employed to improve divergence. A unique fusion approach has been adopted to overcome the dilemma of selecting an appropriate clustering technique and the corresponding parameters, given a set of data to be investigated. Extensive experiments have been conducted on the Flickr Div150Cred dataset that has 30 different landmark locations. The results obtained are promising when evaluated on metrics for relevance and diversification.

ContributorsKalakota, Vaibhav Reddy (Author) / Bansal, Ajay (Thesis advisor) / Bansal, Srividya (Committee member) / Findler, Michael (Committee member) / Arizona State University (Publisher)

Created2020

Neural Network Architecture with External Memory and Domain-aware Weight Switching Mechanism

Description

Humans have an excellent ability to analyze and process information from multiple domains. They also possess the ability to apply the same decision-making process when the situation is familiar with their previous experience.

Inspired by human's ability to remember past experiences and apply the same when a similar situation occurs,…

Humans have an excellent ability to analyze and process information from multiple domains. They also possess the ability to apply the same decision-making process when the situation is familiar with their previous experience.

Inspired by human's ability to remember past experiences and apply the same when a similar situation occurs, the research community has attempted to augment memory with Neural Network to store the previously learned information. Together with this, the community has also developed mechanisms to perform domain-specific weight switching to handle multiple domains using a single model. Notably, the two research fields work independently, and the goal of this dissertation is to combine their capabilities.

This dissertation introduces a Neural Network module augmented with two external memories, one allowing the network to read and write the information and another to perform domain-specific weight switching. Two learning tasks are proposed in this work to investigate the model performance - solving mathematics operations sequence and action based on color sequence identification. A wide range of experiments with these two tasks verify the model's learning capabilities.

ContributorsPatel, Deep Chittranjan (Author) / Ben Amor, Hani (Thesis advisor) / Banerjee, Ayan (Committee member) / McDaniel, Troy (Committee member) / Arizona State University (Publisher)

Created2020

Simultaneous Material Microstructure Classification and Discovery using Acoustic Emission Signals

Description

Acoustic emission (AE) signals have been widely employed for tracking material properties and structural characteristics. In this study, the aim is to analyze the AE signals gathered during a scanning probe lithography process to classify the known microstructure types and discover unknown surface microstructures/anomalies. To achieve this, a Hidden Markov…

Acoustic emission (AE) signals have been widely employed for tracking material properties and structural characteristics. In this study, the aim is to analyze the AE signals gathered during a scanning probe lithography process to classify the known microstructure types and discover unknown surface microstructures/anomalies. To achieve this, a Hidden Markov Model is developed to consider the temporal dependency of the high-resolution AE data. Furthermore, the posterior classification probability and the negative likelihood score for microstructure classification and discovery are computed. Subsequently, a diagnostic procedure to identify the dominant AE frequencies that were used to track the microstructural characteristics is presented. In addition, machine learning methods such as KNN, Naive Bayes, and Logistic Regression classifiers are applied. Finally, the proposed approach applied to identify the surface microstructures of additively manufactured Ti-6Al-4V and show that it not only achieved a high classification accuracy (e.g., more than 90\%) but also correctly identified the microstructural anomalies that may be subjected to further investigation to discover new material phases/properties.

ContributorsSun, Huifeng (Author) / Yan, Hao (Thesis advisor) / Fricks, John (Thesis advisor) / Cheng, Dan (Committee member) / Arizona State University (Publisher)

Created2020

Protecting User Privacy with Social Media Data and Mining

Description

The pervasive use of the Web has connected billions of people all around the globe and enabled them to obtain information at their fingertips. This results in tremendous amounts of user-generated data which makes users traceable and vulnerable to privacy leakage attacks. In general, there are two types of privacy…

The pervasive use of the Web has connected billions of people all around the globe and enabled them to obtain information at their fingertips. This results in tremendous amounts of user-generated data which makes users traceable and vulnerable to privacy leakage attacks. In general, there are two types of privacy leakage attacks for user-generated data, i.e., identity disclosure and private-attribute disclosure attacks. These attacks put users at potential risks ranging from persecution by governments to targeted frauds. Therefore, it is necessary for users to be able to safeguard their privacy without leaving their unnecessary traces of online activities. However, privacy protection comes at the cost of utility loss defined as the loss in quality of personalized services users receive. The reason is that this information of traces is crucial for online vendors to provide personalized services and a lack of it would result in deteriorating utility. This leads to a dilemma of privacy and utility.

Protecting users' privacy while preserving utility for user-generated data is a challenging task. The reason is that users generate different types of data such as Web browsing histories, user-item interactions, and textual information. This data is heterogeneous, unstructured, noisy, and inherently different from relational and tabular data and thus requires quantifying users' privacy and utility in each context separately. In this dissertation, I investigate four aspects of protecting user privacy for user-generated data. First, a novel adversarial technique is introduced to assay privacy risks in heterogeneous user-generated data. Second, a novel framework is proposed to boost users' privacy while retaining high utility for Web browsing histories. Third, a privacy-aware recommendation system is developed to protect privacy w.r.t. the rich user-item interaction data by recommending relevant and privacy-preserving items. Fourth, a privacy-preserving framework for text representation learning is presented to safeguard user-generated textual data as it can reveal private information.

ContributorsBeigi, Ghazaleh (Author) / Liu, Huan (Thesis advisor) / Kambhampati, Subbarao (Committee member) / Tong, Hanghang (Committee member) / Eliassi-Rad, Tina (Committee member) / Arizona State University (Publisher)

Created2020

Understanding Propagation of Malicious Information Online

Description

The recent proliferation of online platforms has not only revolutionized the way people communicate and acquire information but has also led to propagation of malicious information (e.g., online human trafficking, spread of misinformation, etc.). Propagation of such information occurs at unprecedented scale that could ultimately pose imminent societal-significant threats to…

The recent proliferation of online platforms has not only revolutionized the way people communicate and acquire information but has also led to propagation of malicious information (e.g., online human trafficking, spread of misinformation, etc.). Propagation of such information occurs at unprecedented scale that could ultimately pose imminent societal-significant threats to the public. To better understand the behavior and impact of the malicious actors and counter their activity, social media authorities need to deploy certain capabilities to reduce their threats. Due to the large volume of this data and limited manpower, the burden usually falls to automatic approaches to identify these malicious activities. However, this is a subtle task facing online platforms due to several challenges: (1) malicious users have strong incentives to disguise themselves as normal users (e.g., intentional misspellings, camouflaging, etc.), (2) malicious users are high likely to be key users in making harmful messages go viral and thus need to be detected at their early life span to stop their threats from reaching a vast audience, and (3) available data for training automatic approaches for detecting malicious users, are usually either highly imbalanced (i.e., higher number of normal users than malicious users) or comprise insufficient labeled data.

To address the above mentioned challenges, in this dissertation I investigate the propagation of online malicious information from two broad perspectives: (1) content posted by users and (2) information cascades formed by resharing mechanisms in social media. More specifically, first, non-parametric and semi-supervised learning algorithms are introduced to discern potential patterns of human trafficking activities that are of high interest to law enforcement. Second, a time-decay causality-based framework is introduced for early detection of “Pathogenic Social Media (PSM)” accounts (e.g., terrorist supporters). Third, due to the lack of sufficient annotated data for training PSM detection approaches, a semi-supervised causal framework is proposed that utilizes causal-related attributes from unlabeled instances to compensate for the lack of enough labeled data. Fourth, a feature-driven approach for PSM detection is introduced that leverages different sets of attributes from users’ causal activities, account-level and content-related information as well as those from URLs shared by users.

ContributorsAlvari, Hamidreza (Author) / Shakarian, Paulo (Thesis advisor) / Davulcu, Hasan (Committee member) / Tong, Hanghang (Committee member) / Ruston, Scott (Committee member) / Arizona State University (Publisher)

Created2020

Monocular depth estimation with edge-based constraints and active learning

Description

The ubiquity of single camera systems in society has made improving monocular depth estimation a topic of increasing interest in the broader computer vision community. Inspired by recent work in sparse-to-dense depth estimation, this thesis focuses on sparse patterns generated from feature detection based algorithms as opposed to regular grid…

The ubiquity of single camera systems in society has made improving monocular depth estimation a topic of increasing interest in the broader computer vision community. Inspired by recent work in sparse-to-dense depth estimation, this thesis focuses on sparse patterns generated from feature detection based algorithms as opposed to regular grid sparse patterns used by previous work. This work focuses on using these feature-based sparse patterns to generate additional depth information by interpolating regions between clusters of samples that are in close proximity to each other. These interpolated sparse depths are used to enforce additional constraints on the network’s predictions. In addition to the improved depth prediction performance observed from incorporating the sparse sample information in the network compared to pure RGB-based methods, the experiments show that actively retraining a network on a small number of samples that deviate most from the interpolated sparse depths leads to better depth prediction overall.

This thesis also introduces a new metric, titled Edge, to quantify model performance in regions of an image that show the highest change in ground truth depth values along either the x-axis or the y-axis. Existing metrics in depth estimation like Root Mean Square Error(RMSE) and Mean Absolute Error(MAE) quantify model performance across the entire image and don’t focus on specific regions of an image that are hard to predict. To this end, the proposed Edge metric focuses specifically on these hard to classify regions. The experiments also show that using the Edge metric as a small addition to existing loss functions like L1 loss in current state-of-the-art methods leads to vastly improved performance in these hard to classify regions, while also improving performance across the board in every other metric.

ContributorsRai, Anshul (Author) / Yang, Yezhou (Thesis advisor) / Zhang, Wenlong (Committee member) / Liang, Jianming (Committee member) / Arizona State University (Publisher)

Created2019

Analyzing the impact of renewable generation on the locational marginal price (LMP) forecast for California ISO

Description

Accurate forecasting of electricity prices has been a key factor for bidding strategies in the electricity markets. The increase in renewable generation due to large scale PV and wind deployment in California has led to an increase in day-ahead and real-time price volatility. This has also led to prices going…

Accurate forecasting of electricity prices has been a key factor for bidding strategies in the electricity markets. The increase in renewable generation due to large scale PV and wind deployment in California has led to an increase in day-ahead and real-time price volatility. This has also led to prices going negative due to the supply-demand imbalance caused by excess renewable generation during instances of low demand. This research focuses on applying machine learning models to analyze the impact of renewable generation on the hourly locational marginal prices (LMPs) for California Independent System Operator (CAISO). Historical data involving the load, renewable generation from solar and wind, fuel prices, aggregated generation outages is extracted and collected together in a dataset and used as features to train different machine learning models. Tree- based machine learning models such as Extra Trees, Gradient Boost, Extreme Gradient Boost (XGBoost) as well as models based on neural networks such as Long short term memory networks (LSTMs) are implemented for price forecasting. The focus is to capture the best relation between the features and the target LMP variable and determine the weight of every feature in determining the price.

The impact of renewable generation on LMP forecasting is determined for several different days in 2018. It is seen that the prices are impacted significantly by solar and wind generation and it ranks second in terms of impact after the electric load. The results of this research propose a method to evaluate the impact of several parameters on the day-ahead price forecast and would be useful for the grid operators to evaluate the parameters that could significantly impact the day-ahead price prediction and which parameters with low impact could be ignored to avoid an error in the forecast.

ContributorsVad, Chinmay (Author) / Honsberg, C. (Christiana B.) (Thesis advisor) / King, Richard R. (Committee member) / Kurtz, Sarah (Committee member) / Arizona State University (Publisher)

Created2019

Asymmetric Error Control for Classification in Medical Disease Diagnosis

Description

In classification applications, such as medical disease diagnosis, the cost of one type of error (false negative) could greatly outweigh the other (false positive) enabling the need of asymmetric error control. Due to this unique nature of the problem, traditional machine learning techniques, even with much improved accuracy, may not…

In classification applications, such as medical disease diagnosis, the cost of one type of error (false negative) could greatly outweigh the other (false positive) enabling the need of asymmetric error control. Due to this unique nature of the problem, traditional machine learning techniques, even with much improved accuracy, may not be ideal as they do not provide a way to control the false negatives below a certain threshold. To address this need, a classification algorithm that can provide asymmetric error control is proposed. The theoretical foundation for this algorithm is based on Neyman-Pearson (NP) Lemma and it is complemented with sample splitting and order statistics to pick a threshold that enables an upper bound on the number of false negatives. Additionally, this classifier addresses the imbalance of the data, which is common in medical datasets, by using Hellinger distance as the splitting criterion. This eliminates the need of sampling methods, which add complexity and the need for parameter selection. This approach is used to create a novel tree-based classifier that enables asymmetric error control. Applications, such as prediction of the severity of cardiac arrhythmia, require classification over multiple classes. The NP oracle inequalities for binary classes are not immediately applicable for the multiclass NP classification, leading to a multi-step procedure proposed in this dissertation to extend the algorithm in the context of multiple classes. This classifier is used in predicting various forms of cardiac disease for both binary and multi-class classification problems with not only comparable accuracy metrics but also with full control over the number of false negatives. Moreover, this research allows us to pick the threshold for the classifier in a data adaptive way. This dissertation also shows that this methodology can be extended to non medical applications that require classification with asymmetric error control.

ContributorsBokhari, Wasif (Author) / Bansal, Ajay (Thesis advisor) / Zhang, Yu (Committee member) / Yang, Yezhou (Committee member) / Bahadur, Faisal (Committee member) / Arizona State University (Publisher)

Created2021

Filtering by