Search Content

Computer Vision Methods for Urinary Tract Infection Diagnostics

Description

Antibiotic resistance is a very important issue that threatens mankind. As bacteria

are becoming resistant to multiple antibiotics, many common antibiotics will soon

become ineective. The ineciency of current methods for diagnostics is an important

cause of antibiotic resistance, since due to their relative slowness, treatment plans

are often based on physician's experience rather…

Antibiotic resistance is a very important issue that threatens mankind. As bacteria

are becoming resistant to multiple antibiotics, many common antibiotics will soon

become ineective. The ineciency of current methods for diagnostics is an important

cause of antibiotic resistance, since due to their relative slowness, treatment plans

are often based on physician's experience rather than on test results, having a high

chance of being inaccurate or not optimal. This leads to a need of faster, pointof-

care (POC) methods, which can provide results in a few hours. Motivated by

recent advances on computer vision methods, three projects have been developed

for bacteria identication and antibiotic susceptibility tests (AST), with the goal of

speeding up the diagnostics process. The rst two projects focus on obtaining features

from optical microscopy such as bacteria shape and motion patterns to distinguish

active and inactive cells. The results show their potential as novel methods for AST,

being able to obtain results within a window of 30 min to 3 hours, a much faster

time frame than the gold standard approach based on cell culture, which takes at

least half a day to be completed. The last project focus on the identication task,

combining large volume light scattering microscopy (LVM) and deep learning to

distinguish bacteria from urine particles. The developed setup is suitable for pointof-

care applications, as a large volume can be viewed at a time, avoiding the need

for cell culturing or enrichment. This is a signicant gain compared to cell culturing

methods. The accuracy performance of the deep learning system is higher than chance

and outperforms a traditional machine learning system by up to 20%.

ContributorsIriya, Rafael (Author) / Turaga, Pavan (Thesis advisor) / Wang, Shaopeng (Committee member) / Grys, Thomas (Committee member) / Zhang, Yanchao (Committee member) / Arizona State University (Publisher)

Created2020

On Density and Noise Challenges in Tensor-Based Data Analytics

Description

Many real-world problems, such as model- and data-driven computer simulation analysis, social and collaborative network analysis, brain data analysis, and so on, benefit from jointly modeling and analyzing the underlying patterns associated with complex, multi-relational data. Tensor decomposition is an ideal mathematical tool for this joint modeling, due to its…

Many real-world problems, such as model- and data-driven computer simulation analysis, social and collaborative network analysis, brain data analysis, and so on, benefit from jointly modeling and analyzing the underlying patterns associated with complex, multi-relational data. Tensor decomposition is an ideal mathematical tool for this joint modeling, due to its simultaneous analysis of such multi-relational data, which is made possible by the data's multidimensional, array-based nature. A major challenge in tensor decomposition lies with its computational and space complexity, especially for dense datasets. While the process is comparatively faster for sparse tensors, decomposition is still a major bottleneck for many applications. The tensor decomposition process results in dense (hence, large) intermediate results, even when the input tensor is sparse (or small). Noise is another challenge for most data mining techniques, and many tensor decomposition schemes are sensitive to noisy datasets; this is an inevitable problem for real-world data, which can lead to false conclusions. In this dissertation, I develop innovative tensor decomposition algorithms for mining both sparse and dense multi-relational data in a noise-resistant way. I present novel, scalable, parallelizable tensor decomposition algorithms, specifically tuned to be effective for dense, noisy tensors, and which maintain the quality of the resulting analysis. Furthermore, I present results on multi-relational data applications focusing on model- and data-driven computer simulation analysis, as well as social network and web mining, which demonstrate the effectiveness of these tensor decompositions.

ContributorsLi, Xinsheng (Author) / Candan, Kasim S (Thesis advisor) / Davulcu, Hasan (Committee member) / Sapino, Maria L (Committee member) / Tong, Hanghang (Committee member) / Arizona State University (Publisher)

Created2019

Safety Enhanced Designs in UAS Risk Monitoring and Collision Resolution

Description

Collision-free path planning is also a major challenge in managing unmanned aerial vehicles (UAVs) fleets, especially in uncertain environments. The design of UAV routing policies using multi-agent reinforcement learning has been considered, and propose a Multi-resolution, Multi-agent, Mean-field reinforcement learning algorithm, named 3M-RL, for flight planning, where multiple vehicles need…

Collision-free path planning is also a major challenge in managing unmanned aerial vehicles (UAVs) fleets, especially in uncertain environments. The design of UAV routing policies using multi-agent reinforcement learning has been considered, and propose a Multi-resolution, Multi-agent, Mean-field reinforcement learning algorithm, named 3M-RL, for flight planning, where multiple vehicles need to avoid collisions with each other while moving towards their destinations. In this system, each UAV makes decisions based on local observations, and does not communicate with other UAVs. The algorithm trains a routing policy using an Actor-Critic neural network with multi-resolution observations, including detailed local information and aggregated global information based on mean-field. The algorithm tackles the curse-of-dimensionality problem in multi-agent reinforcement learning and provides a scalable solution. The proposed algorithm is tested in different complex scenarios in both 2D and 3D space and the simulation results show that 3M-RL result in good routing policies. Also as a compliment, dynamic data communications between UAVs and a control center has also been studied, where the control center needs to monitor the safety state of each UAV in the system in real time, where the transition of risk level is simply considered as a Markov process. Given limited communication bandwidth, it is impossible for the control center to communicate with all UAVs at the same time. A dynamic learning problem with limited communication bandwidth is also discussed in this paper where the objective is to minimize the total information entropy in real-time risk level tracking. The simulations also demonstrate that the algorithm outperforms policies such as a Round & Robin policy.

ContributorsWang, Weichang (Author) / Ying, Lei (Thesis advisor) / Liu, Yongming (Thesis advisor) / Zhang, Junshan (Committee member) / Zhang, Yanchao (Committee member) / Arizona State University (Publisher)

Created2021

Optimization of Block-based Tensor Decompositions through Sub-Tensor Impact Graphs and Applications to Dynamicity in Data and User Focus

Description

Tensors are commonly used for representing multi-dimensional data, such as Web graphs, sensor streams, and social networks. As a consequence of the increase in the use of tensors, tensor decomposition operations began to form the basis for many data analysis and knowledge discovery tasks, from clustering, trend detection, anomaly detection…

Tensors are commonly used for representing multi-dimensional data, such as Web graphs, sensor streams, and social networks. As a consequence of the increase in the use of tensors, tensor decomposition operations began to form the basis for many data analysis and knowledge discovery tasks, from clustering, trend detection, anomaly detection to correlationanalysis [31, 38]. It is well known that Singular Value matrix Decomposition (SVD) [9] is used to extract latent semantics for matrix data. When apply SVD to tensors, which have more than two modes, it is tensor decomposition. The two most popular tensor decomposition algorithms are the Tucker [54] and the CP [19] decompositions. Intuitively, they both generalize SVD to tensors. However, one key problem with tensor decomposition is its computational complexity which may cause system bottleneck. Therefore, two phase block-centric CP tensor decomposition (2PCP) was proposed to partition the tensor into small sub-tensors, execute sub-tensor decomposition in parallel and combine the factors from each sub-tensor into final decomposition factors through iterative rerefinement process. Consequently, I proposed Sub-tensor Impact Graph (SIG) to account for inaccuracy propagation among sub-tensors and measure the impact of decomposition of sub-tensors on the other's decomposition, Based on SIG, I proposed several optimization strategies to optimize 2PCP's phase-2 refinement process. Furthermore, I applied SIG and optimization strategies for data focus, data evolution, and focus shifting in tensor analysis. Personalized Tensor Decomposition (PTD) is proposed to account for the users focus given the observations that in many applications, the user may have a focus of interest i.e., part of the data for which the user needs high accuracy and beyond this area focus, accuracy may not be as critical. PTD takes as input one or more areas of focus and performs the decomposition in such a way that, when reconstructed, the accuracy of the tensor is boosted for these areas of focus. A related challenge of data evolution in tensor analytics is incremental tensor decomposition since re-computation of the whole tensor decomposition with each update will cause high computational costs and incur large memory overheads. Especially for applications where data evolves over time and the tensor-based analysis results need to be continuouslymaintained. To avoid re-decomposition, I propose a two-phase block-incremental CP-based tensor decomposition technique, BICP, that efficiently and effectively maintains tensor decomposition results in the presence of dynamically evolving tensor data. I further extend the research focus on user focus shift. User focus may change over time as data is evolving along the time. Although PTD is efficient, re-computation for each user preference update can be the bottleneck for the system. Therefore I propose dynamic evolving user focus tensor decomposition which can smartly reuse the existing decomposition result to improve the efficiency of evolving user focus block decomposition.

ContributorsHuang, shengyu (Author) / Candan, K. Selcuk (Thesis advisor) / Davulcu, Hasan (Committee member) / Sapino, Maria Luisa (Committee member) / Tong, Hanghang (Committee member) / Zou, Jia (Committee member) / Arizona State University (Publisher)

Created2021

Distributed RDF Storage and Querying Using In-Memory Processing Engine

Description

The proliferation of semantic data in the form of RDF (Resource Description Framework) triples demands an efficient, scalable, and distributed storage along with a highly available and fault-tolerant parallel processing strategy. There are three open issues with distributed RDF data management systems that are not well addressed altogether in existing…

The proliferation of semantic data in the form of RDF (Resource Description Framework) triples demands an efficient, scalable, and distributed storage along with a highly available and fault-tolerant parallel processing strategy. There are three open issues with distributed RDF data management systems that are not well addressed altogether in existing work. First is the querying efficiency, second is that solutions are optimized for certain types of query patterns and don’t necessarily work well for all types, and third is concerned with reducing pre-processing cost. Therefore, the rapid growth of RDF data raises the need for an efficient partitioning strategy over distributed data management systems to improve SPARQL (SPARQL Protocol and RDF Query Language) query performance regardless of its pattern shape with minimized pre-processing overhead. In this context, the first contribution of this work is a distributed RDF data partitioning schema called 3CStore that extends the existing VP (Vertical Partitioning) approach by using a subset of triples from the VP tables based on different join correlations. This approach speeds up queries at the cost of additional pre-processing overhead. To solve this, a relational partitioning schema called VPExp was developed by splitting predicates based on explicit type information of objects. This approach gains a significant query performance only for the specific type of query where the object is bound to a value for a particular predicate. To get efficient query performance on a wide range of query patterns, an improved solution is proposed by extending the existing Property Table approach to Subset-Property Table and combined with the VP approach. Further investigation on distributed RDF processing and querying systems based on typical use cases led to a novel relational partitioning schema called PTP (Property Table Partitioning) that further partitions the whole Property Table into the number of unique properties to minimize query input size and join operations during query evaluation. Finally, an RDF data management system based on the SPARQL-over-SQL approach called S3QLRDF is developed that generates the optimal query execution plan using statistics of PTP tables to provide efficient SPARQL query processing on a distributed system.

ContributorsHassan, P M Mahmudul Mahmudul (Author) / Bansal, Srividya (Thesis advisor) / Bansal, Ajay (Committee member) / Davulcu, Hasan (Committee member) / Sarwat Abdelghany Aly Elsayed, Mohamed (Committee member) / Arizona State University (Publisher)

Created2021

Selego: Robust Variate Selection for Accurate Time Series Forecasting and its Application in Fault Detection

Description

The need of effective forecasting models for multi-variate time series has been underlined by the integration of sensory technologies into essential applications such as building energy optimizations, flight monitoring, and health monitoring. To meet this requirement, time series prediction techniques have been expanded from uni-variate to multi-variate. However, due to…

The need of effective forecasting models for multi-variate time series has been underlined by the integration of sensory technologies into essential applications such as building energy optimizations, flight monitoring, and health monitoring. To meet this requirement, time series prediction techniques have been expanded from uni-variate to multi-variate. However, due to the extended models’ poor ability to capture the intrinsic relationships among variates, naïve extensions of prediction approaches result in an unwanted rise in the cost of model learning and, more critically, a significant loss in model performance. While recurrent models like Long Short-Term Memory (LSTM) and Recurrent Neural Network Network (RNN) are designed to capture the temporal intricacies in data, their performance can soon deteriorate. First, I claim in this thesis that (a) by exploiting temporal alignments of variates to quantify the importance of the recorded variates in relation to a target variate, one can build a more accurate forecasting model. I also argue that (b) traditional time series similarity/distance functions, such as Dynamic Time Warping (DTW), which require that variates have similar absolute patterns are fundamentally ill-suited for this purpose, and that should instead quantify temporal correlation in terms of temporal alignments of key “events” impacting these series, rather than series similarity. Further, I propose that (c) while learning a temporal model with recurrence-based techniques (such as RNN and LSTM – even when leveraging attention strategies) is challenging and expensive, the better results can be obtained by coupling simpler CNNs with an adaptive variate selection strategy. Putting these together, I introduce a novel Selego framework for variate selection based on these arguments, and I experimentally evaluate the performance of the proposed approach on various forecasting models, such as LSTM, RNN, and CNN, for different top-X% percent variates and different forecasting time in the future (lead), on multiple real-world data sets. Experiments demonstrate that the proposed framework can reduce the number of recorded variates required to train predictive models by 90 - 98% while also increasing accuracy. Finally, I present a fault onset detection technique that leverages the precise baseline forecasting models trained using the Selego framework. The proposed, Selego-enabled Fault Detection Framework (FDF-Selego) has been experimentally evaluated within the context of detecting the onset of faults in the building Heating, Ventilation, and Air Conditioning (HVAC) system.

ContributorsTiwaskar, Manoj (Author) / Candan, K. Selcuk (Thesis advisor) / Sapino, Maria Luisa (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)

Created2021

Tragedy Plus Time: Capturing Human Unintended Activities from Weakly-Labeled Videos

Description

In videos that contain actions performed unintentionally, agents do not achieve their desired goals. In such videos, it is challenging for computer vision systems to understand high-level concepts such as goal-directed behavior. On the other hand, from a very early age, humans are able to understand the relation between an…

In videos that contain actions performed unintentionally, agents do not achieve their desired goals. In such videos, it is challenging for computer vision systems to understand high-level concepts such as goal-directed behavior. On the other hand, from a very early age, humans are able to understand the relation between an agent and their ultimate goal even if the action gets disrupted or unintentional effects occur. Inculcating this ability in artificially intelligent agents would make them better social learners by not just learning from their own mistakes, i.e, reinforcement learning, but also learning from other's mistakes. For example, this could greatly reduce the search space for artificially intelligent agents for finding the correct action sequence when trying to achieve a new goal, since they would be able to learn from others what not to do as well as how/when actions result in undesired outcomes.To validate this ability of deep learning models to perform this task, the Weakly Augmented Oops (W-Oops) dataset is proposed, built upon the Oops dataset. W-Oops consists of 2,100 unintentional human action videos, with 44 goal-directed and 33 unintentional video-level activity labels collected through human annotations. Inspired by previous methods on tasks such as weakly supervised action localization which show promise for achieving good localization results without ground truth segment annotations, this paper proposes a weakly supervised algorithm for localizing the goal-directed as well as the unintentional temporal region of a video using only video-level labels. In particular, an attention mechanism based strategy is employed that predicts the temporal regions which contributes the most to a classification task, leveraging solely video-level labels. Meanwhile, our designed overlap regularization allows the model to focus on distinct portions of the video for inferring the goal-directed and unintentional activity, while guaranteeing their temporal ordering. Extensive quantitative experiments verify the validity of our localization method.

ContributorsChakravarthy, Arnav (Author) / Yang, Yezhou (Thesis advisor) / Davulcu, Hasan (Committee member) / Pavlic, Theodore (Committee member) / Arizona State University (Publisher)

Created2021

Efficient routing and resource sharing mechanisms for hybrid optical-wireless access networks

Description

The integration of passive optical networks (PONs) and wireless mesh networks (WMNs) into Fiber-Wireless (FiWi) networks has recently emerged as a promising strategy for

providing flexible network services at relative high transmission rates. This work investigates the effectiveness of localized routing that prioritizes transmissions over the local gateway to the optical…

The integration of passive optical networks (PONs) and wireless mesh networks (WMNs) into Fiber-Wireless (FiWi) networks has recently emerged as a promising strategy for

providing flexible network services at relative high transmission rates. This work investigates the effectiveness of localized routing that prioritizes transmissions over the local gateway to the optical network and avoids wireless packet transmissions in radio zones that do not contain the packet source or destination. Existing routing schemes for FiWi networks consider mainly hop-count and delay metrics over a flat WMN node topology and do not specifically prioritize the local network structure. The combination of clustered and localized routing (CluLoR) performs better in terms of throughput-delay compared to routing schemes that are based on minimum hop-count which do not consider traffic localization. Subsequently, this work also investigates the packet delays when relatively low-rate traffic that has traversed a wireless network is mixed with conventional high-rate PON-only traffic. A range of different FiWi network architectures with different dynamic bandwidth allocation (DBA) mechanisms is considered. The grouping of the optical network units (ONUs) in the double-phase polling (DPP) DBA mechanism in long-range (order of 100~Km) FiWi networks is closely examined, and a novel grouping by cycle length (GCL) strategy that achieves favorable packet delay performance is introduced. At the end, this work proposes a novel backhaul network architecture based on a Smart Gateway (Sm-GW) between the small cell base stations (e.g., LTE eNBs) and the conventional backhaul gateways, e.g., LTE Servicing/Packet Gateway (S/P-GW). The Sm-GW accommodates flexible number of small cells while reducing the infrastructure requirements at the S-GW of LTE backhaul. In contrast to existing methods, the proposed Sm-GW incorporates the scheduling mechanisms to achieve the network fairness while sharing the resources among all the connected small cells base stations.

ContributorsDashti, Yousef (Author) / Reisslein, Martin (Thesis advisor) / Zhang, Yanchao (Committee member) / Fowler, John (Committee member) / Seeling, Patrick (Committee member) / Arizona State University (Publisher)

Created2016

Visual analytics for spatiotemporal cluster analysis

Description

Traditionally, visualization is one of the most important and commonly used methods of generating insight into large scale data. Particularly for spatiotemporal data, the translation of such data into a visual form allows users to quickly see patterns, explore summaries and relate domain knowledge about underlying geographical phenomena that would…

Traditionally, visualization is one of the most important and commonly used methods of generating insight into large scale data. Particularly for spatiotemporal data, the translation of such data into a visual form allows users to quickly see patterns, explore summaries and relate domain knowledge about underlying geographical phenomena that would not be apparent in tabular form. However, several critical challenges arise when visualizing and exploring these large spatiotemporal datasets. While, the underlying geographical component of the data lends itself well to univariate visualization in the form of traditional cartographic representations (e.g., choropleth, isopleth, dasymetric maps), as the data becomes multivariate, cartographic representations become more complex. To simplify the visual representations, analytical methods such as clustering and feature extraction are often applied as part of the classification phase. The automatic classification can then be rendered onto a map; however, one common issue in data classification is that items near a classification boundary are often mislabeled.

This thesis explores methods to augment the automated spatial classification by utilizing interactive machine learning as part of the cluster creation step. First, this thesis explores the design space for spatiotemporal analysis through the development of a comprehensive data wrangling and exploratory data analysis platform. Second, this system is augmented with a novel method for evaluating the visual impact of edge cases for multivariate geographic projections. Finally, system features and functionality are demonstrated through a series of case studies, with key features including similarity analysis, multivariate clustering, and novel visual support for cluster comparison.

ContributorsZhang, Yifan (Author) / Maciejewski, Ross (Thesis advisor) / Mack, Elizabeth (Committee member) / Liu, Huan (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)

Created2016

Improved, scalable, and personalized context recovery system: E-TweetSense

Description

Browsing Twitter users, or browsers, often find it increasingly cumbersome to attach meaning to tweets that are displayed on their timeline as they follow more and more users or pages. The tweets being browsed are created by Twitter users called originators, and are of some significance to the browser who…

Browsing Twitter users, or browsers, often find it increasingly cumbersome to attach meaning to tweets that are displayed on their timeline as they follow more and more users or pages. The tweets being browsed are created by Twitter users called originators, and are of some significance to the browser who has chosen to subscribe to the tweets from the originator by following the originator. Although, hashtags are used to tag tweets in an effort to attach context to the tweets, many tweets do not have a hashtag. Such tweets are called orphan tweets and they adversely affect the experience of a browser.

A hashtag is a type of label or meta-data tag used in social networks and micro-blogging services which makes it easier for users to find messages with a specific theme or content. The context of a tweet can be defined as a set of one or more hashtags. Users often do not use hashtags to tag their tweets. This leads to the problem of missing context for tweets. To address the problem of missing hashtags, a statistical method was proposed which predicts most likely hashtags based on the social circle of an originator.

In this thesis, we propose to improve on the existing context recovery system by selectively limiting the candidate set of hashtags to be derived from the intimate circle of the originator rather than from every user in the social network of the originator. This helps in reducing the computation, increasing speed of prediction, scaling the system to originators with large social networks while still preserving most of the accuracy of the predictions. We also propose to not only derive the candidate hashtags from the social network of the originator but also derive the candidate hashtags based on the content of the tweet. We further propose to learn personalized statistical models according to the adoption patterns of different originators. This helps in not only identifying the personalized candidate set of hashtags based on the social circle and content of the tweets but also in customizing the hashtag adoption pattern to the originator of the tweet.

ContributorsMallapura Umamaheshwar, Tejas (Author) / Kambhampati, Subbarao (Thesis advisor) / Liu, Huan (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)

Created2015

ASU Electronic Theses and Dissertations

Filtering by

Computer Vision Methods for Urinary Tract Infection Diagnostics

On Density and Noise Challenges in Tensor-Based Data Analytics

Safety Enhanced Designs in UAS Risk Monitoring and Collision Resolution

Optimization of Block-based Tensor Decompositions through Sub-Tensor Impact Graphs and Applications to Dynamicity in Data and User Focus

Distributed RDF Storage and Querying Using In-Memory Processing Engine

Selego: Robust Variate Selection for Accurate Time Series Forecasting and its Application in Fault Detection

Tragedy Plus Time: Capturing Human Unintended Activities from Weakly-Labeled Videos

Efficient routing and resource sharing mechanisms for hybrid optical-wireless access networks

Visual analytics for spatiotemporal cluster analysis

Improved, scalable, and personalized context recovery system: E-TweetSense