Search Content

Methodologies in Predictive Visual Analytics

Description

Predictive analytics embraces an extensive area of techniques from statistical modeling to machine learning to data mining and is applied in business intelligence, public health, disaster management and response, and many other fields. To date, visualization has been broadly used to support tasks in the predictive analytics pipeline under the…

Predictive analytics embraces an extensive area of techniques from statistical modeling to machine learning to data mining and is applied in business intelligence, public health, disaster management and response, and many other fields. To date, visualization has been broadly used to support tasks in the predictive analytics pipeline under the underlying assumption that a human-in-the-loop can aid the analysis by integrating domain knowledge that might not be broadly captured by the system. Primary uses of visualization in the predictive analytics pipeline have focused on data cleaning, exploratory analysis, and diagnostics. More recently, numerous visual analytics systems for feature selection, incremental learning, and various prediction tasks have been proposed to support the growing use of complex models, agent-specific optimization, and comprehensive model comparison and result exploration. Such work is being driven by advances in interactive machine learning and the desire of end-users to understand and engage with the modeling process. However, despite the numerous and promising applications of visual analytics to predictive analytics tasks, work to assess the effectiveness of predictive visual analytics is lacking.

This thesis studies the current methodologies in predictive visual analytics. It first defines the scope of predictive analytics and presents a predictive visual analytics (PVA) pipeline. Following the proposed pipeline, a predictive visual analytics framework is developed to be used to explore under what circumstances a human-in-the-loop prediction process is most effective. This framework combines sentiment analysis, feature selection mechanisms, similarity comparisons and model cross-validation through a variety of interactive visualizations to support analysts in model building and prediction. To test the proposed framework, an instantiation for movie box-office prediction is developed and evaluated. Results from small-scale user studies are presented and discussed, and a generalized user study is carried out to assess the role of predictive visual analytics under a movie box-office prediction scenario.

ContributorsLu, Yafeng (Author) / Maciejewski, Ross (Thesis advisor) / Cooke, Nancy J. (Committee member) / Liu, Huan (Committee member) / He, Jingrui (Committee member) / Arizona State University (Publisher)

Created2017

Computer Science Education: A Game to Teach Children about Programming

Description

Computational thinking, the fundamental way of thinking in computer science, including information sourcing and problem solving behind programming, is considered vital to children who live in a digital era. Most of current educational games designed to teach children about coding either rely on external curricular materials or are too complicated…

Computational thinking, the fundamental way of thinking in computer science, including information sourcing and problem solving behind programming, is considered vital to children who live in a digital era. Most of current educational games designed to teach children about coding either rely on external curricular materials or are too complicated to work well with young children. In this thesis project, Guardy, an iOS tower defense game, was developed to help children over 8 years old learn about and practice using basic concepts in programming. The game is built with the SpriteKit, a graphics rendering and animation infrastructure in Apple’s integrated development environment Xcode. It simplifies switching among different game scenes and animating game sprites in the development. In a typical game, a sequence of operations is arranged by players to destroy incoming enemy minions. Basic coding concepts like looping, sequencing, conditionals, and classification are integrated in different levels. In later levels, players are required to type in commands and put them in an order to keep playing the game. To reduce the difficulty of the usability testing, a method combining questionnaires and observation was conducted with two groups of college students who either have no programming experience or are familiar with coding. The results show that Guardy has the potential to help children learn programming and practice computational thinking.

ContributorsWang, Xiaoxiao (Author) / Nelson, Brian C. (Thesis advisor) / Turaga, Pavan (Committee member) / Walker, Erin (Committee member) / Arizona State University (Publisher)

Created2017

Scaling Up Large-scale Sparse Learning and Its Application to Medical Imaging

Description

Large-scale $\ell_1$-regularized loss minimization problems arise in high-dimensional applications such as compressed sensing and high-dimensional supervised learning, including classification and regression problems. In many applications, it remains challenging to apply the sparse learning model to large-scale problems that have massive data samples with high-dimensional features. One popular and promising strategy…

Large-scale $\ell_1$-regularized loss minimization problems arise in high-dimensional applications such as compressed sensing and high-dimensional supervised learning, including classification and regression problems. In many applications, it remains challenging to apply the sparse learning model to large-scale problems that have massive data samples with high-dimensional features. One popular and promising strategy is to scaling up the optimization problem in parallel. Parallel solvers run multiple cores on a shared memory system or a distributed environment to speed up the computation, while the practical usage is limited by the huge dimension in the feature space and synchronization problems.

In this dissertation, I carry out the research along the direction with particular focuses on scaling up the optimization of sparse learning for supervised and unsupervised learning problems. For the supervised learning, I firstly propose an asynchronous parallel solver to optimize the large-scale sparse learning model in a multithreading environment. Moreover, I propose a distributed framework to conduct the learning process when the dataset is distributed stored among different machines. Then the proposed model is further extended to the studies of risk genetic factors for Alzheimer's Disease (AD) among different research institutions, integrating a group feature selection framework to rank the top risk SNPs for AD. For the unsupervised learning problem, I propose a highly efficient solver, termed Stochastic Coordinate Coding (SCC), scaling up the optimization of dictionary learning and sparse coding problems. The common issue for the medical imaging research is that the longitudinal features of patients among different time points are beneficial to study together. To further improve the dictionary learning model, I propose a multi-task dictionary learning method, learning the different task simultaneously and utilizing shared and individual dictionary to encode both consistent and changing imaging features.

ContributorsLi, Qingyang (Author) / Ye, Jieping (Thesis advisor) / Xue, Guoliang (Thesis advisor) / He, Jingrui (Committee member) / Wang, Yalin (Committee member) / Li, Jing (Committee member) / Arizona State University (Publisher)

Created2017

Doppler Lidar Vector Retrievals and Atmospheric Data Visualization in Mixed/Augmented Reality

Description

Environmental remote sensing has seen rapid growth in the recent years and Doppler wind lidars have gained popularity primarily due to their non-intrusive, high spatial and temporal measurement capabilities. While lidar applications early on, relied on the radial velocity measurements alone, most of the practical applications in wind farm…

Environmental remote sensing has seen rapid growth in the recent years and Doppler wind lidars have gained popularity primarily due to their non-intrusive, high spatial and temporal measurement capabilities. While lidar applications early on, relied on the radial velocity measurements alone, most of the practical applications in wind farm control and short term wind prediction require knowledge of the vector wind field. Over the past couple of years, multiple works on lidars have explored three primary methods of retrieving wind vectors viz., using homogeneous windfield assumption, computationally extensive variational methods and the use of multiple Doppler lidars.

Building on prior research, the current three-part study, first demonstrates the capabilities of single and dual Doppler lidar retrievals in capturing downslope windstorm-type flows occurring at Arizona’s Barringer Meteor Crater as a part of the METCRAX II field experiment. Next, to address the need for a reliable and computationally efficient vector retrieval for adaptive wind farm control applications, a novel 2D vector retrieval based on a variational formulation was developed and applied on lidar scans from an offshore wind farm and validated with data from a cup and vane anemometer installed on a nearby research platform. Finally, a novel data visualization technique using Mixed Reality (MR)/ Augmented Reality (AR) technology is presented to visualize data from atmospheric sensors. MR is an environment in which the user's visual perception of the real world is enhanced with live, interactive, computer generated sensory input (in this case, data from atmospheric sensors like Doppler lidars). A methodology using modern game development platforms is presented and demonstrated with lidar retrieved wind fields. In the current study, the possibility of using this technology to visualize data from atmospheric sensors in mixed reality is explored and demonstrated with lidar retrieved wind fields as well as a few earth science datasets for education and outreach activities.

ContributorsCherukuru, Nihanth Wagmi (Author) / Calhoun, Ronald (Thesis advisor) / Newsom, Rob (Committee member) / Huang, Huei Ping (Committee member) / Chen, Kangping (Committee member) / Dahm, Werner (Committee member) / Arizona State University (Publisher)

Created2017

Hybrid Multiresolution Simulation & Model Checking: Network-On-Chip Systems

Description

Designers employ a variety of modeling theories and methodologies to create functional models of discrete network systems. These dynamical models are evaluated using verification and validation techniques throughout incremental design stages. Models created for these systems should directly represent their growing complexity with respect to composition and heterogeneity. Similar to…

Designers employ a variety of modeling theories and methodologies to create functional models of discrete network systems. These dynamical models are evaluated using verification and validation techniques throughout incremental design stages. Models created for these systems should directly represent their growing complexity with respect to composition and heterogeneity. Similar to software engineering practices, incremental model design is required for complex system design. As a result, models at early increments are significantly simpler relative to real systems. While experimenting (verification or validation) on models at early increments are computationally less demanding, the results of these experiments are less trustworthy and less rewarding. At any increment of design, a set of tools and technique are required for controlling the complexity of models and experimentation.

A complex system such as Network-on-Chip (NoC) may benefit from incremental design stages. Current design methods for NoC rely on multiple models developed using various modeling frameworks. It is useful to develop frameworks that can formalize the relationships among these models. Fine-grain models are derived using their coarse-grain counterparts. Moreover, validation and verification capability at various design stages enabled through disciplined model conversion is very beneficial.

In this research, Multiresolution Modeling (MRM) is used for system level design of NoC. MRM aids in creating a family of models at different levels of scale and complexity with well-formed relationships. In addition, a variant of the Discrete Event System Specification (DEVS) formalism is proposed which supports model checking. Hierarchical models of Network-on-Chip components may be created at different resolutions while each model can be validated using discrete-event simulation and verified via state exploration. System property expressions are defined in the DEVS language and developed as Transducers which can be applied seamlessly for model checking and simulation purposes.

Multiresolution Modeling with verification and validation capabilities of this framework complement one another. MRM manages the scale and complexity of models which in turn can reduces V&V time and effort and conversely the V&V helps ensure correctness of models at multiple resolutions. This framework is realized through extending the DEVS-Suite simulator and its applicability demonstrated for exemplar NoC models.

ContributorsGholami, Soroosh (Author) / Sarjoughian, Hessam S. (Thesis advisor) / Fainekos, Georgios (Committee member) / Ogras, Umit Y. (Committee member) / Shrivastava, Aviral (Committee member) / Arizona State University (Publisher)

Created2017

Threats, Countermeasures, and Research Trends for BLE-based IoT Devices

Description

The Internet of Things has conjured up a storm in the technology world by providing novel methods to connect, exchange, aggregate, and monitor data across a system of inter-related devices and entities. Of the myriad technologies that aid in the functioning of these IoT devices, Bluetooth Low Energy also known…

The Internet of Things has conjured up a storm in the technology world by providing novel methods to connect, exchange, aggregate, and monitor data across a system of inter-related devices and entities. Of the myriad technologies that aid in the functioning of these IoT devices, Bluetooth Low Energy also known as BLE plays a major role in establishing inter-connectivity amongst these devices. This thesis aims to provide a background on BLE, the type of attacks that could occur in an IoT setting, the possible defenses that are available to prevent the occurrence of such attacks, and a discussion on the research trends that hold great promise in presenting seamless solutions to integrate IoT devices across different industry verticals.

ContributorsPammi, Ashwath Anand (Author) / Doupe, Adam (Thesis advisor) / Dasgupta, Partha (Committee member) / Ahn, Gail Joon (Committee member) / Arizona State University (Publisher)

Created2017

Perturbation Robust Representations of Topological Persistence Diagrams

Description

Topological methods for data analysis present opportunities for enforcing certain invariances of broad interest in computer vision: including view-point in activity analysis, articulation in shape analysis, and measurement invariance in non-linear dynamical modeling. The increasing success of these methods is attributed to the complementary information that topology provides, as well…

Topological methods for data analysis present opportunities for enforcing certain invariances of broad interest in computer vision: including view-point in activity analysis, articulation in shape analysis, and measurement invariance in non-linear dynamical modeling. The increasing success of these methods is attributed to the complementary information that topology provides, as well as availability of tools for computing topological summaries such as persistence diagrams. However, persistence diagrams are multi-sets of points and hence it is not straightforward to fuse them with features used for contemporary machine learning tools like deep-nets. In this paper theoretically well-grounded approaches to develop novel perturbation robust topological representations are presented, with the long-term view of making them amenable to fusion with contemporary learning architectures. The proposed representation lives on a Grassmann manifold and hence can be efficiently used in machine learning pipelines.

The proposed representation.The efficacy of the proposed descriptor was explored on three applications: view-invariant activity analysis, 3D shape analysis, and non-linear dynamical modeling. Favorable results in both high-level recognition performance and improved performance in reduction of time-complexity when compared to other baseline methods are obtained.

ContributorsThopalli, Kowshik (Author) / Turaga, Pavan Kumar (Thesis advisor) / Papandreou-Suppappola, Antonia (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)

Created2017

A Framework for Interactive Geospatial Map Cleaning using GPS Trajectories

Description

A volunteered geographic information system, e.g., OpenStreetMap (OSM), collects data from volunteers to generate geospatial maps. To keep the map consistent, volunteers are expected to perform the tedious task of updating the underlying geospatial data at regular intervals. Such a map curation step takes time and considerable human effort. In…

A volunteered geographic information system, e.g., OpenStreetMap (OSM), collects data from volunteers to generate geospatial maps. To keep the map consistent, volunteers are expected to perform the tedious task of updating the underlying geospatial data at regular intervals. Such a map curation step takes time and considerable human effort. In this thesis, we propose a framework that improves the process of updating geospatial maps by automatically identifying road changes from user-generated GPS traces. Since GPS traces can be sparse and noisy, the proposed framework validates the map changes with the users before propagating them to a publishable version of the map. The proposed framework achieves up to four times faster map matching performance than the state-of-the-art algorithms with only 0.1-0.3% accuracy loss.

ContributorsVementala, Nikhil (Author) / Papotti, Paolo (Thesis advisor) / Sarwat, Mohamed (Thesis advisor) / Kasim, Selçuk Candan (Committee member) / Arizona State University (Publisher)

Created2017

Forensic Methods and Tools for Web Environments

Description

The Web is one of the most exciting and dynamic areas of development in today’s technology. However, with such activity, innovation, and ubiquity have come a set of new challenges for digital forensic examiners, making their jobs even more difficult. For examiners to become as effective with evidence from the…

The Web is one of the most exciting and dynamic areas of development in today’s technology. However, with such activity, innovation, and ubiquity have come a set of new challenges for digital forensic examiners, making their jobs even more difficult. For examiners to become as effective with evidence from the Web as they currently are with more traditional evidence, they need (1) methods that guide them to know how to approach this new type of evidence and (2) tools that accommodate web environments’ unique characteristics.

In this dissertation, I present my research to alleviate the difficulties forensic examiners currently face with respect to evidence originating from web environments. First, I introduce a framework for web environment forensics, which elaborates on and addresses the key challenges examiners face and outlines a method for how to approach web-based evidence. Next, I describe my work to identify extensions installed on encrypted web thin clients using only a sound understanding of these systems’ inner workings and the metadata of the encrypted files. Finally, I discuss my approach to reconstructing the timeline of events on encrypted web thin clients by using service provider APIs as a proxy for directly analyzing the device. In each of these research areas, I also introduce structured formats that I customized to accommodate the unique features of the evidence sources while also facilitating tool interoperability and information sharing.

ContributorsMabey, Michael Kent (Author) / Ahn, Gail-Joon (Thesis advisor) / Doupe, Adam (Thesis advisor) / Yau, Stephen S. (Committee member) / Lee, Joohyung (Committee member) / Zhao, Ziming (Committee member) / Arizona State University (Publisher)

Created2017

HEF: A Hardware-Assisted Security Evaluation Framework

Description

Hardware-Assisted Security (HAS) is an emerging technology that addresses the shortcomings of software-based virtualized environment. There are two major weaknesses of software-based virtualization that HAS attempts to address - performance overhead and security issues. Performance overhead caused by software-based virtualization is due to the use of additional software layer (i.e.,…

Hardware-Assisted Security (HAS) is an emerging technology that addresses the shortcomings of software-based virtualized environment. There are two major weaknesses of software-based virtualization that HAS attempts to address - performance overhead and security issues. Performance overhead caused by software-based virtualization is due to the use of additional software layer (i.e., hypervisor). Since the performance is highly related to efficiency of processing data and providing services, reducing performance overhead is one of the major concerns in data centers and enterprise networks. Software-based virtualization also imposes additional security issues in the virtualized environments. To resolve those issues, HAS is developed to offload security functions from application layer to a dedicated hardware, thereby achieving almost bare-metal performance and enhanced security. As a result, HAS gained

more popularity and the number of studies regarding efficiency of the technology is increasing.

However, there exists no attempt to our knowledge that provides a generic test mechanism that is universally applicable to all HAS devices. Preparing such a testbed for each specific HAS device is a time-consuming and costly task for hardware manufacturers and network administrators. Therefore, we try to address the demands of hardware vendors and researchers for a generic testbed that can evaluate both performance and security functions of the HAS-enabled systems.

In this thesis, the HAS device evaluation framework (HEF) is defined for hardware vendors, network administrators, and researchers to measure performance of the system with HAS devices. HEF provides a generic test environments for a given HAS device by providing generic test metrics and evaluation mechanisms. HEF is also designed to take user-defined test metrics and test cases to support various hardware. The framework performs the entire process in an automated fashion, and thus it requires no user intervention. Finally, the efficacy of HEF is demonstrated by performing a case study using Intel QuickAssist Technology (QAT) adapter, which is a dedicated PCI express device for cryptographic tasks.

ContributorsKyung, Sukwha (Author) / Ahn, Gail-Joon (Thesis advisor) / Doupe, Adam (Committee member) / Zhao, Ziming (Committee member) / Arizona State University (Publisher)

Created2017

Filtering by