Matching Items (8)

137197-Thumbnail Image.png

Visual Analytic Tools for Geo-Genealogy and Geo-Demographics

Description

This work explores the development of a visual analytics tool for geodemographic exploration in an online environment. We mine 78 million records from the United States white pages, link the

This work explores the development of a visual analytics tool for geodemographic exploration in an online environment. We mine 78 million records from the United States white pages, link the location data to demographic data (specifically income) from the United States Census Bureau, and allow users to interactively compare distributions of names with regards to spatial location similarity and income. In order to enable interactive similarity exploration, we explore methods of pre-processing the data as well as on-the-fly lookups. As data becomes larger and more complex, the development of appropriate data storage and analytics solutions has become even more critical when enabling online visualization. We discuss problems faced in implementation, design decisions and directions for future work.

Contributors

Agent

Created

Date Created
  • 2014-05

155291-Thumbnail Image.png

Visual Analytics Methods for Exploring Geographically Networked Phenomena

Description

The connections between different entities define different kinds of networks, and many such networked phenomena are influenced by their underlying geographical relationships. By integrating network and geospatial analysis, the goal

The connections between different entities define different kinds of networks, and many such networked phenomena are influenced by their underlying geographical relationships. By integrating network and geospatial analysis, the goal is to extract information about interaction topologies and the relationships to related geographical constructs. In the recent decades, much work has been done analyzing the dynamics of spatial networks; however, many challenges still remain in this field. First, the development of social media and transportation technologies has greatly reshaped the typologies of communications between different geographical regions. Second, the distance metrics used in spatial analysis should also be enriched with the underlying network information to develop accurate models.

Visual analytics provides methods for data exploration, pattern recognition, and knowledge discovery. However, despite the long history of geovisualizations and network visual analytics, little work has been done to develop visual analytics tools that focus specifically on geographically networked phenomena. This thesis develops a variety of visualization methods to present data values and geospatial network relationships, which enables users to interactively explore the data. Users can investigate the connections in both virtual networks and geospatial networks and the underlying geographical context can be used to improve knowledge discovery. The focus of this thesis is on social media analysis and geographical hotspots optimization. A framework is proposed for social network analysis to unveil the links between social media interactions and their underlying networked geospatial phenomena. This will be combined with a novel hotspot approach to improve hotspot identification and boundary detection with the networks extracted from urban infrastructure. Several real world problems have been analyzed using the proposed visual analytics frameworks. The primary studies and experiments show that visual analytics methods can help analysts explore such data from multiple perspectives and help the knowledge discovery process.

Contributors

Agent

Created

Date Created
  • 2017

157695-Thumbnail Image.png

Visual analytics methodologies on causality analysis

Description

Causality analysis is the process of identifying cause-effect relationships among variables. This process is challenging because causal relationships cannot be tested solely based on statistical indicators as additional information is

Causality analysis is the process of identifying cause-effect relationships among variables. This process is challenging because causal relationships cannot be tested solely based on statistical indicators as additional information is always needed to reduce the ambiguity caused by factors beyond those covered by the statistical test. Traditionally, controlled experiments are carried out to identify causal relationships, but recently there is a growing interest in causality analysis with observational data due to the increasing availability of data and tools. This type of analysis will often involve automatic algorithms that extract causal relations from large amounts of data and rely on expert judgment to scrutinize and verify the relations. Over-reliance on these automatic algorithms is dangerous because models trained on observational data are susceptible to bias that can be difficult to spot even with expert oversight. Visualization has proven to be effective at bridging the gap between human experts and statistical models by enabling an interactive exploration and manipulation of the data and models. This thesis develops a visual analytics framework to support the interaction between human experts and automatic models in causality analysis. Three case studies were conducted to demonstrate the application of the visual analytics framework in which feature engineering, insight generation, correlation analysis, and causality inspections were showcased.

Contributors

Agent

Created

Date Created
  • 2019

154998-Thumbnail Image.png

The role of teamwork in predicting movie earnings

Description

Intelligence analysts’ work has become progressively complex due to increasing security threats and data availability. In order to study “big” data exploration within the intelligence domain the intelligence analyst

Intelligence analysts’ work has become progressively complex due to increasing security threats and data availability. In order to study “big” data exploration within the intelligence domain the intelligence analyst task was abstracted and replicated in a laboratory (controlled environment). Participants used a computer interface and movie database to determine the opening weekend gross movie earnings of three pre-selected movies. Data consisted of Twitter tweets and predictive models. These data were displayed in various formats such as graphs, charts, and text. Participants used these data to make their predictions. It was expected that teams (a team is a group with members who have different specialties and who work interdependently) would outperform individuals and groups. That is, teams would be significantly better at predicting “Opening Weekend Gross” than individuals or groups. Results indicated that teams outperformed individuals and groups in the first prediction, under performed in the second prediction, and performed better than individuals in the third prediction (but not better than groups). Insights and future directions are discussed.

Contributors

Agent

Created

Date Created
  • 2016

154605-Thumbnail Image.png

Bridging cyber and physical programming classes: an application of semantic visual analytics for programming exams

Description

With the advent of Massive Open Online Courses (MOOCs) educators have the opportunity to collect data from students and use it to derive insightful information about the students. Specifically, for

With the advent of Massive Open Online Courses (MOOCs) educators have the opportunity to collect data from students and use it to derive insightful information about the students. Specifically, for programming based courses the ability to identify the specific areas or topics that need more attention from the students can be of immense help. But the majority of traditional, non-virtual classes lack the ability to uncover such information that can serve as a feedback to the effectiveness of teaching. In majority of the schools paper exams and assignments provide the only form of assessment to measure the success of the students in achieving the course objectives. The overall grade obtained in paper exams and assignments need not present a complete picture of a student’s strengths and weaknesses. In part, this can be addressed by incorporating research-based technology into the classrooms to obtain real-time updates on students' progress. But introducing technology to provide real-time, class-wide engagement involves a considerable investment both academically and financially. This prevents the adoption of such technology thereby preventing the ideal, technology-enabled classrooms. With increasing class sizes, it is becoming impossible for teachers to keep a persistent track of their students progress and to provide personalized feedback. What if we can we provide technology support without adding more burden to the existing pedagogical approach? How can we enable semantic enrichment of exams that can translate to students' understanding of the topics taught in the class? Can we provide feedback to students that goes beyond only numbers and reveal areas that need their focus. In this research I focus on bringing the capability of conducting insightful analysis to paper exams with a less intrusive learning analytics approach that taps into the generic classrooms with minimum technology introduction. Specifically, the work focuses on automatic indexing of programming exam questions with ontological semantics. The thesis also focuses on designing and evaluating a novel semantic visual analytics suite for in-depth course monitoring. By visualizing the semantic information to illustrate the areas that need a student’s focus and enable teachers to visualize class level progress, the system provides a richer feedback to both sides for improvement.

Contributors

Agent

Created

Date Created
  • 2016

152235-Thumbnail Image.png

A visual analytics based decision support methodology for evaluating low energy building design alternatives

Description

The ability to design high performance buildings has acquired great importance in recent years due to numerous federal, societal and environmental initiatives. However, this endeavor is much more demanding in

The ability to design high performance buildings has acquired great importance in recent years due to numerous federal, societal and environmental initiatives. However, this endeavor is much more demanding in terms of designer expertise and time. It requires a whole new level of synergy between automated performance prediction with the human capabilities to perceive, evaluate and ultimately select a suitable solution. While performance prediction can be highly automated through the use of computers, performance evaluation cannot, unless it is with respect to a single criterion. The need to address multi-criteria requirements makes it more valuable for a designer to know the "latitude" or "degrees of freedom" he has in changing certain design variables while achieving preset criteria such as energy performance, life cycle cost, environmental impacts etc. This requirement can be met by a decision support framework based on near-optimal "satisficing" as opposed to purely optimal decision making techniques. Currently, such a comprehensive design framework is lacking, which is the basis for undertaking this research. The primary objective of this research is to facilitate a complementary relationship between designers and computers for Multi-Criterion Decision Making (MCDM) during high performance building design. It is based on the application of Monte Carlo approaches to create a database of solutions using deterministic whole building energy simulations, along with data mining methods to rank variable importance and reduce the multi-dimensionality of the problem. A novel interactive visualization approach is then proposed which uses regression based models to create dynamic interplays of how varying these important variables affect the multiple criteria, while providing a visual range or band of variation of the different design parameters. The MCDM process has been incorporated into an alternative methodology for high performance building design referred to as Visual Analytics based Decision Support Methodology [VADSM]. VADSM is envisioned to be most useful during the conceptual and early design performance modeling stages by providing a set of potential solutions that can be analyzed further for final design selection. The proposed methodology can be used for new building design synthesis as well as evaluation of retrofits and operational deficiencies in existing buildings.

Contributors

Agent

Created

Date Created
  • 2013

154403-Thumbnail Image.png

Visual analytics for spatiotemporal cluster analysis

Description

Traditionally, visualization is one of the most important and commonly used methods of generating insight into large scale data. Particularly for spatiotemporal data, the translation of such data into a

Traditionally, visualization is one of the most important and commonly used methods of generating insight into large scale data. Particularly for spatiotemporal data, the translation of such data into a visual form allows users to quickly see patterns, explore summaries and relate domain knowledge about underlying geographical phenomena that would not be apparent in tabular form. However, several critical challenges arise when visualizing and exploring these large spatiotemporal datasets. While, the underlying geographical component of the data lends itself well to univariate visualization in the form of traditional cartographic representations (e.g., choropleth, isopleth, dasymetric maps), as the data becomes multivariate, cartographic representations become more complex. To simplify the visual representations, analytical methods such as clustering and feature extraction are often applied as part of the classification phase. The automatic classification can then be rendered onto a map; however, one common issue in data classification is that items near a classification boundary are often mislabeled.

This thesis explores methods to augment the automated spatial classification by utilizing interactive machine learning as part of the cluster creation step. First, this thesis explores the design space for spatiotemporal analysis through the development of a comprehensive data wrangling and exploratory data analysis platform. Second, this system is augmented with a novel method for evaluating the visual impact of edge cases for multivariate geographic projections. Finally, system features and functionality are demonstrated through a series of case studies, with key features including similarity analysis, multivariate clustering, and novel visual support for cluster comparison.

Contributors

Agent

Created

Date Created
  • 2016

155343-Thumbnail Image.png

Methodologies in Predictive Visual Analytics

Description

Predictive analytics embraces an extensive area of techniques from statistical modeling to machine learning to data mining and is applied in business intelligence, public health, disaster management and response, and

Predictive analytics embraces an extensive area of techniques from statistical modeling to machine learning to data mining and is applied in business intelligence, public health, disaster management and response, and many other fields. To date, visualization has been broadly used to support tasks in the predictive analytics pipeline under the underlying assumption that a human-in-the-loop can aid the analysis by integrating domain knowledge that might not be broadly captured by the system. Primary uses of visualization in the predictive analytics pipeline have focused on data cleaning, exploratory analysis, and diagnostics. More recently, numerous visual analytics systems for feature selection, incremental learning, and various prediction tasks have been proposed to support the growing use of complex models, agent-specific optimization, and comprehensive model comparison and result exploration. Such work is being driven by advances in interactive machine learning and the desire of end-users to understand and engage with the modeling process. However, despite the numerous and promising applications of visual analytics to predictive analytics tasks, work to assess the effectiveness of predictive visual analytics is lacking.

This thesis studies the current methodologies in predictive visual analytics. It first defines the scope of predictive analytics and presents a predictive visual analytics (PVA) pipeline. Following the proposed pipeline, a predictive visual analytics framework is developed to be used to explore under what circumstances a human-in-the-loop prediction process is most effective. This framework combines sentiment analysis, feature selection mechanisms, similarity comparisons and model cross-validation through a variety of interactive visualizations to support analysts in model building and prediction. To test the proposed framework, an instantiation for movie box-office prediction is developed and evaluated. Results from small-scale user studies are presented and discussed, and a generalized user study is carried out to assess the role of predictive visual analytics under a movie box-office prediction scenario.

Contributors

Agent

Created

Date Created
  • 2017