Search Content

Essays on space-time interaction tests

Description

Researchers across a variety of fields are often interested in determining if data are of a random nature or if they exhibit patterning which may be the result of some alternative and potentially more interesting process. This dissertation explores a family of statistical methods, i.e. space-time interaction tests, designed to…

Researchers across a variety of fields are often interested in determining if data are of a random nature or if they exhibit patterning which may be the result of some alternative and potentially more interesting process. This dissertation explores a family of statistical methods, i.e. space-time interaction tests, designed to detect structure within three-dimensional event data. These tests, widely employed in the fields of spatial epidemiology, criminology, ecology and beyond, are used to identify synergistic interaction across the spatial and temporal dimensions of a series of events. Exploration is needed to better understand these methods and determine how their results may be affected by data quality problems commonly encountered in their implementation; specifically, how inaccuracy and/or uncertainty in the input data analyzed by the methods may impact subsequent results. Additionally, known shortcomings of the methods must be ameliorated. The contributions of this dissertation are twofold: it develops a more complete understanding of how input data quality problems impact the results of a number of global and local tests of space-time interaction and it formulates an improved version of one global test which accounts for the previously identified problem of population shift bias. A series of simulation experiments reveal the global tests of space-time interaction explored here to be dramatically affected by the aforementioned deficiencies in the quality of the input data. It is shown that in some cases, a conservative degree of these common data problems can completely obscure evidence of space-time interaction and in others create it where it does not exist. Conversely, a local metric of space-time interaction examined here demonstrates a surprising robustness in the face of these same deficiencies. This local metric is revealed to be only minimally affected by the inaccuracies and incompleteness introduced in these experiments. Finally, enhancements to one of the global tests are presented which solve the problem of population shift bias associated with the test and better contextualize and visualize its results, thereby enhancing its utility for practitioners.

ContributorsMalizia, Nicholas (Author) / Anselin, Luc (Thesis advisor) / Murray, Alan (Committee member) / Rey, Sergio (Committee member) / Arizona State University (Publisher)

Created2013

How does built environment affect cycling?: evidence from the whole California 2010-2012

Description

It has been identified in the literature that there exists a link between the built environment and non-motorized transport. This study aims to contribute to existing literature on the effects of the built environment on cycling, examining the case of the whole State of California. Physical built environment features are…

It has been identified in the literature that there exists a link between the built environment and non-motorized transport. This study aims to contribute to existing literature on the effects of the built environment on cycling, examining the case of the whole State of California. Physical built environment features are classified into six groups as: 1) local density, 2) diversity of land use, 3) road connectivity, 4) bike route length, 5) green space, 6) job accessibility. Cycling trips in one week for all children, school children, adults and employed-adults are investigated separately. The regression analysis shows that cycling trips is significantly associated with some features of built environment when many socio-demographic factors are taken into account. Street intersections, bike route length tend to increase the use of bicycle. These effects are well-aligned with literature. Moreover, both local and regional job accessibility variables are statistically significant in two adults' models. However, residential density always has a significant negatively effect on cycling trips, which is still need further research to confirm. Also, there is a gap in literature on how green space affects cycling, but the results of this study is still too unclear to make it up. By elasticity analysis, this study concludes that street intersections is the most powerful predictor on cycling trips. From another perspective, the effects of built environment on cycling at workplace (or school) are distinguished from at home. This study implies that a wide range of measures are available for planners to control vehicle travel by improving cycling-level in California.

ContributorsWang, Kailai, M.U.E.P (Author) / Salon, Deborah (Thesis advisor) / Rey, Sergio (Committee member) / Li, Wenwen (Committee member) / Arizona State University (Publisher)

Created2015

Deriving an obstacle-avoiding shortest path in continuous space: a spatial approach

Description

The shortest path between two locations is important for spatial analysis, location modeling, and wayfinding tasks. Depending on permissible movement and availability of data, the shortest path is either derived from a pre-defined transportation network or constructed in continuous space. However, continuous space movement adds substantial complexity to identifying the…

The shortest path between two locations is important for spatial analysis, location modeling, and wayfinding tasks. Depending on permissible movement and availability of data, the shortest path is either derived from a pre-defined transportation network or constructed in continuous space. However, continuous space movement adds substantial complexity to identifying the shortest path as the influence of obstacles has to be considered to avoid errors and biases in a derived path. This obstacle-avoiding shortest path in continuous space has been referred to as Euclidean shortest path (ESP), and attracted the attention of many researchers. It has been proven that constructing a graph is an effective approach to limit infinite search options associated with continuous space, reducing the problem to a finite set of potential paths. To date, various methods have been developed for ESP derivation. However, their computational efficiency is limited due to fundamental limitations in graph construction. In this research, a novel algorithm is developed for efficient identification of a graph guaranteed to contain the ESP. This new approach is referred to as the convexpath algorithm, and exploits spatial knowledge and GIS functionality to efficiently construct a graph. The convexpath algorithm utilizes the notion of a convex hull to simultaneously identify relevant obstacles and construct the graph. Additionally, a spatial filtering technique based on intermediate shortest path is enhances intelligent identification of relevant obstacles. Empirical applications show that the convexpath algorithm is able to construct a graph and derive the ESP with significantly improved efficiency compared to visibility and local visibility graph approaches. Furthermore, to boost the performance of convexpath in big data environments, a parallelization approach is proposed and applied to exploit computationally intensive spatial operations of convexpath. Multicore CPU parallelization demonstrates noticeable efficiency gain over the sequential convexpath. Finally, spatial representation and approximation issues associated with raster-based approximation of the ESP are assessed. This dissertation provides a comprehensive treatment of the ESP, and details an important approach for deriving an optimal ESP in real time.

ContributorsHong, Insu (Author) / Murray, Alan T. (Thesis advisor) / Kuby, Micheal (Committee member) / Rey, Sergio (Committee member) / Arizona State University (Publisher)

Created2015

An exploratory toolkit for examining residential movement patterns at a micro scale

Description

Change of residence is a commonly occurring event in urban areas. It reflects how people interact with the social or physical environment. Thus, by exploring the movement patterns of residential changes, geographers and other scholars hope to learn more about the reasons and impacts associated with residential mobility, and to…

Change of residence is a commonly occurring event in urban areas. It reflects how people interact with the social or physical environment. Thus, by exploring the movement patterns of residential changes, geographers and other scholars hope to learn more about the reasons and impacts associated with residential mobility, and to better understand how humans and the environment mutually interact. This is especially meaningful if exploration is based on micro scale movements, since residential changes within a city or a county reflect how the urban structure and community composition interact. Local differentiation, as an inevitable feature among movements at different places, can best be examined based on data at the micro scale. Such work is meaningful, but there have not been appropriate approaches for assessment and evaluation. The majority of traditional methods concentrate more on aggregate movement data at a national scale. So, in order to facilitate research examining movement patterns from a mass of individual residential changes at a micro scale, a toolkit, implemented by computational programming, is introduced in this dissertation to integrate both exploratory as well as confirmatory methods. This toolkit also employs a creative method to explore the spatial autocorrelation of residential movements, reflecting the local effects involved in this social event. The effectiveness and efficiency of this toolkit is examined through a concrete application involving 2,363 residential movements in Franklin County, Ohio.

ContributorsLiu, Yin (Author) / Murray, Alan (Thesis advisor) / Rey, Sergio (Committee member) / Wentz, Elizabeth (Committee member) / Arizona State University (Publisher)

Created2012

Policy and Place: A Spatial Data Science Framework for Research and Decision-Making

Description

A major challenge in health-related policy and program evaluation research is attributing underlying causal relationships where complicated processes may exist in natural or quasi-experimental settings. Spatial interaction and heterogeneity between units at individual or group levels can violate both components of the Stable-Unit-Treatment-Value-Assumption (SUTVA) that are core to the counterfactual…

A major challenge in health-related policy and program evaluation research is attributing underlying causal relationships where complicated processes may exist in natural or quasi-experimental settings. Spatial interaction and heterogeneity between units at individual or group levels can violate both components of the Stable-Unit-Treatment-Value-Assumption (SUTVA) that are core to the counterfactual framework, making treatment effects difficult to assess. New approaches are needed in health studies to develop spatially dynamic causal modeling methods to both derive insights from data that are sensitive to spatial differences and dependencies, and also be able to rely on a more robust, dynamic technical infrastructure needed for decision-making. To address this gap with a focus on causal applications theoretically, methodologically and technologically, I (1) develop a theoretical spatial framework (within single-level panel econometric methodology) that extends existing theories and methods of causal inference, which tend to ignore spatial dynamics; (2) demonstrate how this spatial framework can be applied in empirical research; and (3) implement a new spatial infrastructure framework that integrates and manages the required data for health systems evaluation.

The new spatially explicit counterfactual framework considers how spatial effects impact treatment choice, treatment variation, and treatment effects. To illustrate this new methodological framework, I first replicate a classic quasi-experimental study that evaluates the effect of drinking age policy on mortality in the United States from 1970 to 1984, and further extend it with a spatial perspective. In another example, I evaluate food access dynamics in Chicago from 2007 to 2014 by implementing advanced spatial analytics that better account for the complex patterns of food access, and quasi-experimental research design to distill the impact of the Great Recession on the foodscape. Inference interpretation is sensitive to both research design framing and underlying processes that drive geographically distributed relationships. Finally, I advance a new Spatial Data Science Infrastructure to integrate and manage data in dynamic, open environments for public health systems research and decision- making. I demonstrate an infrastructure prototype in a final case study, developed in collaboration with health department officials and community organizations.

ContributorsKolak, Marynia Aniela (Author) / Anselin, Luc (Thesis advisor) / Rey, Sergio (Committee member) / Koschinsky, Julia (Committee member) / Maciejewski, Ross (Committee member) / Arizona State University (Publisher)

Created2017

A Data-driven, High-performance and Intelligent CyberInfrastructure to Advance Spatial Sciences

Description

In the field of Geographic Information Science (GIScience), we have witnessed the unprecedented data deluge brought about by the rapid advancement of high-resolution data observing technologies. For example, with the advancement of Earth Observation (EO) technologies, a massive amount of EO data including remote sensing data and other sensor observation…

In the field of Geographic Information Science (GIScience), we have witnessed the unprecedented data deluge brought about by the rapid advancement of high-resolution data observing technologies. For example, with the advancement of Earth Observation (EO) technologies, a massive amount of EO data including remote sensing data and other sensor observation data about earthquake, climate, ocean, hydrology, volcano, glacier, etc., are being collected on a daily basis by a wide range of organizations. In addition to the observation data, human-generated data including microblogs, photos, consumption records, evaluations, unstructured webpages and other Volunteered Geographical Information (VGI) are incessantly generated and shared on the Internet.

Meanwhile, the emerging cyberinfrastructure rapidly increases our capacity for handling such massive data with regard to data collection and management, data integration and interoperability, data transmission and visualization, high-performance computing, etc. Cyberinfrastructure (CI) consists of computing systems, data storage systems, advanced instruments and data repositories, visualization environments, and people, all linked together by software and high-performance networks to improve research productivity and enable breakthroughs that are not otherwise possible.

The Geospatial CI (GCI, or CyberGIS), as the synthesis of CI and GIScience has inherent advantages in enabling computationally intensive spatial analysis and modeling (SAM) and collaborative geospatial problem solving and decision making.

This dissertation is dedicated to addressing several critical issues and improving the performance of existing methodologies and systems in the field of CyberGIS. My dissertation will include three parts: The first part is focused on developing methodologies to help public researchers find appropriate open geo-spatial datasets from millions of records provided by thousands of organizations scattered around the world efficiently and effectively. Machine learning and semantic search methods will be utilized in this research. The second part develops an interoperable and replicable geoprocessing service by synthesizing the high-performance computing (HPC) environment, the core spatial statistic/analysis algorithms from the widely adopted open source python package – Python Spatial Analysis Library (PySAL), and rich datasets acquired from the first research. The third part is dedicated to studying optimization strategies for feature data transmission and visualization. This study is intended for solving the performance issue in large feature data transmission through the Internet and visualization on the client (browser) side.

Taken together, the three parts constitute an endeavor towards the methodological improvement and implementation practice of the data-driven, high-performance and intelligent CI to advance spatial sciences.

ContributorsShao, Hu (Author) / Li, Wenwen (Thesis advisor) / Rey, Sergio (Thesis advisor) / Maciejewski, Ross (Committee member) / Arizona State University (Publisher)

Created2018

Underutilized Spaces and Marginal Lands for Sustainable Land Use: A Multi-Scale Analysis

Description

Drawn from a trio of manuscripts, this dissertation evaluates the sustainability contributions and implications of deploying underutilized spaces for alternative uses at multiple scales: urban, regional and continental. The first paper considers the use of underutilized spaces at the urban scale for urban agriculture (UA) to meet local sustainability goals…

Drawn from a trio of manuscripts, this dissertation evaluates the sustainability contributions and implications of deploying underutilized spaces for alternative uses at multiple scales: urban, regional and continental. The first paper considers the use of underutilized spaces at the urban scale for urban agriculture (UA) to meet local sustainability goals in Phoenix, Arizona. Through a data-driven analysis, it demonstrates UA can meet 90% of annual demand for fresh produce, supply local produce in all food deserts, reduce areas underserved by public parks by 60%, and displace >50,000 tons of carbon-dioxide emissions from buildings.

The second paper considers marginal agricultural land use for bioenergy crop cultivation to meet future liquid fuels demand from cellulosic biofuels sustainably and profitably. At a wholesale fuel price of $4 gallons-of-gasoline-equivalent, 30 to 90.7 billion gallons of cellulosic biofuels can be supplied by converting 22 to 79.3 million hectares of marginal lands in the Eastern United States (U.S.). Displacing marginal croplands (9.4-13.7 million hectares) reduces stress on water resources by preserving soil moisture. This displacement is comparable to existing land use for first-generation biofuels, limiting food supply impacts. Coupled modeling reveals positive hydroclimate feedback on bioenergy crop yields that moderates the land footprint.

The third paper examines the sustainability implications of expanding use of marginal lands for corn cultivation in the Western Corn Belt, a commercially important and environmentally sensitive U.S. region. Corn cultivation on lower quality lands, which tend to overlap with marginal agricultural lands, is shown to be nearly three times more sensitive to changes in crop prices. Therefore, corn cultivation disproportionately expanded into these lands following price spikes.

Underutilized spaces can contribute towards sustainability at small and large scales in a complementary fashion. While supplying fresh produce locally and delivering other benefits in terms of energy use and public health, UA can also reduce pressures on croplands and complement non-urban food production. This complementarity can help diversify agricultural land use for meeting other goals, like supplying biofuels. However, understanding the role of market forces and economic linkages is critical to anticipate any unintended consequences due to such re-organization of land use.

ContributorsULUDERE ARAGON, Nazli Zeynep (Author) / Georgescu, Matei (Thesis advisor) / Hanemann, William M (Committee member) / Parker, Nathan C. (Committee member) / Rey, Sergio (Committee member) / Arizona State University (Publisher)

Created2020

Spatial Regression and Gaussian Process BART

Description

Spatial regression is one of the central topics in spatial statistics. Based on the goals, interpretation or prediction, spatial regression models can be classified into two categories, linear mixed regression models and nonlinear regression models. This dissertation explored these models and their real world applications. New methods and models were…

Spatial regression is one of the central topics in spatial statistics. Based on the goals, interpretation or prediction, spatial regression models can be classified into two categories, linear mixed regression models and nonlinear regression models. This dissertation explored these models and their real world applications. New methods and models were proposed to overcome the challenges in practice. There are three major parts in the dissertation.

In the first part, nonlinear regression models were embedded into a multistage workflow to predict the spatial abundance of reef fish species in the Gulf of Mexico. There were two challenges, zero-inflated data and out of sample prediction. The methods and models in the workflow could effectively handle the zero-inflated sampling data without strong assumptions. Three strategies were proposed to solve the out of sample prediction problem. The results and discussions showed that the nonlinear prediction had the advantages of high accuracy, low bias and well-performed in multi-resolution.

In the second part, a two-stage spatial regression model was proposed for analyzing soil carbon stock (SOC) data. In the first stage, there was a spatial linear mixed model that captured the linear and stationary effects. In the second stage, a generalized additive model was used to explain the nonlinear and nonstationary effects. The results illustrated that the two-stage model had good interpretability in understanding the effect of covariates, meanwhile, it kept high prediction accuracy which is competitive to the popular machine learning models, like, random forest, xgboost and support vector machine.

A new nonlinear regression model, Gaussian process BART (Bayesian additive regression tree), was proposed in the third part. Combining advantages in both BART and Gaussian process, the model could capture the nonlinear effects of both observed and latent covariates. To develop the model, first, the traditional BART was generalized to accommodate correlated errors. Then, the failure of likelihood based Markov chain Monte Carlo (MCMC) in parameter estimating was discussed. Based on the idea of analysis of variation, back comparing and tuning range, were proposed to tackle this failure. Finally, effectiveness of the new model was examined by experiments on both simulation and real data.

ContributorsLu, Xuetao (Author) / McCulloch, Robert (Thesis advisor) / Hahn, Paul (Committee member) / Lan, Shiwei (Committee member) / Zhou, Shuang (Committee member) / Saul, Steven (Committee member) / Arizona State University (Publisher)

Created2020

Multiscale Geographically Weighted Regression: Computation, Inference, and Application

Description

Geographically Weighted Regression (GWR) has been broadly used in various fields to

model spatially non-stationary relationships. Classic GWR is considered as a single-scale model that is based on one bandwidth parameter which controls the amount of distance-decay in weighting neighboring data around each location. The single bandwidth in GWR assumes that…

Geographically Weighted Regression (GWR) has been broadly used in various fields to

model spatially non-stationary relationships. Classic GWR is considered as a single-scale model that is based on one bandwidth parameter which controls the amount of distance-decay in weighting neighboring data around each location. The single bandwidth in GWR assumes that processes (relationships between the response variable and the predictor variables) all operate at the same scale. However, this posits a limitation in modeling potentially multi-scale processes which are more often seen in the real world. For example, the measured ambient temperature of a location is affected by the built environment, regional weather and global warming, all of which operate at different scales. A recent advancement to GWR termed Multiscale GWR (MGWR) removes the single bandwidth assumption and allows the bandwidths for each covariate to vary. This results in each parameter surface being allowed to have a different degree of spatial variation, reflecting variation across covariate-specific processes. In this way, MGWR has the capability to differentiate local, regional and global processes by using varying bandwidths for covariates. Additionally, bandwidths in MGWR become explicit indicators of the scale at various processes operate. The proposed dissertation covers three perspectives centering on MGWR: Computation; Inference; and Application. The first component focuses on addressing computational issues in MGWR to allow MGWR models to be calibrated more efficiently and to be applied on large datasets. The second component aims to statistically differentiate the spatial scales at which different processes operate by quantifying the uncertainty associated with each bandwidth obtained from MGWR. In the third component, an empirical study will be conducted to model the changing relationships between county-level socio-economic factors and voter preferences in the 2008-2016 United States presidential elections using MGWR.

ContributorsLi, Ziqi (Author) / Fotheringham, A. Stewart (Thesis advisor) / Goodchild, Michael F. (Committee member) / Li, Wenwen (Committee member) / Arizona State University (Publisher)

Created2020

Filtering by