Search Content

Essays on space-time interaction tests

Description

Researchers across a variety of fields are often interested in determining if data are of a random nature or if they exhibit patterning which may be the result of some alternative and potentially more interesting process. This dissertation explores a family of statistical methods, i.e. space-time interaction tests, designed to…

Researchers across a variety of fields are often interested in determining if data are of a random nature or if they exhibit patterning which may be the result of some alternative and potentially more interesting process. This dissertation explores a family of statistical methods, i.e. space-time interaction tests, designed to detect structure within three-dimensional event data. These tests, widely employed in the fields of spatial epidemiology, criminology, ecology and beyond, are used to identify synergistic interaction across the spatial and temporal dimensions of a series of events. Exploration is needed to better understand these methods and determine how their results may be affected by data quality problems commonly encountered in their implementation; specifically, how inaccuracy and/or uncertainty in the input data analyzed by the methods may impact subsequent results. Additionally, known shortcomings of the methods must be ameliorated. The contributions of this dissertation are twofold: it develops a more complete understanding of how input data quality problems impact the results of a number of global and local tests of space-time interaction and it formulates an improved version of one global test which accounts for the previously identified problem of population shift bias. A series of simulation experiments reveal the global tests of space-time interaction explored here to be dramatically affected by the aforementioned deficiencies in the quality of the input data. It is shown that in some cases, a conservative degree of these common data problems can completely obscure evidence of space-time interaction and in others create it where it does not exist. Conversely, a local metric of space-time interaction examined here demonstrates a surprising robustness in the face of these same deficiencies. This local metric is revealed to be only minimally affected by the inaccuracies and incompleteness introduced in these experiments. Finally, enhancements to one of the global tests are presented which solve the problem of population shift bias associated with the test and better contextualize and visualize its results, thereby enhancing its utility for practitioners.

ContributorsMalizia, Nicholas (Author) / Anselin, Luc (Thesis advisor) / Murray, Alan (Committee member) / Rey, Sergio (Committee member) / Arizona State University (Publisher)

Created2013

Addressing geographic uncertainty in spatial optimization

Description

There exist many facets of error and uncertainty in digital spatial information. As error or uncertainty will not likely ever be completely eliminated, a better understanding of its impacts is necessary. Spatial analytical approaches, in particular, must somehow address data quality issues. This can range from evaluating impacts of potential…

There exist many facets of error and uncertainty in digital spatial information. As error or uncertainty will not likely ever be completely eliminated, a better understanding of its impacts is necessary. Spatial analytical approaches, in particular, must somehow address data quality issues. This can range from evaluating impacts of potential data uncertainty in planning processes that make use of methods to devising methods that explicitly account for error/uncertainty. To date, little has been done to structure methods accounting for error. This research focuses on developing methods to address geographic data uncertainty in spatial optimization. An integrated approach that characterizes uncertainty impacts by constructing and solving a new multi-objective model that explicitly incorporates facets of data uncertainty is developed. Empirical findings illustrate that the proposed approaches can be applied to evaluate the impacts of data uncertainty with statistical confidence, which moves beyond popular practices of simulating errors in data. Spatial uncertainty impacts are evaluated in two contexts: harvest scheduling and sex offender residency. Owing to the integration of spatial uncertainty, the detailed multi-objective models are more complex and computationally challenging to solve. As a result, a new multi-objective evolutionary algorithm is developed to address the computational challenges posed. The proposed algorithm incorporates problem-specific spatial knowledge to significantly enhance the capability of the evolutionary algorithm for solving the model.

ContributorsWei, Ran (Author) / Murray, Alan T. (Thesis advisor) / Anselin, Luc (Committee member) / Rey, Segio J (Committee member) / Mack, Elizabeth A. (Committee member) / Arizona State University (Publisher)

Created2013

Intermetropolitan networks of co-invention in American biotechnology

Description

Regional differences of inventive activity and economic growth are important in economic geography. These differences are generally explained by the theory of localized knowledge spillovers, which argues that geographical proximity among economic actors fosters invention and innovation. However, knowledge production involves an increasing number of actors connecting to non-local partners.…

Regional differences of inventive activity and economic growth are important in economic geography. These differences are generally explained by the theory of localized knowledge spillovers, which argues that geographical proximity among economic actors fosters invention and innovation. However, knowledge production involves an increasing number of actors connecting to non-local partners. The space of knowledge flows is not tightly bounded in a given territory, but functions as a network-based system where knowledge flows circulate around alignments of actors in different and distant places. The purpose of this dissertation is to understand the dynamics of network aspects of knowledge flows in American biotechnology. The first research task assesses both spatial and network-based dependencies of biotechnology co-invention across 150 large U.S. metropolitan areas over four decades (1979, 1989, 1999, and 2009). An integrated methodology including both spatial and social network analyses are explicitly applied and compared. Results show that the network-based proximity better defines the U.S. biotechnology co-invention urban system in recent years. Co-patenting relationships of major biotechnology centers has demonstrated national and regional association since the 1990s. Associations retain features of spatial proximity especially in some Midwestern and Northeastern cities, but these are no longer the strongest features affecting co-inventive links. The second research task examines how biotechnology knowledge flows circulate over space by focusing on the structural properties of intermetropolitan co-invention networks. All analyses in this task are conducted using social network analysis. Evidence shows that the architecture of the U.S. co-invention networks reveals a trend toward more organized structures and less fragmentation over the four years of analysis. Metropolitan areas are increasingly interconnected into a large web of networked environment. Knowledge flows are less likely to be controlled by a small number of intermediaries. San Francisco, New York, Boston, and San Diego monopolize the central positions of the intermetropolitan co-invention network as major American biotechnology concentrations. The overall network-based system comes close to a relational core/periphery structure where core metropolitan areas are strongly connected to one another and to some peripheral areas. Peripheral metropolitan areas are loosely connected or even disconnected with each other. This dissertation provides empirical evidence to support the argument that technological collaboration reveals a network-based system associated with different or even distant geographical places, which is somewhat different from the conventional theory of localized knowledge spillovers that once dominated understanding of the role of geography in technological advance.

ContributorsLee, Der-Shiuan (Author) / Ó Huallacháin, Breandán (Thesis advisor) / Anselin, Luc (Committee member) / Kuby, Michael (Committee member) / Lobo, Jose (Committee member) / Arizona State University (Publisher)

Created2011

The centralization index as a measure of local spatial segregation

Description

Decades ago in the U.S., clear lines delineated which neighborhoods were acceptable for certain people and which were not. Techniques such as steering and biased mortgage practices continue to perpetuate a segregated outcome for many residents. In contrast, ethnic enclaves and age restricted communities are viewed as voluntary segregation based…

Decades ago in the U.S., clear lines delineated which neighborhoods were acceptable for certain people and which were not. Techniques such as steering and biased mortgage practices continue to perpetuate a segregated outcome for many residents. In contrast, ethnic enclaves and age restricted communities are viewed as voluntary segregation based on cultural and social amenities. This diversity surrounding the causes of segregation are not just region-wide characteristics, but can vary within a region. Local segregation analysis aims to uncover this local variation, and hence open the door to policy solutions not visible at the global scale. The centralization index, originally introduced as a global measure of segregation focused on spatial concentration of two population groups relative a region's urban center, has lost relevancy in recent decades as regions have become polycentric, and the index's magnitude is sensitive to the particular point chosen as the center. These attributes, which make it a poor global measure, are leveraged here to repurpose the index as a local measure. The index's ability to differentiate minority from majority segregation, and its focus on a particular location within a region make it an ideal local segregation index. Based on the local centralization index for two groups, a local multigroup variation is defined, and a local space-time redistribution index is presented capturing change in concentration of a single population group over two time periods. Permutation based inference approaches are used to test the statistical significance of measured index values. Applications to the Phoenix, Arizona metropolitan area show persistent cores of black and white segregation over the years 1990, 2000 and 2010, and a trend of white segregated neighborhoods increasing at a faster rate than black. An analysis of the Phoenix area's recently opened light rail system shows that its 28 stations are located in areas of significant white, black and Hispanic segregation, and there is a clear concentration of renters over owners around most stations. There is little indication of statistically significant change in segregation or population concentration around the stations, indicating a lack of near term impact of light rail on the region's overall demographics.

ContributorsFolch, David C. (Author) / Rey, Sergio J (Thesis advisor) / Anselin, Luc (Committee member) / Murray, Alan T. (Committee member) / Arizona State University (Publisher)

Created2012

Spatializing partisan gerrymandering forensics: local measures and spatial specifications

Description

Gerrymandering is a central problem for many representative democracies. Formally, gerrymandering is the manipulation of spatial boundaries to provide political advantage to a particular group (Warf, 2006). The term often refers to political district design, where the boundaries of political districts are “unnaturally” manipulated by redistricting officials to generate durable…

Gerrymandering is a central problem for many representative democracies. Formally, gerrymandering is the manipulation of spatial boundaries to provide political advantage to a particular group (Warf, 2006). The term often refers to political district design, where the boundaries of political districts are “unnaturally” manipulated by redistricting officials to generate durable advantages for one group or party. Since free and fair elections are possibly the critical part of representative democracy, it is important for this cresting tide to have scientifically validated tools. This dissertation supports a current wave of reform by developing a general inferential technique to “localize” inferential bias measures, generating a new type of district-level score. The new method relies on the statistical intuition behind jackknife methods to construct relative local indicators. I find that existing statewide indicators of partisan bias can be localized using this technique, providing an estimate of how strongly a district impacts statewide partisan bias over an entire decade. When compared to measures of shape compactness (a common gerrymandering detection statistic), I find that weirdly-shaped districts have no consistent relationship with impact in many states during the 2000 and 2010 redistricting plan. To ensure that this work is valid, I examine existing seats-votes modeling strategies and develop a novel method for constructing seats-votes curves. I find that, while the empirical structure of electoral swing shows significant spatial dependence (even in the face of spatial heterogeneity), existing seats-votes specifications are more robust than anticipated to spatial dependence. Centrally, this dissertation contributes to the much larger social aim to resist electoral manipulation: that individuals & organizations suffer no undue burden on political access from partisan gerrymandering.

ContributorsWolf, Levi (Author) / Rey, Sergio J (Thesis advisor) / Anselin, Luc (Committee member) / Fotheringham, A. Stewart (Committee member) / Tam Cho, Wendy K (Committee member) / Arizona State University (Publisher)

Created2017

Policy and Place: A Spatial Data Science Framework for Research and Decision-Making

Description

A major challenge in health-related policy and program evaluation research is attributing underlying causal relationships where complicated processes may exist in natural or quasi-experimental settings. Spatial interaction and heterogeneity between units at individual or group levels can violate both components of the Stable-Unit-Treatment-Value-Assumption (SUTVA) that are core to the counterfactual…

A major challenge in health-related policy and program evaluation research is attributing underlying causal relationships where complicated processes may exist in natural or quasi-experimental settings. Spatial interaction and heterogeneity between units at individual or group levels can violate both components of the Stable-Unit-Treatment-Value-Assumption (SUTVA) that are core to the counterfactual framework, making treatment effects difficult to assess. New approaches are needed in health studies to develop spatially dynamic causal modeling methods to both derive insights from data that are sensitive to spatial differences and dependencies, and also be able to rely on a more robust, dynamic technical infrastructure needed for decision-making. To address this gap with a focus on causal applications theoretically, methodologically and technologically, I (1) develop a theoretical spatial framework (within single-level panel econometric methodology) that extends existing theories and methods of causal inference, which tend to ignore spatial dynamics; (2) demonstrate how this spatial framework can be applied in empirical research; and (3) implement a new spatial infrastructure framework that integrates and manages the required data for health systems evaluation.

The new spatially explicit counterfactual framework considers how spatial effects impact treatment choice, treatment variation, and treatment effects. To illustrate this new methodological framework, I first replicate a classic quasi-experimental study that evaluates the effect of drinking age policy on mortality in the United States from 1970 to 1984, and further extend it with a spatial perspective. In another example, I evaluate food access dynamics in Chicago from 2007 to 2014 by implementing advanced spatial analytics that better account for the complex patterns of food access, and quasi-experimental research design to distill the impact of the Great Recession on the foodscape. Inference interpretation is sensitive to both research design framing and underlying processes that drive geographically distributed relationships. Finally, I advance a new Spatial Data Science Infrastructure to integrate and manage data in dynamic, open environments for public health systems research and decision- making. I demonstrate an infrastructure prototype in a final case study, developed in collaboration with health department officials and community organizations.

ContributorsKolak, Marynia Aniela (Author) / Anselin, Luc (Thesis advisor) / Rey, Sergio (Committee member) / Koschinsky, Julia (Committee member) / Maciejewski, Ross (Committee member) / Arizona State University (Publisher)

Created2017

Spatial Regression and Gaussian Process BART

Description

Spatial regression is one of the central topics in spatial statistics. Based on the goals, interpretation or prediction, spatial regression models can be classified into two categories, linear mixed regression models and nonlinear regression models. This dissertation explored these models and their real world applications. New methods and models were…

Spatial regression is one of the central topics in spatial statistics. Based on the goals, interpretation or prediction, spatial regression models can be classified into two categories, linear mixed regression models and nonlinear regression models. This dissertation explored these models and their real world applications. New methods and models were proposed to overcome the challenges in practice. There are three major parts in the dissertation.

In the first part, nonlinear regression models were embedded into a multistage workflow to predict the spatial abundance of reef fish species in the Gulf of Mexico. There were two challenges, zero-inflated data and out of sample prediction. The methods and models in the workflow could effectively handle the zero-inflated sampling data without strong assumptions. Three strategies were proposed to solve the out of sample prediction problem. The results and discussions showed that the nonlinear prediction had the advantages of high accuracy, low bias and well-performed in multi-resolution.

In the second part, a two-stage spatial regression model was proposed for analyzing soil carbon stock (SOC) data. In the first stage, there was a spatial linear mixed model that captured the linear and stationary effects. In the second stage, a generalized additive model was used to explain the nonlinear and nonstationary effects. The results illustrated that the two-stage model had good interpretability in understanding the effect of covariates, meanwhile, it kept high prediction accuracy which is competitive to the popular machine learning models, like, random forest, xgboost and support vector machine.

A new nonlinear regression model, Gaussian process BART (Bayesian additive regression tree), was proposed in the third part. Combining advantages in both BART and Gaussian process, the model could capture the nonlinear effects of both observed and latent covariates. To develop the model, first, the traditional BART was generalized to accommodate correlated errors. Then, the failure of likelihood based Markov chain Monte Carlo (MCMC) in parameter estimating was discussed. Based on the idea of analysis of variation, back comparing and tuning range, were proposed to tackle this failure. Finally, effectiveness of the new model was examined by experiments on both simulation and real data.

ContributorsLu, Xuetao (Author) / McCulloch, Robert (Thesis advisor) / Hahn, Paul (Committee member) / Lan, Shiwei (Committee member) / Zhou, Shuang (Committee member) / Saul, Steven (Committee member) / Arizona State University (Publisher)

Created2020

Multiscale Geographically Weighted Regression: Computation, Inference, and Application

Description

Geographically Weighted Regression (GWR) has been broadly used in various fields to

model spatially non-stationary relationships. Classic GWR is considered as a single-scale model that is based on one bandwidth parameter which controls the amount of distance-decay in weighting neighboring data around each location. The single bandwidth in GWR assumes that…

Geographically Weighted Regression (GWR) has been broadly used in various fields to

model spatially non-stationary relationships. Classic GWR is considered as a single-scale model that is based on one bandwidth parameter which controls the amount of distance-decay in weighting neighboring data around each location. The single bandwidth in GWR assumes that processes (relationships between the response variable and the predictor variables) all operate at the same scale. However, this posits a limitation in modeling potentially multi-scale processes which are more often seen in the real world. For example, the measured ambient temperature of a location is affected by the built environment, regional weather and global warming, all of which operate at different scales. A recent advancement to GWR termed Multiscale GWR (MGWR) removes the single bandwidth assumption and allows the bandwidths for each covariate to vary. This results in each parameter surface being allowed to have a different degree of spatial variation, reflecting variation across covariate-specific processes. In this way, MGWR has the capability to differentiate local, regional and global processes by using varying bandwidths for covariates. Additionally, bandwidths in MGWR become explicit indicators of the scale at various processes operate. The proposed dissertation covers three perspectives centering on MGWR: Computation; Inference; and Application. The first component focuses on addressing computational issues in MGWR to allow MGWR models to be calibrated more efficiently and to be applied on large datasets. The second component aims to statistically differentiate the spatial scales at which different processes operate by quantifying the uncertainty associated with each bandwidth obtained from MGWR. In the third component, an empirical study will be conducted to model the changing relationships between county-level socio-economic factors and voter preferences in the 2008-2016 United States presidential elections using MGWR.

ContributorsLi, Ziqi (Author) / Fotheringham, A. Stewart (Thesis advisor) / Goodchild, Michael F. (Committee member) / Li, Wenwen (Committee member) / Arizona State University (Publisher)

Created2020

Filtering by