Search Content

Advances in Local Multiscale Modeling in a Regression Framework

Description

Embedded within the regression framework, local models can estimate conditioned relationships between observed spatial phenomena and hypothesized explanatory variables and help infer the intangible spatial processes that contribute to the observed spatial patterns. Rather than investigating averaged characteristics corresponding to processes over space as global models do, these models estimate…

Embedded within the regression framework, local models can estimate conditioned relationships between observed spatial phenomena and hypothesized explanatory variables and help infer the intangible spatial processes that contribute to the observed spatial patterns. Rather than investigating averaged characteristics corresponding to processes over space as global models do, these models estimate a surface of spatially varying parameters with a value for each location. Additionally, some models such as variants within the Geographically Weighted Regression (GWR) framework, also estimate a parameter to represent the spatial scale across which the processes vary representing the inherent heterogeneity of the estimated surfaces. Since different processes tend to operate at unique spatial scales, some extensions to local models such as Multiscale GWR (MGWR) estimate unique scales of association for each predictor in a model and generate significantly more information on the nature of geographic processes than their predecessors. However, developments within the realm of local models are fairly nascent and hence an understanding around their correct application as well as recognizing their true potential in exploring fundamental spatial science issues is under-developed. The techniques within these frameworks are also currently limited thus restricting the kinds of data that can be analyzed using these models. Therefore the goal of this dissertation is to advance techniques within local multiscale modeling specifically by coining new diagnostics, exploring their novel application in understanding long-standing issues concerning spatial scale and by expanding the tool base to allow their use in wider empirical applications. This goal is realized through three distinct research objectives over four chapters, followed by a discussion on the future of the developments within local multiscale modeling. A correct understanding of the capability and promise of local multiscale models and expanding the fields where they can be employed will not only enhance geographical research by strengthening the intuition of the nature of geographic processes, but will also exemplify the importance and need for using such tools bringing quantitative spatial science to the fore.

ContributorsSachdeva, Mehak (Author) / Fotheringham, A. Stewart (Thesis advisor) / Goodchild, Michael Frank (Committee member) / Kedron, Peter (Committee member) / Wolf, Levi John (Committee member) / Arizona State University (Publisher)

Created2022

Issues in the Distribution Dynamics Approach to the Analysis of Regional Economic Growth and Convergence: Spatial Effects and Small Samples

Description

In the study of regional economic growth and convergence, the distribution dynamics approach which interrogates the evolution of the cross-sectional distribution as a whole and is concerned with both the external and internal dynamics of the distribution has received wide usage. However, many methodological issues remain to be resolved before…

In the study of regional economic growth and convergence, the distribution dynamics approach which interrogates the evolution of the cross-sectional distribution as a whole and is concerned with both the external and internal dynamics of the distribution has received wide usage. However, many methodological issues remain to be resolved before valid inferences and conclusions can be drawn from empirical research. Among them, spatial effects including spatial heterogeneity and spatial dependence invalidate the assumption of independent and identical distributions underlying the conventional maximum likelihood techniques while the availability of small samples in regional settings questions the usage of the asymptotic properties. This dissertation is comprised of three papers targeted at addressing these two issues. The first paper investigates whether the conventional regional income mobility estimators are still suitable in the presence of spatial dependence and/or a small sample. It is approached through a series of Monte Carlo experiments which require the proposal of a novel data generating process (DGP) capable of generating spatially dependent time series. The second paper moves to the statistical tests for detecting specific forms of spatial (spatiotemporal) effects in the discrete Markov chain model, investigating their robustness to the alternative spatial effect, sensitivity to discretization granularity, and properties in small sample settings. The third paper proposes discrete kernel estimators with cross-validated bandwidths as an alternative to maximum likelihood estimators in small sample settings. It is demonstrated that the performance of discrete kernel estimators offers improvement when the sample size is small. Taken together, the three papers constitute an endeavor to relax the restrictive assumptions of spatial independence and spatial homogeneity, as well as demonstrating the difference between the small sample and asymptotic properties for conventionally adopted maximum likelihood estimators towards a more valid inferential framework for the distribution dynamics approach to the study of regional economic growth and convergence.

ContributorsKang, Wei (Author) / Rey, Sergio (Thesis advisor) / Fotheringham, A. Stewart (Committee member) / Ye, Xinyue (Committee member) / Arizona State University (Publisher)

Created2018

GeoAI-enhanced Techniques to Support Geographical Knowledge Discovery from Big Geospatial Data

Description

Big data that contain geo-referenced attributes have significantly reformed the way that I process and analyze geospatial data. Compared with the expected benefits received in the data-rich environment, more data have not always contributed to more accurate analysis. “Big but valueless” has becoming a critical concern to the community of…

Big data that contain geo-referenced attributes have significantly reformed the way that I process and analyze geospatial data. Compared with the expected benefits received in the data-rich environment, more data have not always contributed to more accurate analysis. “Big but valueless” has becoming a critical concern to the community of GIScience and data-driven geography. As a highly-utilized function of GeoAI technique, deep learning models designed for processing geospatial data integrate powerful computing hardware and deep neural networks into various dimensions of geography to effectively discover the representation of data. However, limitations of these deep learning models have also been reported when People may have to spend much time on preparing training data for implementing a deep learning model. The objective of this dissertation research is to promote state-of-the-art deep learning models in discovering the representation, value and hidden knowledge of GIS and remote sensing data, through three research approaches. The first methodological framework aims to unify varied shadow into limited number of patterns, with the convolutional neural network (CNNs)-powered shape classification, multifarious shadow shapes with a limited number of representative shadow patterns for efficient shadow-based building height estimation. The second research focus integrates semantic analysis into a framework of various state-of-the-art CNNs to support human-level understanding of map content. The final research approach of this dissertation focuses on normalizing geospatial domain knowledge to promote the transferability of a CNN’s model to land-use/land-cover classification. This research reports a method designed to discover detailed land-use/land-cover types that might be challenging for a state-of-the-art CNN’s model that previously performed well on land-cover classification only.

ContributorsZhou, Xiran (Author) / Li, Wenwen (Thesis advisor) / Myint, Soe Win (Committee member) / Arundel, Samantha Thompson (Committee member) / Arizona State University (Publisher)

Created2019

A taxonomy of parallel vector spatial analysis algorithms

Description

Nearly 25 years ago, parallel computing techniques were first applied to vector spatial analysis methods. This initial research was driven by the desire to reduce computing times in order to support scaling to larger problem sets. Since this initial work, rapid technological advancement has driven the availability of High Performance…

Nearly 25 years ago, parallel computing techniques were first applied to vector spatial analysis methods. This initial research was driven by the desire to reduce computing times in order to support scaling to larger problem sets. Since this initial work, rapid technological advancement has driven the availability of High Performance Computing (HPC) resources, in the form of multi-core desktop computers, distributed geographic information processing systems, e.g. computational grids, and single site HPC clusters. In step with increases in computational resources, significant advancement in the capabilities to capture and store large quantities of spatially enabled data have been realized. A key component to utilizing vast data quantities in HPC environments, scalable algorithms, have failed to keep pace. The National Science Foundation has identified the lack of scalable algorithms in codified frameworks as an essential research product. Fulfillment of this goal is challenging given the lack of a codified theoretical framework mapping atomic numeric operations from the spatial analysis stack to parallel programming paradigms, the diversity in vernacular utilized by research groups, the propensity for implementations to tightly couple to under- lying hardware, and the general difficulty in realizing scalable parallel algorithms. This dissertation develops a taxonomy of parallel vector spatial analysis algorithms with classification being defined by root mathematical operation and communication pattern, a computational dwarf. Six computational dwarfs are identified, three being drawn directly from an existing parallel computing taxonomy and three being created to capture characteristics unique to spatial analysis algorithms. The taxonomy provides a high-level classification decoupled from low-level implementation details such as hardware, communication protocols, implementation language, decomposition method, or file input and output. By taking a high-level approach implementation specifics are broadly proposed, breadth of coverage is achieved, and extensibility is ensured. The taxonomy is both informed and informed by five case studies im- plemented across multiple, divergent hardware environments. A major contribution of this dissertation is a theoretical framework to support the future development of concrete parallel vector spatial analysis frameworks through the identification of computational dwarfs and, by extension, successful implementation strategies.

ContributorsLaura, Jason (Author) / Rey, Sergio J. (Thesis advisor) / Anselin, Luc (Committee member) / Wang, Shaowen (Committee member) / Li, Wenwen (Committee member) / Arizona State University (Publisher)

Created2015

Regional economic inequality analysis : a comparative study of the United States and China

Description

Economic inequality is always presented as how economic metrics vary amongst individuals in a group, amongst groups in a population, or amongst some regions. Economic inequality can substantially impact the social environment, socioeconomics as well as human living standard. Since economic inequality always plays an important role in our social…

Economic inequality is always presented as how economic metrics vary amongst individuals in a group, amongst groups in a population, or amongst some regions. Economic inequality can substantially impact the social environment, socioeconomics as well as human living standard. Since economic inequality always plays an important role in our social environment, its study has attracted much attention from scholars in various research fields, such as development economics, sociology and political science. On the other hand, economic inequality can result from many factors, phenomena, and complex procedures, including policy, ethnic, education, globalization and etc. However, the spatial dimension in economic inequality research did not draw much attention from scholars until early 2000s. Spatial dependency, perform key roles in economic inequality analysis. The spatial econometric methods do not merely convey a consequence of the characters of the data exclusively. More importantly, they also respect and quantify the spatial effects in the economic inequality. As aforementioned, although regional economic inequality starts to attract scholars' attention in both economy and regional science domains, corresponding methodologies to examine such regional inequality remain in their preliminary phase, which need substantial further exploration. My thesis aims at contributing to the body of knowledge in the method development to support economic inequality studies by exploring the feasibility of a set of new analytical methods in use of regional inequality analysis. These methods include Theil's T statistic, geographical rank Markov and new methods applying graph theory. The thesis will also leverage these methods to compare the inequality between China and US, two large economic entities in the world, because of the long history of economic development as well as the corresponding evolution of inequality in US; the rapid economic development and consequent high variation of economic inequality in China.

ContributorsWang, Sizhe (Author) / Rey, Sergio J (Thesis advisor) / Li, Wenwen (Committee member) / Salon, Deborah (Committee member) / Arizona State University (Publisher)

Created2016

Improving species distribution models with bias correction and geographically weighted regression: tests of virtual species and past and present distributions in North American deserts

Description

This work investigates the effects of non-random sampling on our understanding of species distributions and their niches. In its most general form, bias is systematic error that can obscure interpretation of analytical results by skewing samples away from the average condition of the system they represent. Here I use species…

This work investigates the effects of non-random sampling on our understanding of species distributions and their niches. In its most general form, bias is systematic error that can obscure interpretation of analytical results by skewing samples away from the average condition of the system they represent. Here I use species distribution modelling (SDM), virtual species, and multiscale geographically weighted regression (MGWR) to explore how sampling bias can alter our perception of broad patterns of biodiversity by distorting spatial predictions of habitat, a key characteristic in biogeographic studies. I use three separate case studies to explore: 1) How methods to account for sampling bias in species distribution modeling may alter estimates of species distributions and species-environment relationships, 2) How accounting for sampling bias in fossil data may change our understanding of paleo-distributions and interpretation of niche stability through time (i.e. niche conservation), and 3) How a novel use of MGWR can account for environmental sampling bias to reveal landscape patterns of local niche differences among proximal, but non-overlapping sister taxa. Broadly, my work shows that sampling bias present in commonly used federated global biodiversity observations is more than enough to degrade model performance of spatial predictions and niche characteristics. Measures commonly used to account for this bias can negate much loss, but only in certain conditions, and did not improve the ability to correctly identify explanatory variables or recreate species-environment relationships. Paleo-distributions calibrated on biased fossil records were improved with the use of a novel method to directly estimate the biased sampling distribution, which can be generalized to finer time slices for further paleontological studies. Finally, I show how a novel coupling of SDM and MGWR can illuminate local differences in niche separation that more closely match landscape genotypic variability in the two North American desert tortoise species than does their current taxonomic delineation.

ContributorsInman, Richard (Author) / Franklin, Janet (Thesis advisor) / Fotheringham, A. Stewart (Committee member) / Dorn, Ronald (Committee member) / Arizona State University (Publisher)

Created2018

Multiscale Geographically Weighted Regression: Computation, Inference, and Application

Description

Geographically Weighted Regression (GWR) has been broadly used in various fields to

model spatially non-stationary relationships. Classic GWR is considered as a single-scale model that is based on one bandwidth parameter which controls the amount of distance-decay in weighting neighboring data around each location. The single bandwidth in GWR assumes that…

Geographically Weighted Regression (GWR) has been broadly used in various fields to

model spatially non-stationary relationships. Classic GWR is considered as a single-scale model that is based on one bandwidth parameter which controls the amount of distance-decay in weighting neighboring data around each location. The single bandwidth in GWR assumes that processes (relationships between the response variable and the predictor variables) all operate at the same scale. However, this posits a limitation in modeling potentially multi-scale processes which are more often seen in the real world. For example, the measured ambient temperature of a location is affected by the built environment, regional weather and global warming, all of which operate at different scales. A recent advancement to GWR termed Multiscale GWR (MGWR) removes the single bandwidth assumption and allows the bandwidths for each covariate to vary. This results in each parameter surface being allowed to have a different degree of spatial variation, reflecting variation across covariate-specific processes. In this way, MGWR has the capability to differentiate local, regional and global processes by using varying bandwidths for covariates. Additionally, bandwidths in MGWR become explicit indicators of the scale at various processes operate. The proposed dissertation covers three perspectives centering on MGWR: Computation; Inference; and Application. The first component focuses on addressing computational issues in MGWR to allow MGWR models to be calibrated more efficiently and to be applied on large datasets. The second component aims to statistically differentiate the spatial scales at which different processes operate by quantifying the uncertainty associated with each bandwidth obtained from MGWR. In the third component, an empirical study will be conducted to model the changing relationships between county-level socio-economic factors and voter preferences in the 2008-2016 United States presidential elections using MGWR.

ContributorsLi, Ziqi (Author) / Fotheringham, A. Stewart (Thesis advisor) / Goodchild, Michael F. (Committee member) / Li, Wenwen (Committee member) / Arizona State University (Publisher)

Created2020

Developing Data-Driven Methods for Movement Pattern Analysis using Geographic Context

Description

The role of movement data is essential to understanding how geographic context influences movement patterns in urban areas. Owing to the growth in ubiquitous data collection platforms like smartphones, fitness trackers, and health monitoring apps, researchers are now able to collect movement data at increasingly fine spatial and temporal resolution.…

The role of movement data is essential to understanding how geographic context influences movement patterns in urban areas. Owing to the growth in ubiquitous data collection platforms like smartphones, fitness trackers, and health monitoring apps, researchers are now able to collect movement data at increasingly fine spatial and temporal resolution. Despite the surge in volumes of fine-grained movement data, there is a gap in the availability of quantitative and analytical tools to extract actionable insights from such big datasets and tease out the role of context in movement pattern analysis. As cities aim to be safer and healthier, policymakers require methods to generate efficient strategies for urban planning utilizing high-frequency movement data to make targeted decisions for infrastructure investments without compromising the safety of its residents. The objective of this Ph.D. dissertation is to develop quantitative methods that combine big spatial-temporal data from crowdsourced platforms with geographic context to analyze movement patterns over space and time. Knowledge about the role of context can help in assessing why changes in movement patterns occur and how those changes are affected by the immediate natural and built environment. In this dissertation I contribute to the rapidly expanding body of quantitative movement pattern analysis research by 1) developing a bias-correction framework for improving the representativeness of crowdsourced movement data by modeling bias with training data and geographical variables, 2) understanding spatial-temporal changes in movement patterns at different periods and how context influences those changes by generating hourly and monthly change maps in bicycle ridership patterns, and 3) quantifying the variation in accuracy and generalizability of transportation mode detection models using GPS (Global Positioning Systems) data upon adding geographic context. Using statistical models, supervised classification algorithms, and functional data analysis approaches I develop modeling frameworks that address each of the research objectives. The results are presented as street-level maps and predictive models which are reproducible in nature. The methods developed in this dissertation can serve as analytical tools by policymakers to plan infrastructure changes and facilitate data collection efforts that represent movement patterns for all ages and abilities.

ContributorsRoy, Avipsa (Author) / Nelson, Trisalyn A. (Thesis advisor) / Kedron, Peter J. (Committee member) / Li, Wenwen (Committee member) / Arizona State University (Publisher)

Created2021

Spatial Regression and Gaussian Process BART

Description

Spatial regression is one of the central topics in spatial statistics. Based on the goals, interpretation or prediction, spatial regression models can be classified into two categories, linear mixed regression models and nonlinear regression models. This dissertation explored these models and their real world applications. New methods and models were…

Spatial regression is one of the central topics in spatial statistics. Based on the goals, interpretation or prediction, spatial regression models can be classified into two categories, linear mixed regression models and nonlinear regression models. This dissertation explored these models and their real world applications. New methods and models were proposed to overcome the challenges in practice. There are three major parts in the dissertation.

In the first part, nonlinear regression models were embedded into a multistage workflow to predict the spatial abundance of reef fish species in the Gulf of Mexico. There were two challenges, zero-inflated data and out of sample prediction. The methods and models in the workflow could effectively handle the zero-inflated sampling data without strong assumptions. Three strategies were proposed to solve the out of sample prediction problem. The results and discussions showed that the nonlinear prediction had the advantages of high accuracy, low bias and well-performed in multi-resolution.

In the second part, a two-stage spatial regression model was proposed for analyzing soil carbon stock (SOC) data. In the first stage, there was a spatial linear mixed model that captured the linear and stationary effects. In the second stage, a generalized additive model was used to explain the nonlinear and nonstationary effects. The results illustrated that the two-stage model had good interpretability in understanding the effect of covariates, meanwhile, it kept high prediction accuracy which is competitive to the popular machine learning models, like, random forest, xgboost and support vector machine.

A new nonlinear regression model, Gaussian process BART (Bayesian additive regression tree), was proposed in the third part. Combining advantages in both BART and Gaussian process, the model could capture the nonlinear effects of both observed and latent covariates. To develop the model, first, the traditional BART was generalized to accommodate correlated errors. Then, the failure of likelihood based Markov chain Monte Carlo (MCMC) in parameter estimating was discussed. Based on the idea of analysis of variation, back comparing and tuning range, were proposed to tackle this failure. Finally, effectiveness of the new model was examined by experiments on both simulation and real data.

ContributorsLu, Xuetao (Author) / McCulloch, Robert (Thesis advisor) / Hahn, Paul (Committee member) / Lan, Shiwei (Committee member) / Zhou, Shuang (Committee member) / Saul, Steven (Committee member) / Arizona State University (Publisher)

Created2020

A spatial statistical framework for evaluating landscape pattern and its impacts on the urban thermal environment

Description

Urban growth, from regional sprawl to global urbanization, is the most rapid, drastic, and irreversible form of human modification to the natural environment. Extensive land cover modifications during urban growth have altered the local energy balance, causing the city warmer than its surrounding rural environment, a phenomenon known as an…

Urban growth, from regional sprawl to global urbanization, is the most rapid, drastic, and irreversible form of human modification to the natural environment. Extensive land cover modifications during urban growth have altered the local energy balance, causing the city warmer than its surrounding rural environment, a phenomenon known as an urban heat island (UHI). How are the seasonal and diurnal surface temperatures related to the land surface characteristics, and what land cover types and/or patterns are desirable for ameliorating climate in a fast growing desert city? This dissertation scrutinizes these questions and seeks to address them using a combination of satellite remote sensing, geographical information science, and spatial statistical modeling techniques.

This dissertation includes two main parts. The first part proposes to employ the continuous, pixel-based landscape gradient models in comparison to the discrete, patch-based mosaic models and evaluates model efficiency in two empirical contexts: urban landscape pattern mapping and land cover dynamics monitoring. The second part formalizes a novel statistical model called spatially filtered ridge regression (SFRR) that ensures accurate and stable statistical estimation despite the existence of multicollinearity and the inherent spatial effect.

Results highlight the strong potential of local indicators of spatial dependence in landscape pattern mapping across various geographical scales. This is based on evidence from a sequence of exploratory comparative analyses and a time series study of land cover dynamics over Phoenix, AZ. The newly proposed SFRR method is capable of producing reliable estimates when analyzing statistical relationships involving geographic data and highly correlated predictor variables. An empirical application of the SFRR over Phoenix suggests that urban cooling can be achieved not only by altering the land cover abundance, but also by optimizing the spatial arrangements of urban land cover features. Considering the limited water supply, rapid urban expansion, and the continuously warming climate, judicious design and planning of urban land cover features is of increasing importance for conserving resources and enhancing quality of life.

ContributorsFan, Chao (Author) / Myint, Soe W (Thesis advisor) / Li, Wenwen (Committee member) / Rey, Sergio J (Committee member) / Arizona State University (Publisher)

Created2016

Filtering by