Filtering by
- All Subjects: Geographic Information Science and Geodesy
- All Subjects: Bayesian statistics
- All Subjects: spatial statistics
- Genre: Academic theses
- Creators: Li, Wenwen
- Creators: Zhou, Shuang
Choropleth maps are a common form of online cartographic visualization. They reveal patterns in spatial distributions of a variable by associating colors with data values measured at areal units. Although this capability of pattern revelation has popularized the use of choropleth maps, existing methods for their online delivery are limited in supporting dynamic map generation from large areal data. This limitation has become increasingly problematic in online choropleth mapping as access to small area statistics, such as high-resolution census data and real-time aggregates of geospatial data streams, has never been easier due to advances in geospatial web technologies. The current literature shows that the challenge of large areal data can be mitigated through tiled maps where pre-processed map data are hierarchically partitioned into tiny rectangular images or map chunks for efficient data transmission. Various approaches have emerged lately to enable this tile-based choropleth mapping, yet little empirical evidence exists on their ability to handle spatial data with large numbers of areal units, thus complicating technical decision making in the development of online choropleth mapping applications. To fill this knowledge gap, this dissertation study conducts a scalability evaluation of three tile-based methods discussed in the literature: raster, scalable vector graphics (SVG), and HTML5 Canvas. For the evaluation, the study develops two test applications, generates map tiles from five different boundaries of the United States, and measures the response times of the applications under multiple test operations. While specific to the experimental setups of the study, the evaluation results show that the raster method scales better across various types of user interaction than the other methods. Empirical evidence also points to the superior scalability of Canvas to SVG in dynamic rendering of vector tiles, but not necessarily for partial updates of the tiles. These findings indicate that the raster method is better suited for dynamic choropleth rendering from large areal data, while Canvas would be more suitable than SVG when such rendering frequently involves complete updates of vector shapes.
model spatially non-stationary relationships. Classic GWR is considered as a single-scale model that is based on one bandwidth parameter which controls the amount of distance-decay in weighting neighboring data around each location. The single bandwidth in GWR assumes that processes (relationships between the response variable and the predictor variables) all operate at the same scale. However, this posits a limitation in modeling potentially multi-scale processes which are more often seen in the real world. For example, the measured ambient temperature of a location is affected by the built environment, regional weather and global warming, all of which operate at different scales. A recent advancement to GWR termed Multiscale GWR (MGWR) removes the single bandwidth assumption and allows the bandwidths for each covariate to vary. This results in each parameter surface being allowed to have a different degree of spatial variation, reflecting variation across covariate-specific processes. In this way, MGWR has the capability to differentiate local, regional and global processes by using varying bandwidths for covariates. Additionally, bandwidths in MGWR become explicit indicators of the scale at various processes operate. The proposed dissertation covers three perspectives centering on MGWR: Computation; Inference; and Application. The first component focuses on addressing computational issues in MGWR to allow MGWR models to be calibrated more efficiently and to be applied on large datasets. The second component aims to statistically differentiate the spatial scales at which different processes operate by quantifying the uncertainty associated with each bandwidth obtained from MGWR. In the third component, an empirical study will be conducted to model the changing relationships between county-level socio-economic factors and voter preferences in the 2008-2016 United States presidential elections using MGWR.
In the first part, nonlinear regression models were embedded into a multistage workflow to predict the spatial abundance of reef fish species in the Gulf of Mexico. There were two challenges, zero-inflated data and out of sample prediction. The methods and models in the workflow could effectively handle the zero-inflated sampling data without strong assumptions. Three strategies were proposed to solve the out of sample prediction problem. The results and discussions showed that the nonlinear prediction had the advantages of high accuracy, low bias and well-performed in multi-resolution.
In the second part, a two-stage spatial regression model was proposed for analyzing soil carbon stock (SOC) data. In the first stage, there was a spatial linear mixed model that captured the linear and stationary effects. In the second stage, a generalized additive model was used to explain the nonlinear and nonstationary effects. The results illustrated that the two-stage model had good interpretability in understanding the effect of covariates, meanwhile, it kept high prediction accuracy which is competitive to the popular machine learning models, like, random forest, xgboost and support vector machine.
A new nonlinear regression model, Gaussian process BART (Bayesian additive regression tree), was proposed in the third part. Combining advantages in both BART and Gaussian process, the model could capture the nonlinear effects of both observed and latent covariates. To develop the model, first, the traditional BART was generalized to accommodate correlated errors. Then, the failure of likelihood based Markov chain Monte Carlo (MCMC) in parameter estimating was discussed. Based on the idea of analysis of variation, back comparing and tuning range, were proposed to tackle this failure. Finally, effectiveness of the new model was examined by experiments on both simulation and real data.
This dissertation includes two main parts. The first part proposes to employ the continuous, pixel-based landscape gradient models in comparison to the discrete, patch-based mosaic models and evaluates model efficiency in two empirical contexts: urban landscape pattern mapping and land cover dynamics monitoring. The second part formalizes a novel statistical model called spatially filtered ridge regression (SFRR) that ensures accurate and stable statistical estimation despite the existence of multicollinearity and the inherent spatial effect.
Results highlight the strong potential of local indicators of spatial dependence in landscape pattern mapping across various geographical scales. This is based on evidence from a sequence of exploratory comparative analyses and a time series study of land cover dynamics over Phoenix, AZ. The newly proposed SFRR method is capable of producing reliable estimates when analyzing statistical relationships involving geographic data and highly correlated predictor variables. An empirical application of the SFRR over Phoenix suggests that urban cooling can be achieved not only by altering the land cover abundance, but also by optimizing the spatial arrangements of urban land cover features. Considering the limited water supply, rapid urban expansion, and the continuously warming climate, judicious design and planning of urban land cover features is of increasing importance for conserving resources and enhancing quality of life.