Matching Items (6)

Filtering by

Clear all filters

155931-Thumbnail Image.png

Spatializing partisan gerrymandering forensics: local measures and spatial specifications

Description

Gerrymandering is a central problem for many representative democracies. Formally, gerrymandering is the manipulation of spatial boundaries to provide political advantage to a particular group (Warf, 2006). The term often refers to political district design, where the boundaries of political

Gerrymandering is a central problem for many representative democracies. Formally, gerrymandering is the manipulation of spatial boundaries to provide political advantage to a particular group (Warf, 2006). The term often refers to political district design, where the boundaries of political districts are “unnaturally” manipulated by redistricting officials to generate durable advantages for one group or party. Since free and fair elections are possibly the critical part of representative democracy, it is important for this cresting tide to have scientifically validated tools. This dissertation supports a current wave of reform by developing a general inferential technique to “localize” inferential bias measures, generating a new type of district-level score. The new method relies on the statistical intuition behind jackknife methods to construct relative local indicators. I find that existing statewide indicators of partisan bias can be localized using this technique, providing an estimate of how strongly a district impacts statewide partisan bias over an entire decade. When compared to measures of shape compactness (a common gerrymandering detection statistic), I find that weirdly-shaped districts have no consistent relationship with impact in many states during the 2000 and 2010 redistricting plan. To ensure that this work is valid, I examine existing seats-votes modeling strategies and develop a novel method for constructing seats-votes curves. I find that, while the empirical structure of electoral swing shows significant spatial dependence (even in the face of spatial heterogeneity), existing seats-votes specifications are more robust than anticipated to spatial dependence. Centrally, this dissertation contributes to the much larger social aim to resist electoral manipulation: that individuals & organizations suffer no undue burden on political access from partisan gerrymandering.

Contributors

Agent

Created

Date Created
2017

156722-Thumbnail Image.png

Spatio-temporal statistical modeling: climate impacts due to bioenergy crop expansion

Description

Large-scale cultivation of perennial bioenergy crops (e.g., miscanthus and switch-

grass) offers unique opportunities to mitigate climate change through avoided fossil fuel use and associated greenhouse gas reduction. Although conversion of existing agriculturally intensive lands (e.g., maize and soy) to perennial

Large-scale cultivation of perennial bioenergy crops (e.g., miscanthus and switch-

grass) offers unique opportunities to mitigate climate change through avoided fossil fuel use and associated greenhouse gas reduction. Although conversion of existing agriculturally intensive lands (e.g., maize and soy) to perennial bioenergy cropping systems has been shown to reduce near-surface temperatures, unintended consequences on natural water resources via depletion of soil moisture may offset these benefits. In the effort of the cross-fertilization across the disciplines of physics-based modeling and spatio-temporal statistics, three topics are investigated in this dissertation aiming to provide a novel quantification and robust justifications of the hydroclimate impacts associated with bioenergy crop expansion. Topic 1 quantifies the hydroclimatic impacts associated with perennial bioenergy crop expansion over the contiguous United States using the Weather Research and Forecasting Model (WRF) dynamically coupled to a land surface model (LSM). A suite of continuous (2000–09) medium-range resolution (20-km grid spacing) ensemble-based simulations is conducted. Hovmöller and Taylor diagrams are utilized to evaluate simulated temperature and precipitation. In addition, Mann-Kendall modified trend tests and Sieve-bootstrap trend tests are performed to evaluate the statistical significance of trends in soil moisture differences. Finally, this research reveals potential hot spots of suitable deployment and regions to avoid. Topic 2 presents spatio-temporal Bayesian models which quantify the robustness of control simulation bias, as well as biofuel impacts, using three spatio-temporal correlation structures. A hierarchical model with spatially varying intercepts and slopes display satisfactory performance in capturing spatio-temporal associations. Simulated temperature impacts due to perennial bioenergy crop expansion are robust to physics parameterization schemes. Topic 3 further focuses on the accuracy and efficiency of spatial-temporal statistical modeling for large datasets. An ensemble of spatio-temporal eigenvector filtering algorithms (hereafter: STEF) is proposed to account for the spatio-temporal autocorrelation structure of the data while taking into account spatial confounding. Monte Carlo experiments are conducted. This method is then used to quantify the robustness of simulated hydroclimatic impacts associated with bioenergy crops to alternative physics parameterizations. Results are evaluated against those obtained from three alternative Bayesian spatio-temporal specifications.

Contributors

Agent

Created

Date Created
2018

156693-Thumbnail Image.png

Issues in the Distribution Dynamics Approach to the Analysis of Regional Economic Growth and Convergence: Spatial Effects and Small Samples

Description

In the study of regional economic growth and convergence, the distribution dynamics approach which interrogates the evolution of the cross-sectional distribution as a whole and is concerned with both the external and internal dynamics of the distribution has received wide

In the study of regional economic growth and convergence, the distribution dynamics approach which interrogates the evolution of the cross-sectional distribution as a whole and is concerned with both the external and internal dynamics of the distribution has received wide usage. However, many methodological issues remain to be resolved before valid inferences and conclusions can be drawn from empirical research. Among them, spatial effects including spatial heterogeneity and spatial dependence invalidate the assumption of independent and identical distributions underlying the conventional maximum likelihood techniques while the availability of small samples in regional settings questions the usage of the asymptotic properties. This dissertation is comprised of three papers targeted at addressing these two issues. The first paper investigates whether the conventional regional income mobility estimators are still suitable in the presence of spatial dependence and/or a small sample. It is approached through a series of Monte Carlo experiments which require the proposal of a novel data generating process (DGP) capable of generating spatially dependent time series. The second paper moves to the statistical tests for detecting specific forms of spatial (spatiotemporal) effects in the discrete Markov chain model, investigating their robustness to the alternative spatial effect, sensitivity to discretization granularity, and properties in small sample settings. The third paper proposes discrete kernel estimators with cross-validated bandwidths as an alternative to maximum likelihood estimators in small sample settings. It is demonstrated that the performance of discrete kernel estimators offers improvement when the sample size is small. Taken together, the three papers constitute an endeavor to relax the restrictive assumptions of spatial independence and spatial homogeneity, as well as demonstrating the difference between the small sample and asymptotic properties for conventionally adopted maximum likelihood estimators towards a more valid inferential framework for the distribution dynamics approach to the study of regional economic growth and convergence.

Contributors

Agent

Created

Date Created
2018

158516-Thumbnail Image.png

Multiscale Geographically Weighted Regression: Computation, Inference, and Application

Description

Geographically Weighted Regression (GWR) has been broadly used in various fields to

model spatially non-stationary relationships. Classic GWR is considered as a single-scale model that is based on one bandwidth parameter which controls the amount of distance-decay in weighting neighboring data

Geographically Weighted Regression (GWR) has been broadly used in various fields to

model spatially non-stationary relationships. Classic GWR is considered as a single-scale model that is based on one bandwidth parameter which controls the amount of distance-decay in weighting neighboring data around each location. The single bandwidth in GWR assumes that processes (relationships between the response variable and the predictor variables) all operate at the same scale. However, this posits a limitation in modeling potentially multi-scale processes which are more often seen in the real world. For example, the measured ambient temperature of a location is affected by the built environment, regional weather and global warming, all of which operate at different scales. A recent advancement to GWR termed Multiscale GWR (MGWR) removes the single bandwidth assumption and allows the bandwidths for each covariate to vary. This results in each parameter surface being allowed to have a different degree of spatial variation, reflecting variation across covariate-specific processes. In this way, MGWR has the capability to differentiate local, regional and global processes by using varying bandwidths for covariates. Additionally, bandwidths in MGWR become explicit indicators of the scale at various processes operate. The proposed dissertation covers three perspectives centering on MGWR: Computation; Inference; and Application. The first component focuses on addressing computational issues in MGWR to allow MGWR models to be calibrated more efficiently and to be applied on large datasets. The second component aims to statistically differentiate the spatial scales at which different processes operate by quantifying the uncertainty associated with each bandwidth obtained from MGWR. In the third component, an empirical study will be conducted to model the changing relationships between county-level socio-economic factors and voter preferences in the 2008-2016 United States presidential elections using MGWR.

Contributors

Agent

Created

Date Created
2020

158387-Thumbnail Image.png

Spatial Mortality Modeling in Actuarial Science

Description

Modeling human survivorship is a core area of research within the actuarial com

munity. With life insurance policies and annuity products as dominant financial

instruments which depend on future mortality rates, there is a risk that observed

human mortality experiences will differ from

Modeling human survivorship is a core area of research within the actuarial com

munity. With life insurance policies and annuity products as dominant financial

instruments which depend on future mortality rates, there is a risk that observed

human mortality experiences will differ from projected when they are sold. From an

insurer’s portfolio perspective, to curb this risk, it is imperative that models of hu

man survivorship are constantly being updated and equipped to accurately gauge and

forecast mortality rates. At present, the majority of actuarial research in mortality

modeling involves factor-based approaches which operate at a global scale, placing

little attention on the determinants and interpretable risk factors of mortality, specif

ically from a spatial perspective. With an abundance of research being performed

in the field of spatial statistics and greater accessibility to localized mortality data,

there is a clear opportunity to extend the existing body of mortality literature to

wards the spatial domain. It is the objective of this dissertation to introduce these

new statistical approaches to equip the field of actuarial science to include geographic

space into the mortality modeling context.

First, this dissertation evaluates the underlying spatial patterns of mortality across

the United States, and introduces a spatial filtering methodology to generate latent

spatial patterns which capture the essence of these mortality rates in space. Second,

local modeling techniques are illustrated, and a multiscale geographically weighted

regression (MGWR) model is generated to describe the variation of mortality rates

across space in an interpretable manner which allows for the investigation of the

presence of spatial variability in the determinants of mortality. Third, techniques for

updating traditional mortality models are introduced, culminating in the development

of a model which addresses the relationship between space, economic growth, and

mortality. It is through these applications that this dissertation demonstrates the

utility in updating actuarial mortality models from a spatial perspective.

Contributors

Agent

Created

Date Created
2020

158850-Thumbnail Image.png

Spatial Regression and Gaussian Process BART

Description

Spatial regression is one of the central topics in spatial statistics. Based on the goals, interpretation or prediction, spatial regression models can be classified into two categories, linear mixed regression models and nonlinear regression models. This dissertation explored these models

Spatial regression is one of the central topics in spatial statistics. Based on the goals, interpretation or prediction, spatial regression models can be classified into two categories, linear mixed regression models and nonlinear regression models. This dissertation explored these models and their real world applications. New methods and models were proposed to overcome the challenges in practice. There are three major parts in the dissertation.

In the first part, nonlinear regression models were embedded into a multistage workflow to predict the spatial abundance of reef fish species in the Gulf of Mexico. There were two challenges, zero-inflated data and out of sample prediction. The methods and models in the workflow could effectively handle the zero-inflated sampling data without strong assumptions. Three strategies were proposed to solve the out of sample prediction problem. The results and discussions showed that the nonlinear prediction had the advantages of high accuracy, low bias and well-performed in multi-resolution.

In the second part, a two-stage spatial regression model was proposed for analyzing soil carbon stock (SOC) data. In the first stage, there was a spatial linear mixed model that captured the linear and stationary effects. In the second stage, a generalized additive model was used to explain the nonlinear and nonstationary effects. The results illustrated that the two-stage model had good interpretability in understanding the effect of covariates, meanwhile, it kept high prediction accuracy which is competitive to the popular machine learning models, like, random forest, xgboost and support vector machine.

A new nonlinear regression model, Gaussian process BART (Bayesian additive regression tree), was proposed in the third part. Combining advantages in both BART and Gaussian process, the model could capture the nonlinear effects of both observed and latent covariates. To develop the model, first, the traditional BART was generalized to accommodate correlated errors. Then, the failure of likelihood based Markov chain Monte Carlo (MCMC) in parameter estimating was discussed. Based on the idea of analysis of variation, back comparing and tuning range, were proposed to tackle this failure. Finally, effectiveness of the new model was examined by experiments on both simulation and real data.

Contributors

Agent

Created

Date Created
2020