Matching Items (3)

158850-Thumbnail Image.png

Spatial Regression and Gaussian Process BART

Description

Spatial regression is one of the central topics in spatial statistics. Based on the goals, interpretation or prediction, spatial regression models can be classified into two categories, linear mixed regression

Spatial regression is one of the central topics in spatial statistics. Based on the goals, interpretation or prediction, spatial regression models can be classified into two categories, linear mixed regression models and nonlinear regression models. This dissertation explored these models and their real world applications. New methods and models were proposed to overcome the challenges in practice. There are three major parts in the dissertation.

In the first part, nonlinear regression models were embedded into a multistage workflow to predict the spatial abundance of reef fish species in the Gulf of Mexico. There were two challenges, zero-inflated data and out of sample prediction. The methods and models in the workflow could effectively handle the zero-inflated sampling data without strong assumptions. Three strategies were proposed to solve the out of sample prediction problem. The results and discussions showed that the nonlinear prediction had the advantages of high accuracy, low bias and well-performed in multi-resolution.

In the second part, a two-stage spatial regression model was proposed for analyzing soil carbon stock (SOC) data. In the first stage, there was a spatial linear mixed model that captured the linear and stationary effects. In the second stage, a generalized additive model was used to explain the nonlinear and nonstationary effects. The results illustrated that the two-stage model had good interpretability in understanding the effect of covariates, meanwhile, it kept high prediction accuracy which is competitive to the popular machine learning models, like, random forest, xgboost and support vector machine.

A new nonlinear regression model, Gaussian process BART (Bayesian additive regression tree), was proposed in the third part. Combining advantages in both BART and Gaussian process, the model could capture the nonlinear effects of both observed and latent covariates. To develop the model, first, the traditional BART was generalized to accommodate correlated errors. Then, the failure of likelihood based Markov chain Monte Carlo (MCMC) in parameter estimating was discussed. Based on the idea of analysis of variation, back comparing and tuning range, were proposed to tackle this failure. Finally, effectiveness of the new model was examined by experiments on both simulation and real data.

Contributors

Agent

Created

Date Created
  • 2020

150929-Thumbnail Image.png

Bayesian networks and gaussian mixture models in multi-dimensional data analysis with application to religion-conflict data

Description

This thesis examines the application of statistical signal processing approaches to data arising from surveys intended to measure psychological and sociological phenomena underpinning human social dynamics. The use of signal

This thesis examines the application of statistical signal processing approaches to data arising from surveys intended to measure psychological and sociological phenomena underpinning human social dynamics. The use of signal processing methods for analysis of signals arising from measurement of social, biological, and other non-traditional phenomena has been an important and growing area of signal processing research over the past decade. Here, we explore the application of statistical modeling and signal processing concepts to data obtained from the Global Group Relations Project, specifically to understand and quantify the effects and interactions of social psychological factors related to intergroup conflicts. We use Bayesian networks to specify prospective models of conditional dependence. Bayesian networks are determined between social psychological factors and conflict variables, and modeled by directed acyclic graphs, while the significant interactions are modeled as conditional probabilities. Since the data are sparse and multi-dimensional, we regress Gaussian mixture models (GMMs) against the data to estimate the conditional probabilities of interest. The parameters of GMMs are estimated using the expectation-maximization (EM) algorithm. However, the EM algorithm may suffer from over-fitting problem due to the high dimensionality and limited observations entailed in this data set. Therefore, the Akaike information criterion (AIC) and the Bayesian information criterion (BIC) are used for GMM order estimation. To assist intuitive understanding of the interactions of social variables and the intergroup conflicts, we introduce a color-based visualization scheme. In this scheme, the intensities of colors are proportional to the conditional probabilities observed.

Contributors

Agent

Created

Date Created
  • 2012

151080-Thumbnail Image.png

Optimum corona ring design for high voltage compact transmission lines using Gaussian process model

Description

Electric utilities are exploring new technologies to cope up with the in-crease in electricity demand and power transfer capabilities of transmission lines. Compact transmission lines and high phase order systems

Electric utilities are exploring new technologies to cope up with the in-crease in electricity demand and power transfer capabilities of transmission lines. Compact transmission lines and high phase order systems are few of the techniques which enhance the power transfer capability of transmission lines without requiring any additional right-of-way. This research work investigates the impact of compacting high voltage transmission lines and high phase order systems on the surface electric field of composite insulators, a key factor deciding service performance of insulators. The electric field analysis was done using COULOMB 9.0, a 3D software package which uses a numerical analysis technique based on Boundary Element Method (BEM). 3D models of various types of standard transmission towers used for 230 kV, 345 kV and 500 kV level were modeled with different insulators con-figurations and number of circuits. Standard tower configuration models were compacted by reducing the clearance from live parts in steps of 10%. It was found that the standard tower configuration can be compacted to 30% without violating the minimum safety clearance mandated by NESC standards. The study shows that surface electric field on insulators for few of the compact structures exceeded the maximum allowable limit even if corona rings were installed. As a part of this study, a Gaussian process model based optimization pro-gram was developed to find the optimum corona ring dimensions to limit the electric field within stipulated values. The optimization program provides the dimen-sions of corona ring, its placement from the high voltage end for a given dry arc length of insulator and system voltage. JMP, a statistical computer package and AMPL, a computer language widely used form optimization was used for optimi-zation program. The results obtained from optimization program validated the industrial standards.

Contributors

Agent

Created

Date Created
  • 2012