Bayesian networks and gaussian mixture models in multi-dimensional data analysis with application to religion-conflict data
This thesis examines the application of statistical signal processing approaches to data arising from surveys intended to measure psychological and sociological phenomena underpinning human social dynamics. The use of signal processing methods for analysis of signals arising from measurement of social, biological, and other non-traditional phenomena has been an important and growing area of signal processing research over the past decade. Here, we explore the application of statistical modeling and signal processing concepts to data obtained from the Global Group Relations Project, specifically to understand and quantify the effects and interactions of social psychological factors related to intergroup conflicts. We use Bayesian networks to specify prospective models of conditional dependence. Bayesian networks are determined between social psychological factors and conflict variables, and modeled by directed acyclic graphs, while the significant interactions are modeled as conditional probabilities. Since the data are sparse and multi-dimensional, we regress Gaussian mixture models (GMMs) against the data to estimate the conditional probabilities of interest. The parameters of GMMs are estimated using the expectation-maximization (EM) algorithm. However, the EM algorithm may suffer from over-fitting problem due to the high dimensionality and limited observations entailed in this data set. Therefore, the Akaike information criterion (AIC) and the Bayesian information criterion (BIC) are used for GMM order estimation. To assist intuitive understanding of the interactions of social variables and the intergroup conflicts, we introduce a color-based visualization scheme. In this scheme, the intensities of colors are proportional to the conditional probabilities observed.