Matching Items (23)
154080-Thumbnail Image.png
Description
Optimal experimental design for generalized linear models is often done using a pseudo-Bayesian approach that integrates the design criterion across a prior distribution on the parameter values. This approach ignores the lack of utility of certain models contained in the prior, and a case is demonstrated where the heavy

Optimal experimental design for generalized linear models is often done using a pseudo-Bayesian approach that integrates the design criterion across a prior distribution on the parameter values. This approach ignores the lack of utility of certain models contained in the prior, and a case is demonstrated where the heavy focus on such hopeless models results in a design with poor performance and with wild swings in coverage probabilities for Wald-type confidence intervals. Design construction using a utility-based approach is shown to result in much more stable coverage probabilities in the area of greatest concern.

The pseudo-Bayesian approach can be applied to the problem of optimal design construction under dependent observations. Often, correlation between observations exists due to restrictions on randomization. Several techniques for optimal design construction are proposed in the case of the conditional response distribution being a natural exponential family member but with a normally distributed block effect . The reviewed pseudo-Bayesian approach is compared to an approach based on substituting the marginal likelihood with the joint likelihood and an approach based on projections of the score function (often called quasi-likelihood). These approaches are compared for several models with normal, Poisson, and binomial conditional response distributions via the true determinant of the expected Fisher information matrix where the dispersion of the random blocks is considered a nuisance parameter. A case study using the developed methods is performed.

The joint and quasi-likelihood methods are then extended to address the case when the magnitude of random block dispersion is of concern. Again, a simulation study over several models is performed, followed by a case study when the conditional response distribution is a Poisson distribution.
ContributorsHassler, Edgar (Author) / Montgomery, Douglas C. (Thesis advisor) / Silvestrini, Rachel T. (Thesis advisor) / Borror, Connie M. (Committee member) / Pan, Rong (Committee member) / Arizona State University (Publisher)
Created2015
156576-Thumbnail Image.png
Description
The primary objective in time series analysis is forecasting. Raw data often exhibits nonstationary behavior: trends, seasonal cycles, and heteroskedasticity. After data is transformed to a weakly stationary process, autoregressive moving average (ARMA) models may capture the remaining temporal dynamics to improve forecasting. Estimation of ARMA can be performed

The primary objective in time series analysis is forecasting. Raw data often exhibits nonstationary behavior: trends, seasonal cycles, and heteroskedasticity. After data is transformed to a weakly stationary process, autoregressive moving average (ARMA) models may capture the remaining temporal dynamics to improve forecasting. Estimation of ARMA can be performed through regressing current values on previous realizations and proxy innovations. The classic paradigm fails when dynamics are nonlinear; in this case, parametric, regime-switching specifications model changes in level, ARMA dynamics, and volatility, using a finite number of latent states. If the states can be identified using past endogenous or exogenous information, a threshold autoregressive (TAR) or logistic smooth transition autoregressive (LSTAR) model may simplify complex nonlinear associations to conditional weakly stationary processes. For ARMA, TAR, and STAR, order parameters quantify the extent past information is associated with the future. Unfortunately, even if model orders are known a priori, the possibility of over-fitting can lead to sub-optimal forecasting performance. By intentionally overestimating these orders, a linear representation of the full model is exploited and Bayesian regularization can be used to achieve sparsity. Global-local shrinkage priors for AR, MA, and exogenous coefficients are adopted to pull posterior means toward 0 without over-shrinking relevant effects. This dissertation introduces, evaluates, and compares Bayesian techniques that automatically perform model selection and coefficient estimation of ARMA, TAR, and STAR models. Multiple Monte Carlo experiments illustrate the accuracy of these methods in finding the "true" data generating process. Practical applications demonstrate their efficacy in forecasting.
ContributorsGiacomazzo, Mario (Author) / Kamarianakis, Yiannis (Thesis advisor) / Reiser, Mark R. (Committee member) / McCulloch, Robert (Committee member) / Hahn, Richard (Committee member) / Fricks, John (Committee member) / Arizona State University (Publisher)
Created2018
156690-Thumbnail Image.png
Description
Dynamic Bayesian networks (DBNs; Reye, 2004) are a promising tool for modeling student proficiency under rich measurement scenarios (Reichenberg, in press). These scenarios often present assessment conditions far more complex than what is seen with more traditional assessments and require assessment arguments and psychometric models capable of integrating those complexities.

Dynamic Bayesian networks (DBNs; Reye, 2004) are a promising tool for modeling student proficiency under rich measurement scenarios (Reichenberg, in press). These scenarios often present assessment conditions far more complex than what is seen with more traditional assessments and require assessment arguments and psychometric models capable of integrating those complexities. Unfortunately, DBNs remain understudied and their psychometric properties relatively unknown. If the apparent strengths of DBNs are to be leveraged, then the body of literature surrounding their properties and use needs to be expanded upon. To this end, the current work aimed at exploring the properties of DBNs under a variety of realistic psychometric conditions. A two-phase Monte Carlo simulation study was conducted in order to evaluate parameter recovery for DBNs using maximum likelihood estimation with the Netica software package. Phase 1 included a limited number of conditions and was exploratory in nature while Phase 2 included a larger and more targeted complement of conditions. Manipulated factors included sample size, measurement quality, test length, the number of measurement occasions. Results suggested that measurement quality has the most prominent impact on estimation quality with more distinct performance categories yielding better estimation. While increasing sample size tended to improve estimation, there were a limited number of conditions under which greater samples size led to more estimation bias. An exploration of this phenomenon is included. From a practical perspective, parameter recovery appeared to be sufficient with samples as low as N = 400 as long as measurement quality was not poor and at least three items were present at each measurement occasion. Tests consisting of only a single item required exceptional measurement quality in order to adequately recover model parameters. The study was somewhat limited due to potentially software-specific issues as well as a non-comprehensive collection of experimental conditions. Further research should replicate and, potentially expand the current work using other software packages including exploring alternate estimation methods (e.g., Markov chain Monte Carlo).
ContributorsReichenberg, Raymond E (Author) / Levy, Roy (Thesis advisor) / Eggum-Wilkens, Natalie (Thesis advisor) / Iida, Masumi (Committee member) / DeLay, Dawn (Committee member) / Arizona State University (Publisher)
Created2018
156902-Thumbnail Image.png
Description
Pipeline infrastructure forms a vital aspect of the United States economy and standard of living. A majority of the current pipeline systems were installed in the early 1900’s and often lack a reliable database reporting the mechanical properties, and information about manufacturing and installation, thereby raising a concern for their

Pipeline infrastructure forms a vital aspect of the United States economy and standard of living. A majority of the current pipeline systems were installed in the early 1900’s and often lack a reliable database reporting the mechanical properties, and information about manufacturing and installation, thereby raising a concern for their safety and integrity. Testing for the aging pipe strength and toughness estimation without interrupting the transmission and operations thus becomes important. The state-of-the-art techniques tend to focus on the single modality deterministic estimation of pipe strength and do not account for inhomogeneity and uncertainties, many others appear to rely on destructive means. These gaps provide an impetus for novel methods to better characterize the pipe material properties. The focus of this study is the design of a Bayesian Network information fusion model for the prediction of accurate probabilistic pipe strength and consequently the maximum allowable operating pressure. A multimodal diagnosis is performed by assessing the mechanical property variation within the pipe in terms of material property measurements, such as microstructure, composition, hardness and other mechanical properties through experimental analysis, which are then integrated with the Bayesian network model that uses a Markov chain Monte Carlo (MCMC) algorithm. Prototype testing is carried out for model verification, validation and demonstration and data training of the model is employed to obtain a more accurate measure of the probabilistic pipe strength. With a view of providing a holistic measure of material performance in service, the fatigue properties of the pipe steel are investigated. The variation in the fatigue crack growth rate (da/dN) along the direction of the pipe wall thickness is studied in relation to the microstructure and the material constants for the crack growth have been reported. A combination of imaging and composition analysis is incorporated to study the fracture surface of the fatigue specimen. Finally, some well-known statistical inference models are employed for prediction of manufacturing process parameters for steel pipelines. The adaptability of the small datasets for the accuracy of the prediction outcomes is discussed and the models are compared for their performance.
ContributorsDahire, Sonam (Author) / Liu, Yongming (Thesis advisor) / Jiao, Yang (Committee member) / Ren, Yi (Committee member) / Arizona State University (Publisher)
Created2018
157121-Thumbnail Image.png
Description
In this work, I present a Bayesian inference computational framework for the analysis of widefield microscopy data that addresses three challenges: (1) counting and localizing stationary fluorescent molecules; (2) inferring a spatially-dependent effective fluorescence profile that describes the spatially-varying rate at which fluorescent molecules emit subsequently-detected photons (due to different

In this work, I present a Bayesian inference computational framework for the analysis of widefield microscopy data that addresses three challenges: (1) counting and localizing stationary fluorescent molecules; (2) inferring a spatially-dependent effective fluorescence profile that describes the spatially-varying rate at which fluorescent molecules emit subsequently-detected photons (due to different illumination intensities or different local environments); and (3) inferring the camera gain. My general theoretical framework utilizes the Bayesian nonparametric Gaussian and beta-Bernoulli processes with a Markov chain Monte Carlo sampling scheme, which I further specify and implement for Total Internal Reflection Fluorescence (TIRF) microscopy data, benchmarking the method on synthetic data. These three frameworks are self-contained, and can be used concurrently so that the fluorescence profile and emitter locations are both considered unknown and, under some conditions, learned simultaneously. The framework I present is flexible and may be adapted to accommodate the inference of other parameters, such as emission photophysical kinetics and the trajectories of moving molecules. My TIRF-specific implementation may find use in the study of structures on cell membranes, or in studying local sample properties that affect fluorescent molecule photon emission rates.
ContributorsWallgren, Ross (Author) / Presse, Steve (Thesis advisor) / Armbruster, Hans (Thesis advisor) / McCulloch, Robert (Committee member) / Arizona State University (Publisher)
Created2019
157274-Thumbnail Image.png
Description
Bayesian Additive Regression Trees (BART) is a non-parametric Bayesian model

that often outperforms other popular predictive models in terms of out-of-sample error. This thesis studies a modified version of BART called Accelerated Bayesian Additive Regression Trees (XBART). The study consists of simulation and real data experiments comparing XBART to other leading

Bayesian Additive Regression Trees (BART) is a non-parametric Bayesian model

that often outperforms other popular predictive models in terms of out-of-sample error. This thesis studies a modified version of BART called Accelerated Bayesian Additive Regression Trees (XBART). The study consists of simulation and real data experiments comparing XBART to other leading algorithms, including BART. The results show that XBART maintains BART’s predictive power while reducing its computation time. The thesis also describes the development of a Python package implementing XBART.
ContributorsYalov, Saar (Author) / Hahn, P. Richard (Thesis advisor) / McCulloch, Robert (Committee member) / Kao, Ming-Hung (Committee member) / Arizona State University (Publisher)
Created2019
154594-Thumbnail Image.png
Description
Bayesian networks are powerful tools in system reliability assessment due to their flexibility in modeling the reliability structure of complex systems. This dissertation develops Bayesian network models for system reliability analysis through the use of Bayesian inference techniques.

Bayesian networks generalize fault trees by allowing components and subsystems to be related

Bayesian networks are powerful tools in system reliability assessment due to their flexibility in modeling the reliability structure of complex systems. This dissertation develops Bayesian network models for system reliability analysis through the use of Bayesian inference techniques.

Bayesian networks generalize fault trees by allowing components and subsystems to be related by conditional probabilities instead of deterministic relationships; thus, they provide analytical advantages to the situation when the failure structure is not well understood, especially during the product design stage. In order to tackle this problem, one needs to utilize auxiliary information such as the reliability information from similar products and domain expertise. For this purpose, a Bayesian network approach is proposed to incorporate data from functional analysis and parent products. The functions with low reliability and their impact on other functions in the network are identified, so that design changes can be suggested for system reliability improvement.

A complex system does not necessarily have all components being monitored at the same time, causing another challenge in the reliability assessment problem. Sometimes there are a limited number of sensors deployed in the system to monitor the states of some components or subsystems, but not all of them. Data simultaneously collected from multiple sensors on the same system are analyzed using a Bayesian network approach, and the conditional probabilities of the network are estimated by combining failure information and expert opinions at both system and component levels. Several data scenarios with discrete, continuous and hybrid data (both discrete and continuous data) are analyzed. Posterior distributions of the reliability parameters of the system and components are assessed using simultaneous data.

Finally, a Bayesian framework is proposed to incorporate different sources of prior information and reconcile these different sources, including expert opinions and component information, in order to form a prior distribution for the system. Incorporating expert opinion in the form of pseudo-observations substantially simplifies statistical modeling, as opposed to the pooling techniques and supra Bayesian methods used for combining prior distributions in the literature.

The methods proposed are demonstrated with several case studies.
ContributorsYontay, Petek (Author) / Pan, Rong (Thesis advisor) / Montgomery, Douglas C. (Committee member) / Shunk, Dan L. (Committee member) / Du, Xiaoping (Committee member) / Arizona State University (Publisher)
Created2016
155025-Thumbnail Image.png
Description
Accurate data analysis and interpretation of results may be influenced by many potential factors. The factors of interest in the current work are the chosen analysis model(s), the presence of missing data, and the type(s) of data collected. If analysis models are used which a) do not accurately capture the

Accurate data analysis and interpretation of results may be influenced by many potential factors. The factors of interest in the current work are the chosen analysis model(s), the presence of missing data, and the type(s) of data collected. If analysis models are used which a) do not accurately capture the structure of relationships in the data such as clustered/hierarchical data, b) do not allow or control for missing values present in the data, or c) do not accurately compensate for different data types such as categorical data, then the assumptions associated with the model have not been met and the results of the analysis may be inaccurate. In the presence of clustered
ested data, hierarchical linear modeling or multilevel modeling (MLM; Raudenbush & Bryk, 2002) has the ability to predict outcomes for each level of analysis and across multiple levels (accounting for relationships between levels) providing a significant advantage over single-level analyses. When multilevel data contain missingness, multilevel multiple imputation (MLMI) techniques may be used to model both the missingness and the clustered nature of the data. With categorical multilevel data with missingness, categorical MLMI must be used. Two such routines for MLMI with continuous and categorical data were explored with missing at random (MAR) data: a formal Bayesian imputation and analysis routine in JAGS (R/JAGS) and a common MLM procedure of imputation via Bayesian estimation in BLImP with frequentist analysis of the multilevel model in Mplus (BLImP/Mplus). Manipulated variables included interclass correlations, number of clusters, and the rate of missingness. Results showed that with continuous data, R/JAGS returned more accurate parameter estimates than BLImP/Mplus for almost all parameters of interest across levels of the manipulated variables. Both R/JAGS and BLImP/Mplus encountered convergence issues and returned inaccurate parameter estimates when imputing and analyzing dichotomous data. Follow-up studies showed that JAGS and BLImP returned similar imputed datasets but the choice of analysis software for MLM impacted the recovery of accurate parameter estimates. Implications of these findings and recommendations for further research will be discussed.
ContributorsKunze, Katie L (Author) / Levy, Roy (Thesis advisor) / Enders, Craig K. (Committee member) / Thompson, Marilyn S (Committee member) / Arizona State University (Publisher)
Created2016
154967-Thumbnail Image.png
Description
Biological and biomedical measurements, when adequately analyzed and processed, can be used to impart quantitative diagnosis during primary health care consultation to improve patient adherence to recommended treatments. For example, analyzing neural recordings from neurostimulators implanted in patients with neurological disorders can be used by a physician to adjust detrimental

Biological and biomedical measurements, when adequately analyzed and processed, can be used to impart quantitative diagnosis during primary health care consultation to improve patient adherence to recommended treatments. For example, analyzing neural recordings from neurostimulators implanted in patients with neurological disorders can be used by a physician to adjust detrimental stimulation parameters to improve treatment. As another example, biosequences, such as sequences from peptide microarrays obtained from a biological sample, can potentially provide pre-symptomatic diagnosis for infectious diseases when processed to associate antibodies to specific pathogens or infectious agents. This work proposes advanced statistical signal processing and machine learning methodologies to assess neurostimulation from neural recordings and to extract diagnostic information from biosequences.

For locating specific cognitive and behavioral information in different regions of the brain, neural recordings are processed using sequential Bayesian filtering methods to detect and estimate both the number of neural sources and their corresponding parameters. Time-frequency based feature selection algorithms are combined with adaptive machine learning approaches to suppress physiological and non-physiological artifacts present in neural recordings. Adaptive processing and unsupervised clustering methods applied to neural recordings are also used to suppress neurostimulation artifacts and classify between various behavior tasks to assess the level of neurostimulation in patients.

For pathogen detection and identification, random peptide sequences and their properties are first uniquely mapped to highly-localized signals and their corresponding parameters in the time-frequency plane. Time-frequency signal processing methods are then applied to estimate antigenic determinants or epitope candidates for detecting and identifying potential pathogens.
ContributorsMaurer, Alexander Joseph (Author) / Papandreou-Suppappola, Antonia (Thesis advisor) / Bliss, Daniel (Committee member) / Chakrabarti, Chaitali (Committee member) / Kovvali, Narayan (Committee member) / Arizona State University (Publisher)
Created2016
155670-Thumbnail Image.png
Description
Statistical mediation analysis has been widely used in the social sciences in order to examine the indirect effects of an independent variable on a dependent variable. The statistical properties of the single mediator model with manifest and latent variables have been studied using simulation studies. However, the single mediator model

Statistical mediation analysis has been widely used in the social sciences in order to examine the indirect effects of an independent variable on a dependent variable. The statistical properties of the single mediator model with manifest and latent variables have been studied using simulation studies. However, the single mediator model with latent variables in the Bayesian framework with various accurate and inaccurate priors for structural and measurement model parameters has yet to be evaluated in a statistical simulation. This dissertation outlines the steps in the estimation of a single mediator model with latent variables as a Bayesian structural equation model (SEM). A Monte Carlo study is carried out in order to examine the statistical properties of point and interval summaries for the mediated effect in the Bayesian latent variable single mediator model with prior distributions with varying degrees of accuracy and informativeness. Bayesian methods with diffuse priors have equally good statistical properties as Maximum Likelihood (ML) and the distribution of the product. With accurate informative priors Bayesian methods can increase power up to 25% and decrease interval width up to 24%. With inaccurate informative priors the point summaries of the mediated effect are more biased than ML estimates, and the bias is higher if the inaccuracy occurs in priors for structural parameters than in priors for measurement model parameters. Findings from the Monte Carlo study are generalizable to Bayesian analyses with priors of the same distributional forms that have comparable amounts of (in)accuracy and informativeness to priors evaluated in the Monte Carlo study.
ContributorsMiočević, Milica (Author) / Mackinnon, David P. (Thesis advisor) / Levy, Roy (Thesis advisor) / Grimm, Kevin (Committee member) / West, Stephen G. (Committee member) / Arizona State University (Publisher)
Created2017