Matching Items (23)

390-Thumbnail Image.png

'Use it or Lose It' Professional Judgment: Educational Evaluation and Bayesian Reasoning

Description

This paper presents a Bayesian framework for evaluative classification. Current education policy debates center on arguments about whether and how to use student test score data in school and personnel

This paper presents a Bayesian framework for evaluative classification. Current education policy debates center on arguments about whether and how to use student test score data in school and personnel evaluation. Proponents of such use argue that refusing to use data violates both the public’s need to hold schools accountable when they use taxpayer dollars and students’ right to educational opportunities. Opponents of formulaic use of test-score data argue that most standardized test data is susceptible to fatal technical flaws, is a partial picture of student achievement, and leads to behavior that corrupts the measures.

A Bayesian perspective on summative ordinal classification is a possible framework for combining quantitative outcome data for students with the qualitative types of evaluation that critics of high-stakes testing advocate. This paper describes the key characteristics of a Bayesian perspective on classification, describes a method to translate a naïve Bayesian classifier into a point-based system for evaluation, and draws conclusions from the comparison on the construction of algorithmic (including point-based) systems that could capture the political and practical benefits of a Bayesian approach. The most important practical conclusion is that point-based systems with fixed components and weights cannot capture the dynamic and political benefits of a reciprocal relationship between professional judgment and quantitative student outcome data.

Contributors

Agent

Created

Date Created
  • 2009

157274-Thumbnail Image.png

A study of accelerated Bayesian additive regression trees

Description

Bayesian Additive Regression Trees (BART) is a non-parametric Bayesian model

that often outperforms other popular predictive models in terms of out-of-sample error. This thesis studies a modified version of BART called

Bayesian Additive Regression Trees (BART) is a non-parametric Bayesian model

that often outperforms other popular predictive models in terms of out-of-sample error. This thesis studies a modified version of BART called Accelerated Bayesian Additive Regression Trees (XBART). The study consists of simulation and real data experiments comparing XBART to other leading algorithms, including BART. The results show that XBART maintains BART’s predictive power while reducing its computation time. The thesis also describes the development of a Python package implementing XBART.

Contributors

Agent

Created

Date Created
  • 2019

151465-Thumbnail Image.png

Adaptive parameter estimation, modeling and patient-specific classification of electrocardiogram signals

Description

Adaptive processing and classification of electrocardiogram (ECG) signals are important in eliminating the strenuous process of manually annotating ECG recordings for clinical use. Such algorithms require robust models whose parameters

Adaptive processing and classification of electrocardiogram (ECG) signals are important in eliminating the strenuous process of manually annotating ECG recordings for clinical use. Such algorithms require robust models whose parameters can adequately describe the ECG signals. Although different dynamic statistical models describing ECG signals currently exist, they depend considerably on a priori information and user-specified model parameters. Also, ECG beat morphologies, which vary greatly across patients and disease states, cannot be uniquely characterized by a single model. In this work, sequential Bayesian based methods are used to appropriately model and adaptively select the corresponding model parameters of ECG signals. An adaptive framework based on a sequential Bayesian tracking method is proposed to adaptively select the cardiac parameters that minimize the estimation error, thus precluding the need for pre-processing. Simulations using real ECG data from the online Physionet database demonstrate the improvement in performance of the proposed algorithm in accurately estimating critical heart disease parameters. In addition, two new approaches to ECG modeling are presented using the interacting multiple model and the sequential Markov chain Monte Carlo technique with adaptive model selection. Both these methods can adaptively choose between different models for various ECG beat morphologies without requiring prior ECG information, as demonstrated by using real ECG signals. A supervised Bayesian maximum-likelihood (ML) based classifier uses the estimated model parameters to classify different types of cardiac arrhythmias. However, the non-availability of sufficient amounts of representative training data and the large inter-patient variability pose a challenge to the existing supervised learning algorithms, resulting in a poor classification performance. In addition, recently developed unsupervised learning methods require a priori knowledge on the number of diseases to cluster the ECG data, which often evolves over time. In order to address these issues, an adaptive learning ECG classification method that uses Dirichlet process Gaussian mixture models is proposed. This approach does not place any restriction on the number of disease classes, nor does it require any training data. This algorithm is adapted to be patient-specific by labeling or identifying the generated mixtures using the Bayesian ML method, assuming the availability of labeled training data.

Contributors

Agent

Created

Date Created
  • 2012

154080-Thumbnail Image.png

Bayesian D-optimal design issues and optimal design construction methods for generalized linear models with random blocks

Description

Optimal experimental design for generalized linear models is often done using a pseudo-Bayesian approach that integrates the design criterion across a prior distribution on the parameter values. This approach

Optimal experimental design for generalized linear models is often done using a pseudo-Bayesian approach that integrates the design criterion across a prior distribution on the parameter values. This approach ignores the lack of utility of certain models contained in the prior, and a case is demonstrated where the heavy focus on such hopeless models results in a design with poor performance and with wild swings in coverage probabilities for Wald-type confidence intervals. Design construction using a utility-based approach is shown to result in much more stable coverage probabilities in the area of greatest concern.

The pseudo-Bayesian approach can be applied to the problem of optimal design construction under dependent observations. Often, correlation between observations exists due to restrictions on randomization. Several techniques for optimal design construction are proposed in the case of the conditional response distribution being a natural exponential family member but with a normally distributed block effect . The reviewed pseudo-Bayesian approach is compared to an approach based on substituting the marginal likelihood with the joint likelihood and an approach based on projections of the score function (often called quasi-likelihood). These approaches are compared for several models with normal, Poisson, and binomial conditional response distributions via the true determinant of the expected Fisher information matrix where the dispersion of the random blocks is considered a nuisance parameter. A case study using the developed methods is performed.

The joint and quasi-likelihood methods are then extended to address the case when the magnitude of random block dispersion is of concern. Again, a simulation study over several models is performed, followed by a case study when the conditional response distribution is a Poisson distribution.

Contributors

Agent

Created

Date Created
  • 2015

152985-Thumbnail Image.png

Obtaining accurate estimates of the mediated effect with and without prior information

Description

Research methods based on the frequentist philosophy use prior information in a priori power calculations and when determining the necessary sample size for the detection of an effect, but not

Research methods based on the frequentist philosophy use prior information in a priori power calculations and when determining the necessary sample size for the detection of an effect, but not in statistical analyses. Bayesian methods incorporate prior knowledge into the statistical analysis in the form of a prior distribution. When prior information about a relationship is available, the estimates obtained could differ drastically depending on the choice of Bayesian or frequentist method. Study 1 in this project compared the performance of five methods for obtaining interval estimates of the mediated effect in terms of coverage, Type I error rate, empirical power, interval imbalance, and interval width at N = 20, 40, 60, 100 and 500. In Study 1, Bayesian methods with informative prior distributions performed almost identically to Bayesian methods with diffuse prior distributions, and had more power than normal theory confidence limits, lower Type I error rates than the percentile bootstrap, and coverage, interval width, and imbalance comparable to normal theory, percentile bootstrap, and the bias-corrected bootstrap confidence limits. Study 2 evaluated if a Bayesian method with true parameter values as prior information outperforms the other methods. The findings indicate that with true values of parameters as the prior information, Bayesian credibility intervals with informative prior distributions have more power, less imbalance, and narrower intervals than Bayesian credibility intervals with diffuse prior distributions, normal theory, percentile bootstrap, and bias-corrected bootstrap confidence limits. Study 3 examined how much power increases when increasing the precision of the prior distribution by a factor of ten for either the action or the conceptual path in mediation analysis. Power generally increases with increases in precision but there are many sample size and parameter value combinations where precision increases by a factor of 10 do not lead to substantial increases in power.

Contributors

Agent

Created

Date Created
  • 2014

154594-Thumbnail Image.png

A Bayesian network approach to early reliability assessment of complex systems

Description

Bayesian networks are powerful tools in system reliability assessment due to their flexibility in modeling the reliability structure of complex systems. This dissertation develops Bayesian network models for system reliability

Bayesian networks are powerful tools in system reliability assessment due to their flexibility in modeling the reliability structure of complex systems. This dissertation develops Bayesian network models for system reliability analysis through the use of Bayesian inference techniques.

Bayesian networks generalize fault trees by allowing components and subsystems to be related by conditional probabilities instead of deterministic relationships; thus, they provide analytical advantages to the situation when the failure structure is not well understood, especially during the product design stage. In order to tackle this problem, one needs to utilize auxiliary information such as the reliability information from similar products and domain expertise. For this purpose, a Bayesian network approach is proposed to incorporate data from functional analysis and parent products. The functions with low reliability and their impact on other functions in the network are identified, so that design changes can be suggested for system reliability improvement.

A complex system does not necessarily have all components being monitored at the same time, causing another challenge in the reliability assessment problem. Sometimes there are a limited number of sensors deployed in the system to monitor the states of some components or subsystems, but not all of them. Data simultaneously collected from multiple sensors on the same system are analyzed using a Bayesian network approach, and the conditional probabilities of the network are estimated by combining failure information and expert opinions at both system and component levels. Several data scenarios with discrete, continuous and hybrid data (both discrete and continuous data) are analyzed. Posterior distributions of the reliability parameters of the system and components are assessed using simultaneous data.

Finally, a Bayesian framework is proposed to incorporate different sources of prior information and reconcile these different sources, including expert opinions and component information, in order to form a prior distribution for the system. Incorporating expert opinion in the form of pseudo-observations substantially simplifies statistical modeling, as opposed to the pooling techniques and supra Bayesian methods used for combining prior distributions in the literature.

The methods proposed are demonstrated with several case studies.

Contributors

Agent

Created

Date Created
  • 2016

153263-Thumbnail Image.png

The application of Bayesian networks in system reliability

Description

In this paper, a literature review is presented on the application of Bayesian networks applied in system reliability analysis. It is shown that Bayesian networks have become a popular modeling

In this paper, a literature review is presented on the application of Bayesian networks applied in system reliability analysis. It is shown that Bayesian networks have become a popular modeling framework for system reliability analysis due to the benefits that Bayesian networks have the capability and flexibility to model complex systems, update the probability according to evidences and give a straightforward and compact graphical representation. Research on approaches for Bayesian network learning and inference are summarized. Two groups of models with multistate nodes were developed for scenarios from constant to continuous time to apply and contrast Bayesian networks with classical fault tree method. The expanded model discretized the continuous variables and provided failure related probability distribution over time.

Contributors

Agent

Created

Date Created
  • 2014

157748-Thumbnail Image.png

Bayesian nonparametric modeling and inference for multiple object tracking

Description

The problem of multiple object tracking seeks to jointly estimate the time-varying cardinality and trajectory of each object. There are numerous challenges that are encountered in tracking multiple objects including

The problem of multiple object tracking seeks to jointly estimate the time-varying cardinality and trajectory of each object. There are numerous challenges that are encountered in tracking multiple objects including a time-varying number of measurements, under varying constraints, and environmental conditions. In this thesis, the proposed statistical methods integrate the use of physical-based models with Bayesian nonparametric methods to address the main challenges in a tracking problem. In particular, Bayesian nonparametric methods are exploited to efficiently and robustly infer object identity and learn time-dependent cardinality; together with Bayesian inference methods, they are also used to associate measurements to objects and estimate the trajectory of objects. These methods differ from the current methods to the core as the existing methods are mainly based on random finite set theory.

The first contribution proposes dependent nonparametric models such as the dependent Dirichlet process and the dependent Pitman-Yor process to capture the inherent time-dependency in the problem at hand. These processes are used as priors for object state distributions to learn dependent information between previous and current time steps. Markov chain Monte Carlo sampling methods exploit the learned information to sample from posterior distributions and update the estimated object parameters.

The second contribution proposes a novel, robust, and fast nonparametric approach based on a diffusion process over infinite random trees to infer information on object cardinality and trajectory. This method follows the hierarchy induced by objects entering and leaving a scene and the time-dependency between unknown object parameters. Markov chain Monte Carlo sampling methods integrate the prior distributions over the infinite random trees with time-dependent diffusion processes to update object states.

The third contribution develops the use of hierarchical models to form a prior for statistically dependent measurements in a single object tracking setup. Dependency among the sensor measurements provides extra information which is incorporated to achieve the optimal tracking performance. The hierarchical Dirichlet process as a prior provides the required flexibility to do inference. Bayesian tracker is integrated with the hierarchical Dirichlet process prior to accurately estimate the object trajectory.

The fourth contribution proposes an approach to model both the multiple dependent objects and multiple dependent measurements. This approach integrates the dependent Dirichlet process modeling over the dependent object with the hierarchical Dirichlet process modeling of the measurements to fully capture the dependency among both object and measurements. Bayesian nonparametric models can successfully associate each measurement to the corresponding object and exploit dependency among them to more accurately infer the trajectory of objects. Markov chain Monte Carlo methods amalgamate the dependent Dirichlet process with the hierarchical Dirichlet process to infer the object identity and object cardinality.

Simulations are exploited to demonstrate the improvement in multiple object tracking performance when compared to approaches that are developed based on random finite set theory.

Contributors

Agent

Created

Date Created
  • 2019

155670-Thumbnail Image.png

Statistical properties of the single mediator model with latent variables in the bayesian framework

Description

Statistical mediation analysis has been widely used in the social sciences in order to examine the indirect effects of an independent variable on a dependent variable. The statistical properties of

Statistical mediation analysis has been widely used in the social sciences in order to examine the indirect effects of an independent variable on a dependent variable. The statistical properties of the single mediator model with manifest and latent variables have been studied using simulation studies. However, the single mediator model with latent variables in the Bayesian framework with various accurate and inaccurate priors for structural and measurement model parameters has yet to be evaluated in a statistical simulation. This dissertation outlines the steps in the estimation of a single mediator model with latent variables as a Bayesian structural equation model (SEM). A Monte Carlo study is carried out in order to examine the statistical properties of point and interval summaries for the mediated effect in the Bayesian latent variable single mediator model with prior distributions with varying degrees of accuracy and informativeness. Bayesian methods with diffuse priors have equally good statistical properties as Maximum Likelihood (ML) and the distribution of the product. With accurate informative priors Bayesian methods can increase power up to 25% and decrease interval width up to 24%. With inaccurate informative priors the point summaries of the mediated effect are more biased than ML estimates, and the bias is higher if the inaccuracy occurs in priors for structural parameters than in priors for measurement model parameters. Findings from the Monte Carlo study are generalizable to Bayesian analyses with priors of the same distributional forms that have comparable amounts of (in)accuracy and informativeness to priors evaluated in the Monte Carlo study.

Contributors

Agent

Created

Date Created
  • 2017

154967-Thumbnail Image.png

Use of Bayesian filtering and adaptive learning methods to improve the detection and estimation of pathological and neurological disorders

Description

Biological and biomedical measurements, when adequately analyzed and processed, can be used to impart quantitative diagnosis during primary health care consultation to improve patient adherence to recommended treatments. For example,

Biological and biomedical measurements, when adequately analyzed and processed, can be used to impart quantitative diagnosis during primary health care consultation to improve patient adherence to recommended treatments. For example, analyzing neural recordings from neurostimulators implanted in patients with neurological disorders can be used by a physician to adjust detrimental stimulation parameters to improve treatment. As another example, biosequences, such as sequences from peptide microarrays obtained from a biological sample, can potentially provide pre-symptomatic diagnosis for infectious diseases when processed to associate antibodies to specific pathogens or infectious agents. This work proposes advanced statistical signal processing and machine learning methodologies to assess neurostimulation from neural recordings and to extract diagnostic information from biosequences.

For locating specific cognitive and behavioral information in different regions of the brain, neural recordings are processed using sequential Bayesian filtering methods to detect and estimate both the number of neural sources and their corresponding parameters. Time-frequency based feature selection algorithms are combined with adaptive machine learning approaches to suppress physiological and non-physiological artifacts present in neural recordings. Adaptive processing and unsupervised clustering methods applied to neural recordings are also used to suppress neurostimulation artifacts and classify between various behavior tasks to assess the level of neurostimulation in patients.

For pathogen detection and identification, random peptide sequences and their properties are first uniquely mapped to highly-localized signals and their corresponding parameters in the time-frequency plane. Time-frequency signal processing methods are then applied to estimate antigenic determinants or epitope candidates for detecting and identifying potential pathogens.

Contributors

Agent

Created

Date Created
  • 2016