Matching Items (5)

147660-Thumbnail Image.png

A Bayesian Approach to Single-Photon Single-Molecule FRET data

Description

Single molecule FRET experiments are important for studying processes that happen on the molecular scale. By using pulsed illumination and collecting single photons, it is possible to use information gained

Single molecule FRET experiments are important for studying processes that happen on the molecular scale. By using pulsed illumination and collecting single photons, it is possible to use information gained from the fluorescence lifetime of the chromophores in the FRET pair to gain more accurate estimates of the underlying FRET rate which is used to determine information about the distance between the chromophores of the FRET pair. In this paper, we outline a method that utilizes Bayesian inference to learn parameter values for a model informed by the physics of a immobilized single-molecule FRET experiment. This method is unique in that it combines a rigorous look at the photophysics of the FRET pair and a nonparametric treatment of the molecular conformational statespace, allowing the method to learn not just relevant photophysical rates (such as relaxation rates and FRET rates), but also the number of molecular conformational states.

Contributors

Agent

Created

Date Created
  • 2021-05

157174-Thumbnail Image.png

Cost-Sensitive Selective Classification and its Applications to Online Fraud Management

Description

Fraud is defined as the utilization of deception for illegal gain by hiding the true nature of the activity. While organizations lose around $3.7 trillion in revenue due to financial

Fraud is defined as the utilization of deception for illegal gain by hiding the true nature of the activity. While organizations lose around $3.7 trillion in revenue due to financial crimes and fraud worldwide, they can affect all levels of society significantly. In this dissertation, I focus on credit card fraud in online transactions. Every online transaction comes with a fraud risk and it is the merchant's liability to detect and stop fraudulent transactions. Merchants utilize various mechanisms to prevent and manage fraud such as automated fraud detection systems and manual transaction reviews by expert fraud analysts. Many proposed solutions mostly focus on fraud detection accuracy and ignore financial considerations. Also, the highly effective manual review process is overlooked. First, I propose Profit Optimizing Neural Risk Manager (PONRM), a selective classifier that (a) constitutes optimal collaboration between machine learning models and human expertise under industrial constraints, (b) is cost and profit sensitive. I suggest directions on how to characterize fraudulent behavior and assess the risk of a transaction. I show that my framework outperforms cost-sensitive and cost-insensitive baselines on three real-world merchant datasets. While PONRM is able to work with many supervised learners and obtain convincing results, utilizing probability outputs directly from the trained model itself can pose problems, especially in deep learning as softmax output is not a true uncertainty measure. This phenomenon, and the wide and rapid adoption of deep learning by practitioners brought unintended consequences in many situations such as in the infamous case of Google Photos' racist image recognition algorithm; thus, necessitated the utilization of the quantified uncertainty for each prediction. There have been recent efforts towards quantifying uncertainty in conventional deep learning methods (e.g., dropout as Bayesian approximation); however, their optimal use in decision making is often overlooked and understudied. Thus, I present a mixed-integer programming framework for selective classification called MIPSC, that investigates and combines model uncertainty and predictive mean to identify optimal classification and rejection regions. I also extend this framework to cost-sensitive settings (MIPCSC) and focus on the critical real-world problem, online fraud management and show that my approach outperforms industry standard methods significantly for online fraud management in real-world settings.

Contributors

Agent

Created

Date Created
  • 2019

154594-Thumbnail Image.png

A Bayesian network approach to early reliability assessment of complex systems

Description

Bayesian networks are powerful tools in system reliability assessment due to their flexibility in modeling the reliability structure of complex systems. This dissertation develops Bayesian network models for system reliability

Bayesian networks are powerful tools in system reliability assessment due to their flexibility in modeling the reliability structure of complex systems. This dissertation develops Bayesian network models for system reliability analysis through the use of Bayesian inference techniques.

Bayesian networks generalize fault trees by allowing components and subsystems to be related by conditional probabilities instead of deterministic relationships; thus, they provide analytical advantages to the situation when the failure structure is not well understood, especially during the product design stage. In order to tackle this problem, one needs to utilize auxiliary information such as the reliability information from similar products and domain expertise. For this purpose, a Bayesian network approach is proposed to incorporate data from functional analysis and parent products. The functions with low reliability and their impact on other functions in the network are identified, so that design changes can be suggested for system reliability improvement.

A complex system does not necessarily have all components being monitored at the same time, causing another challenge in the reliability assessment problem. Sometimes there are a limited number of sensors deployed in the system to monitor the states of some components or subsystems, but not all of them. Data simultaneously collected from multiple sensors on the same system are analyzed using a Bayesian network approach, and the conditional probabilities of the network are estimated by combining failure information and expert opinions at both system and component levels. Several data scenarios with discrete, continuous and hybrid data (both discrete and continuous data) are analyzed. Posterior distributions of the reliability parameters of the system and components are assessed using simultaneous data.

Finally, a Bayesian framework is proposed to incorporate different sources of prior information and reconcile these different sources, including expert opinions and component information, in order to form a prior distribution for the system. Incorporating expert opinion in the form of pseudo-observations substantially simplifies statistical modeling, as opposed to the pooling techniques and supra Bayesian methods used for combining prior distributions in the literature.

The methods proposed are demonstrated with several case studies.

Contributors

Agent

Created

Date Created
  • 2016

154511-Thumbnail Image.png

Spatial genetic structure under limited dispersal: theory, methods and consequences of isolation-by-distance

Description

Isolation-by-distance is a specific type of spatial genetic structure that arises when parent-offspring dispersal is limited. Many natural populations exhibit localized dispersal, and as a result, individuals that are geographically

Isolation-by-distance is a specific type of spatial genetic structure that arises when parent-offspring dispersal is limited. Many natural populations exhibit localized dispersal, and as a result, individuals that are geographically near each other will tend to have greater genetic similarity than individuals that are further apart. It is important to identify isolation-by-distance because it can impact the statistical analysis of population samples and it can help us better understand evolutionary dynamics. For this dissertation I investigated several aspects of isolation-by-distance. First, I looked at how the shape of the dispersal distribution affects the observed pattern of isolation-by-distance. If, as theory predicts, the shape of the distribution has little effect, then it would be more practical to model isolation-by-distance using a simple dispersal distribution rather than replicating the complexities of more realistic distributions. Therefore, I developed an efficient algorithm to simulate dispersal based on a simple triangular distribution, and using a simulation, I confirmed that the pattern of isolation-by-distance was similar to other more realistic distributions. Second, I developed a Bayesian method to quantify isolation-by-distance using genetic data by estimating Wright’s neighborhood size parameter. I analyzed the performance of this method using simulated data and a microsatellite data set from two populations of Maritime pine, and I found that the neighborhood size estimates had good coverage and low error. Finally, one of the major consequences of isolation-by-distance is an increase in inbreeding. Plants are often particularly susceptible to inbreeding, and as a result, they have evolved many inbreeding avoidance mechanisms. Using a simulation, I determined which mechanisms are more successful at preventing inbreeding associated with isolation-by-distance.

Contributors

Agent

Created

Date Created
  • 2015

153109-Thumbnail Image.png

Applied meta-analysis of lead-free solder reliability

Description

This thesis presents a meta-analysis of lead-free solder reliability. The qualitative analyses of the failure modes of lead- free solder under different stress tests including drop test, bend test, thermal

This thesis presents a meta-analysis of lead-free solder reliability. The qualitative analyses of the failure modes of lead- free solder under different stress tests including drop test, bend test, thermal test and vibration test are discussed. The main cause of failure of lead- free solder is fatigue crack, and the speed of propagation of the initial crack could differ from different test conditions and different solder materials. A quantitative analysis about the fatigue behavior of SAC lead-free solder under thermal preconditioning process is conducted. This thesis presents a method of making prediction of failure life of solder alloy by building a Weibull regression model. The failure life of solder on circuit board is assumed Weibull distributed. Different materials and test conditions could affect the distribution by changing the shape and scale parameters of Weibull distribution. The method is to model the regression of parameters with different test conditions as predictors based on Bayesian inference concepts. In the process of building regression models, prior distributions are generated according to the previous studies, and Markov Chain Monte Carlo (MCMC) is used under WinBUGS environment.

Contributors

Agent

Created

Date Created
  • 2014