Search Content

Hierarchical Sequential Event Prediction and Translation from Aviation Accident Report Data

Description

Sequential event prediction or sequential pattern mining is a well-studied topic in the literature. There are a lot of real-world scenarios where the data is released sequentially. People believe that there exist repetitive patterns of event sequences so that the future events can be predicted. For example, many companies build…

Sequential event prediction or sequential pattern mining is a well-studied topic in the literature. There are a lot of real-world scenarios where the data is released sequentially. People believe that there exist repetitive patterns of event sequences so that the future events can be predicted. For example, many companies build their recommender system to predict the next possible product for the users according to their purchase history. The healthcare system discovers the relationships among patients’ sequential symptoms to mitigate the adverse effect of a treatment (drugs or surgery). Modern engineering systems like aviation/distributed computing/energy systems diagnosed failure event logs and took prompt actions to avoid disaster when a similar failure pattern occurs. In this dissertation, I specifically focus on building a scalable algorithm for event prediction and extraction in the aviation domain. Understanding the accident event is always the major concern of the safety issue in the aviation system. A flight accident is often caused by a sequence of failure events. Accurate modeling of the failure event sequence and how it leads to the final accident is important for aviation safety. This work aims to study the relationship of the failure event sequence and evaluate the risk of the final accident according to these failure events. There are three major challenges I am trying to deal with. (1) Modeling Sequential Events with Hierarchical Structure: I aim to improve the prediction accuracy by taking advantage of the multi-level or hierarchical representation of these rare events. Specifically, I proposed to build a sequential Encoder-Decoder framework with a hierarchical embedding representation of the events. (2) Lack of high-quality and consistent event log data: In order to acquire more accurate event data from aviation accident reports, I convert the problem into a multi-label classification. An attention-based Bidirectional Encoder Representations from Transformers model is developed to achieve good performance and interpretability. (3) Ontology-based event extraction: In order to extract detailed events, I proposed to solve the problem as a hierarchical classification task. I improve the model performance by incorporating event ontology. By solving these three challenges, I provide a framework to extract events from narrative reports and estimate the risk level of aviation accidents through event sequence modeling.

ContributorsZhao, Xinyu (Author) / Yan, Hao (Thesis advisor) / Liu, Yongming (Committee member) / Ju, Feng (Committee member) / Iquebal, Ashif (Committee member) / Arizona State University (Publisher)

Created2022

Software Tools for Design, Simulation, and Characterization of DNA and RNA Nanostructures

Description

Nucleic acid nanotechnology is a field of nanoscale engineering where the sequences of deoxyribonucleicacid (DNA) and ribonucleic acid (RNA) molecules are carefully designed to create self–assembled nanostructures with higher spatial resolution than is available to top–down fabrication methods. In the 40 year history of the field, the structures created have scaled…

Nucleic acid nanotechnology is a field of nanoscale engineering where the sequences of deoxyribonucleicacid (DNA) and ribonucleic acid (RNA) molecules are carefully designed to create self–assembled nanostructures with higher spatial resolution than is available to top–down fabrication methods. In the 40 year history of the field, the structures created have scaled from small tile–like structures constructed from a few hundred individual nucleotides to micron–scale structures assembled from millions of nucleotides using the technique of “DNA origami”. One of the key drivers of advancement in any modern engineering field is the parallel development of software which facilitates the design of components and performs in silico simulation of the target structure to determine its structural properties, dynamic behavior, and identify defects. For nucleic acid nanotechnology, the design software CaDNAno and simulation software oxDNA are the most popular choices for design and simulation, respectively. In this dissertation I will present my work on the oxDNA software ecosystem, including an analysis toolkit, a web–based graphical interface, and a new molecular visualization tool which doubles as a free–form design editor that covers some of the weaknesses of CaDNAno’s lattice–based design paradigm. Finally, as a demonstration of the utility of these new tools I show oxDNA simulation and subsequent analysis of a nanoscale leaf–spring engine capable of converting chemical energy into dynamic motion. OxDNA simulations were used to investigate the effects of design choices on the behavior of the system and rationalize experimental results.

ContributorsPoppleton, Erik (Author) / Sulc, Petr (Thesis advisor) / Yan, Hao (Committee member) / Forrest, Stephanie (Committee member) / Stephanopoulos, Nicholas (Committee member) / Arizona State University (Publisher)

Created2022

Novel Computational Algorithms for Imaging Biomarker Identification

Description

Over the past few decades, medical imaging is becoming important in medicine for disease diagnosis, prognosis, treatment assessment and health monitoring. As medical imaging has progressed, imaging biomarkers are being rapidly developed for early diagnosis and staging of disease. Detecting and segmenting objects from images are often the first steps…

Over the past few decades, medical imaging is becoming important in medicine for disease diagnosis, prognosis, treatment assessment and health monitoring. As medical imaging has progressed, imaging biomarkers are being rapidly developed for early diagnosis and staging of disease. Detecting and segmenting objects from images are often the first steps in quantitative measurement of these biomarkers. While large objects can often be automatically or semi-automatically delineated, segmenting small objects (blobs) is challenging. The small object of particular interest in this dissertation are glomeruli from kidney magnetic resonance (MR) images. This problem has its unique challenges. First of all, the size of glomeruli is extremely small and very similar with noises from images. Second, there are massive of glomeruli in kidney, e.g. over 1 million glomeruli in human kidney, and the intensity distribution is heterogenous. A third recognized issue is that a large portion of glomeruli are overlapping and touched in images. The goal of this dissertation is to develop computational algorithms to identify and discover glomeruli related imaging biomarkers. The first phase is to develop a U-net joint with Hessian based Difference of Gaussians (UH-DoG) blob detector. Joining effort from deep learning alleviates the over-detection issue from Hessian analysis. Next, as extension of UH-DoG, a small blob detector using Bi-Threshold Constrained Adaptive Scales (BTCAS) is proposed. Deep learning is treated as prior of Difference of Gaussian (DoG) to improve its efficiency. By adopting BTCAS, under-segmentation issue of deep learning is addressed. The second phase is to develop a denoising convexity-consistent Blob Generative Adversarial Network (BlobGAN). BlobGAN could achieve high denoising performance and selectively denoise the image without affecting the blobs. These detectors are validated on datasets of 2D fluorescent images, 3D synthetic images, 3D MR (18 mice, 3 humans) images and proved to be outperforming the competing detectors. In the last phase, a Fréchet Descriptors Distance based Coreset approach (FDD-Coreset) is proposed for accelerating BlobGAN’s training. Experiments have shown that BlobGAN trained on FDD-Coreset not only significantly reduces the training time, but also achieves higher denoising performance and maintains approximate performance of blob identification compared with training on entire dataset.

ContributorsXu, Yanzhe (Author) / Wu, Teresa (Thesis advisor) / Iquebal, Ashif (Committee member) / Yan, Hao (Committee member) / Beeman, Scott (Committee member) / Arizona State University (Publisher)

Created2022

Novel Deep Learning Models for Medical Imaging Analysis

Description

Deep learning is a sub-field of machine learning in which models are developed to imitate the workings of the human brain in processing data and creating patterns for decision making. This dissertation is focused on developing deep learning models for medical imaging analysis of different modalities for different tasks including…

Deep learning is a sub-field of machine learning in which models are developed to imitate the workings of the human brain in processing data and creating patterns for decision making. This dissertation is focused on developing deep learning models for medical imaging analysis of different modalities for different tasks including detection, segmentation and classification. Imaging modalities including digital mammography (DM), magnetic resonance imaging (MRI), positron emission tomography (PET) and computed tomography (CT) are studied in the dissertation for various medical applications. The first phase of the research is to develop a novel shallow-deep convolutional neural network (SD-CNN) model for improved breast cancer diagnosis. This model takes one type of medical image as input and synthesizes different modalities for additional feature sources; both original image and synthetic image are used for feature generation. This proposed architecture is validated in the application of breast cancer diagnosis and proved to be outperforming the competing models. Motivated by the success from the first phase, the second phase focuses on improving medical imaging synthesis performance with advanced deep learning architecture. A new architecture named deep residual inception encoder-decoder network (RIED-Net) is proposed. RIED-Net has the advantages of preserving pixel-level information and cross-modality feature transferring. The applicability of RIED-Net is validated in breast cancer diagnosis and Alzheimer’s disease (AD) staging. Recognizing medical imaging research often has multiples inter-related tasks, namely, detection, segmentation and classification, my third phase of the research is to develop a multi-task deep learning model. Specifically, a feature transfer enabled multi-task deep learning model (FT-MTL-Net) is proposed to transfer high-resolution features from segmentation task to low-resolution feature-based classification task. The application of FT-MTL-Net on breast cancer detection, segmentation and classification using DM images is studied. As a continuing effort on exploring the transfer learning in deep models for medical application, the last phase is to develop a deep learning model for both feature transfer and knowledge from pre-training age prediction task to new domain of Mild cognitive impairment (MCI) to AD conversion prediction task. It is validated in the application of predicting MCI patients’ conversion to AD with 3D MRI images.

ContributorsGao, Fei (Author) / Wu, Teresa (Thesis advisor) / Li, Jing (Committee member) / Yan, Hao (Committee member) / Patel, Bhavika (Committee member) / Arizona State University (Publisher)

Created2019

Stochastic models of patient access management in healthcare

Description

This dissertation addresses access management problems that occur in both emergency and outpatient clinics with the objective of allocating the available resources to improve performance measures by considering the trade-offs. Two main settings are considered for estimating patient willingness-to-wait (WtW) behavior for outpatient appointments with statistical analyses of data: allocation…

This dissertation addresses access management problems that occur in both emergency and outpatient clinics with the objective of allocating the available resources to improve performance measures by considering the trade-offs. Two main settings are considered for estimating patient willingness-to-wait (WtW) behavior for outpatient appointments with statistical analyses of data: allocation of the limited booking horizon to patients of different priorities by using time windows in an outpatient setting considering patient behavior, and allocation of hospital beds to admitted Emergency Department (ED) patients. For each chapter, a different approach based on the problem context is developed and the performance is analyzed by implementing analytical and simulation models. Real hospital data is used in the analyses to provide evidence that the methodologies introduced are beneficial in addressing real life problems, and real improvements can be achievable by using the policies that are suggested.

This dissertation starts with studying an outpatient clinic context to develop an effective resource allocation mechanism that can improve patient access to clinic appointments. I first start with identifying patient behavior in terms of willingness-to-wait to an outpatient appointment. Two statistical models are developed to estimate patient WtW distribution by using data on booked appointments and appointment requests. Several analyses are conducted on simulated data to observe effectiveness and accuracy of the estimations.

Then, this dissertation introduces a time windows based policy that utilizes patient behavior to improve access by using appointment delay as a lever. The policy improves patient access by allocating the available capacity to the patients from different priorities by dividing the booking horizon into time intervals that can be used by each priority group which strategically delay lower priority patients.

Finally, the patient routing between ED and inpatient units to improve the patient access to hospital beds is studied. The strategy that captures the trade-off between patient safety and quality of care is characterized as a threshold type. Through the simulation experiments developed by real data collected from a hospital, the achievable improvement of implementing such a strategy that considers the safety-quality of care trade-off is illustrated.

ContributorsKilinc, Derya (Author) / Gel, Esma (Thesis advisor) / Pasupathy, Kalyan (Committee member) / Sefair, Jorge (Committee member) / Sir, Mustafa (Committee member) / Yan, Hao (Committee member) / Arizona State University (Publisher)

Created2019

Novel Semi-Supervised Learning Models to Balance Data Inclusivity and Usability in Healthcare Applications

Description

Semi-supervised learning (SSL) is sub-field of statistical machine learning that is useful for problems that involve having only a few labeled instances with predictor (X) and target (Y) information, and abundance of unlabeled instances that only have predictor (X) information. SSL harnesses the target information available in the limited…

Semi-supervised learning (SSL) is sub-field of statistical machine learning that is useful for problems that involve having only a few labeled instances with predictor (X) and target (Y) information, and abundance of unlabeled instances that only have predictor (X) information. SSL harnesses the target information available in the limited labeled data, as well as the information in the abundant unlabeled data to build strong predictive models. However, not all the included information is useful. For example, some features may correspond to noise and including them will hurt the predictive model performance. Additionally, some instances may not be as relevant to model building and their inclusion will increase training time and potentially hurt the model performance. The objective of this research is to develop novel SSL models to balance data inclusivity and usability. My dissertation research focuses on applications of SSL in healthcare, driven by problems in brain cancer radiomics, migraine imaging, and Parkinson’s Disease telemonitoring.

The first topic introduces an integration of machine learning (ML) and a mechanistic model (PI) to develop an SSL model applied to predicting cell density of glioblastoma brain cancer using multi-parametric medical images. The proposed ML-PI hybrid model integrates imaging information from unbiopsied regions of the brain as well as underlying biological knowledge from the mechanistic model to predict spatial tumor density in the brain.

The second topic develops a multi-modality imaging-based diagnostic decision support system (MMI-DDS). MMI-DDS consists of modality-wise principal components analysis to incorporate imaging features at different aggregation levels (e.g., voxel-wise, connectivity-based, etc.), a constrained particle swarm optimization (cPSO) feature selection algorithm, and a clinical utility engine that utilizes inverse operators on chosen principal components for white-box classification models.

The final topic develops a new SSL regression model with integrated feature and instance selection called s2SSL (with “s2” referring to selection in two different ways: feature and instance). s2SSL integrates cPSO feature selection and graph-based instance selection to simultaneously choose the optimal features and instances and build accurate models for continuous prediction. s2SSL was applied to smartphone-based telemonitoring of Parkinson’s Disease patients.

ContributorsGaw, Nathan (Author) / Li, Jing (Thesis advisor) / Wu, Teresa (Committee member) / Yan, Hao (Committee member) / Hu, Leland (Committee member) / Arizona State University (Publisher)

Created2019

Real-time Analysis and Control for Smart Manufacturing Systems

Description

Recent advances in manufacturing system, such as advanced embedded sensing, big data analytics and IoT and robotics, are promising a paradigm shift in the manufacturing industry towards smart manufacturing systems. Typically, real-time data is available in many industries, such as automotive, semiconductor, and food production, which can reflect the machine…

Recent advances in manufacturing system, such as advanced embedded sensing, big data analytics and IoT and robotics, are promising a paradigm shift in the manufacturing industry towards smart manufacturing systems. Typically, real-time data is available in many industries, such as automotive, semiconductor, and food production, which can reflect the machine conditions and production system’s operation performance. However, a major research gap still exists in terms of how to utilize these real-time data information to evaluate and predict production system performance and to further facilitate timely decision making and production control on the factory floor. To tackle these challenges, this dissertation takes on an integrated analytical approach by hybridizing data analytics, stochastic modeling and decision making under uncertainty methodology to solve practical manufacturing problems.

Specifically, in this research, the machine degradation process is considered. It has been shown that machines working at different operating states may break down in different probabilistic manners. In addition, machines working in worse operating stage are more likely to fail, thus causing more frequent down period and reducing the system throughput. However, there is still a lack of analytical methods to quantify the potential impact of machine condition degradation on the overall system performance to facilitate operation decision making on the factory floor. To address these issues, this dissertation considers a serial production line with finite buffers and multiple machines following Markovian degradation process. An integrated model based on the aggregation method is built to quantify the overall system performance and its interactions with machine condition process. Moreover, system properties are investigated to analyze the influence of system parameters on system performance. In addition, three types of bottlenecks are defined and their corresponding indicators are derived to provide guidelines on improving system performance. These methods provide quantitative tools for modeling, analyzing, and improving manufacturing systems with the coupling between machine condition degradation and productivity given the real-time signals.

ContributorsKang, Yunyi (Author) / Ju, Feng (Thesis advisor) / Pedrielli, Giulia (Committee member) / Wu, Teresa (Committee member) / Yan, Hao (Committee member) / Arizona State University (Publisher)

Created2020

Bayesian-Entropy Method for Probabilistic Diagnostics and Prognostics of Engineering Systems

Description

Information exists in various forms and a better utilization of the available information can benefit the system awareness and response predictions. The focus of this dissertation is on the fusion of different types of information using Bayesian-Entropy method. The Maximum Entropy method in information theory introduces a unique way of…

Information exists in various forms and a better utilization of the available information can benefit the system awareness and response predictions. The focus of this dissertation is on the fusion of different types of information using Bayesian-Entropy method. The Maximum Entropy method in information theory introduces a unique way of handling information in the form of constraints. The Bayesian-Entropy (BE) principle is proposed to integrate the Bayes’ theorem and Maximum Entropy method to encode extra information. The posterior distribution in Bayesian-Entropy method has a Bayesian part to handle point observation data, and an Entropy part that encodes constraints, such as statistical moment information, range information and general function between variables. The proposed method is then extended to its network format as Bayesian Entropy Network (BEN), which serves as a generalized information fusion tool for diagnostics, prognostics, and surrogate modeling.

The proposed BEN is demonstrated and validated with extensive engineering applications. The BEN method is first demonstrated for diagnostics of gas pipelines and metal/composite plates for damage diagnostics. Both empirical knowledge and physics model are integrated with direct observations to improve the accuracy for diagnostics and to reduce the training samples. Next, the BEN is demonstrated in prognostics and safety assessment in air traffic management system. Various information types, such as human concepts, variable correlation functions, physical constraints, and tendency data, are fused in BEN to enhance the safety assessment and risk prediction in the National Airspace System (NAS). Following this, the BE principle is applied in surrogate modeling. Multiple algorithms are proposed based on different type of information encoding, such as Bayesian-Entropy Linear Regression (BELR), Bayesian-Entropy Semiparametric Gaussian Process (BESGP), and Bayesian-Entropy Gaussian Process (BEGP) are demonstrated with numerical toy problems and practical engineering analysis. The results show that the major benefits are the superior prediction/extrapolation performance and significant reduction of training samples by using additional physics/knowledge as constraints. The proposed BEN offers a systematic and rigorous way to incorporate various information sources. Several major conclusions are drawn based on the proposed study.

ContributorsWang, Yuhao (Author) / Liu, Yongming (Thesis advisor) / Chattopadhyay, Aditi (Committee member) / Mignolet, Marc (Committee member) / Yan, Hao (Committee member) / Ren, Yi (Committee member) / Arizona State University (Publisher)

Created2020

RNA Aptamer-Based Systems for Pathogen Detection and Biomolecule Synthesis

Description

RNA aptamers adopt tertiary structures that enable them to bind to specific ligands. This capability has enabled aptamers to be used for a variety of diagnostic, therapeutic, and regulatory applications. This dissertation focuses on the use RNA aptamers in two biological applications: (1) nucleic acid diagnostic assays and (2) scaffolding…

RNA aptamers adopt tertiary structures that enable them to bind to specific ligands. This capability has enabled aptamers to be used for a variety of diagnostic, therapeutic, and regulatory applications. This dissertation focuses on the use RNA aptamers in two biological applications: (1) nucleic acid diagnostic assays and (2) scaffolding of enzymatic pathways. First, sensors for detecting arbitrary target RNAs based the fluorogenic RNA aptamer Broccoli are designed and validated. Studies of three different sensor designs reveal that toehold-initiated Broccoli-based aptasensors provide the lowest signal leakage and highest signal intensity in absence and in presence of the target RNA, respectively. This toehold-initiated design is used for developing aptasensors targeting pathogens. Diagnostic assays for detecting pathogen nucleic acids are implemented by integrating Broccoli-based aptasensors with isothermal amplification methods. When coupling with recombinase polymerase amplification (RPA), aptasensors enable detection of synthetic valley fever DNA down to concentrations of 2 fM. Integration of Broccoli-based aptasensors with nucleic acid sequence-based amplification (NASBA) enables as few as 120 copies/mL of synthetic dengue RNA to be detected in reactions taking less than three hours. Moreover, the aptasensor-NASBA assay successfully detects dengue RNA in clinical samples. Second, RNA scaffolds containing peptide-binding RNA aptamers are employed for programming the synthesis of nonribosomal peptides (NRPs). Using the NRP enterobactin pathway as a model, RNA scaffolds are developed to direct the assembly of the enzymes entE, entB, and entF from E. coli, along with the aryl-carrier protein dhbB from B. subtilis. These scaffolds employ X-shaped RNA motifs from bacteriophage packaging motors, kissing loop interactions from HIV, and peptide-binding RNA aptamers to position peptide-modified NRP enzymes. The resulting RNA scaffolds functionalized with different aptamers are designed and evaluated for in vitro production of enterobactin. The best RNA scaffold provides a 418% increase in enterobactin production compared with the system in absence of the RNA scaffold. Moreover, the chimeric scaffold, with E. coli and B. subtilis enzymes, reaches approximately 56% of the activity of the wild-type enzyme assembly. The studies presented in this dissertation will be helpful for future development of nucleic acid-based assays and for controlling protein interaction for NRPs biosynthesis.

ContributorsTang, Anli (Author) / Green, Alexander (Thesis advisor) / Yan, Hao (Committee member) / Woodbury, Neal (Committee member) / Arizona State University (Publisher)

Created2020

Queueing Network Models for Performance Evaluation of Dynamic Multi-Product Manufacturing Systems

Description

Modern manufacturing systems are part of a complex supply chain where customer preferences are constantly evolving. The rapidly evolving market demands manufacturing organizations to be increasingly agile and flexible. Medium term capacity planning for manufacturing systems employ queueing network models based on stationary demand assumptions. However, these stationary demand assumptions…

Modern manufacturing systems are part of a complex supply chain where customer preferences are constantly evolving. The rapidly evolving market demands manufacturing organizations to be increasingly agile and flexible. Medium term capacity planning for manufacturing systems employ queueing network models based on stationary demand assumptions. However, these stationary demand assumptions are not very practical for rapidly evolving supply chains. Nonstationary demand processes provide a reasonable framework to capture the time-varying nature of modern markets. The analysis of queues and queueing networks with time-varying parameters is mathematically intractable. In this dissertation, heuristics which draw upon existing steady state queueing results are proposed to provide computationally efficient approximations for dynamic multi-product manufacturing systems modeled as time-varying queueing networks with multiple customer classes (product types). This dissertation addresses the problem of performance evaluation of such manufacturing systems.

This dissertation considers the two key aspects of dynamic multi-product manufacturing systems - namely, performance evaluation and optimal server resource allocation. First, the performance evaluation of systems with infinite queueing room and a first-come first-serve service paradigm is considered. Second, systems with finite queueing room and priorities between product types are considered. Finally, the optimal server allocation problem is addressed in the context of dynamic multi-product manufacturing systems. The performance estimates developed in the earlier part of the dissertation are leveraged in a simulated annealing algorithm framework to obtain server resource allocations.

ContributorsJampani Hanumantha, Girish (Author) / Askin, Ronald (Thesis advisor) / Ju, Feng (Committee member) / Yan, Hao (Committee member) / Mirchandani, Pitu (Committee member) / Arizona State University (Publisher)

Created2020

Filtering by