Matching Items (5)

Filtering by

Clear all filters

149723-Thumbnail Image.png

System complexity reduction via feature selection

Description

This dissertation transforms a set of system complexity reduction problems to feature selection problems. Three systems are considered: classification based on association rules, network structure learning, and time series classification. Furthermore, two variable importance measures are proposed to reduce the

This dissertation transforms a set of system complexity reduction problems to feature selection problems. Three systems are considered: classification based on association rules, network structure learning, and time series classification. Furthermore, two variable importance measures are proposed to reduce the feature selection bias in tree models. Associative classifiers can achieve high accuracy, but the combination of many rules is difficult to interpret. Rule condition subset selection (RCSS) methods for associative classification are considered. RCSS aims to prune the rule conditions into a subset via feature selection. The subset then can be summarized into rule-based classifiers. Experiments show that classifiers after RCSS can substantially improve the classification interpretability without loss of accuracy. An ensemble feature selection method is proposed to learn Markov blankets for either discrete or continuous networks (without linear, Gaussian assumptions). The method is compared to a Bayesian local structure learning algorithm and to alternative feature selection methods in the causal structure learning problem. Feature selection is also used to enhance the interpretability of time series classification. Existing time series classification algorithms (such as nearest-neighbor with dynamic time warping measures) are accurate but difficult to interpret. This research leverages the time-ordering of the data to extract features, and generates an effective and efficient classifier referred to as a time series forest (TSF). The computational complexity of TSF is only linear in the length of time series, and interpretable features can be extracted. These features can be further reduced, and summarized for even better interpretability. Lastly, two variable importance measures are proposed to reduce the feature selection bias in tree-based ensemble models. It is well known that bias can occur when predictor attributes have different numbers of values. Two methods are proposed to solve the bias problem. One uses an out-of-bag sampling method called OOBForest, and the other, based on the new concept of a partial permutation test, is called a pForest. Experimental results show the existing methods are not always reliable for multi-valued predictors, while the proposed methods have advantages.

Contributors

Agent

Created

Date Created
2011

152768-Thumbnail Image.png

Surgical instrument reprocessing in a hospital setting analyzed with statistical process control and data mining techniques

Description

In a healthcare setting, the Sterile Processing Department (SPD) provides ancillary services to the Operating Room (OR), Emergency Room, Labor & Delivery, and off-site clinics. SPD's function is to reprocess reusable surgical instruments and return them to their home departments.

In a healthcare setting, the Sterile Processing Department (SPD) provides ancillary services to the Operating Room (OR), Emergency Room, Labor & Delivery, and off-site clinics. SPD's function is to reprocess reusable surgical instruments and return them to their home departments. The management of surgical instruments and medical devices can impact patient safety and hospital revenue. Any time instrumentation or devices are not available or are not fit for use, patient safety and revenue can be negatively impacted. One step of the instrument reprocessing cycle is sterilization. Steam sterilization is the sterilization method used for the majority of surgical instruments and is preferred to immediate use steam sterilization (IUSS) because terminally sterilized items can be stored until needed. IUSS Items must be used promptly and cannot be stored for later use. IUSS is intended for emergency situations and not as regular course of action. Unfortunately, IUSS is used to compensate for inadequate inventory levels, scheduling conflicts, and miscommunications. If IUSS is viewed as an adverse event, then monitoring IUSS incidences can help healthcare organizations meet patient safety goals and financial goals along with aiding in process improvement efforts. This work recommends statistical process control methods to IUSS incidents and illustrates the use of control charts for IUSS occurrences through a case study and analysis of the control charts for data from a health care provider. Furthermore, this work considers the application of data mining methods to IUSS occurrences and presents a representative example of data mining to the IUSS occurrences. This extends the application of statistical process control and data mining in healthcare applications.

Contributors

Agent

Created

Date Created
2014

154558-Thumbnail Image.png

Distinct feature learning and nonlinear variation pattern discovery using regularized autoencoders

Description

Feature learning and the discovery of nonlinear variation patterns in high-dimensional data is an important task in many problem domains, such as imaging, streaming data from sensors, and manufacturing. This dissertation presents several methods for learning and visualizing nonlinear variation

Feature learning and the discovery of nonlinear variation patterns in high-dimensional data is an important task in many problem domains, such as imaging, streaming data from sensors, and manufacturing. This dissertation presents several methods for learning and visualizing nonlinear variation in high-dimensional data. First, an automated method for discovering nonlinear variation patterns using deep learning autoencoders is proposed. The approach provides a functional mapping from a low-dimensional representation to the original spatially-dense data that is both interpretable and efficient with respect to preserving information. Experimental results indicate that deep learning autoencoders outperform manifold learning and principal component analysis in reproducing the original data from the learned variation sources.

A key issue in using autoencoders for nonlinear variation pattern discovery is to encourage the learning of solutions where each feature represents a unique variation source, which we define as distinct features. This problem of learning distinct features is also referred to as disentangling factors of variation in the representation learning literature. The remainder of this dissertation highlights and provides solutions for this important problem.

An alternating autoencoder training method is presented and a new measure motivated by orthogonal loadings in linear models is proposed to quantify feature distinctness in the nonlinear models. Simulated point cloud data and handwritten digit images illustrate that standard training methods for autoencoders consistently mix the true variation sources in the learned low-dimensional representation, whereas the alternating method produces solutions with more distinct patterns.

Finally, a new regularization method for learning distinct nonlinear features using autoencoders is proposed. Motivated in-part by the properties of linear solutions, a series of learning constraints are implemented via regularization penalties during stochastic gradient descent training. These include the orthogonality of tangent vectors to the manifold, the correlation between learned features, and the distributions of the learned features. This regularized learning approach yields low-dimensional representations which can be better interpreted and used to identify the true sources of variation impacting a high-dimensional feature space. Experimental results demonstrate the effectiveness of this method for nonlinear variation pattern discovery on both simulated and real data sets.

Contributors

Agent

Created

Date Created
2016

158704-Thumbnail Image.png

Cognitive Computing for Decision Support

Description

The Cognitive Decision Support (CDS) model is proposed. The model is widely applicable and scales to realistic, complex decision problems based on adaptive learning. The utility of a decision is discussed and four types of decisions associated with CDS model

The Cognitive Decision Support (CDS) model is proposed. The model is widely applicable and scales to realistic, complex decision problems based on adaptive learning. The utility of a decision is discussed and four types of decisions associated with CDS model are identified. The CDS model is designed to learn decision utilities. Data enrichment is introduced to promote the effectiveness of learning. Grouping is introduced for large-scale decision learning. Introspection and adjustment are presented for adaptive learning. Triage recommendation is incorporated to indicate the trustworthiness of suggested decisions.

The CDS model and methodologies are integrated into an architecture using concepts from cognitive computing. The proposed architecture is implemented with an example use case to inventory management.

Reinforcement learning (RL) is discussed as an alternative, generalized adaptive learning engine for the CDS system to handle the complexity of many problems with unknown environments. An adaptive state dimension with context that can increase with newly available information is discussed. Several enhanced components for RL which are critical for complex use cases are integrated. Deep Q networks are embedded with the adaptive learning methodologies and applied to an example supply chain management problem on capacity planning.

A new approach using Ito stochastic processes is proposed as a more generalized method to generate non-stationary demands in various patterns that can be used in decision problems. The proposed method generates demands with varying non-stationary patterns, including trend, cyclical, seasonal, and irregular patterns. Conventional approaches are identified as special cases of the proposed method. Demands are illustrated in realistic settings for various decision models. Various statistical criteria are applied to filter the generated demands. The method is applied to a real-world example.

Contributors

Agent

Created

Date Created
2020

158771-Thumbnail Image.png

Fine Mapping Functional Noncoding Genetic Elements Via Machine Learning

Description

All biological processes like cell growth, cell differentiation, development, and aging requires a series of steps which are characterized by gene regulation. Studies have shown that gene regulation is the key to various traits and diseases. Various factors affect the

All biological processes like cell growth, cell differentiation, development, and aging requires a series of steps which are characterized by gene regulation. Studies have shown that gene regulation is the key to various traits and diseases. Various factors affect the gene regulation which includes genetic signals, epigenetic tracks, genetic variants, etc. Deciphering and cataloging these functional genetic elements in the non-coding regions of the genome is one of the biggest challenges in precision medicine and genetic research. This thesis presents two different approaches to identifying these elements: TreeMap and DeepCORE. The first approach involves identifying putative causal genetic variants in cis-eQTL accounting for multisite effects and genetic linkage at a locus. TreeMap performs an organized search for individual and multiple causal variants using a tree guided nested machine learning method. DeepCORE on the other hand explores novel deep learning techniques that models the relationship between genetic, epigenetic and transcriptional patterns across tissues and cell lines and identifies co-operative regulatory elements that affect gene regulation. These two methods are believed to be the link for genotype-phenotype association and a necessary step to explaining various complex diseases and missing heritability.

Contributors

Agent

Created

Date Created
2020