Search Content

A visual analytics based decision support methodology for evaluating low energy building design alternatives

Description

The ability to design high performance buildings has acquired great importance in recent years due to numerous federal, societal and environmental initiatives. However, this endeavor is much more demanding in terms of designer expertise and time. It requires a whole new level of synergy between automated performance prediction with the…

The ability to design high performance buildings has acquired great importance in recent years due to numerous federal, societal and environmental initiatives. However, this endeavor is much more demanding in terms of designer expertise and time. It requires a whole new level of synergy between automated performance prediction with the human capabilities to perceive, evaluate and ultimately select a suitable solution. While performance prediction can be highly automated through the use of computers, performance evaluation cannot, unless it is with respect to a single criterion. The need to address multi-criteria requirements makes it more valuable for a designer to know the "latitude" or "degrees of freedom" he has in changing certain design variables while achieving preset criteria such as energy performance, life cycle cost, environmental impacts etc. This requirement can be met by a decision support framework based on near-optimal "satisficing" as opposed to purely optimal decision making techniques. Currently, such a comprehensive design framework is lacking, which is the basis for undertaking this research. The primary objective of this research is to facilitate a complementary relationship between designers and computers for Multi-Criterion Decision Making (MCDM) during high performance building design. It is based on the application of Monte Carlo approaches to create a database of solutions using deterministic whole building energy simulations, along with data mining methods to rank variable importance and reduce the multi-dimensionality of the problem. A novel interactive visualization approach is then proposed which uses regression based models to create dynamic interplays of how varying these important variables affect the multiple criteria, while providing a visual range or band of variation of the different design parameters. The MCDM process has been incorporated into an alternative methodology for high performance building design referred to as Visual Analytics based Decision Support Methodology [VADSM]. VADSM is envisioned to be most useful during the conceptual and early design performance modeling stages by providing a set of potential solutions that can be analyzed further for final design selection. The proposed methodology can be used for new building design synthesis as well as evaluation of retrofits and operational deficiencies in existing buildings.

ContributorsDutta, Ranojoy (Author) / Reddy, T Agami (Thesis advisor) / Runger, George C. (Committee member) / Addison, Marlin S. (Committee member) / Arizona State University (Publisher)

Created2013

Surgical instrument reprocessing in a hospital setting analyzed with statistical process control and data mining techniques

Description

In a healthcare setting, the Sterile Processing Department (SPD) provides ancillary services to the Operating Room (OR), Emergency Room, Labor & Delivery, and off-site clinics. SPD's function is to reprocess reusable surgical instruments and return them to their home departments. The management of surgical instruments and medical devices can impact…

In a healthcare setting, the Sterile Processing Department (SPD) provides ancillary services to the Operating Room (OR), Emergency Room, Labor & Delivery, and off-site clinics. SPD's function is to reprocess reusable surgical instruments and return them to their home departments. The management of surgical instruments and medical devices can impact patient safety and hospital revenue. Any time instrumentation or devices are not available or are not fit for use, patient safety and revenue can be negatively impacted. One step of the instrument reprocessing cycle is sterilization. Steam sterilization is the sterilization method used for the majority of surgical instruments and is preferred to immediate use steam sterilization (IUSS) because terminally sterilized items can be stored until needed. IUSS Items must be used promptly and cannot be stored for later use. IUSS is intended for emergency situations and not as regular course of action. Unfortunately, IUSS is used to compensate for inadequate inventory levels, scheduling conflicts, and miscommunications. If IUSS is viewed as an adverse event, then monitoring IUSS incidences can help healthcare organizations meet patient safety goals and financial goals along with aiding in process improvement efforts. This work recommends statistical process control methods to IUSS incidents and illustrates the use of control charts for IUSS occurrences through a case study and analysis of the control charts for data from a health care provider. Furthermore, this work considers the application of data mining methods to IUSS occurrences and presents a representative example of data mining to the IUSS occurrences. This extends the application of statistical process control and data mining in healthcare applications.

ContributorsWeart, Gail (Author) / Runger, George C. (Thesis advisor) / Li, Jing (Committee member) / Shunk, Dan (Committee member) / Arizona State University (Publisher)

Created2014

System complexity reduction via feature selection

Description

This dissertation transforms a set of system complexity reduction problems to feature selection problems. Three systems are considered: classification based on association rules, network structure learning, and time series classification. Furthermore, two variable importance measures are proposed to reduce the feature selection bias in tree models. Associative classifiers can achieve…

This dissertation transforms a set of system complexity reduction problems to feature selection problems. Three systems are considered: classification based on association rules, network structure learning, and time series classification. Furthermore, two variable importance measures are proposed to reduce the feature selection bias in tree models. Associative classifiers can achieve high accuracy, but the combination of many rules is difficult to interpret. Rule condition subset selection (RCSS) methods for associative classification are considered. RCSS aims to prune the rule conditions into a subset via feature selection. The subset then can be summarized into rule-based classifiers. Experiments show that classifiers after RCSS can substantially improve the classification interpretability without loss of accuracy. An ensemble feature selection method is proposed to learn Markov blankets for either discrete or continuous networks (without linear, Gaussian assumptions). The method is compared to a Bayesian local structure learning algorithm and to alternative feature selection methods in the causal structure learning problem. Feature selection is also used to enhance the interpretability of time series classification. Existing time series classification algorithms (such as nearest-neighbor with dynamic time warping measures) are accurate but difficult to interpret. This research leverages the time-ordering of the data to extract features, and generates an effective and efficient classifier referred to as a time series forest (TSF). The computational complexity of TSF is only linear in the length of time series, and interpretable features can be extracted. These features can be further reduced, and summarized for even better interpretability. Lastly, two variable importance measures are proposed to reduce the feature selection bias in tree-based ensemble models. It is well known that bias can occur when predictor attributes have different numbers of values. Two methods are proposed to solve the bias problem. One uses an out-of-bag sampling method called OOBForest, and the other, based on the new concept of a partial permutation test, is called a pForest. Experimental results show the existing methods are not always reliable for multi-valued predictors, while the proposed methods have advantages.

ContributorsDeng, Houtao (Author) / Runger, George C. (Thesis advisor) / Lohr, Sharon L (Committee member) / Pan, Rong (Committee member) / Zhang, Muhong (Committee member) / Arizona State University (Publisher)

Created2011

Matching supply and demand using dynamic quotation strategies

Description

Today's competitive markets force companies to constantly engage in the complex task of managing their demand. In make-to-order manufacturing or service systems, the demand of a product is shaped by price and lead times, where high price and lead time quotes ensure profitability for supplier, but discourage the customers from…

Today's competitive markets force companies to constantly engage in the complex task of managing their demand. In make-to-order manufacturing or service systems, the demand of a product is shaped by price and lead times, where high price and lead time quotes ensure profitability for supplier, but discourage the customers from placing orders. Low price and lead times, on the other hand, generally result in high demand, but do not necessarily ensure profitability. The price and lead time quotation problem considers the trade-off between offering high and low prices and lead times. The recent practices in make-to- order manufacturing companies reveal the importance of dynamic quotation strategies, under which the prices and lead time quotes flexibly change depending on the status of the system. In this dissertation, the objective is to model a make-to-order manufacturing system and explore various aspects of dynamic quotation strategies such as the behavior of optimal price and lead time decisions, the impact of customer preferences on optimal decisions, the benefits of employing dynamic quotation in comparison to simpler quotation strategies, and the benefits of coordinating price and lead time decisions. I first consider a manufacturer that receives demand from spot purchasers (who are quoted dynamic price and lead times), as well as from contract customers who have agree- ments with the manufacturer with fixed price and lead time terms. I analyze how customer preferences affect the optimal price and lead time decisions, the benefits of dynamic quo- tation, and the optimal mix of spot purchaser and contract customers. These analyses necessitate the computation of expected tardiness of customer orders at the moment cus- tomer enters the system. Hence, in the second part of the dissertation, I develop method- ologies to compute the expected tardiness in multi-class priority queues. For the trivial single class case, a closed formulation is obtained. For the more complex multi-class case, numerical inverse Laplace transformation algorithms are developed. In the last part of the dissertation, I model a decentralized system with two components. Marketing department determines the price quotes with the objective of maximizing revenues, and manufacturing department determines the lead time quotes to minimize lateness costs. I discuss the ben- efits of coordinating price and lead time decisions, and develop an incentivization scheme to reduce the negative impacts of lack of coordination.

ContributorsHafizoglu, Ahmet Baykal (Author) / Gel, Esma S (Thesis advisor) / Villalobos, Jesus R (Committee member) / Mirchandani, Pitu (Committee member) / Keskinocak, Pinar (Committee member) / Runger, George C. (Committee member) / Arizona State University (Publisher)

Created2012

Optimal design of experiments for functional responses

Description

Functional or dynamic responses are prevalent in experiments in the fields of engineering, medicine, and the sciences, but proposals for optimal designs are still sparse for this type of response. Experiments with dynamic responses result in multiple responses taken over a spectrum variable, so the design matrix for a dynamic…

Functional or dynamic responses are prevalent in experiments in the fields of engineering, medicine, and the sciences, but proposals for optimal designs are still sparse for this type of response. Experiments with dynamic responses result in multiple responses taken over a spectrum variable, so the design matrix for a dynamic response have more complicated structures. In the literature, the optimal design problem for some functional responses has been solved using genetic algorithm (GA) and approximate design methods. The goal of this dissertation is to develop fast computer algorithms for calculating exact D-optimal designs.

First, we demonstrated how the traditional exchange methods could be improved to generate a computationally efficient algorithm for finding G-optimal designs. The proposed two-stage algorithm, which is called the cCEA, uses a clustering-based approach to restrict the set of possible candidates for PEA, and then improves the G-efficiency using CEA.

The second major contribution of this dissertation is the development of fast algorithms for constructing D-optimal designs that determine the optimal sequence of stimuli in fMRI studies. The update formula for the determinant of the information matrix was improved by exploiting the sparseness of the information matrix, leading to faster computation times. The proposed algorithm outperforms genetic algorithm with respect to computational efficiency and D-efficiency.

The third contribution is a study of optimal experimental designs for more general functional response models. First, the B-spline system is proposed to be used as the non-parametric smoother of response function and an algorithm is developed to determine D-optimal sampling points of a spectrum variable. Second, we proposed a two-step algorithm for finding the optimal design for both sampling points and experimental settings. In the first step, the matrix of experimental settings is held fixed while the algorithm optimizes the determinant of the information matrix for a mixed effects model to find the optimal sampling times. In the second step, the optimal sampling times obtained from the first step is held fixed while the algorithm iterates on the information matrix to find the optimal experimental settings. The designs constructed by this approach yield superior performance over other designs found in literature.

ContributorsSaleh, Moein (Author) / Pan, Rong (Thesis advisor) / Montgomery, Douglas C. (Committee member) / Runger, George C. (Committee member) / Kao, Ming-Hung (Committee member) / Arizona State University (Publisher)

Created2015

A multi-sensor data fusion approach for real-time lane-based traffic estimation

Description

Modern intelligent transportation systems (ITS) make driving more efficient, easier, and safer. Knowledge of real-time traffic conditions is a critical input for operating ITS. Real-time freeway traffic state estimation approaches have been used to quantify traffic conditions given limited amount of data collected by traffic sensors. Currently, almost all real-time…

Modern intelligent transportation systems (ITS) make driving more efficient, easier, and safer. Knowledge of real-time traffic conditions is a critical input for operating ITS. Real-time freeway traffic state estimation approaches have been used to quantify traffic conditions given limited amount of data collected by traffic sensors. Currently, almost all real-time estimation methods have been developed for estimating laterally aggregated traffic conditions in a roadway segment using link-based models which assume homogeneous conditions across multiple lanes. However, with new advances and applications of ITS, knowledge of lane-based traffic conditions is becoming important, where the traffic condition differences among lanes are recognized. In addition, most of the current real-time freeway traffic estimators consider only data from loop detectors. This dissertation develops a bi-level data fusion approach using heterogeneous multi-sensor measurements to estimate real-time lane-based freeway traffic conditions, which integrates a link-level model-based estimator and a lane-level data-driven estimator.

Macroscopic traffic flow models describe the evolution of aggregated traffic characteristics over time and space, which are required by model-based traffic estimation approaches. Since current first-order Lagrangian macroscopic traffic flow model has some unrealistic implicit assumptions (e.g., infinite acceleration), a second-order Lagrangian macroscopic traffic flow model has been developed by incorporating drivers’ anticipation and reaction delay. A multi-sensor extended Kalman filter (MEKF) algorithm has been developed to combine heterogeneous measurements from multiple sources. A MEKF-based traffic estimator, explicitly using the developed second-order traffic flow model and measurements from loop detectors as well as GPS trajectories for given fractions of vehicles, has been proposed which gives real-time link-level traffic estimates in the bi-level estimation system.

The lane-level estimation in the bi-level data fusion system uses the link-level estimates as priors and adopts a data-driven approach to obtain lane-based estimates, where now heterogeneous multi-sensor measurements are combined using parallel spatial-temporal filters.

Experimental analysis shows that the second-order model can more realistically reproduce real world traffic flow patterns (e.g., stop-and-go waves). The MEKF-based link-level estimator exhibits more accurate results than the estimator that uses only a single data source. Evaluation of the lane-level estimator demonstrates that the proposed new bi-level multi-sensor data fusion system can provide very good estimates of real-time lane-based traffic conditions.

ContributorsZhou, Zhuoyang (Author) / Mirchandani, Pitu (Thesis advisor) / Askin, Ronald (Committee member) / Runger, George C. (Committee member) / Zhou, Xuesong (Committee member) / Arizona State University (Publisher)

Created2015

Distinct feature learning and nonlinear variation pattern discovery using regularized autoencoders

Description

Feature learning and the discovery of nonlinear variation patterns in high-dimensional data is an important task in many problem domains, such as imaging, streaming data from sensors, and manufacturing. This dissertation presents several methods for learning and visualizing nonlinear variation in high-dimensional data. First, an automated method for discovering nonlinear…

Feature learning and the discovery of nonlinear variation patterns in high-dimensional data is an important task in many problem domains, such as imaging, streaming data from sensors, and manufacturing. This dissertation presents several methods for learning and visualizing nonlinear variation in high-dimensional data. First, an automated method for discovering nonlinear variation patterns using deep learning autoencoders is proposed. The approach provides a functional mapping from a low-dimensional representation to the original spatially-dense data that is both interpretable and efficient with respect to preserving information. Experimental results indicate that deep learning autoencoders outperform manifold learning and principal component analysis in reproducing the original data from the learned variation sources.

A key issue in using autoencoders for nonlinear variation pattern discovery is to encourage the learning of solutions where each feature represents a unique variation source, which we define as distinct features. This problem of learning distinct features is also referred to as disentangling factors of variation in the representation learning literature. The remainder of this dissertation highlights and provides solutions for this important problem.

An alternating autoencoder training method is presented and a new measure motivated by orthogonal loadings in linear models is proposed to quantify feature distinctness in the nonlinear models. Simulated point cloud data and handwritten digit images illustrate that standard training methods for autoencoders consistently mix the true variation sources in the learned low-dimensional representation, whereas the alternating method produces solutions with more distinct patterns.

Finally, a new regularization method for learning distinct nonlinear features using autoencoders is proposed. Motivated in-part by the properties of linear solutions, a series of learning constraints are implemented via regularization penalties during stochastic gradient descent training. These include the orthogonality of tangent vectors to the manifold, the correlation between learned features, and the distributions of the learned features. This regularized learning approach yields low-dimensional representations which can be better interpreted and used to identify the true sources of variation impacting a high-dimensional feature space. Experimental results demonstrate the effectiveness of this method for nonlinear variation pattern discovery on both simulated and real data sets.

ContributorsHoward, Phillip (Author) / Runger, George C. (Thesis advisor) / Montgomery, Douglas C. (Committee member) / Mirchandani, Pitu (Committee member) / Apley, Daniel (Committee member) / Arizona State University (Publisher)

Created2016

Cognitive Computing for Decision Support

Description

The Cognitive Decision Support (CDS) model is proposed. The model is widely applicable and scales to realistic, complex decision problems based on adaptive learning. The utility of a decision is discussed and four types of decisions associated with CDS model are identified. The CDS model is designed to learn decision…

The Cognitive Decision Support (CDS) model is proposed. The model is widely applicable and scales to realistic, complex decision problems based on adaptive learning. The utility of a decision is discussed and four types of decisions associated with CDS model are identified. The CDS model is designed to learn decision utilities. Data enrichment is introduced to promote the effectiveness of learning. Grouping is introduced for large-scale decision learning. Introspection and adjustment are presented for adaptive learning. Triage recommendation is incorporated to indicate the trustworthiness of suggested decisions.

The CDS model and methodologies are integrated into an architecture using concepts from cognitive computing. The proposed architecture is implemented with an example use case to inventory management.

Reinforcement learning (RL) is discussed as an alternative, generalized adaptive learning engine for the CDS system to handle the complexity of many problems with unknown environments. An adaptive state dimension with context that can increase with newly available information is discussed. Several enhanced components for RL which are critical for complex use cases are integrated. Deep Q networks are embedded with the adaptive learning methodologies and applied to an example supply chain management problem on capacity planning.

A new approach using Ito stochastic processes is proposed as a more generalized method to generate non-stationary demands in various patterns that can be used in decision problems. The proposed method generates demands with varying non-stationary patterns, including trend, cyclical, seasonal, and irregular patterns. Conventional approaches are identified as special cases of the proposed method. Demands are illustrated in realistic settings for various decision models. Various statistical criteria are applied to filter the generated demands. The method is applied to a real-world example.

ContributorsKee, Seho (Author) / Runger, George C. (Thesis advisor) / Escobedo, Adolfo (Committee member) / Gel, Esma (Committee member) / Janakiram, Mani (Committee member) / Rogers, Dale (Committee member) / Arizona State University (Publisher)

Created2020

Fine Mapping Functional Noncoding Genetic Elements Via Machine Learning

Description

All biological processes like cell growth, cell differentiation, development, and aging requires a series of steps which are characterized by gene regulation. Studies have shown that gene regulation is the key to various traits and diseases. Various factors affect the gene regulation which includes genetic signals, epigenetic tracks, genetic variants,…

All biological processes like cell growth, cell differentiation, development, and aging requires a series of steps which are characterized by gene regulation. Studies have shown that gene regulation is the key to various traits and diseases. Various factors affect the gene regulation which includes genetic signals, epigenetic tracks, genetic variants, etc. Deciphering and cataloging these functional genetic elements in the non-coding regions of the genome is one of the biggest challenges in precision medicine and genetic research. This thesis presents two different approaches to identifying these elements: TreeMap and DeepCORE. The first approach involves identifying putative causal genetic variants in cis-eQTL accounting for multisite effects and genetic linkage at a locus. TreeMap performs an organized search for individual and multiple causal variants using a tree guided nested machine learning method. DeepCORE on the other hand explores novel deep learning techniques that models the relationship between genetic, epigenetic and transcriptional patterns across tissues and cell lines and identifies co-operative regulatory elements that affect gene regulation. These two methods are believed to be the link for genotype-phenotype association and a necessary step to explaining various complex diseases and missing heritability.

ContributorsChandrashekar, Pramod Bharadwaj (Author) / Liu, Li (Thesis advisor) / Runger, George C. (Committee member) / Dinu, Valentin (Committee member) / Arizona State University (Publisher)

Created2020

Filtering by