Search Content

A sparsity enforcing framework with TVL1 regularization and its application in MR imaging and source localization

Description

The theme for this work is the development of fast numerical algorithms for sparse optimization as well as their applications in medical imaging and source localization using sensor array processing. Due to the recently proposed theory of Compressive Sensing (CS), the $\ell_1$ minimization problem attracts more attention for its ability…

The theme for this work is the development of fast numerical algorithms for sparse optimization as well as their applications in medical imaging and source localization using sensor array processing. Due to the recently proposed theory of Compressive Sensing (CS), the $\ell_1$ minimization problem attracts more attention for its ability to exploit sparsity. Traditional interior point methods encounter difficulties in computation for solving the CS applications. In the first part of this work, a fast algorithm based on the augmented Lagrangian method for solving the large-scale TV-$\ell_1$ regularized inverse problem is proposed. Specifically, by taking advantage of the separable structure, the original problem can be approximated via the sum of a series of simple functions with closed form solutions. A preconditioner for solving the block Toeplitz with Toeplitz block (BTTB) linear system is proposed to accelerate the computation. An in-depth discussion on the rate of convergence and the optimal parameter selection criteria is given. Numerical experiments are used to test the performance and the robustness of the proposed algorithm to a wide range of parameter values. Applications of the algorithm in magnetic resonance (MR) imaging and a comparison with other existing methods are included. The second part of this work is the application of the TV-$\ell_1$ model in source localization using sensor arrays. The array output is reformulated into a sparse waveform via an over-complete basis and study the $\ell_p$-norm properties in detecting the sparsity. An algorithm is proposed for minimizing a non-convex problem. According to the results of numerical experiments, the proposed algorithm with the aid of the $\ell_p$-norm can resolve closely distributed sources with higher accuracy than other existing methods.

ContributorsShen, Wei (Author) / Mittlemann, Hans D (Thesis advisor) / Renaut, Rosemary A. (Committee member) / Jackiewicz, Zdzislaw (Committee member) / Gelb, Anne (Committee member) / Ringhofer, Christian (Committee member) / Arizona State University (Publisher)

Created2011

Integrative analyses of diverse biological data sources

Description

The technology expansion seen in the last decade for genomics research has permitted the generation of large-scale data sources pertaining to molecular biological assays, genomics, proteomics, transcriptomics and other modern omics catalogs. New methods to analyze, integrate and visualize these data types are essential to unveil relevant disease mechanisms. Towards…

The technology expansion seen in the last decade for genomics research has permitted the generation of large-scale data sources pertaining to molecular biological assays, genomics, proteomics, transcriptomics and other modern omics catalogs. New methods to analyze, integrate and visualize these data types are essential to unveil relevant disease mechanisms. Towards these objectives, this research focuses on data integration within two scenarios: (1) transcriptomic, proteomic and functional information and (2) real-time sensor-based measurements motivated by single-cell technology. To assess relationships between protein abundance, transcriptomic and functional data, a nonlinear model was explored at static and temporal levels. The successful integration of these heterogeneous data sources through the stochastic gradient boosted tree approach and its improved predictability are some highlights of this work. Through the development of an innovative validation subroutine based on a permutation approach and the use of external information (i.e., operons), lack of a priori knowledge for undetected proteins was overcome. The integrative methodologies allowed for the identification of undetected proteins for Desulfovibrio vulgaris and Shewanella oneidensis for further biological exploration in laboratories towards finding functional relationships. In an effort to better understand diseases such as cancer at different developmental stages, the Microscale Life Science Center headquartered at the Arizona State University is pursuing single-cell studies by developing novel technologies. This research arranged and applied a statistical framework that tackled the following challenges: random noise, heterogeneous dynamic systems with multiple states, and understanding cell behavior within and across different Barrett's esophageal epithelial cell lines using oxygen consumption curves. These curves were characterized with good empirical fit using nonlinear models with simple structures which allowed extraction of a large number of features. Application of a supervised classification model to these features and the integration of experimental factors allowed for identification of subtle patterns among different cell types visualized through multidimensional scaling. Motivated by the challenges of analyzing real-time measurements, we further explored a unique two-dimensional representation of multiple time series using a wavelet approach which showcased promising results towards less complex approximations. Also, the benefits of external information were explored to improve the image representation.

ContributorsTorres Garcia, Wandaliz (Author) / Meldrum, Deirdre R. (Thesis advisor) / Runger, George C. (Thesis advisor) / Gel, Esma S. (Committee member) / Li, Jing (Committee member) / Zhang, Weiwen (Committee member) / Arizona State University (Publisher)

Created2011

System complexity reduction via feature selection

Description

This dissertation transforms a set of system complexity reduction problems to feature selection problems. Three systems are considered: classification based on association rules, network structure learning, and time series classification. Furthermore, two variable importance measures are proposed to reduce the feature selection bias in tree models. Associative classifiers can achieve…

This dissertation transforms a set of system complexity reduction problems to feature selection problems. Three systems are considered: classification based on association rules, network structure learning, and time series classification. Furthermore, two variable importance measures are proposed to reduce the feature selection bias in tree models. Associative classifiers can achieve high accuracy, but the combination of many rules is difficult to interpret. Rule condition subset selection (RCSS) methods for associative classification are considered. RCSS aims to prune the rule conditions into a subset via feature selection. The subset then can be summarized into rule-based classifiers. Experiments show that classifiers after RCSS can substantially improve the classification interpretability without loss of accuracy. An ensemble feature selection method is proposed to learn Markov blankets for either discrete or continuous networks (without linear, Gaussian assumptions). The method is compared to a Bayesian local structure learning algorithm and to alternative feature selection methods in the causal structure learning problem. Feature selection is also used to enhance the interpretability of time series classification. Existing time series classification algorithms (such as nearest-neighbor with dynamic time warping measures) are accurate but difficult to interpret. This research leverages the time-ordering of the data to extract features, and generates an effective and efficient classifier referred to as a time series forest (TSF). The computational complexity of TSF is only linear in the length of time series, and interpretable features can be extracted. These features can be further reduced, and summarized for even better interpretability. Lastly, two variable importance measures are proposed to reduce the feature selection bias in tree-based ensemble models. It is well known that bias can occur when predictor attributes have different numbers of values. Two methods are proposed to solve the bias problem. One uses an out-of-bag sampling method called OOBForest, and the other, based on the new concept of a partial permutation test, is called a pForest. Experimental results show the existing methods are not always reliable for multi-valued predictors, while the proposed methods have advantages.

ContributorsDeng, Houtao (Author) / Runger, George C. (Thesis advisor) / Lohr, Sharon L (Committee member) / Pan, Rong (Committee member) / Zhang, Muhong (Committee member) / Arizona State University (Publisher)

Created2011

Design and analysis of ambulance diversion policies

Description

Overcrowding of Emergency Departments (EDs) put the safety of patients at risk. Decision makers implement Ambulance Diversion (AD) as a way to relieve congestion and ensure timely treatment delivery. However, ineffective design of AD policies reduces the accessibility to emergency care and adverse events may arise. The objective of this…

Overcrowding of Emergency Departments (EDs) put the safety of patients at risk. Decision makers implement Ambulance Diversion (AD) as a way to relieve congestion and ensure timely treatment delivery. However, ineffective design of AD policies reduces the accessibility to emergency care and adverse events may arise. The objective of this dissertation is to propose methods to design and analyze effective AD policies that consider performance measures that are related to patient safety. First, a simulation-based methodology is proposed to evaluate the mean performance and variability of single-factor AD policies in a single hospital environment considering the trade-off between average waiting time and percentage of time spent on diversion. Regression equations are proposed to obtain parameters of AD policies that yield desired performance level. The results suggest that policies based on the total number of patients waiting are more consistent and provide a high precision in predicting policy performance. Then, a Markov Decision Process model is proposed to obtain the optimal AD policy assuming that information to start treatment in a neighboring hospital is available. The model is designed to minimize the average tardiness per patient in the long run. Tardiness is defined as the time that patients have to wait beyond a safety time threshold to start receiving treatment. Theoretical and computational analyses show that there exists an optimal policy that is of threshold type, and diversion can be a good alternative to decrease tardiness when ambulance patients cause excessive congestion in the ED. Furthermore, implementation of AD policies in a simulation model that accounts for several relaxations of the assumptions suggests that the model provides consistent policies under multiple scenarios. Finally, a genetic algorithm is combined with simulation to design effective policies for multiple hospitals simultaneously. The model has the objective of minimizing the time that patients spend in non-value added activities, including transportation, waiting and boarding in the ED. Moreover, the AD policies are combined with simple ambulance destination policies to create ambulance flow control mechanisms. Results show that effective ambulance management can significantly reduce the time that patients have to wait to receive appropriate level of care.

ContributorsRamirez Nafarrate, Adrian (Author) / Fowler, John W. (Thesis advisor) / Wu, Teresa (Thesis advisor) / Gel, Esma S. (Committee member) / Limon, Jorge (Committee member) / Arizona State University (Publisher)

Created2011

Learning from asymmetric models and matched pairs

Description

With the increase in computing power and availability of data, there has never been a greater need to understand data and make decisions from it. Traditional statistical techniques may not be adequate to handle the size of today's data or the complexities of the information hidden within the data. Thus…

With the increase in computing power and availability of data, there has never been a greater need to understand data and make decisions from it. Traditional statistical techniques may not be adequate to handle the size of today's data or the complexities of the information hidden within the data. Thus knowledge discovery by machine learning techniques is necessary if we want to better understand information from data. In this dissertation, we explore the topics of asymmetric loss and asymmetric data in machine learning and propose new algorithms as solutions to some of the problems in these topics. We also studied variable selection of matched data sets and proposed a solution when there is non-linearity in the matched data. The research is divided into three parts. The first part addresses the problem of asymmetric loss. A proposed asymmetric support vector machine (aSVM) is used to predict specific classes with high accuracy. aSVM was shown to produce higher precision than a regular SVM. The second part addresses asymmetric data sets where variables are only predictive for a subset of the predictor classes. Asymmetric Random Forest (ARF) was proposed to detect these kinds of variables. The third part explores variable selection for matched data sets. Matched Random Forest (MRF) was proposed to find variables that are able to distinguish case and control without the restrictions that exists in linear models. MRF detects variables that are able to distinguish case and control even in the presence of interaction and qualitative variables.

ContributorsKoh, Derek (Author) / Runger, George C. (Thesis advisor) / Wu, Tong (Committee member) / Pan, Rong (Committee member) / Cesta, John (Committee member) / Arizona State University (Publisher)

Created2013

Spatio-temporal data mining to detect changes and clusters in trajectories

Description

With the rapid development of mobile sensing technologies like GPS, RFID, sensors in smartphones, etc., capturing position data in the form of trajectories has become easy. Moving object trajectory analysis is a growing area of interest these days owing to its applications in various domains such as marketing, security, traffic…

With the rapid development of mobile sensing technologies like GPS, RFID, sensors in smartphones, etc., capturing position data in the form of trajectories has become easy. Moving object trajectory analysis is a growing area of interest these days owing to its applications in various domains such as marketing, security, traffic monitoring and management, etc. To better understand movement behaviors from the raw mobility data, this doctoral work provides analytic models for analyzing trajectory data. As a first contribution, a model is developed to detect changes in trajectories with time. If the taxis moving in a city are viewed as sensors that provide real time information of the traffic in the city, a change in these trajectories with time can reveal that the road network has changed. To detect changes, trajectories are modeled with a Hidden Markov Model (HMM). A modified training algorithm, for parameter estimation in HMM, called m-BaumWelch, is used to develop likelihood estimates under assumed changes and used to detect changes in trajectory data with time. Data from vehicles are used to test the method for change detection. Secondly, sequential pattern mining is used to develop a model to detect changes in frequent patterns occurring in trajectory data. The aim is to answer two questions: Are the frequent patterns still frequent in the new data? If they are frequent, has the time interval distribution in the pattern changed? Two different approaches are considered for change detection, frequency-based approach and distribution-based approach. The methods are illustrated with vehicle trajectory data. Finally, a model is developed for clustering and outlier detection in semantic trajectories. A challenge with clustering semantic trajectories is that both numeric and categorical attributes are present. Another problem to be addressed while clustering is that trajectories can be of different lengths and also have missing values. A tree-based ensemble is used to address these problems. The approach is extended to outlier detection in semantic trajectories.

ContributorsKondaveeti, Anirudh (Author) / Runger, George C. (Thesis advisor) / Mirchandani, Pitu (Committee member) / Pan, Rong (Committee member) / Maciejewski, Ross (Committee member) / Arizona State University (Publisher)

Created2012

Towards haptic intelligence for artificial hands: development and use of deformable, fluidic tactile sensors to relate action and perception

Description

Human fingertips contain thousands of specialized mechanoreceptors that enable effortless physical interactions with the environment. Haptic perception capabilities enable grasp and manipulation in the absence of visual feedback, as when reaching into one's pocket or wrapping a belt around oneself. Unfortunately, state-of-the-art artificial tactile sensors and processing algorithms are no…

Human fingertips contain thousands of specialized mechanoreceptors that enable effortless physical interactions with the environment. Haptic perception capabilities enable grasp and manipulation in the absence of visual feedback, as when reaching into one's pocket or wrapping a belt around oneself. Unfortunately, state-of-the-art artificial tactile sensors and processing algorithms are no match for their biological counterparts. Tactile sensors must not only meet stringent practical specifications for everyday use, but their signals must be processed and interpreted within hundreds of milliseconds. Control of artificial manipulators, ranging from prosthetic hands to bomb defusal robots, requires a constant reliance on visual feedback that is not entirely practical. To address this, we conducted three studies aimed at advancing artificial haptic intelligence. First, we developed a novel, robust, microfluidic tactile sensor skin capable of measuring normal forces on flat or curved surfaces, such as a fingertip. The sensor consists of microchannels in an elastomer filled with a liquid metal alloy. The fluid serves as both electrical interconnects and tunable capacitive sensing units, and enables functionality despite substantial deformation. The second study investigated the use of a commercially-available, multimodal tactile sensor (BioTac sensor, SynTouch) to characterize edge orientation with respect to a body fixed reference frame, such as a fingertip. Trained on data from a robot testbed, a support vector regression model was developed to relate haptic exploration actions to perception of edge orientation. The model performed comparably to humans for estimating edge orientation. Finally, the robot testbed was used to perceive small, finger-sized geometric features. The efficiency and accuracy of different haptic exploratory procedures and supervised learning models were assessed for estimating feature properties such as type (bump, pit), order of curvature (flat, conical, spherical), and size. This study highlights the importance of tactile sensing in situations where other modalities fail, such as when the finger itself blocks line of sight. Insights from this work could be used to advance tactile sensor technology and haptic intelligence for artificial manipulators that improve quality of life, such as prosthetic hands and wheelchair-mounted robotic hands.

ContributorsPonce Wong, Ruben Dario (Author) / Santos, Veronica J (Thesis advisor) / Artemiadis, Panagiotis K (Committee member) / Helms Tillery, Stephen I (Committee member) / Posner, Jonathan D (Committee member) / Runger, George C. (Committee member) / Arizona State University (Publisher)

Created2013

Non-linear variation patterns and kernel preimages

Description

Identifying important variation patterns is a key step to identifying root causes of process variability. This gives rise to a number of challenges. First, the variation patterns might be non-linear in the measured variables, while the existing research literature has focused on linear relationships. Second, it is important to remove…

Identifying important variation patterns is a key step to identifying root causes of process variability. This gives rise to a number of challenges. First, the variation patterns might be non-linear in the measured variables, while the existing research literature has focused on linear relationships. Second, it is important to remove noise from the dataset in order to visualize the true nature of the underlying patterns. Third, in addition to visualizing the pattern (preimage), it is also essential to understand the relevant features that define the process variation pattern. This dissertation considers these variation challenges. A base kernel principal component analysis (KPCA) algorithm transforms the measurements to a high-dimensional feature space where non-linear patterns in the original measurement can be handled through linear methods. However, the principal component subspace in feature space might not be well estimated (especially from noisy training data). An ensemble procedure is constructed where the final preimage is estimated as the average from bagged samples drawn from the original dataset to attenuate noise in kernel subspace estimation. This improves the robustness of any base KPCA algorithm. In a second method, successive iterations of denoising a convex combination of the training data and the corresponding denoised preimage are used to produce a more accurate estimate of the actual denoised preimage for noisy training data. The number of primary eigenvectors chosen in each iteration is also decreased at a constant rate. An efficient stopping rule criterion is used to reduce the number of iterations. A feature selection procedure for KPCA is constructed to find the set of relevant features from noisy training data. Data points are projected onto sparse random vectors. Pairs of such projections are then matched, and the differences in variation patterns within pairs are used to identify the relevant features. This approach provides robustness to irrelevant features by calculating the final variation pattern from an ensemble of feature subsets. Experiments are conducted using several simulated as well as real-life data sets. The proposed methods show significant improvement over the competitive methods.

ContributorsSahu, Anshuman (Author) / Runger, George C. (Thesis advisor) / Wu, Teresa (Committee member) / Pan, Rong (Committee member) / Maciejewski, Ross (Committee member) / Arizona State University (Publisher)

Created2013

Matching supply and demand using dynamic quotation strategies

Description

Today's competitive markets force companies to constantly engage in the complex task of managing their demand. In make-to-order manufacturing or service systems, the demand of a product is shaped by price and lead times, where high price and lead time quotes ensure profitability for supplier, but discourage the customers from…

Today's competitive markets force companies to constantly engage in the complex task of managing their demand. In make-to-order manufacturing or service systems, the demand of a product is shaped by price and lead times, where high price and lead time quotes ensure profitability for supplier, but discourage the customers from placing orders. Low price and lead times, on the other hand, generally result in high demand, but do not necessarily ensure profitability. The price and lead time quotation problem considers the trade-off between offering high and low prices and lead times. The recent practices in make-to- order manufacturing companies reveal the importance of dynamic quotation strategies, under which the prices and lead time quotes flexibly change depending on the status of the system. In this dissertation, the objective is to model a make-to-order manufacturing system and explore various aspects of dynamic quotation strategies such as the behavior of optimal price and lead time decisions, the impact of customer preferences on optimal decisions, the benefits of employing dynamic quotation in comparison to simpler quotation strategies, and the benefits of coordinating price and lead time decisions. I first consider a manufacturer that receives demand from spot purchasers (who are quoted dynamic price and lead times), as well as from contract customers who have agree- ments with the manufacturer with fixed price and lead time terms. I analyze how customer preferences affect the optimal price and lead time decisions, the benefits of dynamic quo- tation, and the optimal mix of spot purchaser and contract customers. These analyses necessitate the computation of expected tardiness of customer orders at the moment cus- tomer enters the system. Hence, in the second part of the dissertation, I develop method- ologies to compute the expected tardiness in multi-class priority queues. For the trivial single class case, a closed formulation is obtained. For the more complex multi-class case, numerical inverse Laplace transformation algorithms are developed. In the last part of the dissertation, I model a decentralized system with two components. Marketing department determines the price quotes with the objective of maximizing revenues, and manufacturing department determines the lead time quotes to minimize lateness costs. I discuss the ben- efits of coordinating price and lead time decisions, and develop an incentivization scheme to reduce the negative impacts of lack of coordination.

ContributorsHafizoglu, Ahmet Baykal (Author) / Gel, Esma S (Thesis advisor) / Villalobos, Jesus R (Committee member) / Mirchandani, Pitu (Committee member) / Keskinocak, Pinar (Committee member) / Runger, George C. (Committee member) / Arizona State University (Publisher)

Created2012

Modeling frameworks for supply chain analytics

Description

Supply chains are increasingly complex as companies branch out into newer products and markets. In many cases, multiple products with moderate differences in performance and price compete for the same unit of demand. Simultaneous occurrences of multiple scenarios (competitive, disruptive, regulatory, economic, etc.), coupled with business decisions (pricing, product introduction,…

Supply chains are increasingly complex as companies branch out into newer products and markets. In many cases, multiple products with moderate differences in performance and price compete for the same unit of demand. Simultaneous occurrences of multiple scenarios (competitive, disruptive, regulatory, economic, etc.), coupled with business decisions (pricing, product introduction, etc.) can drastically change demand structures within a short period of time. Furthermore, product obsolescence and cannibalization are real concerns due to short product life cycles. Analytical tools that can handle this complexity are important to quantify the impact of business scenarios/decisions on supply chain performance. Traditional analysis methods struggle in this environment of large, complex datasets with hundreds of features becoming the norm in supply chains. We present an empirical analysis framework termed Scenario Trees that provides a novel representation for impulse and delayed scenario events and a direction for modeling multivariate constrained responses. Amongst potential learners, supervised learners and feature extraction strategies based on tree-based ensembles are employed to extract the most impactful scenarios and predict their outcome on metrics at different product hierarchies. These models are able to provide accurate predictions in modeling environments characterized by incomplete datasets due to product substitution, missing values, outliers, redundant features, mixed variables and nonlinear interaction effects. Graphical model summaries are generated to aid model understanding. Models in complex environments benefit from feature selection methods that extract non-redundant feature subsets from the data. Additional model simplification can be achieved by extracting specific levels/values that contribute to variable importance. We propose and evaluate new analytical methods to address this problem of feature value selection and study their comparative performance using simulated datasets. We show that supply chain surveillance can be structured as a feature value selection problem. For situations such as new product introduction, a bottom-up approach to scenario analysis is designed using an agent-based simulation and data mining framework. This simulation engine envelopes utility theory, discrete choice models and diffusion theory and acts as a test bed for enacting different business scenarios. We demonstrate the use of machine learning algorithms to analyze scenarios and generate graphical summaries to aid decision making.

ContributorsShinde, Amit (Author) / Runger, George C. (Thesis advisor) / Montgomery, Douglas C. (Committee member) / Villalobos, Rene (Committee member) / Janakiram, Mani (Committee member) / Arizona State University (Publisher)

Created2012

Filtering by