Search Content

Production scheduling and system configuration for capacitated flow lines with application in the semiconductor backend process

Description

A good production schedule in a semiconductor back-end facility is critical for the on time delivery of customer orders. Compared to the front-end process that is dominated by re-entrant product flows, the back-end process is linear and therefore more suitable for scheduling. However, the production scheduling of the back-end process…

A good production schedule in a semiconductor back-end facility is critical for the on time delivery of customer orders. Compared to the front-end process that is dominated by re-entrant product flows, the back-end process is linear and therefore more suitable for scheduling. However, the production scheduling of the back-end process is still very difficult due to the wide product mix, large number of parallel machines, product family related setups, machine-product qualification, and weekly demand consisting of thousands of lots. In this research, a novel mixed-integer-linear-programming (MILP) model is proposed for the batch production scheduling of a semiconductor back-end facility. In the MILP formulation, the manufacturing process is modeled as a flexible flow line with bottleneck stages, unrelated parallel machines, product family related sequence-independent setups, and product-machine qualification considerations. However, this MILP formulation is difficult to solve for real size problem instances. In a semiconductor back-end facility, production scheduling usually needs to be done every day while considering updated demand forecast for a medium term planning horizon. Due to the limitation on the solvable size of the MILP model, a deterministic scheduling system (DSS), consisting of an optimizer and a scheduler, is proposed to provide sub-optimal solutions in a short time for real size problem instances. The optimizer generates a tentative production plan. Then the scheduler sequences each lot on each individual machine according to the tentative production plan and scheduling rules. Customized factory rules and additional resource constraints are included in the DSS, such as preventive maintenance schedule, setup crew availability, and carrier limitations. Small problem instances are randomly generated to compare the performances of the MILP model and the deterministic scheduling system. Then experimental design is applied to understand the behavior of the DSS and identify the best configuration of the DSS under different demand scenarios. Product-machine qualification decisions have long-term and significant impact on production scheduling. A robust product-machine qualification matrix is critical for meeting demand when demand quantity or mix varies. In the second part of this research, a stochastic mixed integer programming model is proposed to balance the tradeoff between current machine qualification costs and future backorder costs with uncertain demand. The L-shaped method and acceleration techniques are proposed to solve the stochastic model. Computational results are provided to compare the performance of different solution methods.

ContributorsFu, Mengying (Author) / Askin, Ronald G. (Thesis advisor) / Zhang, Muhong (Thesis advisor) / Fowler, John W (Committee member) / Pan, Rong (Committee member) / Sen, Arunabha (Committee member) / Arizona State University (Publisher)

Created2011

Integrative analyses of diverse biological data sources

Description

The technology expansion seen in the last decade for genomics research has permitted the generation of large-scale data sources pertaining to molecular biological assays, genomics, proteomics, transcriptomics and other modern omics catalogs. New methods to analyze, integrate and visualize these data types are essential to unveil relevant disease mechanisms. Towards…

The technology expansion seen in the last decade for genomics research has permitted the generation of large-scale data sources pertaining to molecular biological assays, genomics, proteomics, transcriptomics and other modern omics catalogs. New methods to analyze, integrate and visualize these data types are essential to unveil relevant disease mechanisms. Towards these objectives, this research focuses on data integration within two scenarios: (1) transcriptomic, proteomic and functional information and (2) real-time sensor-based measurements motivated by single-cell technology. To assess relationships between protein abundance, transcriptomic and functional data, a nonlinear model was explored at static and temporal levels. The successful integration of these heterogeneous data sources through the stochastic gradient boosted tree approach and its improved predictability are some highlights of this work. Through the development of an innovative validation subroutine based on a permutation approach and the use of external information (i.e., operons), lack of a priori knowledge for undetected proteins was overcome. The integrative methodologies allowed for the identification of undetected proteins for Desulfovibrio vulgaris and Shewanella oneidensis for further biological exploration in laboratories towards finding functional relationships. In an effort to better understand diseases such as cancer at different developmental stages, the Microscale Life Science Center headquartered at the Arizona State University is pursuing single-cell studies by developing novel technologies. This research arranged and applied a statistical framework that tackled the following challenges: random noise, heterogeneous dynamic systems with multiple states, and understanding cell behavior within and across different Barrett's esophageal epithelial cell lines using oxygen consumption curves. These curves were characterized with good empirical fit using nonlinear models with simple structures which allowed extraction of a large number of features. Application of a supervised classification model to these features and the integration of experimental factors allowed for identification of subtle patterns among different cell types visualized through multidimensional scaling. Motivated by the challenges of analyzing real-time measurements, we further explored a unique two-dimensional representation of multiple time series using a wavelet approach which showcased promising results towards less complex approximations. Also, the benefits of external information were explored to improve the image representation.

ContributorsTorres Garcia, Wandaliz (Author) / Meldrum, Deirdre R. (Thesis advisor) / Runger, George C. (Thesis advisor) / Gel, Esma S. (Committee member) / Li, Jing (Committee member) / Zhang, Weiwen (Committee member) / Arizona State University (Publisher)

Created2011

System complexity reduction via feature selection

Description

This dissertation transforms a set of system complexity reduction problems to feature selection problems. Three systems are considered: classification based on association rules, network structure learning, and time series classification. Furthermore, two variable importance measures are proposed to reduce the feature selection bias in tree models. Associative classifiers can achieve…

This dissertation transforms a set of system complexity reduction problems to feature selection problems. Three systems are considered: classification based on association rules, network structure learning, and time series classification. Furthermore, two variable importance measures are proposed to reduce the feature selection bias in tree models. Associative classifiers can achieve high accuracy, but the combination of many rules is difficult to interpret. Rule condition subset selection (RCSS) methods for associative classification are considered. RCSS aims to prune the rule conditions into a subset via feature selection. The subset then can be summarized into rule-based classifiers. Experiments show that classifiers after RCSS can substantially improve the classification interpretability without loss of accuracy. An ensemble feature selection method is proposed to learn Markov blankets for either discrete or continuous networks (without linear, Gaussian assumptions). The method is compared to a Bayesian local structure learning algorithm and to alternative feature selection methods in the causal structure learning problem. Feature selection is also used to enhance the interpretability of time series classification. Existing time series classification algorithms (such as nearest-neighbor with dynamic time warping measures) are accurate but difficult to interpret. This research leverages the time-ordering of the data to extract features, and generates an effective and efficient classifier referred to as a time series forest (TSF). The computational complexity of TSF is only linear in the length of time series, and interpretable features can be extracted. These features can be further reduced, and summarized for even better interpretability. Lastly, two variable importance measures are proposed to reduce the feature selection bias in tree-based ensemble models. It is well known that bias can occur when predictor attributes have different numbers of values. Two methods are proposed to solve the bias problem. One uses an out-of-bag sampling method called OOBForest, and the other, based on the new concept of a partial permutation test, is called a pForest. Experimental results show the existing methods are not always reliable for multi-valued predictors, while the proposed methods have advantages.

ContributorsDeng, Houtao (Author) / Runger, George C. (Thesis advisor) / Lohr, Sharon L (Committee member) / Pan, Rong (Committee member) / Zhang, Muhong (Committee member) / Arizona State University (Publisher)

Created2011

Opportunistic fresh-produce commercialization under two-market disintegration

Description

This thesis develops a low-investment marketing strategy that allows low-to-mid level farmers extend their commercialization reach by strategically sending containers of fresh produce items to secondary markets that present temporary arbitrage opportunities. The methodology aims at identifying time windows of opportunity in which the price differential between two markets create…

This thesis develops a low-investment marketing strategy that allows low-to-mid level farmers extend their commercialization reach by strategically sending containers of fresh produce items to secondary markets that present temporary arbitrage opportunities. The methodology aims at identifying time windows of opportunity in which the price differential between two markets create an arbitrage opportunity for a transaction; a transaction involves buying a fresh produce item at a base market, and then shipping and selling it at secondary market price. A decision-making tool is developed that gauges the individual arbitrage opportunities and determines the specific price differential (or threshold level) that is most beneficial to the farmer under particular market conditions. For this purpose, two approaches are developed; a pragmatic approach that uses historic price information of the products in order to find the optimal price differential that maximizes earnings, and a theoretical one, which optimizes an expected profit model of the shipments to identify this optimal threshold. This thesis also develops risk management strategies that further reduce profit variability during a particular two-market transaction. In this case, financial engineering concepts are used to determine a shipment configuration strategy that minimizes the overall variability of the profits. For this, a Markowitz model is developed to determine the weight assignation of each component for a particular shipment. Based on the results of the analysis, it is deemed possible to formulate a shipment policy that not only increases the farmer's commercialization reach, but also produces profitable operations. In general, the observed rates of return under a pragmatic and theoretical approach hovered between 0.072 and 0.616 within important two-market structures. Secondly, it is demonstrated that the level of return and risk can be manipulated by varying the strictness of the shipping policy to meet the overall objectives of the decision-maker. Finally, it was found that one can minimize the risk of a particular two-market transaction by strategically grouping the product shipments.

ContributorsFlores, Hector M (Author) / Villalobos, Rene (Thesis advisor) / Runger, George C. (Committee member) / Maltz, Arnold (Committee member) / Arizona State University (Publisher)

Created2011

A pairwise comparison matrix framework for large-scale decision making

Description

A Pairwise Comparison Matrix (PCM) is used to compute for relative priorities of criteria or alternatives and are integral components of widely applied decision making tools: the Analytic Hierarchy Process (AHP) and its generalized form, the Analytic Network Process (ANP). However, a PCM suffers from several issues limiting its application…

A Pairwise Comparison Matrix (PCM) is used to compute for relative priorities of criteria or alternatives and are integral components of widely applied decision making tools: the Analytic Hierarchy Process (AHP) and its generalized form, the Analytic Network Process (ANP). However, a PCM suffers from several issues limiting its application to large-scale decision problems, specifically: (1) to the curse of dimensionality, that is, a large number of pairwise comparisons need to be elicited from a decision maker (DM), (2) inconsistent and (3) imprecise preferences maybe obtained due to the limited cognitive power of DMs. This dissertation proposes a PCM Framework for Large-Scale Decisions to address these limitations in three phases as follows. The first phase proposes a binary integer program (BIP) to intelligently decompose a PCM into several mutually exclusive subsets using interdependence scores. As a result, the number of pairwise comparisons is reduced and the consistency of the PCM is improved. Since the subsets are disjoint, the most independent pivot element is identified to connect all subsets. This is done to derive the global weights of the elements from the original PCM. The proposed BIP is applied to both AHP and ANP methodologies. However, it is noted that the optimal number of subsets is provided subjectively by the DM and hence is subject to biases and judgement errors. The second phase proposes a trade-off PCM decomposition methodology to decompose a PCM into a number of optimally identified subsets. A BIP is proposed to balance the: (1) time savings by reducing pairwise comparisons, the level of PCM inconsistency, and (2) the accuracy of the weights. The proposed methodology is applied to the AHP to demonstrate its advantages and is compared to established methodologies. In the third phase, a beta distribution is proposed to generalize a wide variety of imprecise pairwise comparison distributions via a method of moments methodology. A Non-Linear Programming model is then developed that calculates PCM element weights which maximizes the preferences of the DM as well as minimizes the inconsistency simultaneously. Comparison experiments are conducted using datasets collected from literature to validate the proposed methodology.

ContributorsJalao, Eugene Rex Lazaro (Author) / Shunk, Dan L. (Thesis advisor) / Wu, Teresa (Thesis advisor) / Askin, Ronald G. (Committee member) / Goul, Kenneth M (Committee member) / Arizona State University (Publisher)

Created2013

Learning from asymmetric models and matched pairs

Description

With the increase in computing power and availability of data, there has never been a greater need to understand data and make decisions from it. Traditional statistical techniques may not be adequate to handle the size of today's data or the complexities of the information hidden within the data. Thus…

With the increase in computing power and availability of data, there has never been a greater need to understand data and make decisions from it. Traditional statistical techniques may not be adequate to handle the size of today's data or the complexities of the information hidden within the data. Thus knowledge discovery by machine learning techniques is necessary if we want to better understand information from data. In this dissertation, we explore the topics of asymmetric loss and asymmetric data in machine learning and propose new algorithms as solutions to some of the problems in these topics. We also studied variable selection of matched data sets and proposed a solution when there is non-linearity in the matched data. The research is divided into three parts. The first part addresses the problem of asymmetric loss. A proposed asymmetric support vector machine (aSVM) is used to predict specific classes with high accuracy. aSVM was shown to produce higher precision than a regular SVM. The second part addresses asymmetric data sets where variables are only predictive for a subset of the predictor classes. Asymmetric Random Forest (ARF) was proposed to detect these kinds of variables. The third part explores variable selection for matched data sets. Matched Random Forest (MRF) was proposed to find variables that are able to distinguish case and control without the restrictions that exists in linear models. MRF detects variables that are able to distinguish case and control even in the presence of interaction and qualitative variables.

ContributorsKoh, Derek (Author) / Runger, George C. (Thesis advisor) / Wu, Tong (Committee member) / Pan, Rong (Committee member) / Cesta, John (Committee member) / Arizona State University (Publisher)

Created2013

Spatio-temporal data mining to detect changes and clusters in trajectories

Description

With the rapid development of mobile sensing technologies like GPS, RFID, sensors in smartphones, etc., capturing position data in the form of trajectories has become easy. Moving object trajectory analysis is a growing area of interest these days owing to its applications in various domains such as marketing, security, traffic…

With the rapid development of mobile sensing technologies like GPS, RFID, sensors in smartphones, etc., capturing position data in the form of trajectories has become easy. Moving object trajectory analysis is a growing area of interest these days owing to its applications in various domains such as marketing, security, traffic monitoring and management, etc. To better understand movement behaviors from the raw mobility data, this doctoral work provides analytic models for analyzing trajectory data. As a first contribution, a model is developed to detect changes in trajectories with time. If the taxis moving in a city are viewed as sensors that provide real time information of the traffic in the city, a change in these trajectories with time can reveal that the road network has changed. To detect changes, trajectories are modeled with a Hidden Markov Model (HMM). A modified training algorithm, for parameter estimation in HMM, called m-BaumWelch, is used to develop likelihood estimates under assumed changes and used to detect changes in trajectory data with time. Data from vehicles are used to test the method for change detection. Secondly, sequential pattern mining is used to develop a model to detect changes in frequent patterns occurring in trajectory data. The aim is to answer two questions: Are the frequent patterns still frequent in the new data? If they are frequent, has the time interval distribution in the pattern changed? Two different approaches are considered for change detection, frequency-based approach and distribution-based approach. The methods are illustrated with vehicle trajectory data. Finally, a model is developed for clustering and outlier detection in semantic trajectories. A challenge with clustering semantic trajectories is that both numeric and categorical attributes are present. Another problem to be addressed while clustering is that trajectories can be of different lengths and also have missing values. A tree-based ensemble is used to address these problems. The approach is extended to outlier detection in semantic trajectories.

ContributorsKondaveeti, Anirudh (Author) / Runger, George C. (Thesis advisor) / Mirchandani, Pitu (Committee member) / Pan, Rong (Committee member) / Maciejewski, Ross (Committee member) / Arizona State University (Publisher)

Created2012

An agent-based optimization framework for engineered complex adaptive systems with application to demand response in electricity markets

Description

The main objective of this research is to develop an integrated method to study emergent behavior and consequences of evolution and adaptation in engineered complex adaptive systems (ECASs). A multi-layer conceptual framework and modeling approach including behavioral and structural aspects is provided to describe the structure of a class of…

The main objective of this research is to develop an integrated method to study emergent behavior and consequences of evolution and adaptation in engineered complex adaptive systems (ECASs). A multi-layer conceptual framework and modeling approach including behavioral and structural aspects is provided to describe the structure of a class of engineered complex systems and predict their future adaptive patterns. The approach allows the examination of complexity in the structure and the behavior of components as a result of their connections and in relation to their environment. This research describes and uses the major differences of natural complex adaptive systems (CASs) with artificial/engineered CASs to build a framework and platform for ECAS. While this framework focuses on the critical factors of an engineered system, it also enables one to synthetically employ engineering and mathematical models to analyze and measure complexity in such systems. In this way concepts of complex systems science are adapted to management science and system of systems engineering. In particular an integrated consumer-based optimization and agent-based modeling (ABM) platform is presented that enables managers to predict and partially control patterns of behaviors in ECASs. Demonstrated on the U.S. electricity markets, ABM is integrated with normative and subjective decision behavior recommended by the U.S. Department of Energy (DOE) and Federal Energy Regulatory Commission (FERC). The approach integrates social networks, social science, complexity theory, and diffusion theory. Furthermore, it has unique and significant contribution in exploring and representing concrete managerial insights for ECASs and offering new optimized actions and modeling paradigms in agent-based simulation.

ContributorsHaghnevis, Moeed (Author) / Askin, Ronald G. (Thesis advisor) / Armbruster, Dieter (Thesis advisor) / Mirchandani, Pitu (Committee member) / Wu, Tong (Committee member) / Hedman, Kory (Committee member) / Arizona State University (Publisher)

Created2013

Non-linear variation patterns and kernel preimages

Description

Identifying important variation patterns is a key step to identifying root causes of process variability. This gives rise to a number of challenges. First, the variation patterns might be non-linear in the measured variables, while the existing research literature has focused on linear relationships. Second, it is important to remove…

Identifying important variation patterns is a key step to identifying root causes of process variability. This gives rise to a number of challenges. First, the variation patterns might be non-linear in the measured variables, while the existing research literature has focused on linear relationships. Second, it is important to remove noise from the dataset in order to visualize the true nature of the underlying patterns. Third, in addition to visualizing the pattern (preimage), it is also essential to understand the relevant features that define the process variation pattern. This dissertation considers these variation challenges. A base kernel principal component analysis (KPCA) algorithm transforms the measurements to a high-dimensional feature space where non-linear patterns in the original measurement can be handled through linear methods. However, the principal component subspace in feature space might not be well estimated (especially from noisy training data). An ensemble procedure is constructed where the final preimage is estimated as the average from bagged samples drawn from the original dataset to attenuate noise in kernel subspace estimation. This improves the robustness of any base KPCA algorithm. In a second method, successive iterations of denoising a convex combination of the training data and the corresponding denoised preimage are used to produce a more accurate estimate of the actual denoised preimage for noisy training data. The number of primary eigenvectors chosen in each iteration is also decreased at a constant rate. An efficient stopping rule criterion is used to reduce the number of iterations. A feature selection procedure for KPCA is constructed to find the set of relevant features from noisy training data. Data points are projected onto sparse random vectors. Pairs of such projections are then matched, and the differences in variation patterns within pairs are used to identify the relevant features. This approach provides robustness to irrelevant features by calculating the final variation pattern from an ensemble of feature subsets. Experiments are conducted using several simulated as well as real-life data sets. The proposed methods show significant improvement over the competitive methods.

ContributorsSahu, Anshuman (Author) / Runger, George C. (Thesis advisor) / Wu, Teresa (Committee member) / Pan, Rong (Committee member) / Maciejewski, Ross (Committee member) / Arizona State University (Publisher)

Created2013

Adaptive operation decisions for a system of smart buildings

Description

Buildings (approximately half commercial and half residential) consume over 70% of the electricity among all the consumption units in the United States. Buildings are also responsible for approximately 40% of CO2 emissions, which is more than any other industry sectors. As a result, the initiative smart building which aims to…

Buildings (approximately half commercial and half residential) consume over 70% of the electricity among all the consumption units in the United States. Buildings are also responsible for approximately 40% of CO2 emissions, which is more than any other industry sectors. As a result, the initiative smart building which aims to not only manage electrical consumption in an efficient way but also reduce the damaging effect of greenhouse gases on the environment has been launched. Another important technology being promoted by government agencies is the smart grid which manages energy usage across a wide range of buildings in an effort to reduce cost and increase reliability and transparency. As a great amount of efforts have been devoted to these two initiatives by either exploring the smart grid designs or developing technologies for smart buildings, the research studying how the smart buildings and smart grid coordinate thus more efficiently use the energy is currently lacking. In this dissertation, a "system-of-system" approach is employed to develop an integrated building model which consists a number of buildings (building cluster) interacting with smart grid. The buildings can function as both energy consumption unit as well as energy generation/storage unit. Memetic Algorithm (MA) and Particle Swarm Optimization (PSO) based decision framework are developed for building operation decisions. In addition, Particle Filter (PF) is explored as a mean for fusing online sensor and meter data so adaptive decision could be made in responding to dynamic environment. The dissertation is divided into three inter-connected research components. First, an integrated building energy model including building consumption, storage, generation sub-systems for the building cluster is developed. Then a bi-level Memetic Algorithm (MA) based decentralized decision framework is developed to identify the Pareto optimal operation strategies for the building cluster. The Pareto solutions not only enable multiple dimensional tradeoff analysis, but also provide valuable insight for determining pricing mechanisms and power grid capacity. Secondly, a multi-objective PSO based decision framework is developed to reduce the computational effort of the MA based decision framework without scarifying accuracy. With the improved performance, the decision time scale could be refined to make it capable for hourly operation decisions. Finally, by integrating the multi-objective PSO based decision framework with PF, an adaptive framework is developed for adaptive operation decisions for smart building cluster. The adaptive framework not only enables me to develop a high fidelity decision model but also enables the building cluster to respond to the dynamics and uncertainties inherent in the system.

ContributorsHu, Mengqi (Author) / Wu, Teresa (Thesis advisor) / Weir, Jeffery (Thesis advisor) / Wen, Jin (Committee member) / Fowler, John (Committee member) / Shunk, Dan (Committee member) / Arizona State University (Publisher)

Created2012

Filtering by