Matching Items (15)

Filtering by

Clear all filters

149723-Thumbnail Image.png

System complexity reduction via feature selection

Description

This dissertation transforms a set of system complexity reduction problems to feature selection problems. Three systems are considered: classification based on association rules, network structure learning, and time series classification. Furthermore, two variable importance measures are proposed to reduce the

This dissertation transforms a set of system complexity reduction problems to feature selection problems. Three systems are considered: classification based on association rules, network structure learning, and time series classification. Furthermore, two variable importance measures are proposed to reduce the feature selection bias in tree models. Associative classifiers can achieve high accuracy, but the combination of many rules is difficult to interpret. Rule condition subset selection (RCSS) methods for associative classification are considered. RCSS aims to prune the rule conditions into a subset via feature selection. The subset then can be summarized into rule-based classifiers. Experiments show that classifiers after RCSS can substantially improve the classification interpretability without loss of accuracy. An ensemble feature selection method is proposed to learn Markov blankets for either discrete or continuous networks (without linear, Gaussian assumptions). The method is compared to a Bayesian local structure learning algorithm and to alternative feature selection methods in the causal structure learning problem. Feature selection is also used to enhance the interpretability of time series classification. Existing time series classification algorithms (such as nearest-neighbor with dynamic time warping measures) are accurate but difficult to interpret. This research leverages the time-ordering of the data to extract features, and generates an effective and efficient classifier referred to as a time series forest (TSF). The computational complexity of TSF is only linear in the length of time series, and interpretable features can be extracted. These features can be further reduced, and summarized for even better interpretability. Lastly, two variable importance measures are proposed to reduce the feature selection bias in tree-based ensemble models. It is well known that bias can occur when predictor attributes have different numbers of values. Two methods are proposed to solve the bias problem. One uses an out-of-bag sampling method called OOBForest, and the other, based on the new concept of a partial permutation test, is called a pForest. Experimental results show the existing methods are not always reliable for multi-valued predictors, while the proposed methods have advantages.

Contributors

Agent

Created

Date Created
2011

149754-Thumbnail Image.png

Production scheduling and system configuration for capacitated flow lines with application in the semiconductor backend process

Description

A good production schedule in a semiconductor back-end facility is critical for the on time delivery of customer orders. Compared to the front-end process that is dominated by re-entrant product flows, the back-end process is linear and therefore more suitable

A good production schedule in a semiconductor back-end facility is critical for the on time delivery of customer orders. Compared to the front-end process that is dominated by re-entrant product flows, the back-end process is linear and therefore more suitable for scheduling. However, the production scheduling of the back-end process is still very difficult due to the wide product mix, large number of parallel machines, product family related setups, machine-product qualification, and weekly demand consisting of thousands of lots. In this research, a novel mixed-integer-linear-programming (MILP) model is proposed for the batch production scheduling of a semiconductor back-end facility. In the MILP formulation, the manufacturing process is modeled as a flexible flow line with bottleneck stages, unrelated parallel machines, product family related sequence-independent setups, and product-machine qualification considerations. However, this MILP formulation is difficult to solve for real size problem instances. In a semiconductor back-end facility, production scheduling usually needs to be done every day while considering updated demand forecast for a medium term planning horizon. Due to the limitation on the solvable size of the MILP model, a deterministic scheduling system (DSS), consisting of an optimizer and a scheduler, is proposed to provide sub-optimal solutions in a short time for real size problem instances. The optimizer generates a tentative production plan. Then the scheduler sequences each lot on each individual machine according to the tentative production plan and scheduling rules. Customized factory rules and additional resource constraints are included in the DSS, such as preventive maintenance schedule, setup crew availability, and carrier limitations. Small problem instances are randomly generated to compare the performances of the MILP model and the deterministic scheduling system. Then experimental design is applied to understand the behavior of the DSS and identify the best configuration of the DSS under different demand scenarios. Product-machine qualification decisions have long-term and significant impact on production scheduling. A robust product-machine qualification matrix is critical for meeting demand when demand quantity or mix varies. In the second part of this research, a stochastic mixed integer programming model is proposed to balance the tradeoff between current machine qualification costs and future backorder costs with uncertain demand. The L-shaped method and acceleration techniques are proposed to solve the stochastic model. Computational results are provided to compare the performance of different solution methods.

Contributors

Agent

Created

Date Created
2011

152382-Thumbnail Image.png

A P-value based approach for phase II profile monitoring

Description

A P-value based method is proposed for statistical monitoring of various types of profiles in phase II. The performance of the proposed method is evaluated by the average run length criterion under various shifts in the intercept, slope and error

A P-value based method is proposed for statistical monitoring of various types of profiles in phase II. The performance of the proposed method is evaluated by the average run length criterion under various shifts in the intercept, slope and error standard deviation of the model. In our proposed approach, P-values are computed at each level within a sample. If at least one of the P-values is less than a pre-specified significance level, the chart signals out-of-control. The primary advantage of our approach is that only one control chart is required to monitor several parameters simultaneously: the intercept, slope(s), and the error standard deviation. A comprehensive comparison of the proposed method and the existing KMW-Shewhart method for monitoring linear profiles is conducted. In addition, the effect that the number of observations within a sample has on the performance of the proposed method is investigated. The proposed method was also compared to the T^2 method discussed in Kang and Albin (2000) for multivariate, polynomial, and nonlinear profiles. A simulation study shows that overall the proposed P-value method performs satisfactorily for different profile types.

Contributors

Agent

Created

Date Created
2013

152456-Thumbnail Image.png

Routing and scheduling of electric and alternative-fuel vehicles

Description

Vehicles powered by electricity and alternative-fuels are becoming a more popular form of transportation since they have less of an environmental impact than standard gasoline vehicles. Unfortunately, their success is currently inhibited by the sparseness of locations where the vehicles

Vehicles powered by electricity and alternative-fuels are becoming a more popular form of transportation since they have less of an environmental impact than standard gasoline vehicles. Unfortunately, their success is currently inhibited by the sparseness of locations where the vehicles can refuel as well as the fact that many of the vehicles have a range that is less than those powered by gasoline. These factors together create a "range anxiety" in drivers, which causes the drivers to worry about the utility of alternative-fuel and electric vehicles and makes them less likely to purchase these vehicles. For the new vehicle technologies to thrive it is critical that range anxiety is minimized and performance is increased as much as possible through proper routing and scheduling. In the case of long distance trips taken by individual vehicles, the routes must be chosen such that the vehicles take the shortest routes while not running out of fuel on the trip. When many vehicles are to be routed during the day, if the refueling stations have limited capacity then care must be taken to avoid having too many vehicles arrive at the stations at any time. If the vehicles that will need to be routed in the future are unknown then this problem is stochastic. For fleets of vehicles serving scheduled operations, switching to alternative-fuels requires ensuring the schedules do not cause the vehicles to run out of fuel. This is especially problematic since the locations where the vehicles may refuel are limited due to the technology being new. This dissertation covers three related optimization problems: routing a single electric or alternative-fuel vehicle on a long distance trip, routing many electric vehicles in a network where the stations have limited capacity and the arrivals into the system are stochastic, and scheduling fleets of electric or alternative-fuel vehicles with limited locations to refuel. Different algorithms are proposed to solve each of the three problems, of which some are exact and some are heuristic. The algorithms are tested on both random data and data relating to the State of Arizona.

Contributors

Agent

Created

Date Created
2014

153852-Thumbnail Image.png

Fix-and-optimize heuristic and MP-based approaches for capacitated lot sizing problem with setup carryover, setup splitting and backlogging

Description

In this thesis, a single-level, multi-item capacitated lot sizing problem with setup carryover, setup splitting and backlogging is investigated. This problem is typically used in the tactical and operational planning stage, determining the optimal production quantities and sequencing for all

In this thesis, a single-level, multi-item capacitated lot sizing problem with setup carryover, setup splitting and backlogging is investigated. This problem is typically used in the tactical and operational planning stage, determining the optimal production quantities and sequencing for all the products in the planning horizon. Although the capacitated lot sizing problems have been investigated with many different features from researchers, the simultaneous consideration of setup carryover and setup splitting is relatively new. This consideration is beneficial to reduce costs and produce feasible production schedule. Setup carryover allows the production setup to be continued between two adjacent periods without incurring extra setup costs and setup times. Setup splitting permits the setup to be partially finished in one period and continued in the next period, utilizing the capacity more efficiently and remove infeasibility of production schedule.

The main approaches are that first the simple plant location formulation is adopted to reformulate the original model. Furthermore, an extended formulation by redefining the idle period constraints is developed to make the formulation tighter. Then for the purpose of evaluating the solution quality from heuristic, three types of valid inequalities are added to the model. A fix-and-optimize heuristic with two-stage product decomposition and period decomposition strategies is proposed to solve the formulation. This generic heuristic solves a small portion of binary variables and all the continuous variables rapidly in each subproblem. In addition, the case with demand backlogging is also incorporated to demonstrate that making additional assumptions to the basic formulation does not require to completely altering the heuristic.

The contribution of this thesis includes several aspects: the computational results show the capability, flexibility and effectiveness of the approaches. The average optimality gap is 6% for data without backlogging and 8% for data with backlogging, respectively. In addition, when backlogging is not allowed, the performance of fix-and-optimize heuristic is stable regardless of period length. This gives advantage of using such approach to plan longer production schedule. Furthermore, the performance of the proposed solution approaches is analyzed so that later research on similar topics could compare the result with different solution strategies.

Contributors

Agent

Created

Date Created
2015

150659-Thumbnail Image.png

Product design optimization under epistemic uncertainty

Description

This dissertation is to address product design optimization including reliability-based design optimization (RBDO) and robust design with epistemic uncertainty. It is divided into four major components as outlined below. Firstly, a comprehensive study of uncertainties is performed, in which sources

This dissertation is to address product design optimization including reliability-based design optimization (RBDO) and robust design with epistemic uncertainty. It is divided into four major components as outlined below. Firstly, a comprehensive study of uncertainties is performed, in which sources of uncertainty are listed, categorized and the impacts are discussed. Epistemic uncertainty is of interest, which is due to lack of knowledge and can be reduced by taking more observations. In particular, the strategies to address epistemic uncertainties due to implicit constraint function are discussed. Secondly, a sequential sampling strategy to improve RBDO under implicit constraint function is developed. In modern engineering design, an RBDO task is often performed by a computer simulation program, which can be treated as a black box, as its analytical function is implicit. An efficient sampling strategy on learning the probabilistic constraint function under the design optimization framework is presented. The method is a sequential experimentation around the approximate most probable point (MPP) at each step of optimization process. It is compared with the methods of MPP-based sampling, lifted surrogate function, and non-sequential random sampling. Thirdly, a particle splitting-based reliability analysis approach is developed in design optimization. In reliability analysis, traditional simulation methods such as Monte Carlo simulation may provide accurate results, but are often accompanied with high computational cost. To increase the efficiency, particle splitting is integrated into RBDO. It is an improvement of subset simulation with multiple particles to enhance the diversity and stability of simulation samples. This method is further extended to address problems with multiple probabilistic constraints and compared with the MPP-based methods. Finally, a reliability-based robust design optimization (RBRDO) framework is provided to integrate the consideration of design reliability and design robustness simultaneously. The quality loss objective in robust design, considered together with the production cost in RBDO, are used formulate a multi-objective optimization problem. With the epistemic uncertainty from implicit performance function, the sequential sampling strategy is extended to RBRDO, and a combined metamodel is proposed to tackle both controllable variables and uncontrollable variables. The solution is a Pareto frontier, compared with a single optimal solution in RBDO.

Contributors

Agent

Created

Date Created
2012

150733-Thumbnail Image.png

Single machine scheduling: comparison of MIP formulations and heuristics for interfering job sets

Description

This research by studies the computational performance of four different mixed integer programming (MIP) formulations for single machine scheduling problems with varying complexity. These formulations are based on (1) start and completion time variables, (2) time index variables, (3) linear

This research by studies the computational performance of four different mixed integer programming (MIP) formulations for single machine scheduling problems with varying complexity. These formulations are based on (1) start and completion time variables, (2) time index variables, (3) linear ordering variables and (4) assignment and positional date variables. The objective functions that are studied in this paper are total weighted completion time, maximum lateness, number of tardy jobs and total weighted tardiness. Based on the computational results, discussion and recommendations are made on which MIP formulation might work best for these problems. The performances of these formulations very much depend on the objective function, number of jobs and the sum of the processing times of all the jobs. Two sets of inequalities are presented that can be used to improve the performance of the formulation with assignment and positional date variables. Further, this research is extend to single machine bicriteria scheduling problems in which jobs belong to either of two different disjoint sets, each set having its own performance measure. These problems have been referred to as interfering job sets in the scheduling literature and also been called multi-agent scheduling where each agent's objective function is to be minimized. In the first single machine interfering problem (P1), the criteria of minimizing total completion time and number of tardy jobs for the two sets of jobs is studied. A Forward SPT-EDD heuristic is presented that attempts to generate set of non-dominated solutions. The complexity of this specific problem is NP-hard. The computational efficiency of the heuristic is compared against the pseudo-polynomial algorithm proposed by Ng et al. [2006]. In the second single machine interfering job sets problem (P2), the criteria of minimizing total weighted completion time and maximum lateness is studied. This is an established NP-hard problem for which a Forward WSPT-EDD heuristic is presented that attempts to generate set of supported points and the solution quality is compared with MIP formulations. For both of these problems, all jobs are available at time zero and the jobs are not allowed to be preempted.

Contributors

Agent

Created

Date Created
2012

149478-Thumbnail Image.png

Optimization of surgery delivery systems

Description

Optimization of surgical operations is a challenging managerial problem for surgical suite directors. This dissertation presents modeling and solution techniques for operating room (OR) planning and scheduling problems. First, several sequencing and patient appointment time setting heuristics are proposed for

Optimization of surgical operations is a challenging managerial problem for surgical suite directors. This dissertation presents modeling and solution techniques for operating room (OR) planning and scheduling problems. First, several sequencing and patient appointment time setting heuristics are proposed for scheduling an Outpatient Procedure Center. A discrete event simulation model is used to evaluate how scheduling heuristics perform with respect to the competing criteria of expected patient waiting time and expected surgical suite overtime for a single day compared to current practice. Next, a bi-criteria Genetic Algorithm is used to determine if better solutions can be obtained for this single day scheduling problem. The efficacy of the bi-criteria Genetic Algorithm, when surgeries are allowed to be moved to other days, is investigated. Numerical experiments based on real data from a large health care provider are presented. The analysis provides insight into the best scheduling heuristics, and the tradeoff between patient and health care provider based criteria. Second, a multi-stage stochastic mixed integer programming formulation for the allocation of surgeries to ORs over a finite planning horizon is studied. The demand for surgery and surgical duration are random variables. The objective is to minimize two competing criteria: expected surgery cancellations and OR overtime. A decomposition method, Progressive Hedging, is implemented to find near optimal surgery plans. Finally, properties of the model are discussed and methods are proposed to improve the performance of the algorithm based on the special structure of the model. It is found simple rules can improve schedules used in practice. Sequencing surgeries from the longest to shortest mean duration causes high expected overtime, and should be avoided, while sequencing from the shortest to longest mean duration performed quite well in our experiments. Expending greater computational effort with more sophisticated optimization methods does not lead to substantial improvements. However, controlling daily procedure mix may achieve substantial improvements in performance. A novel stochastic programming model for a dynamic surgery planning problem is proposed in the dissertation. The efficacy of the progressive hedging algorithm is investigated. It is found there is a significant correlation between the performance of the algorithm and type and number of scenario bundles in a problem instance. The computational time spent to solve scenario subproblems is among the most significant factors that impact the performance of the algorithm. The quality of the solutions can be improved by detecting and preventing cyclical behaviors.

Contributors

Agent

Created

Date Created
2010

151813-Thumbnail Image.png

Locating counting sensors in traffic network to estimate origin-destination volumes

Description

Improving the quality of Origin-Destination (OD) demand estimates increases the effectiveness of design, evaluation and implementation of traffic planning and management systems. The associated bilevel Sensor Location Flow-Estimation problem considers two important research questions: (1) how to compute the best

Improving the quality of Origin-Destination (OD) demand estimates increases the effectiveness of design, evaluation and implementation of traffic planning and management systems. The associated bilevel Sensor Location Flow-Estimation problem considers two important research questions: (1) how to compute the best estimates of the flows of interest by using anticipated data from given candidate sensors location; and (2) how to decide on the optimum subset of links where sensors should be located. In this dissertation, a decision framework is developed to optimally locate and obtain high quality OD volume estimates in vehicular traffic networks. The framework includes a traffic assignment model to load the OD traffic volumes on routes in a known choice set, a sensor location model to decide on which subset of links to locate counting sensors to observe traffic volumes, and an estimation model to obtain best estimates of OD or route flow volumes. The dissertation first addresses the deterministic route flow estimation problem given apriori knowledge of route flows and their uncertainties. Two procedures are developed to locate "perfect" and "noisy" sensors respectively. Next, it addresses a stochastic route flow estimation problem. A hierarchical linear Bayesian model is developed, where the real route flows are assumed to be generated from a Multivariate Normal distribution with two parameters: "mean" and "variance-covariance matrix". The prior knowledge for the "mean" parameter is described by a probability distribution. When assuming the "variance-covariance matrix" parameter is known, a Bayesian A-optimal design is developed. When the "variance-covariance matrix" parameter is unknown, Markov Chain Monte Carlo approach is used to estimate the aposteriori quantities. In all the sensor location model the objective is the maximization of the reduction in the variances of the distribution of the estimates of the OD volume. Developed models are compared with other available models in the literature. The comparison showed that the models developed performed better than available models.

Contributors

Agent

Created

Date Created
2013

152768-Thumbnail Image.png

Surgical instrument reprocessing in a hospital setting analyzed with statistical process control and data mining techniques

Description

In a healthcare setting, the Sterile Processing Department (SPD) provides ancillary services to the Operating Room (OR), Emergency Room, Labor & Delivery, and off-site clinics. SPD's function is to reprocess reusable surgical instruments and return them to their home departments.

In a healthcare setting, the Sterile Processing Department (SPD) provides ancillary services to the Operating Room (OR), Emergency Room, Labor & Delivery, and off-site clinics. SPD's function is to reprocess reusable surgical instruments and return them to their home departments. The management of surgical instruments and medical devices can impact patient safety and hospital revenue. Any time instrumentation or devices are not available or are not fit for use, patient safety and revenue can be negatively impacted. One step of the instrument reprocessing cycle is sterilization. Steam sterilization is the sterilization method used for the majority of surgical instruments and is preferred to immediate use steam sterilization (IUSS) because terminally sterilized items can be stored until needed. IUSS Items must be used promptly and cannot be stored for later use. IUSS is intended for emergency situations and not as regular course of action. Unfortunately, IUSS is used to compensate for inadequate inventory levels, scheduling conflicts, and miscommunications. If IUSS is viewed as an adverse event, then monitoring IUSS incidences can help healthcare organizations meet patient safety goals and financial goals along with aiding in process improvement efforts. This work recommends statistical process control methods to IUSS incidents and illustrates the use of control charts for IUSS occurrences through a case study and analysis of the control charts for data from a health care provider. Furthermore, this work considers the application of data mining methods to IUSS occurrences and presents a representative example of data mining to the IUSS occurrences. This extends the application of statistical process control and data mining in healthcare applications.

Contributors

Agent

Created

Date Created
2014