Matching Items (57)

129323-Thumbnail Image.png

Simulation-Based Bayesian Optimal ALT Designs for Model Discrimination

Description

Accelerated life test (ALT) planning in Bayesian framework is studied in this paper with a focus of differentiating competing acceleration models, when there is uncertainty as to whether the relationship between log mean life and the stress variable is linear

Accelerated life test (ALT) planning in Bayesian framework is studied in this paper with a focus of differentiating competing acceleration models, when there is uncertainty as to whether the relationship between log mean life and the stress variable is linear or exhibits some curvature. The proposed criterion is based on the Hellinger distance measure between predictive distributions. The optimal stress-factor setup and unit allocation are determined at three stress levels subject to test-lab equipment and test-duration constraints. Optimal designs are validated by their recovery rates, where the true, data-generating, model is selected under the DIC (Deviance Information Criterion) model selection rule, and by comparing their performance with other test plans. Results show that the proposed optimal design method has the advantage of substantially increasing a test plan׳s ability to distinguish among competing ALT models, thus providing better guidance as to which model is appropriate for the follow-on testing phase in the experiment.

Contributors

Agent

Created

Date Created
2015-02-01

129553-Thumbnail Image.png

Bayesian Analysis for Step-Stress Accelerated Life Testing Using Weibull Proportional Hazard Model

Description

In this paper, we present a Bayesian analysis for the Weibull proportional hazard (PH) model used in step-stress accelerated life testings. The key mathematical and graphical difference between the Weibull cumulative exposure (CE) model and the PH model is illustrated.

In this paper, we present a Bayesian analysis for the Weibull proportional hazard (PH) model used in step-stress accelerated life testings. The key mathematical and graphical difference between the Weibull cumulative exposure (CE) model and the PH model is illustrated. Compared with the CE model, the PH model provides more flexibility in fitting step-stress testing data and has the attractive mathematical properties of being desirable in the Bayesian framework. A Markov chain Monte Carlo algorithm with adaptive rejection sampling technique is used for posterior inference. We demonstrate the performance of this method on both simulated and real datasets.

Contributors

Agent

Created

Date Created
2014-08-01

149723-Thumbnail Image.png

System complexity reduction via feature selection

Description

This dissertation transforms a set of system complexity reduction problems to feature selection problems. Three systems are considered: classification based on association rules, network structure learning, and time series classification. Furthermore, two variable importance measures are proposed to reduce the

This dissertation transforms a set of system complexity reduction problems to feature selection problems. Three systems are considered: classification based on association rules, network structure learning, and time series classification. Furthermore, two variable importance measures are proposed to reduce the feature selection bias in tree models. Associative classifiers can achieve high accuracy, but the combination of many rules is difficult to interpret. Rule condition subset selection (RCSS) methods for associative classification are considered. RCSS aims to prune the rule conditions into a subset via feature selection. The subset then can be summarized into rule-based classifiers. Experiments show that classifiers after RCSS can substantially improve the classification interpretability without loss of accuracy. An ensemble feature selection method is proposed to learn Markov blankets for either discrete or continuous networks (without linear, Gaussian assumptions). The method is compared to a Bayesian local structure learning algorithm and to alternative feature selection methods in the causal structure learning problem. Feature selection is also used to enhance the interpretability of time series classification. Existing time series classification algorithms (such as nearest-neighbor with dynamic time warping measures) are accurate but difficult to interpret. This research leverages the time-ordering of the data to extract features, and generates an effective and efficient classifier referred to as a time series forest (TSF). The computational complexity of TSF is only linear in the length of time series, and interpretable features can be extracted. These features can be further reduced, and summarized for even better interpretability. Lastly, two variable importance measures are proposed to reduce the feature selection bias in tree-based ensemble models. It is well known that bias can occur when predictor attributes have different numbers of values. Two methods are proposed to solve the bias problem. One uses an out-of-bag sampling method called OOBForest, and the other, based on the new concept of a partial permutation test, is called a pForest. Experimental results show the existing methods are not always reliable for multi-valued predictors, while the proposed methods have advantages.

Contributors

Agent

Created

Date Created
2011

149754-Thumbnail Image.png

Production scheduling and system configuration for capacitated flow lines with application in the semiconductor backend process

Description

A good production schedule in a semiconductor back-end facility is critical for the on time delivery of customer orders. Compared to the front-end process that is dominated by re-entrant product flows, the back-end process is linear and therefore more suitable

A good production schedule in a semiconductor back-end facility is critical for the on time delivery of customer orders. Compared to the front-end process that is dominated by re-entrant product flows, the back-end process is linear and therefore more suitable for scheduling. However, the production scheduling of the back-end process is still very difficult due to the wide product mix, large number of parallel machines, product family related setups, machine-product qualification, and weekly demand consisting of thousands of lots. In this research, a novel mixed-integer-linear-programming (MILP) model is proposed for the batch production scheduling of a semiconductor back-end facility. In the MILP formulation, the manufacturing process is modeled as a flexible flow line with bottleneck stages, unrelated parallel machines, product family related sequence-independent setups, and product-machine qualification considerations. However, this MILP formulation is difficult to solve for real size problem instances. In a semiconductor back-end facility, production scheduling usually needs to be done every day while considering updated demand forecast for a medium term planning horizon. Due to the limitation on the solvable size of the MILP model, a deterministic scheduling system (DSS), consisting of an optimizer and a scheduler, is proposed to provide sub-optimal solutions in a short time for real size problem instances. The optimizer generates a tentative production plan. Then the scheduler sequences each lot on each individual machine according to the tentative production plan and scheduling rules. Customized factory rules and additional resource constraints are included in the DSS, such as preventive maintenance schedule, setup crew availability, and carrier limitations. Small problem instances are randomly generated to compare the performances of the MILP model and the deterministic scheduling system. Then experimental design is applied to understand the behavior of the DSS and identify the best configuration of the DSS under different demand scenarios. Product-machine qualification decisions have long-term and significant impact on production scheduling. A robust product-machine qualification matrix is critical for meeting demand when demand quantity or mix varies. In the second part of this research, a stochastic mixed integer programming model is proposed to balance the tradeoff between current machine qualification costs and future backorder costs with uncertain demand. The L-shaped method and acceleration techniques are proposed to solve the stochastic model. Computational results are provided to compare the performance of different solution methods.

Contributors

Agent

Created

Date Created
2011

152223-Thumbnail Image.png

Optimal experimental design for accelerated life testing and design evaluation

Description

Nowadays product reliability becomes the top concern of the manufacturers and customers always prefer the products with good performances under long period. In order to estimate the lifetime of the product, accelerated life testing (ALT) is introduced because most of

Nowadays product reliability becomes the top concern of the manufacturers and customers always prefer the products with good performances under long period. In order to estimate the lifetime of the product, accelerated life testing (ALT) is introduced because most of the products can last years even decades. Much research has been done in the ALT area and optimal design for ALT is a major topic. This dissertation consists of three main studies. First, a methodology of finding optimal design for ALT with right censoring and interval censoring have been developed and it employs the proportional hazard (PH) model and generalized linear model (GLM) to simplify the computational process. A sensitivity study is also given to show the effects brought by parameters to the designs. Second, an extended version of I-optimal design for ALT is discussed and then a dual-objective design criterion is defined and showed with several examples. Also in order to evaluate different candidate designs, several graphical tools are developed. Finally, when there are more than one models available, different model checking designs are discussed.

Contributors

Agent

Created

Date Created
2013

153063-Thumbnail Image.png

Holistic learning for multi-target and network monitoring problems

Description

Technological advances have enabled the generation and collection of various data from complex systems, thus, creating ample opportunity to integrate knowledge in many decision making applications. This dissertation introduces holistic learning as the integration of a comprehensive set of relationships

Technological advances have enabled the generation and collection of various data from complex systems, thus, creating ample opportunity to integrate knowledge in many decision making applications. This dissertation introduces holistic learning as the integration of a comprehensive set of relationships that are used towards the learning objective. The holistic view of the problem allows for richer learning from data and, thereby, improves decision making.

The first topic of this dissertation is the prediction of several target attributes using a common set of predictor attributes. In a holistic learning approach, the relationships between target attributes are embedded into the learning algorithm created in this dissertation. Specifically, a novel tree based ensemble that leverages the relationships between target attributes towards constructing a diverse, yet strong, model is proposed. The method is justified through its connection to existing methods and experimental evaluations on synthetic and real data.

The second topic pertains to monitoring complex systems that are modeled as networks. Such systems present a rich set of attributes and relationships for which holistic learning is important. In social networks, for example, in addition to friendship ties, various attributes concerning the users' gender, age, topic of messages, time of messages, etc. are collected. A restricted form of monitoring fails to take the relationships of multiple attributes into account, whereas the holistic view embeds such relationships in the monitoring methods. The focus is on the difficult task to detect a change that might only impact a small subset of the network and only occur in a sub-region of the high-dimensional space of the network attributes. One contribution is a monitoring algorithm based on a network statistical model. Another contribution is a transactional model that transforms the task into an expedient structure for machine learning, along with a generalizable algorithm to monitor the attributed network. A learning step in this algorithm adapts to changes that may only be local to sub-regions (with a broader potential for other learning tasks). Diagnostic tools to interpret the change are provided. This robust, generalizable, holistic monitoring method is elaborated on synthetic and real networks.

Contributors

Agent

Created

Date Created
2014

151203-Thumbnail Image.png

The development of a validated clinically meaningful endpoint for the evaluation of tear film stability as a measure of ocular surface protection for use in the diagnosis and evaluation of dry eye disease

Description

This dissertation presents methods for the evaluation of ocular surface protection during natural blink function. The evaluation of ocular surface protection is especially important in the diagnosis of dry eye and the evaluation of dry eye severity in clinical trials.

This dissertation presents methods for the evaluation of ocular surface protection during natural blink function. The evaluation of ocular surface protection is especially important in the diagnosis of dry eye and the evaluation of dry eye severity in clinical trials. Dry eye is a highly prevalent disease affecting vast numbers (between 11% and 22%) of an aging population. There is only one approved therapy with limited efficacy, which results in a huge unmet need. The reason so few drugs have reached approval is a lack of a recognized therapeutic pathway with reproducible endpoints. While the interplay between blink function and ocular surface protection has long been recognized, all currently used evaluation techniques have addressed blink function in isolation from tear film stability, the gold standard of which is Tear Film Break-Up Time (TFBUT). In the first part of this research a manual technique of calculating ocular surface protection during natural blink function through the use of video analysis is developed and evaluated for it's ability to differentiate between dry eye and normal subjects, the results are compared with that of TFBUT. In the second part of this research the technique is improved in precision and automated through the use of video analysis algorithms. This software, called the OPI 2.0 System, is evaluated for accuracy and precision, and comparisons are made between the OPI 2.0 System and other currently recognized dry eye diagnostic techniques (e.g. TFBUT). In the third part of this research the OPI 2.0 System is deployed for use in the evaluation of subjects before, immediately after and 30 minutes after exposure to a controlled adverse environment (CAE), once again the results are compared and contrasted against commonly used dry eye endpoints. The results demonstrate that the evaluation of ocular surface protection using the OPI 2.0 System offers superior accuracy to the current standard, TFBUT.

Contributors

Agent

Created

Date Created
2012

151511-Thumbnail Image.png

Learning from asymmetric models and matched pairs

Description

With the increase in computing power and availability of data, there has never been a greater need to understand data and make decisions from it. Traditional statistical techniques may not be adequate to handle the size of today's data or

With the increase in computing power and availability of data, there has never been a greater need to understand data and make decisions from it. Traditional statistical techniques may not be adequate to handle the size of today's data or the complexities of the information hidden within the data. Thus knowledge discovery by machine learning techniques is necessary if we want to better understand information from data. In this dissertation, we explore the topics of asymmetric loss and asymmetric data in machine learning and propose new algorithms as solutions to some of the problems in these topics. We also studied variable selection of matched data sets and proposed a solution when there is non-linearity in the matched data. The research is divided into three parts. The first part addresses the problem of asymmetric loss. A proposed asymmetric support vector machine (aSVM) is used to predict specific classes with high accuracy. aSVM was shown to produce higher precision than a regular SVM. The second part addresses asymmetric data sets where variables are only predictive for a subset of the predictor classes. Asymmetric Random Forest (ARF) was proposed to detect these kinds of variables. The third part explores variable selection for matched data sets. Matched Random Forest (MRF) was proposed to find variables that are able to distinguish case and control without the restrictions that exists in linear models. MRF detects variables that are able to distinguish case and control even in the presence of interaction and qualitative variables.

Contributors

Agent

Created

Date Created
2013

150659-Thumbnail Image.png

Product design optimization under epistemic uncertainty

Description

This dissertation is to address product design optimization including reliability-based design optimization (RBDO) and robust design with epistemic uncertainty. It is divided into four major components as outlined below. Firstly, a comprehensive study of uncertainties is performed, in which sources

This dissertation is to address product design optimization including reliability-based design optimization (RBDO) and robust design with epistemic uncertainty. It is divided into four major components as outlined below. Firstly, a comprehensive study of uncertainties is performed, in which sources of uncertainty are listed, categorized and the impacts are discussed. Epistemic uncertainty is of interest, which is due to lack of knowledge and can be reduced by taking more observations. In particular, the strategies to address epistemic uncertainties due to implicit constraint function are discussed. Secondly, a sequential sampling strategy to improve RBDO under implicit constraint function is developed. In modern engineering design, an RBDO task is often performed by a computer simulation program, which can be treated as a black box, as its analytical function is implicit. An efficient sampling strategy on learning the probabilistic constraint function under the design optimization framework is presented. The method is a sequential experimentation around the approximate most probable point (MPP) at each step of optimization process. It is compared with the methods of MPP-based sampling, lifted surrogate function, and non-sequential random sampling. Thirdly, a particle splitting-based reliability analysis approach is developed in design optimization. In reliability analysis, traditional simulation methods such as Monte Carlo simulation may provide accurate results, but are often accompanied with high computational cost. To increase the efficiency, particle splitting is integrated into RBDO. It is an improvement of subset simulation with multiple particles to enhance the diversity and stability of simulation samples. This method is further extended to address problems with multiple probabilistic constraints and compared with the MPP-based methods. Finally, a reliability-based robust design optimization (RBRDO) framework is provided to integrate the consideration of design reliability and design robustness simultaneously. The quality loss objective in robust design, considered together with the production cost in RBDO, are used formulate a multi-objective optimization problem. With the epistemic uncertainty from implicit performance function, the sequential sampling strategy is extended to RBRDO, and a combined metamodel is proposed to tackle both controllable variables and uncontrollable variables. The solution is a Pareto frontier, compared with a single optimal solution in RBDO.

Contributors

Agent

Created

Date Created
2012

149613-Thumbnail Image.png

Semiconductor yield modeling using generalized linear models

Description

Yield is a key process performance characteristic in the capital-intensive semiconductor fabrication process. In an industry where machines cost millions of dollars and cycle times are a number of months, predicting and optimizing yield are critical to process improvement,

Yield is a key process performance characteristic in the capital-intensive semiconductor fabrication process. In an industry where machines cost millions of dollars and cycle times are a number of months, predicting and optimizing yield are critical to process improvement, customer satisfaction, and financial success. Semiconductor yield modeling is essential to identifying processing issues, improving quality, and meeting customer demand in the industry. However, the complicated fabrication process, the massive amount of data collected, and the number of models available make yield modeling a complex and challenging task. This work presents modeling strategies to forecast yield using generalized linear models (GLMs) based on defect metrology data. The research is divided into three main parts. First, the data integration and aggregation necessary for model building are described, and GLMs are constructed for yield forecasting. This technique yields results at both the die and the wafer levels, outperforms existing models found in the literature based on prediction errors, and identifies significant factors that can drive process improvement. This method also allows the nested structure of the process to be considered in the model, improving predictive capabilities and violating fewer assumptions. To account for the random sampling typically used in fabrication, the work is extended by using generalized linear mixed models (GLMMs) and a larger dataset to show the differences between batch-specific and population-averaged models in this application and how they compare to GLMs. These results show some additional improvements in forecasting abilities under certain conditions and show the differences between the significant effects identified in the GLM and GLMM models. The effects of link functions and sample size are also examined at the die and wafer levels. The third part of this research describes a methodology for integrating classification and regression trees (CART) with GLMs. This technique uses the terminal nodes identified in the classification tree to add predictors to a GLM. This method enables the model to consider important interaction terms in a simpler way than with the GLM alone, and provides valuable insight into the fabrication process through the combination of the tree structure and the statistical analysis of the GLM.

Contributors

Agent

Created

Date Created
2011