Matching Items (23)
Filtering by

Clear all filters

154894-Thumbnail Image.png
Description
The majority of research in experimental design has, to date, been focused on designs when there is only one type of response variable under consideration. In a decision-making process, however, relying on only one objective or criterion can lead to oversimplified, sub-optimal decisions that ignore important considerations. Incorporating multiple, and

The majority of research in experimental design has, to date, been focused on designs when there is only one type of response variable under consideration. In a decision-making process, however, relying on only one objective or criterion can lead to oversimplified, sub-optimal decisions that ignore important considerations. Incorporating multiple, and likely competing, objectives is critical during the decision-making process in order to balance the tradeoffs of all potential solutions. Consequently, the problem of constructing a design for an experiment when multiple types of responses are of interest does not have a clear answer, particularly when the response variables have different distributions. Responses with different distributions have different requirements of the design.

Computer-generated optimal designs are popular design choices for less standard scenarios where classical designs are not ideal. This work presents a new approach to experimental designs for dual-response systems. The normal, binomial, and Poisson distributions are considered for the potential responses. Using the D-criterion for the linear model and the Bayesian D-criterion for the nonlinear models, a weighted criterion is implemented in a coordinate-exchange algorithm. The designs are evaluated and compared across different weights. The sensitivity of the designs to the priors supplied in the Bayesian D-criterion is explored in the third chapter of this work.

The final section of this work presents a method for a decision-making process involving multiple objectives. There are situations where a decision-maker is interested in several optimal solutions, not just one. These types of decision processes fall into one of two scenarios: 1) wanting to identify the best N solutions to accomplish a goal or specific task, or 2) evaluating a decision based on several primary quantitative objectives along with secondary qualitative priorities. Design of experiment selection often involves the second scenario where the goal is to identify several contending solutions using the primary quantitative objectives, and then use the secondary qualitative objectives to guide the final decision. Layered Pareto Fronts can help identify a richer class of contenders to examine more closely. The method is illustrated with a supersaturated screening design example.
ContributorsBurke, Sarah Ellen (Author) / Montgomery, Douglas C. (Thesis advisor) / Borror, Connie M. (Thesis advisor) / Anderson-Cook, Christine M. (Committee member) / Pan, Rong (Committee member) / Silvestrini, Rachel (Committee member) / Arizona State University (Publisher)
Created2016
155361-Thumbnail Image.png
Description
This dissertation proposes a new set of analytical methods for high dimensional physiological sensors. The methodologies developed in this work were motivated by problems in learning science, but also apply to numerous disciplines where high dimensional signals are present. In the education field, more data is now available from traditional

This dissertation proposes a new set of analytical methods for high dimensional physiological sensors. The methodologies developed in this work were motivated by problems in learning science, but also apply to numerous disciplines where high dimensional signals are present. In the education field, more data is now available from traditional sources and there is an important need for analytical methods to translate this data into improved learning. Affecting Computing which is the study of new techniques that develop systems to recognize and model human emotions is integrating different physiological signals such as electroencephalogram (EEG) and electromyogram (EMG) to detect and model emotions which later can be used to improve these learning systems.

The first contribution proposes an event-crossover (ECO) methodology to analyze performance in learning environments. The methodology is relevant to studies where it is desired to evaluate the relationships between sentinel events in a learning environment and a physiological measurement which is provided in real time.

The second contribution introduces analytical methods to study relationships between multi-dimensional physiological signals and sentinel events in a learning environment. The methodology proposed learns physiological patterns in the form of node activations near time of events using different statistical techniques.

The third contribution addresses the challenge of performance prediction from physiological signals. Features from the sensors which could be computed early in the learning activity were developed for input to a machine learning model. The objective is to predict success or failure of the student in the learning environment early in the activity. EEG was used as the physiological signal to train a pattern recognition algorithm in order to derive meta affective states.

The last contribution introduced a methodology to predict a learner's performance using Bayes Belief Networks (BBNs). Posterior probabilities of latent nodes were used as inputs to a predictive model in real-time as evidence was accumulated in the BBN.

The methodology was applied to data streams from a video game and from a Damage Control Simulator which were used to predict and quantify performance. The proposed methods provide cognitive scientists with new tools to analyze subjects in learning environments.
ContributorsLujan Moreno, Gustavo A. (Author) / Runger, George C. (Thesis advisor) / Atkinson, Robert K (Thesis advisor) / Montgomery, Douglas C. (Committee member) / Villalobos, Rene (Committee member) / Arizona State University (Publisher)
Created2017
155712-Thumbnail Image.png
Description
In accelerated life tests (ALTs), complete randomization is hardly achievable because of economic and engineering constraints. Typical experimental protocols such as subsampling or random blocks in ALTs result in a grouped structure, which leads to correlated lifetime observations. In this dissertation, generalized linear mixed model (GLMM) approach is proposed to

In accelerated life tests (ALTs), complete randomization is hardly achievable because of economic and engineering constraints. Typical experimental protocols such as subsampling or random blocks in ALTs result in a grouped structure, which leads to correlated lifetime observations. In this dissertation, generalized linear mixed model (GLMM) approach is proposed to analyze ALT data and find the optimal ALT design with the consideration of heterogeneous group effects.

Two types of ALTs are demonstrated for data analysis. First, constant-stress ALT (CSALT) data with Weibull failure time distribution is modeled by GLMM. The marginal likelihood of observations is approximated by the quadrature rule; and the maximum likelihood (ML) estimation method is applied in iterative fashion to estimate unknown parameters including the variance component of random effect. Secondly, step-stress ALT (SSALT) data with random group effects is analyzed in similar manner but with an assumption of exponentially distributed failure time in each stress step. Two parameter estimation methods, from the frequentist’s and Bayesian points of view, are applied; and they are compared with other traditional models through simulation study and real example of the heterogeneous SSALT data. The proposed random effect model shows superiority in terms of reducing bias and variance in the estimation of life-stress relationship.

The GLMM approach is particularly useful for the optimal experimental design of ALT while taking the random group effects into account. In specific, planning ALTs under nested design structure with random test chamber effects are studied. A greedy two-phased approach shows that different test chamber assignments to stress conditions substantially impact on the estimation of unknown parameters. Then, the D-optimal test plan with two test chambers is constructed by applying the quasi-likelihood approach. Lastly, the optimal ALT planning is expanded for the case of multiple sources of random effects so that the crossed design structure is also considered, along with the nested structure.
ContributorsSeo, Kangwon (Author) / Pan, Rong (Thesis advisor) / Montgomery, Douglas C. (Committee member) / Villalobos, J. Rene (Committee member) / Rigdon, Steven E (Committee member) / Arizona State University (Publisher)
Created2017
149476-Thumbnail Image.png
Description
In mixture-process variable experiments, it is common that the number of runs is greater than in mixture-only or process-variable experiments. These experiments have to estimate the parameters from the mixture components, process variables, and interactions of both variables. In some of these experiments there are variables that are hard to

In mixture-process variable experiments, it is common that the number of runs is greater than in mixture-only or process-variable experiments. These experiments have to estimate the parameters from the mixture components, process variables, and interactions of both variables. In some of these experiments there are variables that are hard to change or cannot be controlled under normal operating conditions. These situations often prohibit a complete randomization for the experimental runs due to practical and economical considerations. Furthermore, the process variables can be categorized into two types: variables that are controllable and directly affect the response, and variables that are uncontrollable and primarily affect the variability of the response. These uncontrollable variables are called noise factors and assumed controllable in a laboratory environment for the purpose of conducting experiments. The model containing both noise variables and control factors can be used to determine factor settings for the control factor that makes the response "robust" to the variability transmitted from the noise factors. These types of experiments can be analyzed in a model for the mean response and a model for the slope of the response within a split-plot structure. When considering the experimental designs, low prediction variances for the mean and slope model are desirable. The methods for the mixture-process variable designs with noise variables considering a restricted randomization are demonstrated and some mixture-process variable designs that are robust to the coefficients of interaction with noise variables are evaluated using fraction design space plots with the respect to the prediction variance properties. Finally, the G-optimal design that minimizes the maximum prediction variance over the entire design region is created using a genetic algorithm.
ContributorsCho, Tae Yeon (Author) / Montgomery, Douglas C. (Thesis advisor) / Borror, Connie M. (Thesis advisor) / Shunk, Dan L. (Committee member) / Gel, Esma S (Committee member) / Kulahci, Murat (Committee member) / Arizona State University (Publisher)
Created2010
149443-Thumbnail Image.png
Description
Public health surveillance is a special case of the general problem where counts (or rates) of events are monitored for changes. Modern data complements event counts with many additional measurements (such as geographic, demographic, and others) that comprise high-dimensional covariates. This leads to an important challenge to detect a change

Public health surveillance is a special case of the general problem where counts (or rates) of events are monitored for changes. Modern data complements event counts with many additional measurements (such as geographic, demographic, and others) that comprise high-dimensional covariates. This leads to an important challenge to detect a change that only occurs within a region, initially unspecified, defined by these covariates. Current methods are typically limited to spatial and/or temporal covariate information and often fail to use all the information available in modern data that can be paramount in unveiling these subtle changes. Additional complexities associated with modern health data that are often not accounted for by traditional methods include: covariates of mixed type, missing values, and high-order interactions among covariates. This work proposes a transform of public health surveillance to supervised learning, so that an appropriate learner can inherently address all the complexities described previously. At the same time, quantitative measures from the learner can be used to define signal criteria to detect changes in rates of events. A Feature Selection (FS) method is used to identify covariates that contribute to a model and to generate a signal. A measure of statistical significance is included to control false alarms. An alternative Percentile method identifies the specific cases that lead to changes using class probability estimates from tree-based ensembles. This second method is intended to be less computationally intensive and significantly simpler to implement. Finally, a third method labeled Rule-Based Feature Value Selection (RBFVS) is proposed for identifying the specific regions in high-dimensional space where the changes are occurring. Results on simulated examples are used to compare the FS method and the Percentile method. Note this work emphasizes the application of the proposed methods on public health surveillance. Nonetheless, these methods can easily be extended to a variety of applications where counts (or rates) of events are monitored for changes. Such problems commonly occur in domains such as manufacturing, economics, environmental systems, engineering, as well as in public health.
ContributorsDavila, Saylisse (Author) / Runger, George C. (Thesis advisor) / Montgomery, Douglas C. (Committee member) / Young, Dennis (Committee member) / Gel, Esma (Committee member) / Arizona State University (Publisher)
Created2010
149613-Thumbnail Image.png
Description
Yield is a key process performance characteristic in the capital-intensive semiconductor fabrication process. In an industry where machines cost millions of dollars and cycle times are a number of months, predicting and optimizing yield are critical to process improvement, customer satisfaction, and financial success. Semiconductor yield modeling is

Yield is a key process performance characteristic in the capital-intensive semiconductor fabrication process. In an industry where machines cost millions of dollars and cycle times are a number of months, predicting and optimizing yield are critical to process improvement, customer satisfaction, and financial success. Semiconductor yield modeling is essential to identifying processing issues, improving quality, and meeting customer demand in the industry. However, the complicated fabrication process, the massive amount of data collected, and the number of models available make yield modeling a complex and challenging task. This work presents modeling strategies to forecast yield using generalized linear models (GLMs) based on defect metrology data. The research is divided into three main parts. First, the data integration and aggregation necessary for model building are described, and GLMs are constructed for yield forecasting. This technique yields results at both the die and the wafer levels, outperforms existing models found in the literature based on prediction errors, and identifies significant factors that can drive process improvement. This method also allows the nested structure of the process to be considered in the model, improving predictive capabilities and violating fewer assumptions. To account for the random sampling typically used in fabrication, the work is extended by using generalized linear mixed models (GLMMs) and a larger dataset to show the differences between batch-specific and population-averaged models in this application and how they compare to GLMs. These results show some additional improvements in forecasting abilities under certain conditions and show the differences between the significant effects identified in the GLM and GLMM models. The effects of link functions and sample size are also examined at the die and wafer levels. The third part of this research describes a methodology for integrating classification and regression trees (CART) with GLMs. This technique uses the terminal nodes identified in the classification tree to add predictors to a GLM. This method enables the model to consider important interaction terms in a simpler way than with the GLM alone, and provides valuable insight into the fabrication process through the combination of the tree structure and the statistical analysis of the GLM.
ContributorsKrueger, Dana Cheree (Author) / Montgomery, Douglas C. (Thesis advisor) / Fowler, John (Committee member) / Pan, Rong (Committee member) / Pfund, Michele (Committee member) / Arizona State University (Publisher)
Created2011
154390-Thumbnail Image.png
Description
Mixture experiments are useful when the interest is in determining how changes in the proportion of an experimental component affects the response. This research focuses on the modeling and design of mixture experiments when the response is categorical namely, binary and ordinal. Data from mixture experiments is characterized by

Mixture experiments are useful when the interest is in determining how changes in the proportion of an experimental component affects the response. This research focuses on the modeling and design of mixture experiments when the response is categorical namely, binary and ordinal. Data from mixture experiments is characterized by the perfect collinearity of the experimental components, resulting in model matrices that are singular and inestimable under likelihood estimation procedures. To alleviate problems with estimation, this research proposes the reparameterization of two nonlinear models for ordinal data -- the proportional-odds model with a logistic link and the stereotype model. A study involving subjective ordinal responses from a mixture experiment demonstrates that the stereotype model reveals useful information about the relationship between mixture components and the ordinality of the response, which the proportional-odds fails to detect.

The second half of this research deals with the construction of exact D-optimal designs for binary and ordinal responses. For both types, the base models fall under the class of Generalized Linear Models (GLMs) with a logistic link. First, the properties of the exact D-optimal mixture designs for binary responses are investigated. It will be shown that standard mixture designs and designs proposed for normal-theory responses are poor surrogates for the true D-optimal designs. In contrast with the D-optimal designs for normal-theory responses which locate support points at the boundaries of the mixture region, exact D-optimal designs for GLMs tend to locate support points at regions of uncertainties. Alternate D-optimal designs for binary responses with high D-efficiencies are proposed by utilizing information about these regions.

The Mixture Exchange Algorithm (MEA), a search heuristic tailored to the construction of efficient mixture designs with GLM-type responses, is proposed. MEA introduces a new and efficient updating formula that lessens the computational expense of calculating the D-criterion for multi-categorical response systems, such as ordinal response models. MEA computationally outperforms comparable search heuristics by several orders of magnitude. Further, its computational expense increases at a slower rate of growth with increasing problem size. Finally, local and robust D-optimal designs for ordinal-response mixture systems are constructed using MEA, investigated, and shown to have high D-efficiency performance.
ContributorsMancenido, Michelle V (Author) / Montgomery, Douglas C. (Thesis advisor) / Pan, Rong (Thesis advisor) / Borror, Connie M. (Committee member) / Shunk, Dan L. (Committee member) / Arizona State University (Publisher)
Created2016
154216-Thumbnail Image.png
Description
The Partition of Variance (POV) method is a simplistic way to identify large sources of variation in manufacturing systems. This method identifies the variance by estimating the variance of the means (between variance) and the means of the variance (within variance). The project shows that the method correctly identifies the

The Partition of Variance (POV) method is a simplistic way to identify large sources of variation in manufacturing systems. This method identifies the variance by estimating the variance of the means (between variance) and the means of the variance (within variance). The project shows that the method correctly identifies the variance source when compared to the ANOVA method. Although the variance estimators deteriorate when varying degrees of non-normality is introduced through simulation; however, the POV method is shown to be a more stable measure of variance in the aggregate. The POV method also provides non-negative, stable estimates for interaction when compared to the ANOVA method. The POV method is shown to be more stable, particularly in low sample size situations. Based on these findings, it is suggested that the POV is not a replacement for more complex analysis methods, but rather, a supplement to them. POV is ideal for preliminary analysis due to the ease of implementation, the simplicity of interpretation, and the lack of dependency on statistical analysis packages or statistical knowledge.
ContributorsLittle, David John (Author) / Borror, Connie (Thesis advisor) / Montgomery, Douglas C. (Committee member) / Broatch, Jennifer (Committee member) / Arizona State University (Publisher)
Created2015
157561-Thumbnail Image.png
Description
Optimal design theory provides a general framework for the construction of experimental designs for categorical responses. For a binary response, where the possible result is one of two outcomes, the logistic regression model is widely used to relate a set of experimental factors with the probability of a positive

Optimal design theory provides a general framework for the construction of experimental designs for categorical responses. For a binary response, where the possible result is one of two outcomes, the logistic regression model is widely used to relate a set of experimental factors with the probability of a positive (or negative) outcome. This research investigates and proposes alternative designs to alleviate the problem of separation in small-sample D-optimal designs for the logistic regression model. Separation causes the non-existence of maximum likelihood parameter estimates and presents a serious problem for model fitting purposes.

First, it is shown that exact, multi-factor D-optimal designs for the logistic regression model can be susceptible to separation. Several logistic regression models are specified, and exact D-optimal designs of fixed sizes are constructed for each model. Sets of simulated response data are generated to estimate the probability of separation in each design. This study proves through simulation that small-sample D-optimal designs are prone to separation and that separation risk is dependent on the specified model. Additionally, it is demonstrated that exact designs of equal size constructed for the same models may have significantly different chances of encountering separation.

The second portion of this research establishes an effective strategy for augmentation, where additional design runs are judiciously added to eliminate separation that has occurred in an initial design. A simulation study is used to demonstrate that augmenting runs in regions of maximum prediction variance (MPV), where the predicted probability of either response category is 50%, most reliably eliminates separation. However, it is also shown that MPV augmentation tends to yield augmented designs with lower D-efficiencies.

The final portion of this research proposes a novel compound optimality criterion, DMP, that is used to construct locally optimal and robust compromise designs. A two-phase coordinate exchange algorithm is implemented to construct exact locally DMP-optimal designs. To address design dependence issues, a maximin strategy is proposed for designating a robust DMP-optimal design. A case study demonstrates that the maximin DMP-optimal design maintains comparable D-efficiencies to a corresponding Bayesian D-optimal design while offering significantly improved separation performance.
ContributorsPark, Anson Robert (Author) / Montgomery, Douglas C. (Thesis advisor) / Mancenido, Michelle V (Thesis advisor) / Escobedo, Adolfo R. (Committee member) / Pan, Rong (Committee member) / Arizona State University (Publisher)
Created2019
158883-Thumbnail Image.png
Description
Nonregular designs are a preferable alternative to regular resolution four designs because they avoid confounding two-factor interactions. As a result nonregular designs can estimate and identify a few active two-factor interactions. However, due to the sometimes complex alias structure of nonregular designs, standard screening strategies can fail to identify all

Nonregular designs are a preferable alternative to regular resolution four designs because they avoid confounding two-factor interactions. As a result nonregular designs can estimate and identify a few active two-factor interactions. However, due to the sometimes complex alias structure of nonregular designs, standard screening strategies can fail to identify all active effects. In this research, two-level nonregular screening designs with orthogonal main effects will be discussed. By utilizing knowledge of the alias structure, a design based model selection process for analyzing nonregular designs is proposed.

The Aliased Informed Model Selection (AIMS) strategy is a design specific approach that is compared to three generic model selection methods; stepwise regression, least absolute shrinkage and selection operator (LASSO), and the Dantzig selector. The AIMS approach substantially increases the power to detect active main effects and two-factor interactions versus the aforementioned generic methodologies. This research identifies design specific model spaces; sets of models with strong heredity, all estimable, and exhibit no model confounding. These spaces are then used in the AIMS method along with design specific aliasing rules for model selection decisions. Model spaces and alias rules are identified for three designs; 16-run no-confounding 6, 7, and 8-factor designs. The designs are demonstrated with several examples as well as simulations to show the AIMS superiority in model selection.

A final piece of the research provides a method for augmenting no-confounding designs based on a model spaces and maximum average D-efficiency. Several augmented designs are provided for different situations. A final simulation with the augmented designs shows strong results for augmenting four additional runs if time and resources permit.
ContributorsMetcalfe, Carly E (Author) / Montgomery, Douglas C. (Thesis advisor) / Jones, Bradley (Committee member) / Pan, Rong (Committee member) / Pedrielli, Giulia (Committee member) / Arizona State University (Publisher)
Created2020