Matching Items (21)
Filtering by

Clear all filters

150547-Thumbnail Image.png
Description
This dissertation presents methods for addressing research problems that currently can only adequately be solved using Quality Reliability Engineering (QRE) approaches especially accelerated life testing (ALT) of electronic printed wiring boards with applications to avionics circuit boards. The methods presented in this research are generally applicable to circuit boards, but

This dissertation presents methods for addressing research problems that currently can only adequately be solved using Quality Reliability Engineering (QRE) approaches especially accelerated life testing (ALT) of electronic printed wiring boards with applications to avionics circuit boards. The methods presented in this research are generally applicable to circuit boards, but the data generated and their analysis is for high performance avionics. Avionics equipment typically requires 20 years expected life by aircraft equipment manufacturers and therefore ALT is the only practical way of performing life test estimates. Both thermal and vibration ALT induced failure are performed and analyzed to resolve industry questions relating to the introduction of lead-free solder product and processes into high reliability avionics. In chapter 2, thermal ALT using an industry standard failure machine implementing Interconnect Stress Test (IST) that simulates circuit board life data is compared to real production failure data by likelihood ratio tests to arrive at a mechanical theory. This mechanical theory results in a statistically equivalent energy bound such that failure distributions below a specific energy level are considered to be from the same distribution thus allowing testers to quantify parameter setting in IST prior to life testing. In chapter 3, vibration ALT comparing tin-lead and lead-free circuit board solder designs involves the use of the likelihood ratio (LR) test to assess both complete failure data and S-N curves to present methods for analyzing data. Failure data is analyzed using Regression and two-way analysis of variance (ANOVA) and reconciled with the LR test results that indicating that a costly aging pre-process may be eliminated in certain cases. In chapter 4, vibration ALT for side-by-side tin-lead and lead-free solder black box designs are life tested. Commercial models from strain data do not exist at the low levels associated with life testing and need to be developed because testing performed and presented here indicate that both tin-lead and lead-free solders are similar. In addition, earlier failures due to vibration like connector failure modes will occur before solder interconnect failures.
ContributorsJuarez, Joseph Moses (Author) / Montgomery, Douglas C. (Thesis advisor) / Borror, Connie M. (Thesis advisor) / Gel, Esma (Committee member) / Mignolet, Marc (Committee member) / Pan, Rong (Committee member) / Arizona State University (Publisher)
Created2012
154080-Thumbnail Image.png
Description
Optimal experimental design for generalized linear models is often done using a pseudo-Bayesian approach that integrates the design criterion across a prior distribution on the parameter values. This approach ignores the lack of utility of certain models contained in the prior, and a case is demonstrated where the heavy

Optimal experimental design for generalized linear models is often done using a pseudo-Bayesian approach that integrates the design criterion across a prior distribution on the parameter values. This approach ignores the lack of utility of certain models contained in the prior, and a case is demonstrated where the heavy focus on such hopeless models results in a design with poor performance and with wild swings in coverage probabilities for Wald-type confidence intervals. Design construction using a utility-based approach is shown to result in much more stable coverage probabilities in the area of greatest concern.

The pseudo-Bayesian approach can be applied to the problem of optimal design construction under dependent observations. Often, correlation between observations exists due to restrictions on randomization. Several techniques for optimal design construction are proposed in the case of the conditional response distribution being a natural exponential family member but with a normally distributed block effect . The reviewed pseudo-Bayesian approach is compared to an approach based on substituting the marginal likelihood with the joint likelihood and an approach based on projections of the score function (often called quasi-likelihood). These approaches are compared for several models with normal, Poisson, and binomial conditional response distributions via the true determinant of the expected Fisher information matrix where the dispersion of the random blocks is considered a nuisance parameter. A case study using the developed methods is performed.

The joint and quasi-likelihood methods are then extended to address the case when the magnitude of random block dispersion is of concern. Again, a simulation study over several models is performed, followed by a case study when the conditional response distribution is a Poisson distribution.
ContributorsHassler, Edgar (Author) / Montgomery, Douglas C. (Thesis advisor) / Silvestrini, Rachel T. (Thesis advisor) / Borror, Connie M. (Committee member) / Pan, Rong (Committee member) / Arizona State University (Publisher)
Created2015
155978-Thumbnail Image.png
Description
Though the likelihood is a useful tool for obtaining estimates of regression parameters, it is not readily available in the fit of hierarchical binary data models. The correlated observations negate the opportunity to have a joint likelihood when fitting hierarchical logistic regression models. Through conditional likelihood, inferences for the regression

Though the likelihood is a useful tool for obtaining estimates of regression parameters, it is not readily available in the fit of hierarchical binary data models. The correlated observations negate the opportunity to have a joint likelihood when fitting hierarchical logistic regression models. Through conditional likelihood, inferences for the regression and covariance parameters as well as the intraclass correlation coefficients are usually obtained. In those cases, I have resorted to use of Laplace approximation and large sample theory approach for point and interval estimates such as Wald-type confidence intervals and profile likelihood confidence intervals. These methods rely on distributional assumptions and large sample theory. However, when dealing with small hierarchical datasets they often result in severe bias or non-convergence. I present a generalized quasi-likelihood approach and a generalized method of moments approach; both do not rely on any distributional assumptions but only moments of response. As an alternative to the typical large sample theory approach, I present bootstrapping hierarchical logistic regression models which provides more accurate interval estimates for small binary hierarchical data. These models substitute computations as an alternative to the traditional Wald-type and profile likelihood confidence intervals. I use a latent variable approach with a new split bootstrap method for estimating intraclass correlation coefficients when analyzing binary data obtained from a three-level hierarchical structure. It is especially useful with small sample size and easily expanded to multilevel. Comparisons are made to existing approaches through both theoretical justification and simulation studies. Further, I demonstrate my findings through an analysis of three numerical examples, one based on cancer in remission data, one related to the China’s antibiotic abuse study, and a third related to teacher effectiveness in schools from a state of southwest US.
ContributorsWang, Bei (Author) / Wilson, Jeffrey R (Thesis advisor) / Kamarianakis, Ioannis (Committee member) / Reiser, Mark R. (Committee member) / St Louis, Robert (Committee member) / Zheng, Yi (Committee member) / Arizona State University (Publisher)
Created2017
156371-Thumbnail Image.png
Description
Generalized Linear Models (GLMs) are widely used for modeling responses with non-normal error distributions. When the values of the covariates in such models are controllable, finding an optimal (or at least efficient) design could greatly facilitate the work of collecting and analyzing data. In fact, many theoretical results are obtained

Generalized Linear Models (GLMs) are widely used for modeling responses with non-normal error distributions. When the values of the covariates in such models are controllable, finding an optimal (or at least efficient) design could greatly facilitate the work of collecting and analyzing data. In fact, many theoretical results are obtained on a case-by-case basis, while in other situations, researchers also rely heavily on computational tools for design selection.

Three topics are investigated in this dissertation with each one focusing on one type of GLMs. Topic I considers GLMs with factorial effects and one continuous covariate. Factors can have interactions among each other and there is no restriction on the possible values of the continuous covariate. The locally D-optimal design structures for such models are identified and results for obtaining smaller optimal designs using orthogonal arrays (OAs) are presented. Topic II considers GLMs with multiple covariates under the assumptions that all but one covariate are bounded within specified intervals and interaction effects among those bounded covariates may also exist. An explicit formula for D-optimal designs is derived and OA-based smaller D-optimal designs for models with one or two two-factor interactions are also constructed. Topic III considers multiple-covariate logistic models. All covariates are nonnegative and there is no interaction among them. Two types of D-optimal design structures are identified and their global D-optimality is proved using the celebrated equivalence theorem.
ContributorsWang, Zhongsheng (Author) / Stufken, John (Thesis advisor) / Kamarianakis, Ioannis (Committee member) / Kao, Ming-Hung (Committee member) / Reiser, Mark R. (Committee member) / Zheng, Yi (Committee member) / Arizona State University (Publisher)
Created2018
154894-Thumbnail Image.png
Description
The majority of research in experimental design has, to date, been focused on designs when there is only one type of response variable under consideration. In a decision-making process, however, relying on only one objective or criterion can lead to oversimplified, sub-optimal decisions that ignore important considerations. Incorporating multiple, and

The majority of research in experimental design has, to date, been focused on designs when there is only one type of response variable under consideration. In a decision-making process, however, relying on only one objective or criterion can lead to oversimplified, sub-optimal decisions that ignore important considerations. Incorporating multiple, and likely competing, objectives is critical during the decision-making process in order to balance the tradeoffs of all potential solutions. Consequently, the problem of constructing a design for an experiment when multiple types of responses are of interest does not have a clear answer, particularly when the response variables have different distributions. Responses with different distributions have different requirements of the design.

Computer-generated optimal designs are popular design choices for less standard scenarios where classical designs are not ideal. This work presents a new approach to experimental designs for dual-response systems. The normal, binomial, and Poisson distributions are considered for the potential responses. Using the D-criterion for the linear model and the Bayesian D-criterion for the nonlinear models, a weighted criterion is implemented in a coordinate-exchange algorithm. The designs are evaluated and compared across different weights. The sensitivity of the designs to the priors supplied in the Bayesian D-criterion is explored in the third chapter of this work.

The final section of this work presents a method for a decision-making process involving multiple objectives. There are situations where a decision-maker is interested in several optimal solutions, not just one. These types of decision processes fall into one of two scenarios: 1) wanting to identify the best N solutions to accomplish a goal or specific task, or 2) evaluating a decision based on several primary quantitative objectives along with secondary qualitative priorities. Design of experiment selection often involves the second scenario where the goal is to identify several contending solutions using the primary quantitative objectives, and then use the secondary qualitative objectives to guide the final decision. Layered Pareto Fronts can help identify a richer class of contenders to examine more closely. The method is illustrated with a supersaturated screening design example.
ContributorsBurke, Sarah Ellen (Author) / Montgomery, Douglas C. (Thesis advisor) / Borror, Connie M. (Thesis advisor) / Anderson-Cook, Christine M. (Committee member) / Pan, Rong (Committee member) / Silvestrini, Rachel (Committee member) / Arizona State University (Publisher)
Created2016
155445-Thumbnail Image.png
Description
The Pearson and likelihood ratio statistics are commonly used to test goodness-of-fit for models applied to data from a multinomial distribution. When data are from a table formed by cross-classification of a large number of variables, the common statistics may have low power and inaccurate Type I error level due

The Pearson and likelihood ratio statistics are commonly used to test goodness-of-fit for models applied to data from a multinomial distribution. When data are from a table formed by cross-classification of a large number of variables, the common statistics may have low power and inaccurate Type I error level due to sparseness in the cells of the table. The GFfit statistic can be used to examine model fit in subtables. It is proposed to assess model fit by using a new version of GFfit statistic based on orthogonal components of Pearson chi-square as a diagnostic to examine the fit on two-way subtables. However, due to variables with a large number of categories and small sample size, even the GFfit statistic may have low power and inaccurate Type I error level due to sparseness in the two-way subtable. In this dissertation, the theoretical power and empirical power of the GFfit statistic are studied. A method based on subsets of orthogonal components for the GFfit statistic on the subtables is developed to improve the performance of the GFfit statistic. Simulation results for power and type I error rate for several different cases along with comparisons to other diagnostics are presented.
ContributorsZhu, Junfei (Author) / Reiser, Mark R. (Thesis advisor) / Stufken, John (Committee member) / Zheng, Yi (Committee member) / St Louis, Robert (Committee member) / Kao, Ming-Hung (Committee member) / Arizona State University (Publisher)
Created2017
155598-Thumbnail Image.png
Description
This article proposes a new information-based subdata selection (IBOSS) algorithm, Squared Scaled Distance Algorithm (SSDA). It is based on the invariance of the determinant of the information matrix under orthogonal transformations, especially rotations. Extensive simulation results show that the new IBOSS algorithm retains nice asymptotic properties of IBOSS and gives

This article proposes a new information-based subdata selection (IBOSS) algorithm, Squared Scaled Distance Algorithm (SSDA). It is based on the invariance of the determinant of the information matrix under orthogonal transformations, especially rotations. Extensive simulation results show that the new IBOSS algorithm retains nice asymptotic properties of IBOSS and gives a larger determinant of the subdata information matrix. It has the same order of time complexity as the D-optimal IBOSS algorithm. However, it exploits the advantages of vectorized calculation avoiding for loops and is approximately 6 times as fast as the D-optimal IBOSS algorithm in R. The robustness of SSDA is studied from three aspects: nonorthogonality, including interaction terms and variable misspecification. A new accurate variable selection algorithm is proposed to help the implementation of IBOSS algorithms when a large number of variables are present with sparse important variables among them. Aggregating random subsample results, this variable selection algorithm is much more accurate than the LASSO method using full data. Since the time complexity is associated with the number of variables only, it is also very computationally efficient if the number of variables is fixed as n increases and not massively large. More importantly, using subsamples it solves the problem that full data cannot be stored in the memory when a data set is too large.
ContributorsZheng, Yi (Author) / Stufken, John (Thesis advisor) / Reiser, Mark R. (Committee member) / McCulloch, Robert (Committee member) / Arizona State University (Publisher)
Created2017
149476-Thumbnail Image.png
Description
In mixture-process variable experiments, it is common that the number of runs is greater than in mixture-only or process-variable experiments. These experiments have to estimate the parameters from the mixture components, process variables, and interactions of both variables. In some of these experiments there are variables that are hard to

In mixture-process variable experiments, it is common that the number of runs is greater than in mixture-only or process-variable experiments. These experiments have to estimate the parameters from the mixture components, process variables, and interactions of both variables. In some of these experiments there are variables that are hard to change or cannot be controlled under normal operating conditions. These situations often prohibit a complete randomization for the experimental runs due to practical and economical considerations. Furthermore, the process variables can be categorized into two types: variables that are controllable and directly affect the response, and variables that are uncontrollable and primarily affect the variability of the response. These uncontrollable variables are called noise factors and assumed controllable in a laboratory environment for the purpose of conducting experiments. The model containing both noise variables and control factors can be used to determine factor settings for the control factor that makes the response "robust" to the variability transmitted from the noise factors. These types of experiments can be analyzed in a model for the mean response and a model for the slope of the response within a split-plot structure. When considering the experimental designs, low prediction variances for the mean and slope model are desirable. The methods for the mixture-process variable designs with noise variables considering a restricted randomization are demonstrated and some mixture-process variable designs that are robust to the coefficients of interaction with noise variables are evaluated using fraction design space plots with the respect to the prediction variance properties. Finally, the G-optimal design that minimizes the maximum prediction variance over the entire design region is created using a genetic algorithm.
ContributorsCho, Tae Yeon (Author) / Montgomery, Douglas C. (Thesis advisor) / Borror, Connie M. (Thesis advisor) / Shunk, Dan L. (Committee member) / Gel, Esma S (Committee member) / Kulahci, Murat (Committee member) / Arizona State University (Publisher)
Created2010
189356-Thumbnail Image.png
Description
This dissertation comprises two projects: (i) Multiple testing of local maxima for detection of peaks and change points with non-stationary noise, and (ii) Height distributions of critical points of smooth isotropic Gaussian fields: computations, simulations and asymptotics. The first project introduces a topological multiple testing method for one-dimensional domains to

This dissertation comprises two projects: (i) Multiple testing of local maxima for detection of peaks and change points with non-stationary noise, and (ii) Height distributions of critical points of smooth isotropic Gaussian fields: computations, simulations and asymptotics. The first project introduces a topological multiple testing method for one-dimensional domains to detect signals in the presence of non-stationary Gaussian noise. The approach involves conducting tests at local maxima based on two observation conditions: (i) the noise is smooth with unit variance and (ii) the noise is not smooth where kernel smoothing is applied to increase the signal-to-noise ratio (SNR). The smoothed signals are then standardized, which ensures that the variance of the new sequence's noise becomes one, making it possible to calculate $p$-values for all local maxima using random field theory. Assuming unimodal true signals with finite support and non-stationary Gaussian noise that can be repeatedly observed. The algorithm introduced in this work, demonstrates asymptotic strong control of the False Discovery Rate (FDR) and power consistency as the number of sequence repetitions and signal strength increase. Simulations indicate that FDR levels can also be controlled under non-asymptotic conditions with finite repetitions. The application of this algorithm to change point detection also guarantees FDR control and power consistency. The second project focuses on investigating the explicit and asymptotic height densities of critical points of smooth isotropic Gaussian random fields on both Euclidean space and spheres.The formulae are based on characterizing the distribution of the Hessian of the Gaussian field using the Gaussian orthogonally invariant (GOI) matrices and the Gaussian orthogonal ensemble (GOE) matrices, which are special cases of GOI matrices. However, as the dimension increases, calculating explicit formulae becomes computationally challenging. The project includes two simulation methods for these distributions. Additionally, asymptotic distributions are obtained by utilizing the asymptotic distribution of the eigenvalues (excluding the maximum eigenvalues) of the GOE matrix for large dimensions. However, when it comes to the maximum eigenvalue, the Tracy-Widom distribution is utilized. Simulation results demonstrate the close approximation between the asymptotic distribution and the real distribution when $N$ is sufficiently large.
Contributorsgu, shuang (Author) / Cheng, Dan (Thesis advisor) / Lopes, Hedibert (Committee member) / Fricks, John (Committee member) / Lan, Shiwei (Committee member) / Zheng, Yi (Committee member) / Arizona State University (Publisher)
Created2023
187808-Thumbnail Image.png
Description
This dissertation covers several topics in machine learning and causal inference. First, the question of “feature selection,” a common byproduct of regularized machine learning methods, is investigated theoretically in the context of treatment effect estimation. This involves a detailed review and extension of frameworks for estimating causal effects and in-depth

This dissertation covers several topics in machine learning and causal inference. First, the question of “feature selection,” a common byproduct of regularized machine learning methods, is investigated theoretically in the context of treatment effect estimation. This involves a detailed review and extension of frameworks for estimating causal effects and in-depth theoretical study. Next, various computational approaches to estimating causal effects with machine learning methods are compared with these theoretical desiderata in mind. Several improvements to current methods for causal machine learning are identified and compelling angles for further study are pinpointed. Finally, a common method used for “explaining” predictions of machine learning algorithms, SHAP, is evaluated critically through a statistical lens.
ContributorsHerren, Andrew (Author) / Hahn, P Richard (Thesis advisor) / Kao, Ming-Hung (Committee member) / Lopes, Hedibert (Committee member) / McCulloch, Robert (Committee member) / Zhou, Shuang (Committee member) / Arizona State University (Publisher)
Created2023