Matching Items (36)
Filtering by

Clear all filters

148169-Thumbnail Image.png
Description

This thesis was conducted to study and analyze the fund allocation process adopted by different states in the United States to reduce the impact of the Covid-19 virus. Seven different states and their funding methodologies were compared against the case count within the state. The study also focused on development

This thesis was conducted to study and analyze the fund allocation process adopted by different states in the United States to reduce the impact of the Covid-19 virus. Seven different states and their funding methodologies were compared against the case count within the state. The study also focused on development of a physical distancing index based on three significant attributes. This index was then compared to the expenditure and case counts to support decision making.
A regression model was developed to analyze and compare how different states case counts played out against the regression model and the risk index.

ContributorsJaisinghani, Shaurya (Author) / Mirchandani, Pitu (Thesis director) / Clough, Michael (Committee member) / McCarville, Daniel R. (Committee member) / Industrial, Systems & Operations Engineering Prgm (Contributor) / Department of Information Systems (Contributor) / Industrial, Systems & Operations Engineering Prgm (Contributor) / Barrett, The Honors College (Contributor)
Created2021-05
152223-Thumbnail Image.png
Description
Nowadays product reliability becomes the top concern of the manufacturers and customers always prefer the products with good performances under long period. In order to estimate the lifetime of the product, accelerated life testing (ALT) is introduced because most of the products can last years even decades. Much research has

Nowadays product reliability becomes the top concern of the manufacturers and customers always prefer the products with good performances under long period. In order to estimate the lifetime of the product, accelerated life testing (ALT) is introduced because most of the products can last years even decades. Much research has been done in the ALT area and optimal design for ALT is a major topic. This dissertation consists of three main studies. First, a methodology of finding optimal design for ALT with right censoring and interval censoring have been developed and it employs the proportional hazard (PH) model and generalized linear model (GLM) to simplify the computational process. A sensitivity study is also given to show the effects brought by parameters to the designs. Second, an extended version of I-optimal design for ALT is discussed and then a dual-objective design criterion is defined and showed with several examples. Also in order to evaluate different candidate designs, several graphical tools are developed. Finally, when there are more than one models available, different model checking designs are discussed.
ContributorsYang, Tao (Author) / Pan, Rong (Thesis advisor) / Montgomery, Douglas C. (Committee member) / Borror, Connie (Committee member) / Rigdon, Steve (Committee member) / Arizona State University (Publisher)
Created2013
151511-Thumbnail Image.png
Description
With the increase in computing power and availability of data, there has never been a greater need to understand data and make decisions from it. Traditional statistical techniques may not be adequate to handle the size of today's data or the complexities of the information hidden within the data. Thus

With the increase in computing power and availability of data, there has never been a greater need to understand data and make decisions from it. Traditional statistical techniques may not be adequate to handle the size of today's data or the complexities of the information hidden within the data. Thus knowledge discovery by machine learning techniques is necessary if we want to better understand information from data. In this dissertation, we explore the topics of asymmetric loss and asymmetric data in machine learning and propose new algorithms as solutions to some of the problems in these topics. We also studied variable selection of matched data sets and proposed a solution when there is non-linearity in the matched data. The research is divided into three parts. The first part addresses the problem of asymmetric loss. A proposed asymmetric support vector machine (aSVM) is used to predict specific classes with high accuracy. aSVM was shown to produce higher precision than a regular SVM. The second part addresses asymmetric data sets where variables are only predictive for a subset of the predictor classes. Asymmetric Random Forest (ARF) was proposed to detect these kinds of variables. The third part explores variable selection for matched data sets. Matched Random Forest (MRF) was proposed to find variables that are able to distinguish case and control without the restrictions that exists in linear models. MRF detects variables that are able to distinguish case and control even in the presence of interaction and qualitative variables.
ContributorsKoh, Derek (Author) / Runger, George C. (Thesis advisor) / Wu, Tong (Committee member) / Pan, Rong (Committee member) / Cesta, John (Committee member) / Arizona State University (Publisher)
Created2013
150547-Thumbnail Image.png
Description
This dissertation presents methods for addressing research problems that currently can only adequately be solved using Quality Reliability Engineering (QRE) approaches especially accelerated life testing (ALT) of electronic printed wiring boards with applications to avionics circuit boards. The methods presented in this research are generally applicable to circuit boards, but

This dissertation presents methods for addressing research problems that currently can only adequately be solved using Quality Reliability Engineering (QRE) approaches especially accelerated life testing (ALT) of electronic printed wiring boards with applications to avionics circuit boards. The methods presented in this research are generally applicable to circuit boards, but the data generated and their analysis is for high performance avionics. Avionics equipment typically requires 20 years expected life by aircraft equipment manufacturers and therefore ALT is the only practical way of performing life test estimates. Both thermal and vibration ALT induced failure are performed and analyzed to resolve industry questions relating to the introduction of lead-free solder product and processes into high reliability avionics. In chapter 2, thermal ALT using an industry standard failure machine implementing Interconnect Stress Test (IST) that simulates circuit board life data is compared to real production failure data by likelihood ratio tests to arrive at a mechanical theory. This mechanical theory results in a statistically equivalent energy bound such that failure distributions below a specific energy level are considered to be from the same distribution thus allowing testers to quantify parameter setting in IST prior to life testing. In chapter 3, vibration ALT comparing tin-lead and lead-free circuit board solder designs involves the use of the likelihood ratio (LR) test to assess both complete failure data and S-N curves to present methods for analyzing data. Failure data is analyzed using Regression and two-way analysis of variance (ANOVA) and reconciled with the LR test results that indicating that a costly aging pre-process may be eliminated in certain cases. In chapter 4, vibration ALT for side-by-side tin-lead and lead-free solder black box designs are life tested. Commercial models from strain data do not exist at the low levels associated with life testing and need to be developed because testing performed and presented here indicate that both tin-lead and lead-free solders are similar. In addition, earlier failures due to vibration like connector failure modes will occur before solder interconnect failures.
ContributorsJuarez, Joseph Moses (Author) / Montgomery, Douglas C. (Thesis advisor) / Borror, Connie M. (Thesis advisor) / Gel, Esma (Committee member) / Mignolet, Marc (Committee member) / Pan, Rong (Committee member) / Arizona State University (Publisher)
Created2012
151226-Thumbnail Image.png
Description
Temporal data are increasingly prevalent and important in analytics. Time series (TS) data are chronological sequences of observations and an important class of temporal data. Fields such as medicine, finance, learning science and multimedia naturally generate TS data. Each series provide a high-dimensional data vector that challenges the learning of

Temporal data are increasingly prevalent and important in analytics. Time series (TS) data are chronological sequences of observations and an important class of temporal data. Fields such as medicine, finance, learning science and multimedia naturally generate TS data. Each series provide a high-dimensional data vector that challenges the learning of the relevant patterns This dissertation proposes TS representations and methods for supervised TS analysis. The approaches combine new representations that handle translations and dilations of patterns with bag-of-features strategies and tree-based ensemble learning. This provides flexibility in handling time-warped patterns in a computationally efficient way. The ensemble learners provide a classification framework that can handle high-dimensional feature spaces, multiple classes and interaction between features. The proposed representations are useful for classification and interpretation of the TS data of varying complexity. The first contribution handles the problem of time warping with a feature-based approach. An interval selection and local feature extraction strategy is proposed to learn a bag-of-features representation. This is distinctly different from common similarity-based time warping. This allows for additional features (such as pattern location) to be easily integrated into the models. The learners have the capability to account for the temporal information through the recursive partitioning method. The second contribution focuses on the comprehensibility of the models. A new representation is integrated with local feature importance measures from tree-based ensembles, to diagnose and interpret time intervals that are important to the model. Multivariate time series (MTS) are especially challenging because the input consists of a collection of TS and both features within TS and interactions between TS can be important to models. Another contribution uses a different representation to produce computationally efficient strategies that learn a symbolic representation for MTS. Relationships between the multiple TS, nominal and missing values are handled with tree-based learners. Applications such as speech recognition, medical diagnosis and gesture recognition are used to illustrate the methods. Experimental results show that the TS representations and methods provide better results than competitive methods on a comprehensive collection of benchmark datasets. Moreover, the proposed approaches naturally provide solutions to similarity analysis, predictive pattern discovery and feature selection.
ContributorsBaydogan, Mustafa Gokce (Author) / Runger, George C. (Thesis advisor) / Atkinson, Robert (Committee member) / Gel, Esma (Committee member) / Pan, Rong (Committee member) / Arizona State University (Publisher)
Created2012
135606-Thumbnail Image.png
Description
League of Legends is a Multiplayer Online Battle Arena (MOBA) game. MOBA games are generally formatted where two teams of five, each player controlling a character (champion), will try to take each other's base as quickly as possible. Currently, with about 70 million, League of Legends is number one in

League of Legends is a Multiplayer Online Battle Arena (MOBA) game. MOBA games are generally formatted where two teams of five, each player controlling a character (champion), will try to take each other's base as quickly as possible. Currently, with about 70 million, League of Legends is number one in the digital entertainment industry with $1.63 billion dollars of revenue in year 2015. This research analysis scopes in on the niche of the "Jungler" role between different tiers of player in League of Legends. I uncovered differences in player strategy that may explain the achievement of high rank using data aggregation through Riot Games' API, data slicing with time-sensitive data, random sampling, clustering by tiers, graphical techniques to display the cluster, distribution analysis and finally, a comprehensive factor analysis on the data's implications.
ContributorsPoon, Alex (Author) / Clark, Joseph (Thesis director) / Simon, Alan (Committee member) / Department of Information Systems (Contributor) / Department of Management (Contributor) / Barrett, The Honors College (Contributor)
Created2016-05
Description
The object of the present study is to examine methods in which the company can optimize their costs on third-party suppliers whom oversee other third-party trade labor. The third parties in scope of this study are suspected to overstaff their workforce, thus overcharging the company. We will introduce a complex

The object of the present study is to examine methods in which the company can optimize their costs on third-party suppliers whom oversee other third-party trade labor. The third parties in scope of this study are suspected to overstaff their workforce, thus overcharging the company. We will introduce a complex spreadsheet model that will propose a proper project staffing level based on key qualitative variables and statistics. Using the model outputs, the Thesis team proposes a headcount solution for the company and problem areas to focus on, going forward. All sources of information come from company proprietary and confidential documents.
ContributorsLoo, Andrew (Co-author) / Brennan, Michael (Co-author) / Sheiner, Alexander (Co-author) / Hertzel, Michael (Thesis director) / Simonson, Mark (Committee member) / Barrett, The Honors College (Contributor) / Department of Information Systems (Contributor) / Department of Finance (Contributor) / Department of Supply Chain Management (Contributor) / WPC Graduate Programs (Contributor) / School of Accountancy (Contributor)
Created2014-05
149613-Thumbnail Image.png
Description
Yield is a key process performance characteristic in the capital-intensive semiconductor fabrication process. In an industry where machines cost millions of dollars and cycle times are a number of months, predicting and optimizing yield are critical to process improvement, customer satisfaction, and financial success. Semiconductor yield modeling is

Yield is a key process performance characteristic in the capital-intensive semiconductor fabrication process. In an industry where machines cost millions of dollars and cycle times are a number of months, predicting and optimizing yield are critical to process improvement, customer satisfaction, and financial success. Semiconductor yield modeling is essential to identifying processing issues, improving quality, and meeting customer demand in the industry. However, the complicated fabrication process, the massive amount of data collected, and the number of models available make yield modeling a complex and challenging task. This work presents modeling strategies to forecast yield using generalized linear models (GLMs) based on defect metrology data. The research is divided into three main parts. First, the data integration and aggregation necessary for model building are described, and GLMs are constructed for yield forecasting. This technique yields results at both the die and the wafer levels, outperforms existing models found in the literature based on prediction errors, and identifies significant factors that can drive process improvement. This method also allows the nested structure of the process to be considered in the model, improving predictive capabilities and violating fewer assumptions. To account for the random sampling typically used in fabrication, the work is extended by using generalized linear mixed models (GLMMs) and a larger dataset to show the differences between batch-specific and population-averaged models in this application and how they compare to GLMs. These results show some additional improvements in forecasting abilities under certain conditions and show the differences between the significant effects identified in the GLM and GLMM models. The effects of link functions and sample size are also examined at the die and wafer levels. The third part of this research describes a methodology for integrating classification and regression trees (CART) with GLMs. This technique uses the terminal nodes identified in the classification tree to add predictors to a GLM. This method enables the model to consider important interaction terms in a simpler way than with the GLM alone, and provides valuable insight into the fabrication process through the combination of the tree structure and the statistical analysis of the GLM.
ContributorsKrueger, Dana Cheree (Author) / Montgomery, Douglas C. (Thesis advisor) / Fowler, John (Committee member) / Pan, Rong (Committee member) / Pfund, Michele (Committee member) / Arizona State University (Publisher)
Created2011
136255-Thumbnail Image.png
Description
Over the course of six months, we have worked in partnership with Arizona State University and a leading producer of semiconductor chips in the United States market (referred to as the "Company"), lending our skills in finance, statistics, model building, and external insight. We attempt to design models that hel

Over the course of six months, we have worked in partnership with Arizona State University and a leading producer of semiconductor chips in the United States market (referred to as the "Company"), lending our skills in finance, statistics, model building, and external insight. We attempt to design models that help predict how much time it takes to implement a cost-saving project. These projects had previously been considered only on the merit of cost savings, but with an added dimension of time, we hope to forecast time according to a number of variables. With such a forecast, we can then apply it to an expense project prioritization model which relates time and cost savings together, compares many different projects simultaneously, and returns a series of present value calculations over different ranges of time. The goal is twofold: assist with an accurate prediction of a project's time to implementation, and provide a basis to compare different projects based on their present values, ultimately helping to reduce the Company's manufacturing costs and improve gross margins. We believe this approach, and the research found toward this goal, is most valuable for the Company. Two coaches from the Company have provided assistance and clarified our questions when necessary throughout our research. In this paper, we begin by defining the problem, setting an objective, and establishing a checklist to monitor our progress. Next, our attention shifts to the data: making observations, trimming the dataset, framing and scoping the variables to be used for the analysis portion of the paper. Before creating a hypothesis, we perform a preliminary statistical analysis of certain individual variables to enrich our variable selection process. After the hypothesis, we run multiple linear regressions with project duration as the dependent variable. After regression analysis and a test for robustness, we shift our focus to an intuitive model based on rules of thumb. We relate these models to an expense project prioritization tool developed using Microsoft Excel software. Our deliverables to the Company come in the form of (1) a rules of thumb intuitive model and (2) an expense project prioritization tool.
ContributorsAl-Assi, Hashim (Co-author) / Chiang, Robert (Co-author) / Liu, Andrew (Co-author) / Ludwick, David (Co-author) / Simonson, Mark (Thesis director) / Hertzel, Michael (Committee member) / Barrett, The Honors College (Contributor) / Department of Information Systems (Contributor) / Department of Finance (Contributor) / Department of Economics (Contributor) / Department of Supply Chain Management (Contributor) / School of Accountancy (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Mechanical and Aerospace Engineering Program (Contributor) / WPC Graduate Programs (Contributor)
Created2015-05
132394-Thumbnail Image.png
Description
In baseball, a starting pitcher has historically been a more durable pitcher capable of lasting long into games without tiring. For the entire history of Major League Baseball, these pitchers have been expected to last 6 innings or more into a game before being replaced. However, with the advances in

In baseball, a starting pitcher has historically been a more durable pitcher capable of lasting long into games without tiring. For the entire history of Major League Baseball, these pitchers have been expected to last 6 innings or more into a game before being replaced. However, with the advances in statistics and sabermetrics and their gradual acceptance by professional coaches, the role of the starting pitcher is beginning to change. Teams are experimenting with having starters being replaced quicker, challenging the traditional role of the starting pitcher. The goal of this study is to determine if there is an exact point at which a team would benefit from replacing a starting or relief pitcher with another pitcher using statistical analyses. We will use logistic stepwise regression to predict the likelihood of a team scoring a run if a substitution is made or not made given the current game situation.
ContributorsBuckley, Nicholas J (Author) / Samara, Marko (Thesis director) / Lanchier, Nicolas (Committee member) / School of Mathematical and Statistical Sciences (Contributor) / Department of Information Systems (Contributor) / Barrett, The Honors College (Contributor)
Created2019-05