Matching Items (160)
152223-Thumbnail Image.png
Description
Nowadays product reliability becomes the top concern of the manufacturers and customers always prefer the products with good performances under long period. In order to estimate the lifetime of the product, accelerated life testing (ALT) is introduced because most of the products can last years even decades. Much research has

Nowadays product reliability becomes the top concern of the manufacturers and customers always prefer the products with good performances under long period. In order to estimate the lifetime of the product, accelerated life testing (ALT) is introduced because most of the products can last years even decades. Much research has been done in the ALT area and optimal design for ALT is a major topic. This dissertation consists of three main studies. First, a methodology of finding optimal design for ALT with right censoring and interval censoring have been developed and it employs the proportional hazard (PH) model and generalized linear model (GLM) to simplify the computational process. A sensitivity study is also given to show the effects brought by parameters to the designs. Second, an extended version of I-optimal design for ALT is discussed and then a dual-objective design criterion is defined and showed with several examples. Also in order to evaluate different candidate designs, several graphical tools are developed. Finally, when there are more than one models available, different model checking designs are discussed.
ContributorsYang, Tao (Author) / Pan, Rong (Thesis advisor) / Montgomery, Douglas C. (Committee member) / Borror, Connie (Committee member) / Rigdon, Steve (Committee member) / Arizona State University (Publisher)
Created2013
152087-Thumbnail Image.png
Description
Nonregular screening designs can be an economical alternative to traditional resolution IV 2^(k-p) fractional factorials. Recently 16-run nonregular designs, referred to as no-confounding designs, were introduced in the literature. These designs have the property that no pair of main effect (ME) and two-factor interaction (2FI) estimates are completely confounded. In

Nonregular screening designs can be an economical alternative to traditional resolution IV 2^(k-p) fractional factorials. Recently 16-run nonregular designs, referred to as no-confounding designs, were introduced in the literature. These designs have the property that no pair of main effect (ME) and two-factor interaction (2FI) estimates are completely confounded. In this dissertation, orthogonal arrays were evaluated with many popular design-ranking criteria in order to identify optimal 20-run and 24-run no-confounding designs. Monte Carlo simulation was used to empirically assess the model fitting effectiveness of the recommended no-confounding designs. The results of the simulation demonstrated that these new designs, particularly the 24-run designs, are successful at detecting active effects over 95% of the time given sufficient model effect sparsity. The final chapter presents a screening design selection methodology, based on decision trees, to aid in the selection of a screening design from a list of published options. The methodology determines which of a candidate set of screening designs has the lowest expected experimental cost.
ContributorsStone, Brian (Author) / Montgomery, Douglas C. (Thesis advisor) / Silvestrini, Rachel T. (Committee member) / Fowler, John W (Committee member) / Borror, Connie M. (Committee member) / Arizona State University (Publisher)
Created2013
152112-Thumbnail Image.png
Description
With the advent of social media (like Twitter, Facebook etc.,) people are easily sharing their opinions, sentiments and enforcing their ideologies on others like never before. Even people who are otherwise socially inactive would like to share their thoughts on current affairs by tweeting and sharing news feeds with their

With the advent of social media (like Twitter, Facebook etc.,) people are easily sharing their opinions, sentiments and enforcing their ideologies on others like never before. Even people who are otherwise socially inactive would like to share their thoughts on current affairs by tweeting and sharing news feeds with their friends and acquaintances. In this thesis study, we chose Twitter as our main data platform to analyze shifts and movements of 27 political organizations in Indonesia. So far, we have collected over 30 million tweets and 150,000 news articles from RSS feeds of the corresponding organizations for our analysis. For Twitter data extraction, we developed a multi-threaded application which seamlessly extracts, cleans and stores millions of tweets matching our keywords from Twitter Streaming API. For keyword extraction, we used topics and perspectives which were extracted using n-grams techniques and later approved by our social scientists. After the data is extracted, we aggregate the tweet contents that belong to every user on a weekly basis. Finally, we applied linear and logistic regression using SLEP, an open source sparse learning package to compute weekly score for users and mapping them to one of the 27 organizations on a radical or counter radical scale. Since, we are mapping users to organizations on a weekly basis, we are able to track user's behavior and important new events that triggered shifts among users between organizations. This thesis study can further be extended to identify topics and organization specific influential users and new users from various social media platforms like Facebook, YouTube etc. can easily be mapped to existing organizations on a radical or counter-radical scale.
ContributorsPoornachandran, Sathishkumar (Author) / Davulcu, Hasan (Thesis advisor) / Sen, Arunabha (Committee member) / Woodward, Mark (Committee member) / Arizona State University (Publisher)
Created2013
152015-Thumbnail Image.png
Description
This dissertation explores different methodologies for combining two popular design paradigms in the field of computer experiments. Space-filling designs are commonly used in order to ensure that there is good coverage of the design space, but they may not result in good properties when it comes to model fitting. Optimal

This dissertation explores different methodologies for combining two popular design paradigms in the field of computer experiments. Space-filling designs are commonly used in order to ensure that there is good coverage of the design space, but they may not result in good properties when it comes to model fitting. Optimal designs traditionally perform very well in terms of model fitting, particularly when a polynomial is intended, but can result in problematic replication in the case of insignificant factors. By bringing these two design types together, positive properties of each can be retained while mitigating potential weaknesses. Hybrid space-filling designs, generated as Latin hypercubes augmented with I-optimal points, are compared to designs of each contributing component. A second design type called a bridge design is also evaluated, which further integrates the disparate design types. Bridge designs are the result of a Latin hypercube undergoing coordinate exchange to reach constrained D-optimality, ensuring that there is zero replication of factors in any one-dimensional projection. Lastly, bridge designs were augmented with I-optimal points with two goals in mind. Augmentation with candidate points generated assuming the same underlying analysis model serves to reduce the prediction variance without greatly compromising the space-filling property of the design, while augmentation with candidate points generated assuming a different underlying analysis model can greatly reduce the impact of model misspecification during the design phase. Each of these composite designs are compared to pure space-filling and optimal designs. They typically out-perform pure space-filling designs in terms of prediction variance and alphabetic efficiency, while maintaining comparability with pure optimal designs at small sample size. This justifies them as excellent candidates for initial experimentation.
ContributorsKennedy, Kathryn (Author) / Montgomery, Douglas C. (Thesis advisor) / Johnson, Rachel T. (Thesis advisor) / Fowler, John W (Committee member) / Borror, Connie M. (Committee member) / Arizona State University (Publisher)
Created2013
151329-Thumbnail Image.png
Description
During the initial stages of experimentation, there are usually a large number of factors to be investigated. Fractional factorial (2^(k-p)) designs are particularly useful during this initial phase of experimental work. These experiments often referred to as screening experiments help reduce the large number of factors to a smaller set.

During the initial stages of experimentation, there are usually a large number of factors to be investigated. Fractional factorial (2^(k-p)) designs are particularly useful during this initial phase of experimental work. These experiments often referred to as screening experiments help reduce the large number of factors to a smaller set. The 16 run regular fractional factorial designs for six, seven and eight factors are in common usage. These designs allow clear estimation of all main effects when the three-factor and higher order interactions are negligible, but all two-factor interactions are aliased with each other making estimation of these effects problematic without additional runs. Alternatively, certain nonregular designs called no-confounding (NC) designs by Jones and Montgomery (Jones & Montgomery, Alternatives to resolution IV screening designs in 16 runs, 2010) partially confound the main effects with the two-factor interactions but do not completely confound any two-factor interactions with each other. The NC designs are useful for independently estimating main effects and two-factor interactions without additional runs. While several methods have been suggested for the analysis of data from nonregular designs, stepwise regression is familiar to practitioners, available in commercial software, and is widely used in practice. Given that an NC design has been run, the performance of stepwise regression for model selection is unknown. In this dissertation I present a comprehensive simulation study evaluating stepwise regression for analyzing both regular fractional factorial and NC designs. Next, the projection properties of the six, seven and eight factor NC designs are studied. Studying the projection properties of these designs allows the development of analysis methods to analyze these designs. Lastly the designs and projection properties of 9 to 14 factor NC designs onto three and four factors are presented. Certain recommendations are made on analysis methods for these designs as well.
ContributorsShinde, Shilpa (Author) / Montgomery, Douglas C. (Thesis advisor) / Borror, Connie (Committee member) / Fowler, John (Committee member) / Jones, Bradley (Committee member) / Arizona State University (Publisher)
Created2012
151341-Thumbnail Image.png
Description
With the rapid development of mobile sensing technologies like GPS, RFID, sensors in smartphones, etc., capturing position data in the form of trajectories has become easy. Moving object trajectory analysis is a growing area of interest these days owing to its applications in various domains such as marketing, security, traffic

With the rapid development of mobile sensing technologies like GPS, RFID, sensors in smartphones, etc., capturing position data in the form of trajectories has become easy. Moving object trajectory analysis is a growing area of interest these days owing to its applications in various domains such as marketing, security, traffic monitoring and management, etc. To better understand movement behaviors from the raw mobility data, this doctoral work provides analytic models for analyzing trajectory data. As a first contribution, a model is developed to detect changes in trajectories with time. If the taxis moving in a city are viewed as sensors that provide real time information of the traffic in the city, a change in these trajectories with time can reveal that the road network has changed. To detect changes, trajectories are modeled with a Hidden Markov Model (HMM). A modified training algorithm, for parameter estimation in HMM, called m-BaumWelch, is used to develop likelihood estimates under assumed changes and used to detect changes in trajectory data with time. Data from vehicles are used to test the method for change detection. Secondly, sequential pattern mining is used to develop a model to detect changes in frequent patterns occurring in trajectory data. The aim is to answer two questions: Are the frequent patterns still frequent in the new data? If they are frequent, has the time interval distribution in the pattern changed? Two different approaches are considered for change detection, frequency-based approach and distribution-based approach. The methods are illustrated with vehicle trajectory data. Finally, a model is developed for clustering and outlier detection in semantic trajectories. A challenge with clustering semantic trajectories is that both numeric and categorical attributes are present. Another problem to be addressed while clustering is that trajectories can be of different lengths and also have missing values. A tree-based ensemble is used to address these problems. The approach is extended to outlier detection in semantic trajectories.
ContributorsKondaveeti, Anirudh (Author) / Runger, George C. (Thesis advisor) / Mirchandani, Pitu (Committee member) / Pan, Rong (Committee member) / Maciejewski, Ross (Committee member) / Arizona State University (Publisher)
Created2012
151275-Thumbnail Image.png
Description
The pay-as-you-go economic model of cloud computing increases the visibility, traceability, and verifiability of software costs. Application developers must understand how their software uses resources when running in the cloud in order to stay within budgeted costs and/or produce expected profits. Cloud computing's unique economic model also leads naturally to

The pay-as-you-go economic model of cloud computing increases the visibility, traceability, and verifiability of software costs. Application developers must understand how their software uses resources when running in the cloud in order to stay within budgeted costs and/or produce expected profits. Cloud computing's unique economic model also leads naturally to an earn-as-you-go profit model for many cloud based applications. These applications can benefit from low level analyses for cost optimization and verification. Testing cloud applications to ensure they meet monetary cost objectives has not been well explored in the current literature. When considering revenues and costs for cloud applications, the resource economic model can be scaled down to the transaction level in order to associate source code with costs incurred while running in the cloud. Both static and dynamic analysis techniques can be developed and applied to understand how and where cloud applications incur costs. Such analyses can help optimize (i.e. minimize) costs and verify that they stay within expected tolerances. An adaptation of Worst Case Execution Time (WCET) analysis is presented here to statically determine worst case monetary costs of cloud applications. This analysis is used to produce an algorithm for determining control flow paths within an application that can exceed a given cost threshold. The corresponding results are used to identify path sections that contribute most to cost excess. A hybrid approach for determining cost excesses is also presented that is comprised mostly of dynamic measurements but that also incorporates calculations that are based on the static analysis approach. This approach uses operational profiles to increase the precision and usefulness of the calculations.
ContributorsBuell, Kevin, Ph.D (Author) / Collofello, James (Thesis advisor) / Davulcu, Hasan (Committee member) / Lindquist, Timothy (Committee member) / Sen, Arunabha (Committee member) / Arizona State University (Publisher)
Created2012
151500-Thumbnail Image.png
Description
Communication networks, both wired and wireless, are expected to have a certain level of fault-tolerance capability.These networks are also expected to ensure a graceful degradation in performance when some of the network components fail. Traditional studies on fault tolerance in communication networks, for the most part, make no assumptions regarding

Communication networks, both wired and wireless, are expected to have a certain level of fault-tolerance capability.These networks are also expected to ensure a graceful degradation in performance when some of the network components fail. Traditional studies on fault tolerance in communication networks, for the most part, make no assumptions regarding the location of node/link faults, i.e., the faulty nodes and links may be close to each other or far from each other. However, in many real life scenarios, there exists a strong spatial correlation among the faulty nodes and links. Such failures are often encountered in disaster situations, e.g., natural calamities or enemy attacks. In presence of such region-based faults, many of traditional network analysis and fault-tolerant metrics, that are valid under non-spatially correlated faults, are no longer applicable. To this effect, the main thrust of this research is design and analysis of robust networks in presence of such region-based faults. One important finding of this research is that if some prior knowledge is available on the maximum size of the region that might be affected due to a region-based fault, this piece of knowledge can be effectively utilized for resource efficient design of networks. It has been shown in this dissertation that in some scenarios, effective utilization of this knowledge may result in substantial saving is transmission power in wireless networks. In this dissertation, the impact of region-based faults on the connectivity of wireless networks has been studied and a new metric, region-based connectivity, is proposed to measure the fault-tolerance capability of a network. In addition, novel metrics, such as the region-based component decomposition number(RBCDN) and region-based largest component size(RBLCS) have been proposed to capture the network state, when a region-based fault disconnects the network. Finally, this dissertation presents efficient resource allocation techniques that ensure tolerance against region-based faults, in distributed file storage networks and data center networks.
ContributorsBanerjee, Sujogya (Author) / Sen, Arunabha (Thesis advisor) / Xue, Guoliang (Committee member) / Richa, Andrea (Committee member) / Hurlbert, Glenn (Committee member) / Arizona State University (Publisher)
Created2013
151511-Thumbnail Image.png
Description
With the increase in computing power and availability of data, there has never been a greater need to understand data and make decisions from it. Traditional statistical techniques may not be adequate to handle the size of today's data or the complexities of the information hidden within the data. Thus

With the increase in computing power and availability of data, there has never been a greater need to understand data and make decisions from it. Traditional statistical techniques may not be adequate to handle the size of today's data or the complexities of the information hidden within the data. Thus knowledge discovery by machine learning techniques is necessary if we want to better understand information from data. In this dissertation, we explore the topics of asymmetric loss and asymmetric data in machine learning and propose new algorithms as solutions to some of the problems in these topics. We also studied variable selection of matched data sets and proposed a solution when there is non-linearity in the matched data. The research is divided into three parts. The first part addresses the problem of asymmetric loss. A proposed asymmetric support vector machine (aSVM) is used to predict specific classes with high accuracy. aSVM was shown to produce higher precision than a regular SVM. The second part addresses asymmetric data sets where variables are only predictive for a subset of the predictor classes. Asymmetric Random Forest (ARF) was proposed to detect these kinds of variables. The third part explores variable selection for matched data sets. Matched Random Forest (MRF) was proposed to find variables that are able to distinguish case and control without the restrictions that exists in linear models. MRF detects variables that are able to distinguish case and control even in the presence of interaction and qualitative variables.
ContributorsKoh, Derek (Author) / Runger, George C. (Thesis advisor) / Wu, Tong (Committee member) / Pan, Rong (Committee member) / Cesta, John (Committee member) / Arizona State University (Publisher)
Created2013
151517-Thumbnail Image.png
Description
Data mining is increasing in importance in solving a variety of industry problems. Our initiative involves the estimation of resource requirements by skill set for future projects by mining and analyzing actual resource consumption data from past projects in the semiconductor industry. To achieve this goal we face difficulties like

Data mining is increasing in importance in solving a variety of industry problems. Our initiative involves the estimation of resource requirements by skill set for future projects by mining and analyzing actual resource consumption data from past projects in the semiconductor industry. To achieve this goal we face difficulties like data with relevant consumption information but stored in different format and insufficient data about project attributes to interpret consumption data. Our first goal is to clean the historical data and organize it into meaningful structures for analysis. Once the preprocessing on data is completed, different data mining techniques like clustering is applied to find projects which involve resources of similar skillsets and which involve similar complexities and size. This results in "resource utilization templates" for groups of related projects from a resource consumption perspective. Then project characteristics are identified which generate this diversity in headcounts and skillsets. These characteristics are not currently contained in the data base and are elicited from the managers of historical projects. This represents an opportunity to improve the usefulness of the data collection system for the future. The ultimate goal is to match the product technical features with the resource requirement for projects in the past as a model to forecast resource requirements by skill set for future projects. The forecasting model is developed using linear regression with cross validation of the training data as the past project execution are relatively few in number. Acceptable levels of forecast accuracy are achieved relative to human experts' results and the tool is applied to forecast some future projects' resource demand.
ContributorsBhattacharya, Indrani (Author) / Sen, Arunabha (Thesis advisor) / Kempf, Karl G. (Thesis advisor) / Liu, Huan (Committee member) / Arizona State University (Publisher)
Created2013