Matching Items (3)
Filtering by

Clear all filters

137627-Thumbnail Image.png
Description
Polar ice masses can be valuable indicators of trends in global climate. In an effort to better understand the dynamics of Arctic ice, this project analyzes sea ice concentration anomaly data collected over gridded regions (cells) and builds graphs based upon high correlations between cells. These graphs offer the opportunity

Polar ice masses can be valuable indicators of trends in global climate. In an effort to better understand the dynamics of Arctic ice, this project analyzes sea ice concentration anomaly data collected over gridded regions (cells) and builds graphs based upon high correlations between cells. These graphs offer the opportunity to use metrics such as clustering coefficients and connected components to isolate representative trends in ice masses. Based upon this analysis, the structure of sea ice graphs differs at a statistically significant level from random graphs, and several regions show erratically decreasing trends in sea ice concentration.
ContributorsWallace-Patterson, Chloe Rae (Author) / Syrotiuk, Violet (Thesis director) / Colbourn, Charles (Committee member) / Montgomery, Douglas (Committee member) / Barrett, The Honors College (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Computer Science and Engineering Program (Contributor)
Created2013-05
161846-Thumbnail Image.png
Description
Complex systems appear when interaction among system components creates emergent behavior that is difficult to be predicted from component properties. The growth of Internet of Things (IoT) and embedded technology has increased complexity across several sectors (e.g., automotive, aerospace, agriculture, city infrastructures, home technologies, healthcare) where the paradigm of cyber-physical

Complex systems appear when interaction among system components creates emergent behavior that is difficult to be predicted from component properties. The growth of Internet of Things (IoT) and embedded technology has increased complexity across several sectors (e.g., automotive, aerospace, agriculture, city infrastructures, home technologies, healthcare) where the paradigm of cyber-physical systems (CPSs) has become a standard. While CPS enables unprecedented capabilities, it raises new challenges in system design, certification, control, and verification. When optimizing system performance computationally expensive simulation tools are often required, and search algorithms that sequentially interrogate a simulator to learn promising solutions are in great demand. This class of algorithms are black-box optimization techniques. However, the generality that makes black-box optimization desirable also causes computational efficiency difficulties when applied real problems. This thesis focuses on Bayesian optimization, a prominent black-box optimization family, and proposes new principles, translated in implementable algorithms, to scale Bayesian optimization to highly expensive, large scale problems. Four problem contexts are studied and approaches are proposed for practically applying Bayesian optimization concepts, namely: (1) increasing sample efficiency of a highly expensive simulator in the presence of other sources of information, where multi-fidelity optimization is used to leverage complementary information sources; (2) accelerating global optimization in the presence of local searches by avoiding over-exploitation with adaptive restart behavior; (3) scaling optimization to high dimensional input spaces by integrating Game theoretic mechanisms with traditional techniques; (4) accelerating optimization by embedding function structure when the reward function is a minimum of several functions. In the first context this thesis produces two multi-fidelity algorithms, a sample driven and model driven approach, and is implemented to optimize a serial production line; in the second context the Stochastic Optimization with Adaptive Restart (SOAR) framework is produced and analyzed with multiple applications to CPS falsification problems; in the third context the Bayesian optimization with sample fictitious play (BOFiP) algorithm is developed with an implementation in high-dimensional neural network training; in the last problem context the minimum surrogate optimization (MSO) framework is produced and combined with both Bayesian optimization and the SOAR framework with applications in simultaneous falsification of multiple CPS requirements.
ContributorsMathesen, Logan (Author) / Pedrielli, Giulia (Thesis advisor) / Candan, Kasim (Committee member) / Fainekos, Georgios (Committee member) / Gel, Esma (Committee member) / Montgomery, Douglas (Committee member) / Zabinsky, Zelda (Committee member) / Arizona State University (Publisher)
Created2021
161983-Thumbnail Image.png
Description
Matching or stratification is commonly used in observational studies to remove bias due to confounding variables. Analyzing matched data sets requires specific methods which handle dependency among observations within a stratum. Also, modern studies often include hundreds or thousands of variables. Traditional methods for matched data sets are challenged in

Matching or stratification is commonly used in observational studies to remove bias due to confounding variables. Analyzing matched data sets requires specific methods which handle dependency among observations within a stratum. Also, modern studies often include hundreds or thousands of variables. Traditional methods for matched data sets are challenged in high-dimensional settings, mixed type variables (numerical and categorical), nonlinear andinteraction effects. Furthermore, machine learning research for such structured data is quite limited. This dissertation addresses this important gap and proposes machine learning models for identifying informative variables from high-dimensional matched data sets. The first part of this dissertation proposes a machine learning model to identify informative variables from high-dimensional matched case-control data sets. The outcome of interest in this study design is binary (case or control), and each stratum is assumed to have one unit from each outcome level. The proposed method which is referred to as Matched Forest (MF) is effective for large number of variables and identifying interaction effects. The second part of this dissertation proposes three enhancements of MF algorithm. First, a regularization framework is proposed to improve variable selection performance in excessively high-dimensional settings. Second, a classification method is proposed to classify unlabeled pairs of data. Third, two metrics are proposed to estimate the effects of important variables identified by MF. The third part proposes a machine learning model based on Neural Networks to identify important variables from a more generalized matched case-control data set where each stratum has one unit from case outcome level and more than one unit from control outcome level. This method which is referred to as Matched Neural Network (MNN) performs better than current algorithms to identify variables with interaction effects. Lastly, a generalized machine learning model is proposed to identify informative variables from high-dimensional matched data sets where the outcome has more than two levels. This method outperforms existing algorithms in the literature in identifying variables with complex nonlinear and interaction effects.
ContributorsShomal Zadeh, Nooshin (Author) / Runger, George (Thesis advisor) / Montgomery, Douglas (Committee member) / Shinde, Shilpa (Committee member) / Escobedo, Adolfo (Committee member) / Arizona State University (Publisher)
Created2021