Matching Items (2)
Filtering by

Clear all filters

168304-Thumbnail Image.png
Description
Monitoring a system for deviations from standard or reference behavior is essential for many data-driven tasks. Whether it is monitoring sensor data or the interactions between system elements, such as edges in a path or transactions in a network, the goal is to detect significant changes from a reference. As

Monitoring a system for deviations from standard or reference behavior is essential for many data-driven tasks. Whether it is monitoring sensor data or the interactions between system elements, such as edges in a path or transactions in a network, the goal is to detect significant changes from a reference. As technological advancements allow for more data to be collected from systems, monitoring approaches should evolve to accommodate the greater collection of high-dimensional data and complex system settings. This dissertation introduces system-level models for monitoring tasks characterized by changes in a subset of system components, utilizing component-level information and relationships. A change may only affect a portion of the data or system (partial change). The first three parts of this dissertation present applications and methods for detecting partial changes. The first part introduces a methodology for partial change detection in a simple, univariate setting. Changes are detected with posterior probabilities and statistical mixture models which allow only a fraction of data to change. The second and third parts of this dissertation center around monitoring more complex multivariate systems modeled through networks. The goal is to detect partial changes in the underlying network attributes and topology. The contributions of the second and third parts are two non-parametric system-level monitoring techniques that consider relationships between network elements. The algorithm Supervised Network Monitoring (SNetM) leverages Graph Neural Networks and transforms the problem into supervised learning. The other algorithm Supervised Network Monitoring for Partial Temporal Inhomogeneity (SNetMP) generates a network embedding, and then transforms the problem to supervised learning. At the end, both SNetM and SNetMP construct measures and transform them to pseudo-probabilities to be monitored for changes. The last topic addresses predicting and monitoring system-level delays on paths in a transportation/delivery system. For each item, the risk of delay is quantified. Machine learning is used to build a system-level model for delay risk, given the information available (such as environmental conditions) on the edges of a path, which integrates edge models. The outputs can then be used in a system-wide monitoring framework, and items most at risk are identified for potential corrective actions.
ContributorsKasaei Roodsari, Maziar (Author) / Runger, George (Thesis advisor) / Escobedo, Adolfo (Committee member) / Pan, Rong (Committee member) / Shinde, Amit (Committee member) / Arizona State University (Publisher)
Created2021
161983-Thumbnail Image.png
Description
Matching or stratification is commonly used in observational studies to remove bias due to confounding variables. Analyzing matched data sets requires specific methods which handle dependency among observations within a stratum. Also, modern studies often include hundreds or thousands of variables. Traditional methods for matched data sets are challenged in

Matching or stratification is commonly used in observational studies to remove bias due to confounding variables. Analyzing matched data sets requires specific methods which handle dependency among observations within a stratum. Also, modern studies often include hundreds or thousands of variables. Traditional methods for matched data sets are challenged in high-dimensional settings, mixed type variables (numerical and categorical), nonlinear andinteraction effects. Furthermore, machine learning research for such structured data is quite limited. This dissertation addresses this important gap and proposes machine learning models for identifying informative variables from high-dimensional matched data sets. The first part of this dissertation proposes a machine learning model to identify informative variables from high-dimensional matched case-control data sets. The outcome of interest in this study design is binary (case or control), and each stratum is assumed to have one unit from each outcome level. The proposed method which is referred to as Matched Forest (MF) is effective for large number of variables and identifying interaction effects. The second part of this dissertation proposes three enhancements of MF algorithm. First, a regularization framework is proposed to improve variable selection performance in excessively high-dimensional settings. Second, a classification method is proposed to classify unlabeled pairs of data. Third, two metrics are proposed to estimate the effects of important variables identified by MF. The third part proposes a machine learning model based on Neural Networks to identify important variables from a more generalized matched case-control data set where each stratum has one unit from case outcome level and more than one unit from control outcome level. This method which is referred to as Matched Neural Network (MNN) performs better than current algorithms to identify variables with interaction effects. Lastly, a generalized machine learning model is proposed to identify informative variables from high-dimensional matched data sets where the outcome has more than two levels. This method outperforms existing algorithms in the literature in identifying variables with complex nonlinear and interaction effects.
ContributorsShomal Zadeh, Nooshin (Author) / Runger, George (Thesis advisor) / Montgomery, Douglas (Committee member) / Shinde, Shilpa (Committee member) / Escobedo, Adolfo (Committee member) / Arizona State University (Publisher)
Created2021