Search Content

Real-time power system topology monitoring supported by synchrophasor measurements

Description

ABSTRACT

This dissertation introduces a real-time topology monitoring scheme for power systems intended to provide enhanced situational awareness during major system disturbances. The topology monitoring scheme requires accurate real-time topology information to be effective. This scheme is supported by advances in transmission line outage detection based on data-mining phasor measurement unit…

ABSTRACT

This dissertation introduces a real-time topology monitoring scheme for power systems intended to provide enhanced situational awareness during major system disturbances. The topology monitoring scheme requires accurate real-time topology information to be effective. This scheme is supported by advances in transmission line outage detection based on data-mining phasor measurement unit (PMU) measurements.

A network flow analysis scheme is proposed to track changes in user defined minimal cut sets within the system. This work introduces a new algorithm used to update a previous network flow solution after the loss of a single system branch. The proposed new algorithm provides a significantly decreased solution time that is desired in a real- time environment. This method of topology monitoring can provide system operators with visual indications of potential problems in the system caused by changes in topology.

This work also presents a method of determining all singleton cut sets within a given network topology called the one line remaining (OLR) algorithm. During operation, if a singleton cut set exists, then the system cannot withstand the loss of any one line and still remain connected. The OLR algorithm activates after the loss of a transmission line and determines if any singleton cut sets were created. These cut sets are found using properties of power transfer distribution factors and minimal cut sets.

The topology analysis algorithms proposed in this work are supported by line outage detection using PMU measurements aimed at providing accurate real-time topology information. This process uses a decision tree (DT) based data-mining approach to characterize a lost tie line in simulation. The trained DT is then used to analyze PMU measurements to detect line outages. The trained decision tree was applied to real PMU measurements to detect the loss of a 500 kV line and had no misclassifications.

The work presented has the objective of enhancing situational awareness during significant system disturbances in real time. This dissertation presents all parts of the proposed topology monitoring scheme and justifies and validates the methodology using a real system event.

ContributorsWerho, Trevor Nelson (Author) / Vittal, Vijay (Thesis advisor) / Heydt, Gerald (Committee member) / Hedman, Kory (Committee member) / Karady, George G. (Committee member) / Arizona State University (Publisher)

Created2015

A study of accelerated Bayesian additive regression trees

Description

Bayesian Additive Regression Trees (BART) is a non-parametric Bayesian model

that often outperforms other popular predictive models in terms of out-of-sample error. This thesis studies a modified version of BART called Accelerated Bayesian Additive Regression Trees (XBART). The study consists of simulation and real data experiments comparing XBART to other leading…

Bayesian Additive Regression Trees (BART) is a non-parametric Bayesian model

that often outperforms other popular predictive models in terms of out-of-sample error. This thesis studies a modified version of BART called Accelerated Bayesian Additive Regression Trees (XBART). The study consists of simulation and real data experiments comparing XBART to other leading algorithms, including BART. The results show that XBART maintains BART’s predictive power while reducing its computation time. The thesis also describes the development of a Python package implementing XBART.

ContributorsYalov, Saar (Author) / Hahn, P. Richard (Thesis advisor) / McCulloch, Robert (Committee member) / Kao, Ming-Hung (Committee member) / Arizona State University (Publisher)

Created2019

Malware Analysis and Classification Framework: Detecting Financial Malware Using Machine Learning Techniques

Description

Malware that perform identity theft or steal bank credentials are becoming increasingly common and can cause millions of dollars of damage annually. A large area of research focus is the automated detection and removal of such malware, due to their large impact on millions of people each year. Such a…

Malware that perform identity theft or steal bank credentials are becoming increasingly common and can cause millions of dollars of damage annually. A large area of research focus is the automated detection and removal of such malware, due to their large impact on millions of people each year. Such a detector will be beneficial to any industry that is regularly the target of malware, such as the financial sector. Typical detection approaches such as those found in commercial anti-malware software include signature-based scanning, in which malware executables are identified based on a unique signature or fingerprint developed for that malware. However, as malware authors continue to modify and obfuscate their malware, heuristic detection is increasingly popular, in which the behaviors of the malware are identified and patterns recognized. We explore a malware analysis and classification framework using machine learning to train classifiers to distinguish between malware and benign programs based upon their features and behaviors. Using both decision tree learning and support vector machines as classifier models, we obtained overall classification accuracies of around 80%. Due to limitations primarily including the usage of a small data set, our approach may not be suitable for practical classification of malware and benign programs, as evident by a high error rate.

ContributorsAnwar, Sajid (Co-author) / Chan, Tsz (Co-author) / Ahn, Gail-Joon (Thesis director) / Zhao, Ziming (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2016-05

Missing Data in Conditional Inference Trees

Description

Decision trees is a machine learning technique that searches the predictor space for the variable and observed value that leads to the best prediction when the data are split into two nodes based on the variable and splitting value. Conditional Inference Trees (CTREEs) is a non-parametric class of decision trees…

Decision trees is a machine learning technique that searches the predictor space for the variable and observed value that leads to the best prediction when the data are split into two nodes based on the variable and splitting value. Conditional Inference Trees (CTREEs) is a non-parametric class of decision trees that uses statistical theory in order to select variables for splitting. Missing data can be problematic in decision trees because of an inability to place an observation with a missing value into a node based on the chosen splitting variable. Moreover, missing data can alter the selection process because of its inability to place observations with missing values. Simple missing data approaches (e.g., deletion, majority rule, and surrogate split) have been implemented in decision tree algorithms; however, more sophisticated missing data techniques have not been thoroughly examined. In addition to these approaches, this dissertation proposed a modified multiple imputation approach to handling missing data in CTREEs. A simulation was conducted to compare this approach with simple missing data approaches as well as single imputation and a multiple imputation with prediction averaging. Results revealed that simple approaches (i.e., majority rule, treat missing as its own category, and listwise deletion) were effective in handling missing data in CTREEs. The modified multiple imputation approach did not perform very well against simple approaches in most conditions, but this approach did seem best suited for small sample sizes and extreme missingness situations.

ContributorsManapat, Danielle Marie (Author) / Grimm, Kevin J (Thesis advisor) / Edwards, Michael C (Thesis advisor) / McNeish, Daniel (Committee member) / Anderson, Samantha F (Committee member) / Arizona State University (Publisher)

Created2023

Machine Learning for the Design of Screening Tests: General Principles and Applications in Criminology and Digital Medicine

Description

This dissertation explores applications of machine learning methods in service of the design of screening tests, which are ubiquitous in applications from social work, to criminology, to healthcare. In the first part, a novel Bayesian decision theory framework is presented for designing tree-based adaptive tests. On an application to youth…

This dissertation explores applications of machine learning methods in service of the design of screening tests, which are ubiquitous in applications from social work, to criminology, to healthcare. In the first part, a novel Bayesian decision theory framework is presented for designing tree-based adaptive tests. On an application to youth delinquency in Honduras, the method produces a 15-item instrument that is almost as accurate as a full-length 150+ item test. The framework includes specific considerations for the context in which the test will be administered, and provides uncertainty quantification around the trade-offs of shortening lengthy tests. In the second part, classification complexity is explored via theoretical and empirical results from statistical learning theory, information theory, and empirical data complexity measures. A simulation study that explicitly controls two key aspects of classification complexity is performed to relate the theoretical and empirical approaches. Throughout, a unified language and notation that formalizes classification complexity is developed; this same notation is used in subsequent chapters to discuss classification complexity in the context of a speech-based screening test. In the final part, the relative merits of task and feature engineering when designing a speech-based cognitive screening test are explored. Through an extensive classification analysis on a clinical speech dataset from patients with normal cognition and Alzheimer’s disease, the speech elicitation task is shown to have a large impact on test accuracy; carefully performed task and feature engineering are required for best results. A new framework for objectively quantifying speech elicitation tasks is introduced, and two methods are proposed for automatically extracting insights into the aspects of the speech elicitation task that are driving classification performance. The dissertation closes with recommendations for how to evaluate the obtained insights and use them to guide future design of speech-based screening tests.

ContributorsKrantsevich, Chelsea (Author) / Hahn, P. Richard (Thesis advisor) / Berisha, Visar (Committee member) / Lopes, Hedibert (Committee member) / Renaut, Rosemary (Committee member) / Zheng, Yi (Committee member) / Arizona State University (Publisher)

Created2023

Computational Challenges in BART Modeling: Extrapolation, Classification, and Causal Inference

Description

This dissertation centers on Bayesian Additive Regression Trees (BART) and Accelerated BART (XBART) and presents a series of models that tackle extrapolation, classification, and causal inference challenges. To improve extrapolation in tree-based models, I propose a method called local Gaussian Process (GP) that combines Gaussian process regression with trained BART…

This dissertation centers on Bayesian Additive Regression Trees (BART) and Accelerated BART (XBART) and presents a series of models that tackle extrapolation, classification, and causal inference challenges. To improve extrapolation in tree-based models, I propose a method called local Gaussian Process (GP) that combines Gaussian process regression with trained BART trees. This allows for extrapolation based on the most relevant data points and covariate variables determined by the trees' structure. The local GP technique is extended to the Bayesian causal forest (BCF) models to address the positivity violation issue in causal inference. Additionally, I introduce the LongBet model to estimate time-varying, heterogeneous treatment effects in panel data. Furthermore, I present a Poisson-based model, with a modified likelihood for XBART for the multi-class classification problem.

ContributorsWang, Meijia (Author) / Hahn, Paul (Thesis advisor) / He, Jingyu (Committee member) / Lan, Shiwei (Committee member) / McCulloch, Robert (Committee member) / Zhou, Shuang (Committee member) / Arizona State University (Publisher)

Created2024

Novel Learning-Based Task Schedulers for Domain-Specific SoCs

Description

This Master’s thesis includes the design, integration on-chip, and evaluation of a set of imitation learning (IL)-based scheduling policies: deep neural network (DNN)and decision tree (DT). We first developed IL-based scheduling policies for heterogeneous systems-on-chips (SoCs). Then, we tested these policies using a system-level domain-specific system-on-chip simulation framework [11]. Finally,…

This Master’s thesis includes the design, integration on-chip, and evaluation of a set of imitation learning (IL)-based scheduling policies: deep neural network (DNN)and decision tree (DT). We first developed IL-based scheduling policies for heterogeneous systems-on-chips (SoCs). Then, we tested these policies using a system-level domain-specific system-on-chip simulation framework [11]. Finally, we transformed them into efficient code using a cloud engine [1] and implemented on a user-space emulation framework [61] on a Unix-based SoC. IL is one area of machine learning (ML) and a useful method to train artificial intelligence (AI) models by imitating the decisions of an expert or Oracle that knows the optimal solution. This thesis's primary focus is to adapt an ML model to work on-chip and optimize the resource allocation for a set of domain-specific wireless and radar systems applications. Evaluation results with four streaming applications from wireless communications and radar domains show how the proposed IL-based scheduler approximates an offline Oracle expert with more than 97% accuracy and 1.20× faster execution time. The models have been implemented as an add-on, making it easy to port to other SoCs.

ContributorsHolt, Conrad Mestres (Author) / Ogras, Umit Y. (Thesis advisor) / Chakrabarti, Chaitali (Committee member) / Akoglu, Ali (Committee member) / Arizona State University (Publisher)

Created2020