A Machine Learning Framework for Power System Event Identification via Modal Analysis of Phasor Measurement Unit Data

190889-Thumbnail Image.png
Description
Event identification is increasingly recognized as crucial for enhancing the reliability, security, and stability of the electric power system. With the growing deployment of Phasor Measurement Units (PMUs) and advancements in data science, there are promising opportunities to explore data-driven

Event identification is increasingly recognized as crucial for enhancing the reliability, security, and stability of the electric power system. With the growing deployment of Phasor Measurement Units (PMUs) and advancements in data science, there are promising opportunities to explore data-driven event identification via machine learning classification techniques. This dissertation explores the potential of data-driven event identification through machine learning classification techniques. In the first part of this dissertation, using measurements from multiple PMUs, I propose to identify events by extracting features based on modal dynamics. I combine such traditional physics-based feature extraction methods with machine learning to distinguish different event types.Using the obtained set of features, I investigate the performance of two well-known classification models, namely, logistic regression (LR) and support vector machines (SVM) to identify generation loss and line trip events in two datasets. The first dataset is obtained from simulated events in the Texas 2000-bus synthetic grid. The second is a proprietary dataset with labeled events obtained from a large utility in the USA. My results indicate that the proposed framework is promising for identifying the two types of events in the supervised setting. In the second part of the dissertation, I use semi-supervised learning techniques, which make use of both labeled and unlabeled samples.I evaluate three categories of classical semi-supervised approaches: (i) self-training, (ii) transductive support vector machines (TSVM), and (iii) graph-based label spreading (LS) method. In particular, I focus on the identification of four event classes i.e., load loss, generation loss, line trip, and bus fault. I have developed and publicly shared a comprehensive Event Identification package which consists of three aspects: data generation, feature extraction, and event identification with limited labels using semi-supervised methodologies. Using this package, I generate eventful PMU data for the South Carolina 500-Bus synthetic network. My evaluation confirms that the integration of additional unlabeled samples and the utilization of LS for pseudo labeling surpasses the outcomes achieved by the self-training and TSVM approaches. Moreover, the LS algorithm consistently enhances the performance of all classifiers more robustly.
Date Created
2023
Agent

Towards Addressing GAN Training Instabilities: Dual-Objective GANs with Tunable Parameters

189335-Thumbnail Image.png
Description
Generative Adversarial Networks (GANs) have emerged as a powerful framework for generating realistic and high-quality data. In the original ``vanilla'' GAN formulation, two models -- the generator and discriminator -- are engaged in a min-max game and optimize the same

Generative Adversarial Networks (GANs) have emerged as a powerful framework for generating realistic and high-quality data. In the original ``vanilla'' GAN formulation, two models -- the generator and discriminator -- are engaged in a min-max game and optimize the same value function. Despite offering an intuitive approach, vanilla GANs often face stability challenges such as vanishing gradients and mode collapse. Addressing these common failures, recent work has proposed the use of tunable classification losses in place of traditional value functions. Although parameterized robust loss families, e.g. $\alpha$-loss, have shown promising characteristics as value functions, this thesis argues that the generator and discriminator require separate objective functions to achieve their different goals. As a result, this thesis introduces the $(\alpha_{D}, \alpha_{G})$-GAN, a parameterized class of dual-objective GANs, as an alternative approach to the standard vanilla GAN. The $(\alpha_{D}, \alpha_{G})$-GAN formulation, inspired by $\alpha$-loss, allows practitioners to tune the parameters $(\alpha_{D}, \alpha_{G}) \in [0,\infty)^{2}$ to provide a more stable training process. The objectives for the generator and discriminator in $(\alpha_{D}, \alpha_{G})$-GAN are derived, and the advantages of using these objectives are investigated. In particular, the optimization trajectory of the generator is found to be influenced by the choice of $\alpha_{D}$ and $\alpha_{G}$. Empirical evidence is presented through experiments conducted on various datasets, including the 2D Gaussian Mixture Ring, Celeb-A image dataset, and LSUN Classroom image dataset. Performance metrics such as mode coverage and Fréchet Inception Distance (FID) are used to evaluate the effectiveness of the $(\alpha_{D}, \alpha_{G})$-GAN compared to the vanilla GAN and state-of-the-art Least Squares GAN (LSGAN). The experimental results demonstrate that tuning $\alpha_{D} < 1$ leads to improved stability, robustness to hyperparameter choice, and competitive performance compared to LSGAN.
Date Created
2023
Agent

Distributed Learning and Data Collection with Strategic Agents

187813-Thumbnail Image.png
Description
The presence of strategic agents can pose unique challenges to data collection and distributed learning. This dissertation first explores the social network dimension of data collection markets, and then focuses on how the strategic agents can be efficiently and effectively

The presence of strategic agents can pose unique challenges to data collection and distributed learning. This dissertation first explores the social network dimension of data collection markets, and then focuses on how the strategic agents can be efficiently and effectively incentivized to cooperate in distributed machine learning frameworks. The first problem explores the impact of social learning in collecting and trading unverifiable information where a data collector purchases data from users through a payment mechanism. Each user starts with a personal signal which represents the knowledge about the underlying state the data collector desires to learn. Through social interactions, each user also acquires additional information from his neighbors in the social network. It is revealed that both the data collector and the users can benefit from social learning which drives down the privacy costs and helps to improve the state estimation for a given total payment budget. In the second half, a federated learning scheme to train a global learning model with strategic agents, who are not bound to contribute their resources unconditionally, is considered. Since the agents are not obliged to provide their true stochastic gradient updates and the server is not capable of directly validating the authenticity of reported updates, the learning process may reach a noncooperative equilibrium. First, the actions of the agents are assumed to be binary: cooperative or defective. If the cooperative action is taken, the agent sends a privacy-preserved version of stochastic gradient signal. If the defective action is taken, the agent sends an arbitrary uninformative noise signal. Furthermore, this setup is extended into the scenarios with more general actions spaces where the quality of the stochastic gradient updates have a range of discrete levels. The proposed methodology evaluates each agent's stochastic gradient according to a reference gradient estimate which is constructed from the gradients provided by other agents, and rewards the agent based on that evaluation.
Date Created
2023
Agent

A Tunable Loss Function for Robust, Rigorous, and Reliable Machine Learning

171411-Thumbnail Image.png
Description
In the era of big data, more and more decisions and recommendations are being made by machine learning (ML) systems and algorithms. Despite their many successes, there have been notable deficiencies in the robustness, rigor, and reliability of these ML

In the era of big data, more and more decisions and recommendations are being made by machine learning (ML) systems and algorithms. Despite their many successes, there have been notable deficiencies in the robustness, rigor, and reliability of these ML systems, which have had detrimental societal impacts. In the next generation of ML, these significant challenges must be addressed through careful algorithmic design, and it is crucial that practitioners and meta-algorithms have the necessary tools to construct ML models that align with human values and interests. In an effort to help address these problems, this dissertation studies a tunable loss function called α-loss for the ML setting of classification. The alpha-loss is a hyperparameterized loss function originating from information theory that continuously interpolates between the exponential (alpha = 1/2), log (alpha = 1), and 0-1 (alpha = infinity) losses, hence providing a holistic perspective of several classical loss functions in ML. Furthermore, the alpha-loss exhibits unique operating characteristics depending on the value (and different regimes) of alpha; notably, for alpha > 1, alpha-loss robustly trains models when noisy training data is present. Thus, the alpha-loss can provide robustness to ML systems for classification tasks, and this has bearing in many applications, e.g., social media, finance, academia, and medicine; indeed, results are presented where alpha-loss produces more robust logistic regression models for COVID-19 survey data with gains over state of the art algorithmic approaches.
Date Created
2022
Agent

Distributed Learning and Adaptive Algorithms for Edge Networks

168293-Thumbnail Image.png
Description
Edge networks pose unique challenges for machine learning and network management. The primary objective of this dissertation is to study deep learning and adaptive control aspects of edge networks and to address some of the unique challenges therein. This dissertation

Edge networks pose unique challenges for machine learning and network management. The primary objective of this dissertation is to study deep learning and adaptive control aspects of edge networks and to address some of the unique challenges therein. This dissertation explores four particular problems of interest at the intersection of edge intelligence, deep learning and network management. The first problem explores the learning of generative models in edge learning setting. Since the learning tasks in similar environments share model similarity, it is plausible to leverage pre-trained generative models from other edge nodes. Appealing to optimal transport theory tailored towards Wasserstein-1 generative adversarial networks, this part aims to develop a framework which systematically optimizes the generative model learning performance using local data at the edge node while exploiting the adaptive coalescence of pre-trained generative models from other nodes. In the second part, a many-to-one wireless architecture for federated learning at the network edge, where multiple edge devices collaboratively train a model using local data, is considered. The unreliable nature of wireless connectivity, togetherwith the constraints in computing resources at edge devices, dictates that the local updates at edge devices should be carefully crafted and compressed to match the wireless communication resources available and should work in concert with the receiver. Therefore, a Stochastic Gradient Descent based bandlimited coordinate descent algorithm is designed for such settings. The third part explores the adaptive traffic engineering algorithms in a dynamic network environment. The ages of traffic measurements exhibit significant variation due to asynchronization and random communication delays between routers and controllers. Inspired by the software defined networking architecture, a controller-assisted distributed routing scheme with recursive link weight reconfigurations, accounting for the impact of measurement ages and routing instability, is devised. The final part focuses on developing a federated learning based framework for traffic reshaping of electric vehicle (EV) charging. The absence of private EV owner information and scattered EV charging data among charging stations motivates the utilization of a federated learning approach. Federated learning algorithms are devised to minimize peak EV charging demand both spatially and temporarily, while maximizing the charging station profit.
Date Created
2021
Agent

Machine Learning for the Analysis of Power System Loads: Cyber-Attack Detection and Generation of Synthetic Datasets

161574-Thumbnail Image.png
Description
As the field of machine learning increasingly provides real value to power system operations, the availability of rich measurement datasets has become crucial for the development of new applications and technologies. This dissertation focuses on the use of time-series load

As the field of machine learning increasingly provides real value to power system operations, the availability of rich measurement datasets has become crucial for the development of new applications and technologies. This dissertation focuses on the use of time-series load data for the design of novel data-driven algorithms. Loads are one of the main factors driving the behavior of a power system and they depend on external phenomena which are not captured by traditional simulation tools. Thus, accurate models that capture the fundamental characteristics of time-series load dataare necessary. In the first part of this dissertation, an example of successful application of machine learning algorithms that leverage load data is presented. Prior work has shown that power systems energy management systems are vulnerable to false data injection attacks against state estimation. Here, a data-driven approach for the detection and localization of such attacks is proposed. The detector uses historical data to learn the normal behavior of the loads in a system and subsequently identify if any of the real-time observed measurements are being manipulated by an attacker. The second part of this work focuses on the design of generative models for time-series load data. Two separate techniques are used to learn load behaviors from real datasets and exploiting them to generate realistic synthetic data. The first approach is based on principal component analysis (PCA), which is used to extract common temporal patterns from real data. The second method leverages conditional generative adversarial networks (cGANs) and it overcomes the limitations of the PCA-based model while providing greater and more nuanced control on the generation of specific types of load profiles. Finally, these two classes of models are combined in a multi-resolution generative scheme which is capable of producing any amount of time-series load data at any sampling resolution, for lengths ranging from a few seconds to years.
Date Created
2021
Agent

Audio Waveform Sample SVD Compression and Impact on Performance

147972-Thumbnail Image.png
Description

Lossy compression is a form of compression that slightly degrades a signal in ways that are ideally not detectable to the human ear. This is opposite to lossless compression, in which the sample is not degraded at all. While lossless

Lossy compression is a form of compression that slightly degrades a signal in ways that are ideally not detectable to the human ear. This is opposite to lossless compression, in which the sample is not degraded at all. While lossless compression may seem like the best option, lossy compression, which is used in most audio and video, reduces transmission time and results in much smaller file sizes. However, this compression can affect quality if it goes too far. The more compression there is on a waveform, the more degradation there is, and once a file is lossy compressed, this process is not reversible. This project will observe the degradation of an audio signal after the application of Singular Value Decomposition compression, a lossy compression that eliminates singular values from a signal’s matrix.

Date Created
2021-05
Agent

Unobservable False Data Injection Attacks on Power Systems

158293-Thumbnail Image.png
Description
Reliable operation of modern power systems is ensured by an intelligent cyber layer that monitors and controls the physical system. The data collection and transmission is achieved by the supervisory control and data acquisition (SCADA) system, and data processing is

Reliable operation of modern power systems is ensured by an intelligent cyber layer that monitors and controls the physical system. The data collection and transmission is achieved by the supervisory control and data acquisition (SCADA) system, and data processing is performed by the energy management system (EMS). In the recent decades, the development of phasor measurement units (PMUs) enables wide area real-time monitoring and control. However, both SCADA-based and PMU-based cyber layers are prone to cyber attacks that can impact system operation and lead to severe physical consequences.

This dissertation studies false data injection (FDI) attacks that are unobservable to bad data detectors (BDD). Prior work has shown that an attacker-defender bi-level linear program (ADBLP) can be used to determine the worst-case consequences of FDI attacks aiming to maximize the physical power flow on a target line. However, the results were only demonstrated on small systems assuming that they are operated with DC optimal power flow (OPF). This dissertation is divided into four parts to thoroughly understand the consequences of these attacks as well as develop countermeasures.

The first part focuses on evaluating the vulnerability of large-scale power systems to FDI attacks. The solution technique introduced in prior work to solve the ADBLP is intractable on large-scale systems due to the large number of binary variables. Four new computationally efficient algorithms are presented to solve this problem.

The second part studies vulnerability of N-1 reliable power systems operated by state-of-the-art EMSs commonly used in practice, specifically real-time contingency analysis (RTCA), and security-constrained economic dispatch (SCED). An ADBLP is formulated with detailed assumptions on attacker's knowledge and system operations.

The third part considers FDI attacks on PMU measurements that have strong temporal correlations due to high data rate. It is shown that predictive filters can detect suddenly injected attacks, but not gradually ramping attacks.

The last part proposes a machine learning-based attack detection framework consists of a support vector regression (SVR) load predictor that predicts loads by exploiting both spatial and temporal correlations, and a subsequent support vector machine (SVM) attack detector to determine the existence of attacks.
Date Created
2020
Agent

Anticipating Postoperative Delirium During Cardiac Surgeries Involving Deep Hypothermia Circulatory Arrest

158175-Thumbnail Image.png
Description
Aortic aneurysms and dissections are life threatening conditions addressed by replacing damaged sections of the aorta. Blood circulation must be halted to facilitate repairs. Ischemia places the body, especially the brain, at risk of damage. Deep hypothermia circulatory arrest (DHCA)

Aortic aneurysms and dissections are life threatening conditions addressed by replacing damaged sections of the aorta. Blood circulation must be halted to facilitate repairs. Ischemia places the body, especially the brain, at risk of damage. Deep hypothermia circulatory arrest (DHCA) is employed to protect patients and provide time for surgeons to complete repairs on the basis that reducing body temperature suppresses the metabolic rate. Supplementary surgical techniques can be employed to reinforce the brain's protection and increase the duration circulation can be suspended. Even then, protection is not completely guaranteed though. A medical condition that can arise early in recovery is postoperative delirium, which is correlated with poor long term outcome. This study develops a methodology to intraoperatively monitor neurophysiology through electroencephalography (EEG) and anticipate postoperative delirium. The earliest opportunity to detect occurrences of complications through EEG is immediately following DHCA during warming. The first observable electrophysiological activity after being completely suppressed is a phenomenon known as burst suppression, which is related to the brain's metabolic state and recovery of nominal neurological function. A metric termed burst suppression duty cycle (BSDC) is developed to characterize the changing electrophysiological dynamics. Predictions of postoperative delirium incidences are made by identifying deviations in the way these dynamics evolve. Sixteen cases are examined in this study. Accurate predictions can be made, where on average 89.74% of cases are correctly classified when burst suppression concludes and 78.10% when burst suppression begins. The best case receiver operating characteristic curve has an area under its convex hull of 0.8988, whereas the worst case area under the hull is 0.7889. These results demonstrate the feasibility of monitoring BSDC to anticipate postoperative delirium during burst suppression. They also motivate a further analysis on identifying footprints of causal mechanisms of neural injury within BSDC. Being able to raise warning signs of postoperative delirium early provides an opportunity to intervene and potentially avert neurological complications. Doing so would improve the success rate and quality of life after surgery.
Date Created
2020
Agent

Quantifying Information Leakage via Adversarial Loss Functions: Theory and Practice

158139-Thumbnail Image.png
Description
Modern digital applications have significantly increased the leakage of private and sensitive personal data. While worst-case measures of leakage such as Differential Privacy (DP) provide the strongest guarantees, when utility matters, average-case information-theoretic measures can be more relevant. However, most

Modern digital applications have significantly increased the leakage of private and sensitive personal data. While worst-case measures of leakage such as Differential Privacy (DP) provide the strongest guarantees, when utility matters, average-case information-theoretic measures can be more relevant. However, most such information-theoretic measures do not have clear operational meanings. This dissertation addresses this challenge.

This work introduces a tunable leakage measure called maximal $\alpha$-leakage which quantifies the maximal gain of an adversary in inferring any function of a data set. The inferential capability of the adversary is modeled by a class of loss functions, namely, $\alpha$-loss. The choice of $\alpha$ determines specific adversarial actions ranging from refining a belief for $\alpha =1$ to guessing the best posterior for $\alpha = \infty$, and for the two specific values maximal $\alpha$-leakage simplifies to mutual information and maximal leakage, respectively. Maximal $\alpha$-leakage is proved to have a composition property and be robust to side information.

There is a fundamental disjoint between theoretical measures of information leakages and their applications in practice. This issue is addressed in the second part of this dissertation by proposing a data-driven framework for learning Censored and Fair Universal Representations (CFUR) of data. This framework is formulated as a constrained minimax optimization of the expected $\alpha$-loss where the constraint ensures a measure of the usefulness of the representation. The performance of the CFUR framework with $\alpha=1$ is evaluated on publicly accessible data sets; it is shown that multiple sensitive features can be effectively censored to achieve group fairness via demographic parity while ensuring accuracy for several \textit{a priori} unknown downstream tasks.

Finally, focusing on worst-case measures, novel information-theoretic tools are used to refine the existing relationship between two such measures, $(\epsilon,\delta)$-DP and R\'enyi-DP. Applying these tools to the moments accountant framework, one can track the privacy guarantee achieved by adding Gaussian noise to Stochastic Gradient Descent (SGD) algorithms. Relative to state-of-the-art, for the same privacy budget, this method allows about 100 more SGD rounds for training deep learning models.
Date Created
2020
Agent