Harnessing Structure in Discrete and Non-convex optimization with applications in online learning, multi-agent systems, and phase retrieval

193468-Thumbnail Image.png
Description
This thesis examines the critical relationship between data, complex models, and other methods to measure and analyze them. As models grow larger and more intricate, they require more data, making it vital to use that data effectively. The document starts

This thesis examines the critical relationship between data, complex models, and other methods to measure and analyze them. As models grow larger and more intricate, they require more data, making it vital to use that data effectively. The document starts with a deep dive into nonconvex functions, a fundamental element of modern complex systems, identifying key conditions that ensure these systems can be analyzed efficiently—a crucial consideration in an era of vast amounts of variables. Loss functions, traditionally seen as mere optimization tools, are analyzed and recast as measures of how accurately a model reflects reality. This redefined perspective permits the refinement of data-sourcing strategies for a better data economy. The aim of the investigation is the model itself, which is used to understand and harness the underlying patterns of complex systems. By incorporating structure both implicitly (through periodic patterns) and explicitly (using graphs), the model's ability to make sense of the data is enhanced. Moreover, online learning principles are applied to a crucial practical scenario: robotic resource monitoring. The results established in this thesis, backed by simulations and theoretical proofs, highlight the advantages of online learning methods over traditional ones commonly used in robotics. In sum, this thesis presents an integrated approach to measuring complex systems, providing new insights and methods that push forward the capabilities of machine learning.
Date Created
2024
Agent

Parameter Optimization with Conscious Allocation (POCA): Efficient Bayesian Hyperparameter Optimization with Adaptive Budget Assignment

Description
The performance of modern machine learning algorithms depends upon the selection of a set of hyperparameters. Common examples of hyperparameters are learning rate and the number of layers in a dense neural network. Auto-ML is a branch of optimization that has produced important

The performance of modern machine learning algorithms depends upon the selection of a set of hyperparameters. Common examples of hyperparameters are learning rate and the number of layers in a dense neural network. Auto-ML is a branch of optimization that has produced important contributions in this area. Within Auto-ML, multi-fidelity approaches, which eliminate poorly-performing configurations after evaluating them at low budgets, are among the most effective. However, the performance of these algorithms strongly depends on how effectively they allocate the computational budget to various hyperparameter configurations. We first present Parameter Optimization with Conscious Allocation 1.0 (POCA 1.0), a hyperband- based algorithm for hyperparameter optimization that adaptively allocates the inputted budget to the hyperparameter configurations it generates following a Bayesian sampling scheme. We then present its successor Parameter Optimization with Conscious Allocation 2.0 (POCA 2.0), which follows POCA 1.0’s successful philosophy while utilizing a time-series model to reduce wasted computational cost and providing a more flexible framework. We compare POCA 1.0 and 2.0 to its nearest competitor BOHB at optimizing the hyperparameters of a multi-layered perceptron and find that both POCA algorithms exceed BOHB in low-budget hyperparameter optimization while performing similarly in high-budget scenarios.
Date Created
2024-05
Agent

A Machine Learning Framework for Power System Event Identification via Modal Analysis of Phasor Measurement Unit Data

190889-Thumbnail Image.png
Description
Event identification is increasingly recognized as crucial for enhancing the reliability, security, and stability of the electric power system. With the growing deployment of Phasor Measurement Units (PMUs) and advancements in data science, there are promising opportunities to explore data-driven

Event identification is increasingly recognized as crucial for enhancing the reliability, security, and stability of the electric power system. With the growing deployment of Phasor Measurement Units (PMUs) and advancements in data science, there are promising opportunities to explore data-driven event identification via machine learning classification techniques. This dissertation explores the potential of data-driven event identification through machine learning classification techniques. In the first part of this dissertation, using measurements from multiple PMUs, I propose to identify events by extracting features based on modal dynamics. I combine such traditional physics-based feature extraction methods with machine learning to distinguish different event types.Using the obtained set of features, I investigate the performance of two well-known classification models, namely, logistic regression (LR) and support vector machines (SVM) to identify generation loss and line trip events in two datasets. The first dataset is obtained from simulated events in the Texas 2000-bus synthetic grid. The second is a proprietary dataset with labeled events obtained from a large utility in the USA. My results indicate that the proposed framework is promising for identifying the two types of events in the supervised setting. In the second part of the dissertation, I use semi-supervised learning techniques, which make use of both labeled and unlabeled samples.I evaluate three categories of classical semi-supervised approaches: (i) self-training, (ii) transductive support vector machines (TSVM), and (iii) graph-based label spreading (LS) method. In particular, I focus on the identification of four event classes i.e., load loss, generation loss, line trip, and bus fault. I have developed and publicly shared a comprehensive Event Identification package which consists of three aspects: data generation, feature extraction, and event identification with limited labels using semi-supervised methodologies. Using this package, I generate eventful PMU data for the South Carolina 500-Bus synthetic network. My evaluation confirms that the integration of additional unlabeled samples and the utilization of LS for pseudo labeling surpasses the outcomes achieved by the self-training and TSVM approaches. Moreover, the LS algorithm consistently enhances the performance of all classifiers more robustly.
Date Created
2023
Agent

Towards Addressing GAN Training Instabilities: Dual-Objective GANs with Tunable Parameters

189335-Thumbnail Image.png
Description
Generative Adversarial Networks (GANs) have emerged as a powerful framework for generating realistic and high-quality data. In the original ``vanilla'' GAN formulation, two models -- the generator and discriminator -- are engaged in a min-max game and optimize the same

Generative Adversarial Networks (GANs) have emerged as a powerful framework for generating realistic and high-quality data. In the original ``vanilla'' GAN formulation, two models -- the generator and discriminator -- are engaged in a min-max game and optimize the same value function. Despite offering an intuitive approach, vanilla GANs often face stability challenges such as vanishing gradients and mode collapse. Addressing these common failures, recent work has proposed the use of tunable classification losses in place of traditional value functions. Although parameterized robust loss families, e.g. $\alpha$-loss, have shown promising characteristics as value functions, this thesis argues that the generator and discriminator require separate objective functions to achieve their different goals. As a result, this thesis introduces the $(\alpha_{D}, \alpha_{G})$-GAN, a parameterized class of dual-objective GANs, as an alternative approach to the standard vanilla GAN. The $(\alpha_{D}, \alpha_{G})$-GAN formulation, inspired by $\alpha$-loss, allows practitioners to tune the parameters $(\alpha_{D}, \alpha_{G}) \in [0,\infty)^{2}$ to provide a more stable training process. The objectives for the generator and discriminator in $(\alpha_{D}, \alpha_{G})$-GAN are derived, and the advantages of using these objectives are investigated. In particular, the optimization trajectory of the generator is found to be influenced by the choice of $\alpha_{D}$ and $\alpha_{G}$. Empirical evidence is presented through experiments conducted on various datasets, including the 2D Gaussian Mixture Ring, Celeb-A image dataset, and LSUN Classroom image dataset. Performance metrics such as mode coverage and Fréchet Inception Distance (FID) are used to evaluate the effectiveness of the $(\alpha_{D}, \alpha_{G})$-GAN compared to the vanilla GAN and state-of-the-art Least Squares GAN (LSGAN). The experimental results demonstrate that tuning $\alpha_{D} < 1$ leads to improved stability, robustness to hyperparameter choice, and competitive performance compared to LSGAN.
Date Created
2023
Agent

Physical System Knowledge Extraction and Transfer Using Machine Learning

171923-Thumbnail Image.png
Description
Modern physical systems are experiencing tremendous evolutions with growing size, more and more complex structures, and the incorporation of new devices. This calls for better planning, monitoring, and control. However, achieving these goals is challenging since the system knowledge (e.g.,

Modern physical systems are experiencing tremendous evolutions with growing size, more and more complex structures, and the incorporation of new devices. This calls for better planning, monitoring, and control. However, achieving these goals is challenging since the system knowledge (e.g., system structures and edge parameters) may be unavailable for a normal system, let alone some dynamic changes like maintenance, reconfigurations, and events, etc. Therefore, extracting system knowledge becomes a central topic. Luckily, advanced metering techniques bring numerous data, leading to the emergence of Machine Learning (ML) methods with efficient learning and fast inference. This work tries to propose a systematic framework of ML-based methods to learn system knowledge under three what-if scenarios: (i) What if the system is normally operated? (ii) What if the system suffers dynamic interventions? (iii) What if the system is new with limited data? For each case, this thesis proposes principled solutions with extensive experiments. Chapter 2 tackles scenario (i) and the golden rule is to learn an ML model that maintains physical consistency, bringing high extrapolation capacity for changing operational conditions. The key finding is that physical consistency can be linked to convexity, a central concept in optimization. Therefore, convexified ML designs are proposed and the global optimality implies faithfulness to the underlying physics. Chapter 3 handles scenario (ii) and the goal is to identify the event time, type, and locations. The problem is formalized as multi-class classification with special attention to accuracy and speed. Subsequently, Chapter 3 builds an ensemble learning framework to aggregate different ML models for better prediction. Next, to tackle high-volume data quickly, a tensor as the multi-dimensional array is used to store and process data, yielding compact and informative vectors for fast inference. Finally, if no labels exist, Chapter 3 uses physical properties to generate labels for learning. Chapter 4 deals with scenario (iii) and a doable process is to transfer knowledge from similar systems, under the framework of Transfer Learning (TL). Chapter 4 proposes cutting-edge system-level TL by considering the network structure, complex spatial-temporal correlations, and different physical information.
Date Created
2022
Agent

A Tunable Loss Function for Robust, Rigorous, and Reliable Machine Learning

171411-Thumbnail Image.png
Description
In the era of big data, more and more decisions and recommendations are being made by machine learning (ML) systems and algorithms. Despite their many successes, there have been notable deficiencies in the robustness, rigor, and reliability of these ML

In the era of big data, more and more decisions and recommendations are being made by machine learning (ML) systems and algorithms. Despite their many successes, there have been notable deficiencies in the robustness, rigor, and reliability of these ML systems, which have had detrimental societal impacts. In the next generation of ML, these significant challenges must be addressed through careful algorithmic design, and it is crucial that practitioners and meta-algorithms have the necessary tools to construct ML models that align with human values and interests. In an effort to help address these problems, this dissertation studies a tunable loss function called α-loss for the ML setting of classification. The alpha-loss is a hyperparameterized loss function originating from information theory that continuously interpolates between the exponential (alpha = 1/2), log (alpha = 1), and 0-1 (alpha = infinity) losses, hence providing a holistic perspective of several classical loss functions in ML. Furthermore, the alpha-loss exhibits unique operating characteristics depending on the value (and different regimes) of alpha; notably, for alpha > 1, alpha-loss robustly trains models when noisy training data is present. Thus, the alpha-loss can provide robustness to ML systems for classification tasks, and this has bearing in many applications, e.g., social media, finance, academia, and medicine; indeed, results are presented where alpha-loss produces more robust logistic regression models for COVID-19 survey data with gains over state of the art algorithmic approaches.
Date Created
2022
Agent

Bayesian Methods for Tuning Hyperparameters of Loss Functions in Machine Learning

168839-Thumbnail Image.png
Description
The introduction of parameterized loss functions for robustness in machine learning has led to questions as to how hyperparameter(s) of the loss functions can be tuned. This thesis explores how Bayesian methods can be leveraged to tune such hyperparameters. Specifically,

The introduction of parameterized loss functions for robustness in machine learning has led to questions as to how hyperparameter(s) of the loss functions can be tuned. This thesis explores how Bayesian methods can be leveraged to tune such hyperparameters. Specifically, a modified Gibbs sampling scheme is used to generate a distribution of loss parameters of tunable loss functions. The modified Gibbs sampler is a two-block sampler that alternates between sampling the loss parameter and optimizing the other model parameters. The sampling step is performed using slice sampling, while the optimization step is performed using gradient descent. This thesis explores the application of the modified Gibbs sampler to alpha-loss, a tunable loss function with a single parameter $\alpha \in (0,\infty]$, that is designed for the classification setting. Theoretically, it is shown that the Markov chain generated by a modified Gibbs sampling scheme is ergodic; that is, the chain has, and converges to, a unique stationary (posterior) distribution. Further, the modified Gibbs sampler is implemented in two experiments: a synthetic dataset and a canonical image dataset. The results show that the modified Gibbs sampler performs well under label noise, generating a distribution indicating preference for larger values of alpha, matching the outcomes of previous experiments.
Date Created
2022
Agent

GPS Spoofing attacks on PMUs: Practical Feasibility and Counter Measures

168444-Thumbnail Image.png
Description
In order to meet the world’s growing energy need, it is necessary to create a reliable, robust, and resilient electric power grid. One way to ensure the creation of such a grid is through the extensive use of synchrophasor technology

In order to meet the world’s growing energy need, it is necessary to create a reliable, robust, and resilient electric power grid. One way to ensure the creation of such a grid is through the extensive use of synchrophasor technology that is based on devices called phasor measurement units (PMUs), and their derivatives, such as μPMUs. Global positioning system (GPS) time-synchronized wide-area monitoring, protection, and control enabled by PMUs has opened up new ways in which the power grid can tackle the problems it faces today. However, with implementation of new technologies comes new challenges, and one of those challenges when it comes to PMUs is the misuse of GPS as a method to obtain a time reference.The use of GPS in PMUs is very intuitive as it is a convenient method to time stamp electrical signals, which in turn helps provide an accurate snapshot of the performance of the PMU-monitored section of the grid. However, GPS is susceptible to different types of signal interruptions due to natural (such as weather) or unnatural (jamming, spoofing) causes. The focus of this thesis is on demonstrating the practical feasibility of GPS spoofing attacks on PMUs, as well as developing novel countermeasures for them. Prior research has demonstrated that GPS spoofing attacks on PMUs can cripple power system operation. The research conducted here first provides an experimental evidence of the feasibility of such an attack using commonly available digital radios known as software defined radio (SDR). Next, it introduces a new countermeasure against such attacks using GPS signal redundancy and low power long range (LoRa) spread spectrum modulation technique. The proposed approach checks the integrity of the GPS signal at remote locations and compares the data with the PMU’s current output. This countermeasure is a steppingstone towards developing a ready-to-deploy system that can provide an instant solution to the GPS spoofing detection problem for PMUs already placed in the power grid.
Date Created
2021
Agent

Predicting COVID-19 Using Self-Reported Survey Data

161579-Thumbnail Image.png
Description
Infectious diseases spread at a rapid rate, due to the increasing mobility of the human population. It is important to have a variety of containment and assessment strategies to prevent and limit their spread. In the on-going COVID-19 pandemic, telehealth

Infectious diseases spread at a rapid rate, due to the increasing mobility of the human population. It is important to have a variety of containment and assessment strategies to prevent and limit their spread. In the on-going COVID-19 pandemic, telehealth services including daily health surveys are used to study the prevalence and severity of the disease. Daily health surveys can also help to study the progression and fluctuation of symptoms as recalling, tracking, and explaining symptoms to doctors can often be challenging for patients. Data aggregates collected from the daily health surveys can be used to identify the surge of a disease in a community. This thesis enhances a well-known boosting algorithm, XGBoost, to predict COVID-19 from the anonymized self-reported survey responses provided by Carnegie Mellon University (CMU) - Delphi research group in collaboration with Facebook. Despite the tremendous COVID-19 surge in the United States, this survey dataset is highly imbalanced with 84% negative COVID-19 cases and 16% positive cases. It is tedious to learn from an imbalanced dataset, especially when the dataset could also be noisy, as seen commonly in self-reported surveys. This thesis addresses these challenges by enhancing XGBoost with a tunable loss function, ?-loss, that interpolates between the exponential loss (? = 1/2), the log-loss (? = 1), and the 0-1 loss (? = ∞). Results show that tuning XGBoost with ?-loss can enhance performance over the standard XGBoost with log-loss (? = 1).
Date Created
2021
Agent

Machine Learning for the Analysis of Power System Loads: Cyber-Attack Detection and Generation of Synthetic Datasets

161574-Thumbnail Image.png
Description
As the field of machine learning increasingly provides real value to power system operations, the availability of rich measurement datasets has become crucial for the development of new applications and technologies. This dissertation focuses on the use of time-series load

As the field of machine learning increasingly provides real value to power system operations, the availability of rich measurement datasets has become crucial for the development of new applications and technologies. This dissertation focuses on the use of time-series load data for the design of novel data-driven algorithms. Loads are one of the main factors driving the behavior of a power system and they depend on external phenomena which are not captured by traditional simulation tools. Thus, accurate models that capture the fundamental characteristics of time-series load dataare necessary. In the first part of this dissertation, an example of successful application of machine learning algorithms that leverage load data is presented. Prior work has shown that power systems energy management systems are vulnerable to false data injection attacks against state estimation. Here, a data-driven approach for the detection and localization of such attacks is proposed. The detector uses historical data to learn the normal behavior of the loads in a system and subsequently identify if any of the real-time observed measurements are being manipulated by an attacker. The second part of this work focuses on the design of generative models for time-series load data. Two separate techniques are used to learn load behaviors from real datasets and exploiting them to generate realistic synthetic data. The first approach is based on principal component analysis (PCA), which is used to extract common temporal patterns from real data. The second method leverages conditional generative adversarial networks (cGANs) and it overcomes the limitations of the PCA-based model while providing greater and more nuanced control on the generation of specific types of load profiles. Finally, these two classes of models are combined in a multi-resolution generative scheme which is capable of producing any amount of time-series load data at any sampling resolution, for lengths ranging from a few seconds to years.
Date Created
2021
Agent