Search Content

Physical System Knowledge Extraction and Transfer Using Machine Learning

Description

Modern physical systems are experiencing tremendous evolutions with growing size, more and more complex structures, and the incorporation of new devices. This calls for better planning, monitoring, and control. However, achieving these goals is challenging since the system knowledge (e.g., system structures and edge parameters) may be unavailable for a…

Modern physical systems are experiencing tremendous evolutions with growing size, more and more complex structures, and the incorporation of new devices. This calls for better planning, monitoring, and control. However, achieving these goals is challenging since the system knowledge (e.g., system structures and edge parameters) may be unavailable for a normal system, let alone some dynamic changes like maintenance, reconfigurations, and events, etc. Therefore, extracting system knowledge becomes a central topic. Luckily, advanced metering techniques bring numerous data, leading to the emergence of Machine Learning (ML) methods with efficient learning and fast inference. This work tries to propose a systematic framework of ML-based methods to learn system knowledge under three what-if scenarios: (i) What if the system is normally operated? (ii) What if the system suffers dynamic interventions? (iii) What if the system is new with limited data? For each case, this thesis proposes principled solutions with extensive experiments. Chapter 2 tackles scenario (i) and the golden rule is to learn an ML model that maintains physical consistency, bringing high extrapolation capacity for changing operational conditions. The key finding is that physical consistency can be linked to convexity, a central concept in optimization. Therefore, convexified ML designs are proposed and the global optimality implies faithfulness to the underlying physics. Chapter 3 handles scenario (ii) and the goal is to identify the event time, type, and locations. The problem is formalized as multi-class classification with special attention to accuracy and speed. Subsequently, Chapter 3 builds an ensemble learning framework to aggregate different ML models for better prediction. Next, to tackle high-volume data quickly, a tensor as the multi-dimensional array is used to store and process data, yielding compact and informative vectors for fast inference. Finally, if no labels exist, Chapter 3 uses physical properties to generate labels for learning. Chapter 4 deals with scenario (iii) and a doable process is to transfer knowledge from similar systems, under the framework of Transfer Learning (TL). Chapter 4 proposes cutting-edge system-level TL by considering the network structure, complex spatial-temporal correlations, and different physical information.

ContributorsLi, Haoran (Author) / Weng, Yang (Thesis advisor) / Tong, Hanghang (Committee member) / Dasarathy, Gautam (Committee member) / Sankar, Lalitha (Committee member) / Arizona State University (Publisher)

Created2022

Collaborative Learning and Optimization for Edge Intelligence

Description

With the proliferation of mobile computing and Internet-of-Things (IoT), billions of mobile and IoT devices are connected to the Internet, generating zillions of Bytes of data at the network edge. Driving by this trend, there is an urgent need to push the artificial intelligence (AI) frontiers to the network edge…

With the proliferation of mobile computing and Internet-of-Things (IoT), billions of mobile and IoT devices are connected to the Internet, generating zillions of Bytes of data at the network edge. Driving by this trend, there is an urgent need to push the artificial intelligence (AI) frontiers to the network edge to unleash the potential of the edge big data fully. This dissertation aims to comprehensively study collaborative learning and optimization algorithms to build a foundation of edge intelligence. Under this common theme, this dissertation is broadly organized into three parts. The first part of this study focuses on model learning with limited data and limited computing capability at the network edge. A global model initialization is first obtained by running federated learning (FL) across many edge devices, based on which a semi-supervised algorithm is devised for an edge device to carry out quick adaptation, aiming to address the insufficiency of labeled data and to learn a personalized model efficiently. In the second part of this study, collaborative learning between the edge and the cloud is studied to achieve real-time edge intelligence. More specifically, a distributionally robust optimization (DRO) approach is proposed to enable the synergy between local data processing and cloud knowledge transfer. Two attractive uncertainty models are investigated corresponding to the cloud knowledge transfer: the distribution uncertainty set based on the cloud data distribution and the prior distribution of the edge model conditioned on the cloud model. Collaborative learning algorithms are developed along this line. The final part focuses on developing an offline model-based safe Inverse Reinforcement Learning (IRL) algorithm for connected Autonomous Vehicles (AVs). A reward penalty is introduced to penalize unsafe states, and a risk-measure-based approach is proposed to mitigate the model uncertainty introduced by offline training. The experimental results demonstrate the improvement of the proposed algorithm over the existing baselines in terms of cumulative rewards.

ContributorsZhang, Zhaofeng (Author) / Zhang, Junshan (Thesis advisor) / Zhang, Yanchao (Thesis advisor) / Dasarathy, Gautam (Committee member) / Fan, Deliang (Committee member) / Arizona State University (Publisher)

Created2023

Bayesian Inference for Markov Kernels Valued in Wasserstein Spaces

Description

In this work, the author analyzes quantitative and structural aspects of Bayesian inference using Markov kernels, Wasserstein metrics, and Kantorovich monads. In particular, the author shows the following main results: first, that Markov kernels can be viewed as Borel measurable maps with values in a Wasserstein space; second, that the…

In this work, the author analyzes quantitative and structural aspects of Bayesian inference using Markov kernels, Wasserstein metrics, and Kantorovich monads. In particular, the author shows the following main results: first, that Markov kernels can be viewed as Borel measurable maps with values in a Wasserstein space; second, that the Disintegration Theorem can be interpreted as a literal equality of integrals using an original theory of integration for Markov kernels; third, that the Kantorovich monad can be defined for Wasserstein metrics of any order; and finally, that, under certain assumptions, a generalized Bayes’s Law for Markov kernels provably leads to convergence of the expected posterior distribution in the Wasserstein metric. These contributions provide a basis for studying further convergence, approximation, and stability properties of Bayesian inverse maps and inference processes using a unified theoretical framework that bridges between statistical inference, machine learning, and probabilistic programming semantics.

ContributorsEikenberry, Keenan (Author) / Cochran, Douglas (Thesis advisor) / Lan, Shiwei (Thesis advisor) / Dasarathy, Gautam (Committee member) / Kotschwar, Brett (Committee member) / Shahbaba, Babak (Committee member) / Arizona State University (Publisher)

Created2023

A Machine Learning Framework for Power System Event Identification via Modal Analysis of Phasor Measurement Unit Data

Description

Event identification is increasingly recognized as crucial for enhancing the reliability, security, and stability of the electric power system. With the growing deployment of Phasor Measurement Units (PMUs) and advancements in data science, there are promising opportunities to explore data-driven event identification via machine learning classification techniques. This dissertation explores…

Event identification is increasingly recognized as crucial for enhancing the reliability, security, and stability of the electric power system. With the growing deployment of Phasor Measurement Units (PMUs) and advancements in data science, there are promising opportunities to explore data-driven event identification via machine learning classification techniques. This dissertation explores the potential of data-driven event identification through machine learning classification techniques. In the first part of this dissertation, using measurements from multiple PMUs, I propose to identify events by extracting features based on modal dynamics. I combine such traditional physics-based feature extraction methods with machine learning to distinguish different event types.Using the obtained set of features, I investigate the performance of two well-known classification models, namely, logistic regression (LR) and support vector machines (SVM) to identify generation loss and line trip events in two datasets. The first dataset is obtained from simulated events in the Texas 2000-bus synthetic grid. The second is a proprietary dataset with labeled events obtained from a large utility in the USA. My results indicate that the proposed framework is promising for identifying the two types of events in the supervised setting. In the second part of the dissertation, I use semi-supervised learning techniques, which make use of both labeled and unlabeled samples.I evaluate three categories of classical semi-supervised approaches: (i) self-training, (ii) transductive support vector machines (TSVM), and (iii) graph-based label spreading (LS) method. In particular, I focus on the identification of four event classes i.e., load loss, generation loss, line trip, and bus fault. I have developed and publicly shared a comprehensive Event Identification package which consists of three aspects: data generation, feature extraction, and event identification with limited labels using semi-supervised methodologies. Using this package, I generate eventful PMU data for the South Carolina 500-Bus synthetic network. My evaluation confirms that the integration of additional unlabeled samples and the utilization of LS for pseudo labeling surpasses the outcomes achieved by the self-training and TSVM approaches. Moreover, the LS algorithm consistently enhances the performance of all classifiers more robustly.

ContributorsTaghipourbazargani, Nima (Author) / Kosut, Oliver (Thesis advisor) / Sankar, Lalitha (Committee member) / Pal, Anamitra (Committee member) / Dasarathy, Gautam (Committee member) / Arizona State University (Publisher)

Created2023

Learning Predictive Models for Assisted Human Biomechanics

Description

This dissertation explores the use of artificial intelligence and machine learningtechniques for the development of controllers for fully-powered robotic prosthetics. The aim of the research is to enable prosthetics to predict future states and control biomechanical properties in both linear and nonlinear fashions, with a particular focus on ergonomics. The research is motivated by…

This dissertation explores the use of artificial intelligence and machine learningtechniques for the development of controllers for fully-powered robotic prosthetics. The aim of the research is to enable prosthetics to predict future states and control biomechanical properties in both linear and nonlinear fashions, with a particular focus on ergonomics. The research is motivated by the need to provide amputees with prosthetic devices that not only replicate the functionality of the missing limb, but also offer a high level of comfort and usability. Traditional prosthetic devices lack the sophistication to adjust to a user’s movement patterns and can cause discomfort and pain over time. The proposed solution involves the development of machine learning-based controllers that can learn from user movements and adjust the prosthetic device’s movements accordingly. The research involves a combination of simulation and real-world testing to evaluate the effectiveness of the proposed approach. The simulation involves the creation of a model of the prosthetic device and the use of machine learning algorithms to train controllers that predict future states and control biomechanical properties. The real- world testing involves the use of human subjects wearing the prosthetic device to evaluate its performance and usability. The research focuses on two main areas: the prediction of future states and the control of biomechanical properties. The prediction of future states involves the development of machine learning algorithms that can analyze a user’s movements and predict the next movements with a high degree of accuracy. The control of biomechanical properties involves the development of algorithms that can adjust the prosthetic device’s movements to ensure maximum comfort and usability for the user. The results of the research show that the use of artificial intelligence and machine learning techniques can significantly improve the performance and usability of pros- thetic devices. The machine learning-based controllers developed in this research are capable of predicting future states and adjusting the prosthetic device’s movements in real-time, leading to a significant improvement in ergonomics and usability. Overall, this dissertation provides a comprehensive analysis of the use of artificial intelligence and machine learning techniques for the development of controllers for fully-powered robotic prosthetics.

ContributorsCLARK, GEOFFEY M (Author) / Ben Amor, Heni (Thesis advisor) / Dasarathy, Gautam (Committee member) / Papandreou-Suppappola, Antonia (Committee member) / Ward, Jeffrey (Committee member) / Arizona State University (Publisher)

Created2023

Graph Regularized Linear Regression

Description

Linear-regression estimators have become widely accepted as a reliable statistical tool in predicting outcomes. Because linear regression is a long-established procedure, the properties of linear-regression estimators are well understood and can be trained very quickly. Many estimators exist for modeling linear relationships, each having ideal conditions for optimal performance. The…

Linear-regression estimators have become widely accepted as a reliable statistical tool in predicting outcomes. Because linear regression is a long-established procedure, the properties of linear-regression estimators are well understood and can be trained very quickly. Many estimators exist for modeling linear relationships, each having ideal conditions for optimal performance. The differences stem from the introduction of a bias into the parameter estimation through the use of various regularization strategies. One of the more popular ones is ridge regression which uses ℓ2-penalization of the parameter vector. In this work, the proposed graph regularized linear estimator is pitted against the popular ridge regression when the parameter vector is known to be dense. When additional knowledge that parameters are smooth with respect to a graph is available, it can be used to improve the parameter estimates. To achieve this goal an additional smoothing penalty is introduced into the traditional loss function of ridge regression. The mean squared error(m.s.e) is used as a performance metric and the analysis is presented for fixed design matrices having a unit covariance matrix. The specific problem setup enables us to study the theoretical conditions where the graph regularized estimator out-performs the ridge estimator. The eigenvectors of the laplacian matrix indicating the graph of connections between the various dimensions of the parameter vector form an integral part of the analysis. Experiments have been conducted on simulated data to compare the performance of the two estimators for laplacian matrices of several types of graphs – complete, star, line and 4-regular. The experimental results indicate that the theory can possibly be extended to more general settings taking smoothness, a concept defined in this work, into consideration.

ContributorsSajja, Akarshan (Author) / Dasarathy, Gautam (Thesis advisor) / Berisha, Visar (Committee member) / Yang, Yingzhen (Committee member) / Arizona State University (Publisher)

Created2022

Modeling and Exploiting the Structure of Data via Meta-Features for Robust and Efficient Machine Learning

Description

In the standard pipeline for machine learning model development, several design decisions are made largely based on trial and error. Take the classification problem as an example. The starting point for classifier design is a dataset with samples from the classes of interest. From this, the algorithm developer must decide…

In the standard pipeline for machine learning model development, several design decisions are made largely based on trial and error. Take the classification problem as an example. The starting point for classifier design is a dataset with samples from the classes of interest. From this, the algorithm developer must decide which features to extract, which hypothesis class to condition on, which hyperparameters to select, and how to train the model. The design process is iterative with the developer trying different classifiers, feature sets, and hyper-parameters and using cross-validation to pick the model with the lowest error. As there are no guidelines for when to stop searching, developers can continue "optimizing" the model to the point where they begin to "fit to the dataset". These problems are amplified in the active learning setting, where the initial dataset may be unlabeled and label acquisition is costly. The aim in this dissertation is to develop algorithms that provide ML developers with additional information about the complexity of the underlying problem to guide downstream model development. I introduce the concept of "meta-features" - features extracted from a dataset that characterize the complexity of the underlying data generating process. In the context of classification, the complexity of the problem can be characterized by understanding two complementary meta-features: (a) the amount of overlap between classes, and (b) the geometry/topology of the decision boundary. Across three complementary works, I present a series of estimators for the meta-features that characterize overlap and geometry/topology of the decision boundary, and demonstrate how they can be used in algorithm development.

ContributorsLi, Weizhi (Author) / Berisha, Visar (Thesis advisor) / Dasarathy, Gautam (Thesis advisor) / Natesan Ramamurthy, Karthikeyan (Committee member) / Turaga, Pavan (Committee member) / Arizona State University (Publisher)

Created2022

Quantum Scattering and Machine Learning in Dirac Materials

Description

A remarkable phenomenon in contemporary physics is quantum scarring in classically chaoticsystems, where the wave functions tend to concentrate on classical periodic orbits. Quantum scarring has been studied for more than four decades, but the problem of efficiently detecting quantum scars has remained to be challenging, relying mostly on human visualization of wave…

A remarkable phenomenon in contemporary physics is quantum scarring in classically chaoticsystems, where the wave functions tend to concentrate on classical periodic orbits. Quantum scarring has been studied for more than four decades, but the problem of efficiently detecting quantum scars has remained to be challenging, relying mostly on human visualization of wave function patterns. This paper develops a machine learning approach to detecting quantum scars in an automated and highly efficient manner. In particular, this paper exploits Meta learning. The first step is to construct a few-shot classification algorithm, under the requirement that the one-shot classification accuracy be larger than 90%. Then propose a scheme based on a combination of neural networks to improve the accuracy. This paper shows that the machine learning scheme can find the correct quantum scars from thousands images of wave functions, without any human intervention, regardless of the symmetry of the underlying classical system. This will be the first application of Meta learning to quantum systems. Interacting spin networks are fundamental to quantum computing. Data-based tomography oftime-independent spin networks has been achieved, but an open challenge is to ascertain the structures of time-dependent spin networks using time series measurements taken locally from a small subset of the spins. Physically, the dynamical evolution of a spin network under time-dependent driving or perturbation is described by the Heisenberg equation of motion. Motivated by this basic fact, this paper articulates a physics-enhanced machine learning framework whose core is Heisenberg neural networks. This paper demonstrates that, from local measurements, not only the local Hamiltonian can be recovered but the Hamiltonian reflecting the interacting structure of the whole system can also be faithfully reconstructed. Using Heisenberg neural machine on spin networks of a variety of structures. In the extreme case where measurements are taken from only one spin, the achieved tomography fidelity values can reach about 90%. The developed machine learning framework is applicable to any time-dependent systems whose quantum dynamical evolution is governed by the Heisenberg equation of motion.

ContributorsHan, Chendi (Author) / Lai, Ying-Cheng (Thesis advisor) / Yu, Hongbin (Committee member) / Dasarathy, Gautam (Committee member) / Seo, Jae-Sun (Committee member) / Arizona State University (Publisher)

Created2022

A Tunable Loss Function for Robust, Rigorous, and Reliable Machine Learning

Description

In the era of big data, more and more decisions and recommendations are being made by machine learning (ML) systems and algorithms. Despite their many successes, there have been notable deficiencies in the robustness, rigor, and reliability of these ML systems, which have had detrimental societal impacts. In the next…

In the era of big data, more and more decisions and recommendations are being made by machine learning (ML) systems and algorithms. Despite their many successes, there have been notable deficiencies in the robustness, rigor, and reliability of these ML systems, which have had detrimental societal impacts. In the next generation of ML, these significant challenges must be addressed through careful algorithmic design, and it is crucial that practitioners and meta-algorithms have the necessary tools to construct ML models that align with human values and interests. In an effort to help address these problems, this dissertation studies a tunable loss function called α-loss for the ML setting of classification. The alpha-loss is a hyperparameterized loss function originating from information theory that continuously interpolates between the exponential (alpha = 1/2), log (alpha = 1), and 0-1 (alpha = infinity) losses, hence providing a holistic perspective of several classical loss functions in ML. Furthermore, the alpha-loss exhibits unique operating characteristics depending on the value (and different regimes) of alpha; notably, for alpha > 1, alpha-loss robustly trains models when noisy training data is present. Thus, the alpha-loss can provide robustness to ML systems for classification tasks, and this has bearing in many applications, e.g., social media, finance, academia, and medicine; indeed, results are presented where alpha-loss produces more robust logistic regression models for COVID-19 survey data with gains over state of the art algorithmic approaches.

ContributorsSypherd, Tyler (Author) / Sankar, Lalitha (Thesis advisor) / Berisha, Visar (Committee member) / Dasarathy, Gautam (Committee member) / Kosut, Oliver (Committee member) / Arizona State University (Publisher)

Created2022

Text to Speech: Extension to Text to Braille Project

Description

Visual impairment is a significant challenge that affects millions of people worldwide. Access to written text, such as books, documents, and other printed materials, can be particularly difficult for individuals with visual impairments. In order to address this issue, our project aims to develop a text-to-Braille and speech translating device…

Visual impairment is a significant challenge that affects millions of people worldwide. Access to written text, such as books, documents, and other printed materials, can be particularly difficult for individuals with visual impairments. In order to address this issue, our project aims to develop a text-to-Braille and speech translating device that will help people with visual impairments to access written text more easily and independently.

ContributorsNguyen, Vu (Author) / Yu, Hongbin (Thesis director) / Dasarathy, Gautam (Committee member) / Barrett, The Honors College (Contributor) / Electrical Engineering Program (Contributor)

Created2023-05