Matching Items (143)
Filtering by

Clear all filters

168530-Thumbnail Image.png
Description
Edge computing applications have recently gained prominence as the world of internet-of-things becomes increasingly embedded into people's lives. Performing computations at the edge addresses multiple issues, such as memory bandwidth-latency bottlenecks, exposure of sensitive data to external attackers, etc. It is important to protect the data collected and processed by

Edge computing applications have recently gained prominence as the world of internet-of-things becomes increasingly embedded into people's lives. Performing computations at the edge addresses multiple issues, such as memory bandwidth-latency bottlenecks, exposure of sensitive data to external attackers, etc. It is important to protect the data collected and processed by edge devices, and also to prevent unauthorized access to such data. It is also important to ensure that the computing hardware fits well within the tight energy and area budgets for the edge devices which are being progressively scaled-down in size. Firstly, a novel low-power smart security prototype chip that combines multiple entropy sources, such as real-time electrocardiogram (ECG) data, and SRAM-based physical unclonable functions (PUF), for authentication and cryptography applications is proposed. Up to ~12X improvement in the equal error rate compared to a prior ECG-only authentication system is achieved by combining feature vectors obtained from ECG, heart rate variability, and SRAM PUF. The resulting vectors can also be utilized for secure cryptography applications. Secondly, a novel in-memory computing (IMC) hardware noise-aware training algorithms that make DNNs more robust to hardware noise is developed and evaluated. Up to 17% accuracy was recovered in deep neural networks (DNNs) deployed on IMC prototype hardware. The noise-aware training principles are also used to improve the adversarial robustness of DNNs, and successfully defend against both adversarial input and weight attacks. Up to ~10\% improvement in robustness against adversarial input attacks, and up to 33% improvement in robustness against adversarial weight attacks are achieved. Finally, a DNN training algorithm that pursues and optimises both activation and weight sparsity simultaneously is proposed and evaluated to obtain highly compressed DNNs. This lead to up to 4.7x reduction in the total number of flops required to perform complex image recognition tasks. A custom sparse inference accelerator is designed and synthesized to evaluate the benefits of the above flop reduction. A speedup of 4.24x is achieved. In summary, this dissertation contains innovative algorithm and hardware design techniques aided by machine learning, which enhance the security and efficiency of edge computing applications.
ContributorsCherupally, Sai Kiran (Author) / Seo, Jae-Sun (Thesis advisor) / Chakrabarti, Chaitali (Committee member) / Cao, Yu (Kevin) (Committee member) / Fan, Deliang (Committee member) / Arizona State University (Publisher)
Created2022
190780-Thumbnail Image.png
Description
Artificial Intelligence (AI) and Machine Learning (ML) techniques have come a long way since their inception and have been used to build intelligent systems for a wide range of applications in everyday life. However they are very computationintensive and require transfer of large volume of data from memory to the

Artificial Intelligence (AI) and Machine Learning (ML) techniques have come a long way since their inception and have been used to build intelligent systems for a wide range of applications in everyday life. However they are very computationintensive and require transfer of large volume of data from memory to the computation units. This memory access time constitute significant part of the computational latency and a performance bottleneck. To address this limitation and the ever-growing demand for implementation in hand-held and edge-devices, In-memory computing (IMC) based AI/ML hardware accelerators have emerged. First, the dissertation presents an IMC static random access memory (SRAM) based hardware modeling and optimization framework. A unified systematic study closely models the IMC hardware, and investigates how a number of design variables and non-idealities (e.g. device mismatch and ADC quantization) affect the Deep Neural Network (DNN) accuracy of the IMC design. The framework allows co-optimized selection of different design variables accounting for sources of noise in IMC hardware and robust implementation of a high accuracy DNN. Next, it presents a kNN hardware accelerator in 65nm Complementary Metal-Oxide-Semiconductor (CMOS) technology. The accelerator combines an IMC SRAM that is developed for binarized deep neural networks and other digital hardware that performs top-k sorting. The simulated k Nearest Neighbor accelerator design processes up to 17.9 million query vectors per second while consuming 11.8 mW, demonstrating >4.8× energy-efficiency improvement over prior works. This dissertation also presents a novel floating-point precision IMC (FP-IMC) macro with a hybrid architecture that configurably supports two Floating Point (FP) precisions. Implementing FP precision MAC has been a challenge owing to its complexity. The design is implemented on 28nm CMOS, and taped-out on chip demonstrating 12.1 TFLOPS/W and 66.1 TFLOPS/W for 8-bit Floating Point (FP8) and Block Floating point (BF8) respectively. Finally, another iteration of the FP design is presented that is modeled to support multiple precision modes from FP8 up to FP32. Two approaches to the architectural design were compared illustrating the throughput-area overhead trade-off. The simulated design shows a 2.1 × normalized energy-efficiency compared to the on-chip implementation of the FP-IMC.
ContributorsSaikia, Jyotishman (Author) / Seo, Jae-Sun (Thesis advisor) / Chakrabarti, Chaitali (Thesis advisor) / Fan, Deliang (Committee member) / Cao, Yu (Committee member) / Arizona State University (Publisher)
Created2023
190903-Thumbnail Image.png
Description
This dissertation centers on the development of Bayesian methods for learning differ- ent types of variation in switching nonlinear gene regulatory networks (GRNs). A new nonlinear and dynamic multivariate GRN model is introduced to account for different sources of variability in GRNs. The new model is aimed at more precisely

This dissertation centers on the development of Bayesian methods for learning differ- ent types of variation in switching nonlinear gene regulatory networks (GRNs). A new nonlinear and dynamic multivariate GRN model is introduced to account for different sources of variability in GRNs. The new model is aimed at more precisely capturing the complexity of GRN interactions through the introduction of time-varying kinetic order parameters, while allowing for variability in multiple model parameters. This model is used as the drift function in the development of several stochastic GRN mod- els based on Langevin dynamics. Six models are introduced which capture intrinsic and extrinsic noise in GRNs, thereby providing a full characterization of a stochastic regulatory system. A Bayesian hierarchical approach is developed for learning the Langevin model which best describes the noise dynamics at each time step. The trajectory of the state, which are the gene expression values, as well as the indicator corresponding to the correct noise model are estimated via sequential Monte Carlo (SMC) with a high degree of accuracy. To address the problem of time-varying regulatory interactions, a Bayesian hierarchical model is introduced for learning variation in switching GRN architectures with unknown measurement noise covariance. The trajectory of the state and the indicator corresponding to the network configuration at each time point are estimated using SMC. This work is extended to a fully Bayesian hierarchical model to account for uncertainty in the process noise covariance associated with each network architecture. An SMC algorithm with local Gibbs sampling is developed to estimate the trajectory of the state and the indicator correspond- ing to the network configuration at each time point with a high degree of accuracy. The results demonstrate the efficacy of Bayesian methods for learning information in switching nonlinear GRNs.
ContributorsVélez-Cruz, Nayely (Author) / Papandreou-Suppappola, Antonia (Thesis advisor) / Moraffah, Bahman (Committee member) / Tepedelenlioğlu, Cihan (Committee member) / Berisha, Visar (Committee member) / Arizona State University (Publisher)
Created2023