Search Content

Energy-Efficient In-Memory Acceleration of Deep Neural Networks Through a Hardware-Software Co-Design Approach

Description

Deep neural networks (DNNs), as a main-stream algorithm for various AI tasks, achieve higher accuracy at the cost of increased computational complexity and model size, posing great challenges to hardware platforms. This dissertation first tackles the design challenges of resistive random-access-memory (RRAM) based in-memory computing (IMC) architectures. A new metric,…

Deep neural networks (DNNs), as a main-stream algorithm for various AI tasks, achieve higher accuracy at the cost of increased computational complexity and model size, posing great challenges to hardware platforms. This dissertation first tackles the design challenges of resistive random-access-memory (RRAM) based in-memory computing (IMC) architectures. A new metric, model stability from the loss landscape, is proposed to help shed light on accuracy under variations and model compression and guide a novel variation-aware training (VAT) solution. The proposed method effectively improves post-mapping accuracy of multiple datasets. Next, a hybrid RRAM/SRAM IMC DNN inference accelerator is developed, that integrates an RRAM-based IMC macro, a reconfigurable SRAM-based multiply-accumulate (MAC) macro, and a programmable shifter. The hybrid IMC accelerator fully recovers the inference accuracy post the mapping. Furthermore, this dissertation researches on architectural optimizations for high IMC utilization, low on-chip communication cost, and low energy-delay product (EDP), including on-chip interconnect design, PE array utilization, and tile-to-router mapping and scheduling. The optimal choice of on-chip interconnect results in up to 6x improvement in energy-delay-area product for RRAM IMC architectures. Furthermore, the PE and NoC optimizations show up to 62% improvement in PE utilization, 78% reduction in area, and 78% lower energy-area product for a wide range of modern DNNs. Finally, this dissertation proposes a novel chiplet-based IMC benchmarking simulator, SIAM, and a heterogeneous chiplet IMC architecture to address the limitations of a monolithic DNN accelerator. SIAM utilizes model-based and cycle-accurate simulation to provide a scalable and flexible architecture. SIAM is calibrated against a published silicon result, SIMBA, from Nvidia. The heterogeneous architecture utilizes a custom mapping with a bank of big and little chiplets, and a hybrid network-on-package (NoP) to optimize the utilization, interconnect bandwidth, and energy efficiency. The proposed big-little chiplet-based RRAM IMC architecture significantly improves energy efficiency at lower area, compared to conventional GPUs. In summary, this dissertation comprehensively investigates novel methods that encompass device, circuits, architecture, packaging, and algorithm to design scalable high-performance and energy-efficient IMC architectures.

ContributorsKrishnan, Gokul (Author) / Cao, Yu (Thesis advisor) / Seo, Jae-Sun (Committee member) / Chakrabarti, Chaitali (Committee member) / Ogras, Umit Y. (Committee member) / Arizona State University (Publisher)

Created2022

Machine Learning Assisted Security for Edge Computing Applications

Description

Edge computing applications have recently gained prominence as the world of internet-of-things becomes increasingly embedded into people's lives. Performing computations at the edge addresses multiple issues, such as memory bandwidth-latency bottlenecks, exposure of sensitive data to external attackers, etc. It is important to protect the data collected and processed by…

Edge computing applications have recently gained prominence as the world of internet-of-things becomes increasingly embedded into people's lives. Performing computations at the edge addresses multiple issues, such as memory bandwidth-latency bottlenecks, exposure of sensitive data to external attackers, etc. It is important to protect the data collected and processed by edge devices, and also to prevent unauthorized access to such data. It is also important to ensure that the computing hardware fits well within the tight energy and area budgets for the edge devices which are being progressively scaled-down in size. Firstly, a novel low-power smart security prototype chip that combines multiple entropy sources, such as real-time electrocardiogram (ECG) data, and SRAM-based physical unclonable functions (PUF), for authentication and cryptography applications is proposed. Up to ~12X improvement in the equal error rate compared to a prior ECG-only authentication system is achieved by combining feature vectors obtained from ECG, heart rate variability, and SRAM PUF. The resulting vectors can also be utilized for secure cryptography applications. Secondly, a novel in-memory computing (IMC) hardware noise-aware training algorithms that make DNNs more robust to hardware noise is developed and evaluated. Up to 17% accuracy was recovered in deep neural networks (DNNs) deployed on IMC prototype hardware. The noise-aware training principles are also used to improve the adversarial robustness of DNNs, and successfully defend against both adversarial input and weight attacks. Up to ~10\% improvement in robustness against adversarial input attacks, and up to 33% improvement in robustness against adversarial weight attacks are achieved. Finally, a DNN training algorithm that pursues and optimises both activation and weight sparsity simultaneously is proposed and evaluated to obtain highly compressed DNNs. This lead to up to 4.7x reduction in the total number of flops required to perform complex image recognition tasks. A custom sparse inference accelerator is designed and synthesized to evaluate the benefits of the above flop reduction. A speedup of 4.24x is achieved. In summary, this dissertation contains innovative algorithm and hardware design techniques aided by machine learning, which enhance the security and efficiency of edge computing applications.

ContributorsCherupally, Sai Kiran (Author) / Seo, Jae-Sun (Thesis advisor) / Chakrabarti, Chaitali (Committee member) / Cao, Yu (Kevin) (Committee member) / Fan, Deliang (Committee member) / Arizona State University (Publisher)

Created2022

Compiler Design for Accelerating Applications on Coarse-Grained Reconfigurable Architectures

Description

Coarse-Grained Reconfigurable Arrays (CGRAs) are emerging accelerators that promise low-power acceleration of compute-intensive loops in applications. The acceleration achieved by CGRA relies on the efficient mapping of the compute-intensive loops by the CGRA compiler onto the CGRA. The CGRA mapping problem, being NP-complete, is performed in a two-step process, scheduling,…

Coarse-Grained Reconfigurable Arrays (CGRAs) are emerging accelerators that promise low-power acceleration of compute-intensive loops in applications. The acceleration achieved by CGRA relies on the efficient mapping of the compute-intensive loops by the CGRA compiler onto the CGRA. The CGRA mapping problem, being NP-complete, is performed in a two-step process, scheduling, and mapping. The scheduling algorithm allocates timeslots to the nodes of the DFG, and the mapping algorithm maps the scheduled nodes onto the PEs of the CGRA. On a mapping failure, the initiation interval (II) is increased, and a new schedule is obtained for the increased II. Most previous mapping techniques use the Iterative Modulo Scheduling algorithm (IMS) to find a schedule for a given II. Since IMS generates a resource-constrained ASAP (as-soon-as-possible) scheduling, even with increased II, it tends to generate a similar schedule that is not mappable and does not explore the schedule space effectively. The problems encountered by IMS-based scheduling algorithms are explored and an improved randomized scheduling algorithm for scheduling of the application loop to be accelerated is proposed. When encountering a mapping failure for a given schedule, existing mapping algorithms either exit and retry the mapping anew, or recursively remove the previously mapped node to find a valid mapping (backtrack).Abandoning the mapping is extreme, but even backtracking may not be the best choice, since the root of the problem may not be the previous node. The challenges in existing algorithms are systematically analyzed and a failure-aware mapping algorithm is presented. The loops in general-purpose applications are often complicated loops, i.e., loops with perfect and imperfect nests and loops with nested if-then-else's (conditionals). The existing hardware-software solutions to execute branches and conditions are inefficient. A co-design approach that efficiently executes complicated loops on CGRA is proposed. The compiler transforms complex loops, maps them to the CGRA, and lays them out in the memory in a specific manner, such that the hardware can fetch and execute the instructions from the right path at runtime. Finally, a CGRA compilation simulator open-source framework is presented. This open-source CGRA simulation framework is based on LLVM and gem5 to extract the loop, map them onto the CGRA architecture, and execute them as a co-processor to an ARM CPU.

ContributorsBalasubramanian, Mahesh (Author) / Shrivastava, Aviral (Thesis advisor) / Chakrabarti, Chaitali (Committee member) / Ren, Fengbo (Committee member) / Pozzi, Laura (Committee member) / Arizona State University (Publisher)

Created2021

Continual Learning with Novelty Detection: Algorithms and Applications to Image and Microelectronic Design

Description

Machine learning techniques have found extensive application in dynamic fields like drones, self-driving vehicles, surveillance, and more. Their effectiveness stems from meticulously crafted deep neural networks (DNNs), extensive data gathering efforts, and resource-intensive model training processes. However, due to the unpredictable nature of the environment, these systems will inevitably encounter…

Machine learning techniques have found extensive application in dynamic fields like drones, self-driving vehicles, surveillance, and more. Their effectiveness stems from meticulously crafted deep neural networks (DNNs), extensive data gathering efforts, and resource-intensive model training processes. However, due to the unpredictable nature of the environment, these systems will inevitably encounter input samples that deviate from the distribution of their original training data, resulting in instability and performance degradation.To effectively detect the emergence of out-of-distribution (OOD) data, this dissertation first proposes a novel, self-supervised approach that evaluates the Mahalanobis distance between the in-distribution (ID) and OOD in gradient space. A binary classifier is then introduced to guide the label selection for gradients calculation, which further boosts the detection performance. Next, to continuously adapt the new OOD into the existing knowledge base, an unified framework for novelty detection and continual learning is proposed. The binary classifier, trained to distinguish OOD data from ID, is connected sequentially with the pre-trained model to form a “N + 1” classifier, where “N” represents prior knowledge which contains N classes and “1” refers to the newly arrival OOD. This continual learning process continues as “N+1+1+1+...”, assimilating the knowledge of each new OOD instance into the system. Finally, this dissertation demonstrates the practical implementation of novelty detection and continual learning within the domain of thermal analysis. To rapidly address the impact of voids in thermal interface material (TIM), a continuous adaptation approach is proposed, which integrates trainable nodes into the graph at the locations where abnormal thermal behaviors are detected. With minimal training overhead, the model can quickly adapts to the change caused by the defects and regenerate accurate thermal prediction. In summary, this dissertation proposes several algorithms and practical applications in continual learning aimed at enhancing the stability and adaptability of the system. All proposed algorithms are validated through extensive experiments conducted on benchmark datasets such as CIFAR-10, CIFAR-100, TinyImageNet for continual learning, and real thermal data for thermal analysis.

ContributorsSun, Jingbo (Author) / Cao, Yu (Thesis advisor) / Chhabria, Vidya (Thesis advisor) / Chakrabarti, Chaitali (Committee member) / Fan, Deliang (Committee member) / Seo, Jae-Sun (Committee member) / Arizona State University (Publisher)

Created2024

Sensing for Wireless Communication: From Theory to Reality

Description

Millimeter-wave (mmWave) and sub-terahertz (sub-THz) systems aim to utilize the large bandwidth available at these frequencies. This has the potential to enable several future applications that require high data rates, such as autonomous vehicles and digital twins. These systems, however, have several challenges that need to be addressed to realize…

Millimeter-wave (mmWave) and sub-terahertz (sub-THz) systems aim to utilize the large bandwidth available at these frequencies. This has the potential to enable several future applications that require high data rates, such as autonomous vehicles and digital twins. These systems, however, have several challenges that need to be addressed to realize their gains in practice. First, they need to deploy large antenna arrays and use narrow beams to guarantee sufficient receive power. Adjusting the narrow beams of the large antenna arrays incurs massive beam training overhead. Second, the sensitivity to blockages is a key challenge for mmWave and THz networks. Since these networks mainly rely on line-of-sight (LOS) links, sudden link blockages highly threaten the reliability of the networks. Further, when the LOS link is blocked, the network typically needs to hand off the user to another LOS basestation, which may incur critical time latency, especially if a search over a large codebook of narrow beams is needed. A promising way to tackle both these challenges lies in leveraging additional side information such as visual, LiDAR, radar, and position data. These sensors provide rich information about the wireless environment, which can be utilized for fast beam and blockage prediction. This dissertation presents a machine-learning framework for sensing-aided beam and blockage prediction. In particular, for beam prediction, this work proposes to utilize visual and positional data to predict the optimal beam indices. For the first time, this work investigates the sensing-aided beam prediction task in a real-world vehicle-to-infrastructure and drone communication scenario. Similarly, for blockage prediction, this dissertation proposes a multi-modal wireless communication solution that utilizes bimodal machine learning to perform proactive blockage prediction and user hand-off. Evaluations on both real-world and synthetic datasets illustrate the promising performance of the proposed solutions and highlight their potential for next-generation communication and sensing systems.

ContributorsCharan, Gouranga (Author) / Alkhateeb, Ahmed (Thesis advisor) / Chakrabarti, Chaitali (Committee member) / Turaga, Pavan (Committee member) / Michelusi, Nicolò (Committee member) / Arizona State University (Publisher)

Created2024

Low Cost 3D Flow Estimation in Medical Ultrasound

Description

Medical ultrasound imaging is widely used today because of it being non-invasive and cost-effective. Flow estimation helps in accurate diagnosis of vascular diseases and adds an important dimension to medical ultrasound imaging. Traditionally flow estimation is done using Doppler-based methods which only estimate velocity in the beam direction. Thus…

Medical ultrasound imaging is widely used today because of it being non-invasive and cost-effective. Flow estimation helps in accurate diagnosis of vascular diseases and adds an important dimension to medical ultrasound imaging. Traditionally flow estimation is done using Doppler-based methods which only estimate velocity in the beam direction. Thus when blood vessels are close to being orthogonal to the beam direction, there are large errors in the estimation results. In this dissertation, a low cost blood flow estimation method that does not have the angle dependency of Doppler-based methods, is presented.

First, a velocity estimator based on speckle tracking and synthetic lateral phase is proposed for clutter-free blood flow.

Speckle tracking is based on kernel matching and does not have any angle dependency. While velocity estimation in axial dimension is accurate, lateral velocity estimation is challenging due to reduced resolution and lack of phase information. This work presents a two tiered method which estimates the pixel level movement using sum-of-absolute difference, and then estimates the sub-pixel level using synthetic phase information in the lateral dimension. Such a method achieves highly accurate velocity estimation with reduced complexity compared to a cross correlation based method. The average bias of the proposed estimation method is less than 2% for plug flow and less than 7% for parabolic flow.

Blood is always accompanied by clutter which originates from vessel wall and surrounding tissues. As magnitude of the blood signal is usually 40-60 dB lower than magnitude of the clutter signal, clutter filtering is necessary before blood flow estimation. Clutter filters utilize the high magnitude and low frequency features of clutter signal to effectively remove them from the compound (blood + clutter) signal. Instead of low complexity FIR filter or high complexity SVD-based filters, here a power/subspace iteration based method is proposed for clutter filtering. Excellent clutter filtering performance is achieved for both slow and fast moving clutters with lower complexity compared to SVD-based filters. For instance, use of the proposed method results in the bias being less than 8% and standard deviation being less than 12% for fast moving clutter when the beam-to-flow-angle is $90^o$.

Third, a flow rate estimation method based on kernel power weighting is proposed. As the velocity estimator is a kernel-based method, the estimation accuracy degrades near the vessel boundary. In order to account for kernels that are not fully inside the vessel, fractional weights are given to these kernels based on their signal power. The proposed method achieves excellent flow rate estimation results with less than 8% bias for both slow and fast moving clutters.

The performance of the velocity estimator is also evaluated for challenging models. A 2D version of our two-tiered method is able to accurately estimate velocity vectors in a spinning disk as well as in a carotid bifurcation model, both of which are part of the synthetic aperture vector flow imaging (SA-VFI) challenge of 2018. In fact, the proposed method ranked 3rd in the challenge for testing dataset with carotid bifurcation. The flow estimation method is also evaluated for blood flow in vessels with stenosis. Simulation results show that the proposed method is able to estimate the flow rate with less than 9% bias.

ContributorsWei, Siyuan (Author) / Chakrabarti, Chaitali (Thesis advisor) / Papandreou-Suppappola, Antonia (Committee member) / Ogras, Umit Y. (Committee member) / Wenisch, Thomas F. (Committee member) / Arizona State University (Publisher)

Created2018

On-chip learning and inference acceleration of sparse representations

Description

The past decade has seen a tremendous surge in running machine learning (ML) functions on mobile devices, from mere novelty applications to now indispensable features for the next generation of devices.

While the mobile platform capabilities range widely, long battery life and reliability are common design concerns that are crucial to…

The past decade has seen a tremendous surge in running machine learning (ML) functions on mobile devices, from mere novelty applications to now indispensable features for the next generation of devices.

While the mobile platform capabilities range widely, long battery life and reliability are common design concerns that are crucial to remain competitive.

Consequently, state-of-the-art mobile platforms have become highly heterogeneous by combining a powerful CPUs with GPUs to accelerate the computation of deep neural networks (DNNs), which are the most common structures to perform ML operations.

But traditional von Neumann architectures are not optimized for the high memory bandwidth and massively parallel computation demands required by DNNs.

Hence, propelling research into non-von Neumann architectures to support the demands of DNNs.

The re-imagining of computer architectures to perform efficient DNN computations requires focusing on the prohibitive demands presented by DNNs and alleviating them. The two central challenges for efficient computation are (1) large memory storage and movement due to weights of the DNN and (2) massively parallel multiplications to compute the DNN output.

Introducing sparsity into the DNNs, where certain percentage of either the weights or the outputs of the DNN are zero, greatly helps with both challenges. This along with algorithm-hardware co-design to compress the DNNs is demonstrated to provide efficient solutions to greatly reduce the power consumption of hardware that compute DNNs. Additionally, exploring emerging technologies such as non-volatile memories and 3-D stacking of silicon in conjunction with algorithm-hardware co-design architectures will pave the way for the next generation of mobile devices.

Towards the objectives stated above, our specific contributions include (a) an architecture based on resistive crosspoint array that can update all values stored and compute matrix vector multiplication in parallel within a single cycle, (b) a framework of training DNNs with a block-wise sparsity to drastically reduce memory storage and total number of computations required to compute the output of DNNs, (c) the exploration of hardware implementations of sparse DNNs and architectural guidelines to reduce power consumption for the implementations in monolithic 3D integrated circuits, and (d) a prototype chip in 65nm CMOS accelerator for long-short term memory networks trained with the proposed block-wise sparsity scheme.

ContributorsKadetotad, Deepak Vinayak (Author) / Seo, Jae-Sun (Thesis advisor) / Chakrabarti, Chaitali (Committee member) / Vrudhula, Sarma (Committee member) / Cao, Yu (Committee member) / Arizona State University (Publisher)

Created2019

Algorithm and Hardware Design for Efficient Deep Learning Inference

Description

Deep learning (DL) has proved itself be one of the most important developements till date with far reaching impacts in numerous fields like robotics, computer vision, surveillance, speech processing, machine translation, finance, etc. They are now widely used for countless applications because of their ability to generalize real world data,…

Deep learning (DL) has proved itself be one of the most important developements till date with far reaching impacts in numerous fields like robotics, computer vision, surveillance, speech processing, machine translation, finance, etc. They are now widely used for countless applications because of their ability to generalize real world data, robustness to noise in previously unseen data and high inference accuracy. With the ability to learn useful features from raw sensor data, deep learning algorithms have out-performed tradinal AI algorithms and pushed the boundaries of what can be achieved with AI. In this work, we demonstrate the power of deep learning by developing a neural network to automatically detect cough instances from audio recorded in un-constrained environments. For this, 24 hours long recordings from 9 dierent patients is collected and carefully labeled by medical personel. A pre-processing algorithm is proposed to convert event based cough dataset to a more informative dataset with start and end of coughs and also introduce data augmentation for regularizing the training procedure. The proposed neural network achieves 92.3% leave-one-out accuracy on data captured in real world.

Deep neural networks are composed of multiple layers that are compute/memory intensive. This makes it difficult to execute these algorithms real-time with low power consumption using existing general purpose computers. In this work, we propose hardware accelerators for a traditional AI algorithm based on random forest trees and two representative deep convolutional neural networks (AlexNet and VGG). With the proposed acceleration techniques, ~ 30x performance improvement was achieved compared to CPU for random forest trees. For deep CNNS, we demonstrate that much higher performance can be achieved with architecture space exploration using any optimization algorithms with system level performance and area models for hardware primitives as inputs and goal of minimizing latency with given resource constraints. With this method, ~30GOPs performance was achieved for Stratix V FPGA boards.

Hardware acceleration of DL algorithms alone is not always the most ecient way and sucient to achieve desired performance. There is a huge headroom available for performance improvement provided the algorithms are designed keeping in mind the hardware limitations and bottlenecks. This work achieves hardware-software co-optimization for Non-Maximal Suppression (NMS) algorithm. Using the proposed algorithmic changes and hardware architecture

With CMOS scaling coming to an end and increasing memory bandwidth bottlenecks, CMOS based system might not scale enough to accommodate requirements of more complicated and deeper neural networks in future. In this work, we explore RRAM crossbars and arrays as compact, high performing and energy efficient alternative to CMOS accelerators for deep learning training and inference. We propose and implement RRAM periphery read and write circuits and achieved ~3000x performance improvement in online dictionary learning compared to CPU.

This work also examines the realistic RRAM devices and their non-idealities. We do an in-depth study of the effects of RRAM non-idealities on inference accuracy when a pretrained model is mapped to RRAM based accelerators. To mitigate this issue, we propose Random Sparse Adaptation (RSA), a novel scheme aimed at tuning the model to take care of the faults of the RRAM array on which it is mapped. Our proposed method can achieve inference accuracy much higher than what traditional Read-Verify-Write (R-V-W) method could achieve. RSA can also recover lost inference accuracy 100x ~ 1000x faster compared to R-V-W. Using 32-bit high precision RSA cells, we achieved ~10% higher accuracy using fautly RRAM arrays compared to what can be achieved by mapping a deep network to an 32 level RRAM array with no variations.

ContributorsMohanty, Abinash (Author) / Cao, Yu (Thesis advisor) / Seo, Jae-Sun (Committee member) / Vrudhula, Sarma (Committee member) / Chakrabarti, Chaitali (Committee member) / Arizona State University (Publisher)

Created2018

Modeling and Parameter Estimation of Sea Clutter Intensity in Thermal Noise

Description

A critical problem for airborne, ship board, and land based radars operating in maritime or littoral environments is the detection, identification and tracking of targets against backscattering caused by the roughness of the sea surface. Statistical models, such as the compound K-distribution (CKD), were shown to accurately describe two…

A critical problem for airborne, ship board, and land based radars operating in maritime or littoral environments is the detection, identification and tracking of targets against backscattering caused by the roughness of the sea surface. Statistical models, such as the compound K-distribution (CKD), were shown to accurately describe two separate structures of the sea clutter intensity fluctuations. The first structure is the texture that is associated with long sea waves and exhibits long temporal decorrelation period. The second structure is the speckle that accounts for reflections from multiple scatters and exhibits a short temporal decorrelation period from pulse to pulse. Existing methods for estimating the CKD model parameters do not include the thermal noise power, which is critical for real sea clutter processing. Estimation methods that include the noise power are either computationally intensive or require very large data records.

This work proposes two new approaches for accurately estimating all three CKD model parameters, including noise power. The first method integrates, in an iterative fashion, the noise power estimation, using one-dimensional nonlinear curve fitting,

with the estimation of the shape and scale parameters, using closed-form solutions in terms of the CKD intensity moments. The second method is similar to the first except it replaces integer-based intensity moments with fractional moments which have been shown to achieve more accurate estimates of the shape parameter. These new methods can be implemented in real time without requiring large data records. They can also achieve accurate estimation performance as demonstrated with simulated and real sea clutter observation datasets. The work also investigates the numerically computed Cram\'er-Rao lower bound (CRLB) of the variance of the shape parameter estimate using intensity observations in thermal noise with unknown power. Using the CRLB, the asymptotic estimation performance behavior of the new estimators is studied and compared to that of other estimators.

ContributorsNorthrop, Judith (Author) / Papandreou-Suppappola, Antonia (Thesis advisor) / Chakrabarti, Chaitali (Committee member) / Tepedelenlioğlu, Cihan (Committee member) / Maurer, Alexander (Committee member) / Arizona State University (Publisher)

Created2019

Transmit waveform design for coexisting radar and communications systems

Description

In recent years, there has been an increased interest in sharing available bandwidth to avoid spectrum congestion. With an ever-increasing number wireless users, it is critical to develop signal processing based spectrum sharing algorithms to achieve cooperative use of the allocated spectrum among multiple systems in order to reduce…

In recent years, there has been an increased interest in sharing available bandwidth to avoid spectrum congestion. With an ever-increasing number wireless users, it is critical to develop signal processing based spectrum sharing algorithms to achieve cooperative use of the allocated spectrum among multiple systems in order to reduce interference between systems. This work studies the radar and communications systems coexistence problem using two main approaches. The first approach develops methodologies to increase radar target tracking performance under low signal-to-interference-plus-noise ratio (SINR) conditions due to the coexistence of strong communications interference. The second approach jointly optimizes the performance of both systems by co-designing a common transmit waveform.

When concentrating on improving radar tracking performance, a pulsed radar that is tracking a single target coexisting with high powered communications interference is considered. Although the Cramer-Rao lower bound (CRLB) on the covariance of an unbiased estimator of deterministic parameters provides a bound on the estimation mean squared error (MSE), there exists an SINR threshold at which estimator covariance rapidly deviates from the CRLB. After demonstrating that different radar waveforms experience different estimation SINR thresholds using the Barankin bound (BB), a new radar waveform design method is proposed based on predicting the waveform-dependent BB SINR threshold under low SINR operating conditions.

A novel method of predicting the SINR threshold value for maximum likelihood estimation (MLE) is proposed. A relationship is shown to exist between the formulation of the BB kernel and the probability of selecting sidelobes for the MLE. This relationship is demonstrated as an accurate means of threshold prediction for the radar target parameter estimation of frequency, time-delay and angle-of-arrival.

For the co-design radar and communications system problem, the use of a common transmit waveform for a pulse-Doppler radar and a multiuser communications system is proposed. The signaling scheme for each system is selected from a class of waveforms with nonlinear phase function by optimizing the waveform parameters to minimize interference between the two systems and interference among communications users. Using multi-objective optimization, a trade-off in system performance is demonstrated when selecting waveforms that minimize both system interference and tracking MSE.

ContributorsKota, John S (Author) / Papandreou-Suppappola, Antonia (Thesis advisor) / Berisha, Visar (Committee member) / Bliss, Daniel (Committee member) / Kovvali, Narayan (Committee member) / Arizona State University (Publisher)

Created2016

Filtering by