Matching Items (2)
Filtering by

Clear all filters

154195-Thumbnail Image.png
Description
Improving energy efficiency has always been the prime objective of the custom and automated digital circuit design techniques. As a result, a multitude of methods to reduce power without sacrificing performance have been proposed. However, as the field of design automation has matured over the last few decades, there have

Improving energy efficiency has always been the prime objective of the custom and automated digital circuit design techniques. As a result, a multitude of methods to reduce power without sacrificing performance have been proposed. However, as the field of design automation has matured over the last few decades, there have been no new automated design techniques, that can provide considerable improvements in circuit power, leakage and area. Although emerging nano-devices are expected to replace the existing MOSFET devices, they are far from being as mature as semiconductor devices and their full potential and promises are many years away from being practical.

The research described in this dissertation consists of four main parts. First is a new circuit architecture of a differential threshold logic flipflop called PNAND. The PNAND gate is an edge-triggered multi-input sequential cell whose next state function is a threshold function of its inputs. Second a new approach, called hybridization, that replaces flipflops and parts of their logic cones with PNAND cells is described. The resulting \hybrid circuit, which consists of conventional logic cells and PNANDs, is shown to have significantly less power consumption, smaller area, less standby power and less power variation.

Third, a new architecture of a field programmable array, called field programmable threshold logic array (FPTLA), in which the standard lookup table (LUT) is replaced by a PNAND is described. The FPTLA is shown to have as much as 50% lower energy-delay product compared to conventional FPGA using well known FPGA modeling tool called VPR.

Fourth, a novel clock skewing technique that makes use of the completion detection feature of the differential mode flipflops is described. This clock skewing method improves the area and power of the ASIC circuits by increasing slack on timing paths. An additional advantage of this method is the elimination of hold time violation on given short paths.

Several circuit design methodologies such as retiming and asynchronous circuit design can use the proposed threshold logic gate effectively. Therefore, the use of threshold logic flipflops in conventional design methodologies opens new avenues of research towards more energy-efficient circuits.
ContributorsKulkarni, Niranjan (Author) / Vrudhula, Sarma (Thesis advisor) / Colbourn, Charles (Committee member) / Seo, Jae-Sun (Committee member) / Yu, Shimeng (Committee member) / Arizona State University (Publisher)
Created2015
157804-Thumbnail Image.png
Description
While machine/deep learning algorithms have been successfully used in many practical applications including object detection and image/video classification, accurate, fast, and low-power hardware implementations of such algorithms are still a challenging task, especially for mobile systems such as Internet of Things, autonomous vehicles, and smart drones.

This work presents an energy-efficient

While machine/deep learning algorithms have been successfully used in many practical applications including object detection and image/video classification, accurate, fast, and low-power hardware implementations of such algorithms are still a challenging task, especially for mobile systems such as Internet of Things, autonomous vehicles, and smart drones.

This work presents an energy-efficient programmable application-specific integrated circuit (ASIC) accelerator for object detection. The proposed ASIC supports multi-class (face/traffic sign/car license plate/pedestrian), many-object (up to 50) in one image with different sizes (6 down-/11 up-scaling), and high accuracy (87% for face detection datasets). The proposed accelerator is composed of an integral channel detector with 2,000 classifiers for five rigid boosted templates to make a strong object detection. By jointly optimizing the algorithm and efficient hardware architecture, the prototype chip implemented in 65nm demonstrates real-time object detection of 20-50 frames/s with 22.5-181.7mW (0.54-1.75nJ/pixel) at 0.58-1.1V supply.



In this work, to reduce computation without accuracy degradation, an energy-efficient deep convolutional neural network (DCNN) accelerator is proposed based on a novel conditional computing scheme and integrates convolution with subsequent max-pooling operations. This way, the total number of bit-wise convolutions could be reduced by ~2x, without affecting the output feature values. This work also has been developing an optimized dataflow that exploits sparsity, maximizes data re-use and minimizes off-chip memory access, which can improve upon existing hardware works. The total off-chip memory access can be saved by 2.12x. Preliminary results of the proposed DCNN accelerator achieved a peak 7.35 TOPS/W for VGG-16 by post-layout simulation results in 40nm.

A number of recent efforts have attempted to design custom inference engine based on various approaches, including the systolic architecture, near memory processing, and in-meomry computing concept. This work evaluates a comprehensive comparison of these various approaches in a unified framework. This work also presents the proposed energy-efficient in-memory computing accelerator for deep neural networks (DNNs) by integrating many instances of in-memory computing macros with an ensemble of peripheral digital circuits, which supports configurable multibit activations and large-scale DNNs seamlessly while substantially improving the chip-level energy-efficiency. Proposed accelerator is fully designed in 65nm, demonstrating ultralow energy consumption for DNNs.
ContributorsKim, Minkyu (Author) / Seo, Jae-Sun (Thesis advisor) / Cao, Yu Kevin (Committee member) / Vrudhula, Sarma (Committee member) / Ogras, Umit Y. (Committee member) / Arizona State University (Publisher)
Created2019