<?xml version="1.0"?>
<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"><responseDate>2026-05-24T03:04:06Z</responseDate><request verb="GetRecord" metadataPrefix="oai_dc">https://keep.lib.asu.edu/oai/request</request><GetRecord><record><header><identifier>oai:keep.lib.asu.edu:node-157804</identifier><datestamp>2024-12-20T18:25:12Z</datestamp><setSpec>oai_pmh:all</setSpec><setSpec>oai_pmh:repo_items</setSpec></header><metadata><oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd"><dc:identifier>157804</dc:identifier>
          <dc:identifier>https://hdl.handle.net/2286/R.I.55506</dc:identifier>
                  <dc:rights>http://rightsstatements.org/vocab/InC/1.0/</dc:rights>
                  <dc:date>2019</dc:date>
                  <dc:format>133 pages</dc:format>
                  <dc:type>Doctoral Dissertation</dc:type>
          <dc:type>Academic theses</dc:type>
          <dc:type>Text</dc:type>
                  <dc:language>eng</dc:language>
                  <dc:contributor>Kim, Minkyu</dc:contributor>
          <dc:contributor>Seo, Jae-Sun</dc:contributor>
          <dc:contributor>Cao, Yu Kevin</dc:contributor>
          <dc:contributor>Vrudhula, Sarma</dc:contributor>
          <dc:contributor>Ogras, Umit Y.</dc:contributor>
          <dc:contributor>Arizona State University</dc:contributor>
                  <dc:description>Doctoral Dissertation Electrical Engineering 2019</dc:description>
          <dc:description>While machine/deep learning algorithms have been successfully used in many practical applications including object detection and image/video classification, accurate, fast, and low-power hardware implementations of such algorithms are still a challenging task, especially for mobile systems such as Internet of Things, autonomous vehicles, and smart drones.&lt;br/&gt;&lt;br/&gt;This work presents an energy-efficient programmable application-specific integrated circuit (ASIC) accelerator for object	detection. The proposed ASIC supports multi-class (face/traffic sign/car license plate/pedestrian), many-object (up to 50) in one image with different sizes (6 down-/11 up-scaling), and high accuracy (87% for face detection datasets). The proposed accelerator is composed of an integral channel detector with 2,000 classifiers for five rigid boosted templates to make a strong object detection. By jointly optimizing the algorithm and efficient hardware architecture, the prototype chip implemented in 65nm demonstrates real-time object detection of 20-50 frames/s with 22.5-181.7mW (0.54-1.75nJ/pixel) at 0.58-1.1V supply.&lt;br/&gt;&lt;br/&gt;	&lt;br/&gt;&lt;br/&gt;In this work, to reduce computation without accuracy degradation, an energy-efficient deep convolutional neural network (DCNN) accelerator is proposed based on a novel conditional computing scheme and integrates convolution with subsequent max-pooling operations. This way, the total number of bit-wise convolutions could be reduced by ~2x, without affecting the output feature values. This work also has been developing an optimized dataflow that exploits sparsity, maximizes data re-use and minimizes off-chip memory access, which can improve upon existing hardware works. The total off-chip memory access can be saved by 2.12x. Preliminary results of the proposed DCNN accelerator achieved a peak 7.35 TOPS/W for VGG-16 by post-layout simulation results in 40nm.&lt;br/&gt;&lt;br/&gt;A number of recent efforts have attempted to design custom inference engine based on various approaches, including the systolic architecture, near memory processing, and in-meomry computing concept. This work evaluates a comprehensive comparison of these various approaches in a unified framework. This work also presents the proposed energy-efficient in-memory computing accelerator for deep neural networks (DNNs) by integrating many instances of in-memory computing macros with an ensemble of peripheral digital circuits, which supports configurable multibit activations and large-scale DNNs seamlessly while substantially improving the chip-level energy-efficiency. Proposed accelerator is fully designed in 65nm, demonstrating ultralow energy consumption for DNNs.</dc:description>
                  <dc:subject>Electrical Engineering</dc:subject>
          <dc:subject>ASIC</dc:subject>
          <dc:subject>Deep learning</dc:subject>
          <dc:subject>hardware accelerator</dc:subject>
          <dc:subject>Machine learning</dc:subject>
          <dc:subject>Neural Networks</dc:subject>
                  <dc:title>Energy-Efficient ASIC Accelerators for Machine/Deep Learning Algorithms</dc:title></oai_dc:dc></metadata></record></GetRecord></OAI-PMH>
