ASU Electronic Theses and Dissertations
This collection includes most of the ASU Theses and Dissertations from 2011 to present. ASU Theses and Dissertations are available in downloadable PDF format; however, a small percentage of items are under embargo. Information about the dissertations/theses includes degree information, committee members, an abstract, supporting data or media.
In addition to the electronic theses found in the ASU Digital Repository, ASU Theses and Dissertations can be found in the ASU Library Catalog.
Dissertations and Theses granted by Arizona State University are archived and made available through a joint effort of the ASU Graduate College and the ASU Libraries. For more information or questions about this collection contact or visit the Digital Repository ETD Library Guide or contact the ASU Graduate College at gradformat@asu.edu.
Filtering by
- Creators: Ogras, Umit Y.
inherent low voltage stability. Moreover, designs increasingly use compiled instead of custom memory blocks, which frequently employ static, rather than pre-charged dynamic RFs. In this work, the various RFs designed for a microprocessor cache and register files are discussed. Comparison between static and dynamic RF power dissipation and timing characteristics is also presented. The relative timing and power advantages of the designs are shown to be dependent on the memory aspect ratio, i.e. array width and height.
In the first part of dissertation, a high performance Li-ion switching battery charger is proposed. Cascaded two loop (CTL) control architecture is used for seamless CC-CV transition, time based technique is utilized to minimize controller area and power consumption. Time domain controller is implemented by using voltage controlled oscillator (VCO) and voltage controlled delay line (VCDL). Several efficiency improvement techniques such as segmented power-FET, quasi-zero voltage switching (QZVS) and switching frequency reduction are proposed. The proposed switching battery charger is able to provide maximum 2 A charging current and has an peak efficiency of 93.3%. By configure the charger as boost converter, the charger is able to provide maximum 1.5 A charging current while achieving 96.3% peak efficiency.
The second part of dissertation presents a digital low dropout regulator (DLDO) for system on a chip (SoC) in portable devices application. The proposed DLDO achieve fast transient settling time, lower undershoot/overshoot and higher PSR performance compared to state of the art. By having a good PSR performance, the proposed DLDO is able to power mixed signal load. To achieve a fast load transient response, a load transient detector (LTD) enables boost mode operation of the digital PI controller. The boost mode operation achieves sub microsecond settling time, and reduces the settling time by 50% to 250 ns, undershoot/overshoot by 35% to 250 mV and 17% to 125 mV without compromising the system stability.
To enable real-time video analytics in emerging large scale Internet of things (IoT) applications, the computation must happen at the network edge (near the cameras) in a distributed fashion. Thus, edge computing must be adopted. Recent studies have shown that field-programmable gate arrays (FPGAs) are highly suitable for edge computing due to their architecture adaptiveness, high computational throughput for streaming processing, and high energy efficiency.
This thesis presents a generic OpenCL-defined CNN accelerator architecture optimized for FPGA-based real-time video analytics on edge. The proposed CNN OpenCL kernel adopts a highly pipelined and parallelized 1-D systolic array architecture, which explores both spatial and temporal parallelism for energy efficiency CNN acceleration on FPGAs. The large fan-in and fan-out of computational units to the memory interface are identified as the limiting factor in existing designs that causes scalability issues, and solutions are proposed to resolve the issue with compiler automation. The proposed CNN kernel is highly scalable and parameterized by three architecture parameters, namely pe_num, reuse_fac, and vec_fac, which can be adapted to achieve 100% utilization of the coarse-grained computation resources (e.g., DSP blocks) for a given FPGA. The proposed CNN kernel is generic and can be used to accelerate a wide range of CNN models without recompiling the FPGA kernel hardware. The performance of Alexnet, Resnet-50, Retinanet, and Light-weight Retinanet has been measured by the proposed CNN kernel on Intel Arria 10 GX1150 FPGA. The measurement result shows that the proposed CNN kernel, when mapped with 100% utilization of computation resources, can achieve a latency of 11ms, 84ms, 1614.9ms, and 990.34ms for Alexnet, Resnet-50, Retinanet, and Light-weight Retinanet respectively when the input feature maps and weights are represented using 32-bit floating-point data type.