Matching Items (6)
Filtering by

Clear all filters

151846-Thumbnail Image.png
Description
Efficiency of components is an ever increasing area of importance to portable applications, where a finite battery means finite operating time. Higher efficiency devices need to be designed that don't compromise on the performance that the consumer has come to expect. Class D amplifiers deliver on the goal of increased

Efficiency of components is an ever increasing area of importance to portable applications, where a finite battery means finite operating time. Higher efficiency devices need to be designed that don't compromise on the performance that the consumer has come to expect. Class D amplifiers deliver on the goal of increased efficiency, but at the cost of distortion. Class AB amplifiers have low efficiency, but high linearity. By modulating the supply voltage of a Class AB amplifier to make a Class H amplifier, the efficiency can increase while still maintaining the Class AB level of linearity. A 92dB Power Supply Rejection Ratio (PSRR) Class AB amplifier and a Class H amplifier were designed in a 0.24um process for portable audio applications. Using a multiphase buck converter increased the efficiency of the Class H amplifier while still maintaining a fast response time to respond to audio frequencies. The Class H amplifier had an efficiency above the Class AB amplifier by 5-7% from 5-30mW of output power without affecting the total harmonic distortion (THD) at the design specifications. The Class H amplifier design met all design specifications and showed performance comparable to the designed Class AB amplifier across 1kHz-20kHz and 0.01mW-30mW. The Class H design was able to output 30mW into 16Ohms without any increase in THD. This design shows that Class H amplifiers merit more research into their potential for increasing efficiency of audio amplifiers and that even simple designs can give significant increases in efficiency without compromising linearity.
ContributorsPeterson, Cory (Author) / Bakkaloglu, Bertan (Thesis advisor) / Barnaby, Hugh (Committee member) / Kiaei, Sayfe (Committee member) / Arizona State University (Publisher)
Created2013
153193-Thumbnail Image.png
Description
As the number of cores per chip increases, maintaining cache coherence becomes prohibitive for both power and performance. Non Coherent Cache (NCC) architectures do away with hardware-based cache coherence, but they become difficult to program. Some existing architectures provide a middle ground by providing some shared memory in the hardware.

As the number of cores per chip increases, maintaining cache coherence becomes prohibitive for both power and performance. Non Coherent Cache (NCC) architectures do away with hardware-based cache coherence, but they become difficult to program. Some existing architectures provide a middle ground by providing some shared memory in the hardware. Specifically, the 48-core Intel Single-chip Cloud Computer (SCC) provides some off-chip (DRAM) shared memory some on-chip (SRAM) shared memory. We call such architectures Hybrid Shared Memory, or HSM, manycore architectures. However, how to efficiently execute multi-threaded programs on HSM architectures is an open problem. To be able to execute a multi-threaded program correctly on HSM architectures, the compiler must: i) identify all the shared data and map it to the shared memory, and ii) map the frequently accessed shared data to the on-chip shared memory. This work presents a source-to-source translator written using CETUS that identifies a conservative superset of all the shared data in a multi-threaded application and maps it to the shared memory such that it enables execution on HSM architectures.
ContributorsRawat, Tushar (Author) / Shrivastava, Aviral (Thesis advisor) / Dasgupta, Partha (Committee member) / Fainekos, Georgios (Committee member) / Arizona State University (Publisher)
Created2014
151246-Thumbnail Image.png
Description
Class D Amplifiers are widely used in portable systems such as mobile phones to achieve high efficiency. The demands of portable electronics for low power consumption to extend battery life and reduce heat dissipation mandate efficient, high-performance audio amplifiers. The high efficiency of Class D amplifiers (CDAs) makes them particularly

Class D Amplifiers are widely used in portable systems such as mobile phones to achieve high efficiency. The demands of portable electronics for low power consumption to extend battery life and reduce heat dissipation mandate efficient, high-performance audio amplifiers. The high efficiency of Class D amplifiers (CDAs) makes them particularly attractive for portable applications. The Digital class D amplifier is an interesting solution to increase the efficiency of embedded systems. However, this solution is not good enough in terms of PWM stage linearity and power supply rejection. An efficient control is needed to correct the error sources in order to get a high fidelity sound quality in the whole audio range of frequencies. A fundamental analysis on various error sources due to non idealities in the power stage have been discussed here with key focus on Power supply perturbations driving the Power stage of a Class D Audio Amplifier. Two types of closed loop Digital Class D architecture for PSRR improvement have been proposed and modeled. Double sided uniform sampling modulation has been used. One of the architecture uses feedback around the power stage and the second architecture uses feedback into digital domain. Simulation & experimental results confirm that the closed loop PSRR & PS-IMD improve by around 30-40 dB and 25 dB respectively.
ContributorsChakraborty, Bijeta (Author) / Bakkaloglu, Bertan (Thesis advisor) / Garrity, Douglas (Committee member) / Ozev, Sule (Committee member) / Arizona State University (Publisher)
Created2012
156962-Thumbnail Image.png
Description
With the end of Dennard scaling and Moore's law, architects have moved towards

heterogeneous designs consisting of specialized cores to achieve higher performance

and energy efficiency for a target application domain. Applications of linear algebra

are ubiquitous in the field of scientific computing, machine learning, statistics,

etc. with matrix computations being fundamental to these

With the end of Dennard scaling and Moore's law, architects have moved towards

heterogeneous designs consisting of specialized cores to achieve higher performance

and energy efficiency for a target application domain. Applications of linear algebra

are ubiquitous in the field of scientific computing, machine learning, statistics,

etc. with matrix computations being fundamental to these linear algebra based solutions.

Design of multiple dense (or sparse) matrix computation routines on the

same platform is quite challenging. Added to the complexity is the fact that dense

and sparse matrix computations have large differences in their storage and access

patterns and are difficult to optimize on the same architecture. This thesis addresses

this challenge and introduces a reconfigurable accelerator that supports both dense

and sparse matrix computations efficiently.

The reconfigurable architecture has been optimized to execute the following linear

algebra routines: GEMV (Dense General Matrix Vector Multiplication), GEMM

(Dense General Matrix Matrix Multiplication), TRSM (Triangular Matrix Solver),

LU Decomposition, Matrix Inverse, SpMV (Sparse Matrix Vector Multiplication),

SpMM (Sparse Matrix Matrix Multiplication). It is a multicore architecture where

each core consists of a 2D array of processing elements (PE).

The 2D array of PEs is of size 4x4 and is scheduled to perform 4x4 sized matrix

updates efficiently. A sequence of such updates is used to solve a larger problem inside

a core. A novel partitioned block compressed sparse data structure (PBCSC/PBCSR)

is used to perform sparse kernel updates. Scalable partitioning and mapping schemes

are presented that map input matrices of any given size to the multicore architecture.

Design trade-offs related to the PE array dimension, size of local memory inside a core

and the bandwidth between on-chip memories and the cores have been presented. An

optimal core configuration is developed from this analysis. Synthesis results using a 7nm PDK show that the proposed accelerator can achieve a performance of upto

32 GOPS using a single core.
ContributorsAnimesh, Saurabh (Author) / Chakrabarti, Chaitali (Thesis advisor) / Brunhaver, John (Committee member) / Ren, Fengbo (Committee member) / Arizona State University (Publisher)
Created2018
154094-Thumbnail Image.png
Description
In this thesis, a digital input class D audio amplifier system which has the ability

to reject the power supply noise and nonlinearly of the output stage is presented. The main digital class D feed-forward path is using the fully-digital sigma-delta PWM open loop topology. Feedback loop is used to suppress

In this thesis, a digital input class D audio amplifier system which has the ability

to reject the power supply noise and nonlinearly of the output stage is presented. The main digital class D feed-forward path is using the fully-digital sigma-delta PWM open loop topology. Feedback loop is used to suppress the power supply noise and harmonic distortions. The design is using global foundry 0.18um technology.

Based on simulation, the power supply rejection at 200Hz is about -49dB with

81dB dynamic range and -70dB THD+N. The full scale output power can reach as high as 27mW and still keep minimum -68dB THD+N. The system efficiency at full scale is about 82%.
ContributorsBai, Jing (Author) / Bakkaloglu, Bertan (Thesis advisor) / Arizona State University (Publisher)
Created2015
149580-Thumbnail Image.png
Description
Advances in electronics technology and innovative manufacturing processes have driven the semiconductor industry towards extensive miniaturization & ever greater integration of chip design. One consequence of this sustained evolution has been the growing relative cost of accessing off-chip components with external memory being one of the dominant contributors. In embedded

Advances in electronics technology and innovative manufacturing processes have driven the semiconductor industry towards extensive miniaturization & ever greater integration of chip design. One consequence of this sustained evolution has been the growing relative cost of accessing off-chip components with external memory being one of the dominant contributors. In embedded systems and applications, where power consumption and cost are extremely crucial factors, the use of on chip Scratch Pad Memories (SPMs) has proven to be a good alternative to caches. SPMs are more efficient than on-chip caches in a wide variety of aspects including energy consumption, power dissipation, speed performance, area, and timing predictability. However, at the same time, they entail explicit software-level management. Specifically, the system performance depends upon overlay scheme for mapping code and data onto the size-limited SPMs. It has been found that for applications with large code sizes, the overlay overhead cost becomes significant. This work aims to evaluate and implement pre-fetching as a performance improvement technique for SPMs. It is implemented in code overlay manager, provided with the Cell Broadband Engine (CBE) Synergistic Processing Unit (SPU) compiler from IBM, spu-gcc. Four different approaches proposed in this work use profiling information to predict pre-fetch calls. The pre-fetching technique achieves considerable performance improvement by hiding some of the code overlay cost behind active computations by fetching the required code segment in advance into SPM. Experimental results supporting this claim are obtained using the IBM Cell architecture platform with substantial gain of more than 30%.
ContributorsGhadge, Nikhil Dadasaheb (Author) / Chatha, Dr. Karamvir (Thesis advisor) / Shrivastava, Dr. Aviral (Committee member) / Lee, Dr. Yann-Hang (Committee member) / Arizona State University (Publisher)
Created2011