Matching Items (6)
Filtering by

Clear all filters

151381-Thumbnail Image.png
Description
The dissolution of metal layers such as silver into chalcogenide glass layers such as germanium selenide changes the resistivity of the metal and chalcogenide films by a great extent. It is known that the incorporation of the metal can be achieved by ultra violet light exposure or thermal processes. In

The dissolution of metal layers such as silver into chalcogenide glass layers such as germanium selenide changes the resistivity of the metal and chalcogenide films by a great extent. It is known that the incorporation of the metal can be achieved by ultra violet light exposure or thermal processes. In this work, the use of metal dissolution by exposure to gamma radiation has been explored for radiation sensor applications. Test structures were designed and a process flow was developed for prototype sensor fabrication. The test structures were designed such that sensitivity to radiation could be studied. The focus is on the effect of gamma rays as well as ultra violet light on silver dissolution in germanium selenide (Ge30Se70) chalcogenide glass. Ultra violet radiation testing was used prior to gamma exposure to assess the basic mechanism. The test structures were electrically characterized prior to and post irradiation to assess resistance change due to metal dissolution. A change in resistance was observed post irradiation and was found to be dependent on the radiation dose. The structures were also characterized using atomic force microscopy and roughness measurements were made prior to and post irradiation. A change in roughness of the silver films on Ge30Se70 was observed following exposure. This indicated the loss of continuity of the film which causes the increase in silver film resistance following irradiation. Recovery of initial resistance in the structures was also observed after the radiation stress was removed. This recovery was explained with photo-stimulated deposition of silver from the chalcogenide at room temperature confirmed with the re-appearance of silver dendrites on the chalcogenide surface. The results demonstrate that it is possible to use the metal dissolution effect in radiation sensing applications.
ContributorsChandran, Ankitha (Author) / Kozicki, Michael N (Thesis advisor) / Holbert, Keith E. (Committee member) / Barnaby, Hugh (Committee member) / Arizona State University (Publisher)
Created2012
151846-Thumbnail Image.png
Description
Efficiency of components is an ever increasing area of importance to portable applications, where a finite battery means finite operating time. Higher efficiency devices need to be designed that don't compromise on the performance that the consumer has come to expect. Class D amplifiers deliver on the goal of increased

Efficiency of components is an ever increasing area of importance to portable applications, where a finite battery means finite operating time. Higher efficiency devices need to be designed that don't compromise on the performance that the consumer has come to expect. Class D amplifiers deliver on the goal of increased efficiency, but at the cost of distortion. Class AB amplifiers have low efficiency, but high linearity. By modulating the supply voltage of a Class AB amplifier to make a Class H amplifier, the efficiency can increase while still maintaining the Class AB level of linearity. A 92dB Power Supply Rejection Ratio (PSRR) Class AB amplifier and a Class H amplifier were designed in a 0.24um process for portable audio applications. Using a multiphase buck converter increased the efficiency of the Class H amplifier while still maintaining a fast response time to respond to audio frequencies. The Class H amplifier had an efficiency above the Class AB amplifier by 5-7% from 5-30mW of output power without affecting the total harmonic distortion (THD) at the design specifications. The Class H amplifier design met all design specifications and showed performance comparable to the designed Class AB amplifier across 1kHz-20kHz and 0.01mW-30mW. The Class H design was able to output 30mW into 16Ohms without any increase in THD. This design shows that Class H amplifiers merit more research into their potential for increasing efficiency of audio amplifiers and that even simple designs can give significant increases in efficiency without compromising linearity.
ContributorsPeterson, Cory (Author) / Bakkaloglu, Bertan (Thesis advisor) / Barnaby, Hugh (Committee member) / Kiaei, Sayfe (Committee member) / Arizona State University (Publisher)
Created2013
156189-Thumbnail Image.png
Description
Static CMOS logic has remained the dominant design style of digital systems for

more than four decades due to its robustness and near zero standby current. Static

CMOS logic circuits consist of a network of combinational logic cells and clocked sequential

elements, such as latches and flip-flops that are used for sequencing computations

over

Static CMOS logic has remained the dominant design style of digital systems for

more than four decades due to its robustness and near zero standby current. Static

CMOS logic circuits consist of a network of combinational logic cells and clocked sequential

elements, such as latches and flip-flops that are used for sequencing computations

over time. The majority of the digital design techniques to reduce power, area, and

leakage over the past four decades have focused almost entirely on optimizing the

combinational logic. This work explores alternate architectures for the flip-flops for

improving the overall circuit performance, power and area. It consists of three main

sections.

First, is the design of a multi-input configurable flip-flop structure with embedded

logic. A conventional D-type flip-flop may be viewed as realizing an identity function,

in which the output is simply the value of the input sampled at the clock edge. In

contrast, the proposed multi-input flip-flop, named PNAND, can be configured to

realize one of a family of Boolean functions called threshold functions. In essence,

the PNAND is a circuit implementation of the well-known binary perceptron. Unlike

other reconfigurable circuits, a PNAND can be configured by simply changing the

assignment of signals to its inputs. Using a standard cell library of such gates, a technology

mapping algorithm can be applied to transform a given netlist into one with

an optimal mixture of conventional logic gates and threshold gates. This approach

was used to fabricate a 32-bit Wallace Tree multiplier and a 32-bit booth multiplier

in 65nm LP technology. Simulation and chip measurements show more than 30%

improvement in dynamic power and more than 20% reduction in core area.

The functional yield of the PNAND reduces with geometry and voltage scaling.

The second part of this research investigates the use of two mechanisms to improve

the robustness of the PNAND circuit architecture. One is the use of forward and reverse body biases to change the device threshold and the other is the use of RRAM

devices for low voltage operation.

The third part of this research focused on the design of flip-flops with non-volatile

storage. Spin-transfer torque magnetic tunnel junctions (STT-MTJ) are integrated

with both conventional D-flipflop and the PNAND circuits to implement non-volatile

logic (NVL). These non-volatile storage enhanced flip-flops are able to save the state of

system locally when a power interruption occurs. However, manufacturing variations

in the STT-MTJs and in the CMOS transistors significantly reduce the yield, leading

to an overly pessimistic design and consequently, higher energy consumption. A

detailed analysis of the design trade-offs in the driver circuitry for performing backup

and restore, and a novel method to design the energy optimal driver for a given yield is

presented. Efficient designs of two nonvolatile flip-flop (NVFF) circuits are presented,

in which the backup time is determined on a per-chip basis, resulting in minimizing

the energy wastage and satisfying the yield constraint. To achieve a yield of 98%,

the conventional approach would have to expend nearly 5X more energy than the

minimum required, whereas the proposed tunable approach expends only 26% more

energy than the minimum. A non-volatile threshold gate architecture NV-TLFF are

designed with the same backup and restore circuitry in 65nm technology. The embedded

logic in NV-TLFF compensates performance overhead of NVL. This leads to the

possibility of zero-overhead non-volatile datapath circuits. An 8-bit multiply-and-

accumulate (MAC) unit is designed to demonstrate the performance benefits of the

proposed architecture. Based on the results of HSPICE simulations, the MAC circuit

with the proposed NV-TLFF cells is shown to consume at least 20% less power and

area as compared to the circuit designed with conventional DFFs, without sacrificing

any performance.
ContributorsYang, Jinghua (Author) / Vrudhula, Sarma (Thesis advisor) / Barnaby, Hugh (Committee member) / Cao, Yu (Committee member) / Seo, Jae-Sun (Committee member) / Arizona State University (Publisher)
Created2018
156845-Thumbnail Image.png
Description
The rapid improvement in computation capability has made deep convolutional neural networks (CNNs) a great success in recent years on many computer vision tasks with significantly improved accuracy. During the inference phase, many applications demand low latency processing of one image with strict power consumption requirement, which reduces the efficiency

The rapid improvement in computation capability has made deep convolutional neural networks (CNNs) a great success in recent years on many computer vision tasks with significantly improved accuracy. During the inference phase, many applications demand low latency processing of one image with strict power consumption requirement, which reduces the efficiency of GPU and other general-purpose platform, bringing opportunities for specific acceleration hardware, e.g. FPGA, by customizing the digital circuit specific for the deep learning algorithm inference. However, deploying CNNs on portable and embedded systems is still challenging due to large data volume, intensive computation, varying algorithm structures, and frequent memory accesses. This dissertation proposes a complete design methodology and framework to accelerate the inference process of various CNN algorithms on FPGA hardware with high performance, efficiency and flexibility.

As convolution contributes most operations in CNNs, the convolution acceleration scheme significantly affects the efficiency and performance of a hardware CNN accelerator. Convolution involves multiply and accumulate (MAC) operations with four levels of loops. Without fully studying the convolution loop optimization before the hardware design phase, the resulting accelerator can hardly exploit the data reuse and manage data movement efficiently. This work overcomes these barriers by quantitatively analyzing and optimizing the design objectives (e.g. memory access) of the CNN accelerator based on multiple design variables. An efficient dataflow and hardware architecture of CNN acceleration are proposed to minimize the data communication while maximizing the resource utilization to achieve high performance.

Although great performance and efficiency can be achieved by customizing the FPGA hardware for each CNN model, significant efforts and expertise are required leading to long development time, which makes it difficult to catch up with the rapid development of CNN algorithms. In this work, we present an RTL-level CNN compiler that automatically generates customized FPGA hardware for the inference tasks of various CNNs, in order to enable high-level fast prototyping of CNNs from software to FPGA and still keep the benefits of low-level hardware optimization. First, a general-purpose library of RTL modules is developed to model different operations at each layer. The integration and dataflow of physical modules are predefined in the top-level system template and reconfigured during compilation for a given CNN algorithm. The runtime control of layer-by-layer sequential computation is managed by the proposed execution schedule so that even highly irregular and complex network topology, e.g. GoogLeNet and ResNet, can be compiled. The proposed methodology is demonstrated with various CNN algorithms, e.g. NiN, VGG, GoogLeNet and ResNet, on two different standalone FPGAs achieving state-of-the art performance.

Based on the optimized acceleration strategy, there are still a lot of design options, e.g. the degree and dimension of computation parallelism, the size of on-chip buffers, and the external memory bandwidth, which impact the utilization of computation resources and data communication efficiency, and finally affect the performance and energy consumption of the accelerator. The large design space of the accelerator makes it impractical to explore the optimal design choice during the real implementation phase. Therefore, a performance model is proposed in this work to quantitatively estimate the accelerator performance and resource utilization. By this means, the performance bottleneck and design bound can be identified and the optimal design option can be explored early in the design phase.
ContributorsMa, Yufei (Author) / Vrudhula, Sarma (Thesis advisor) / Seo, Jae-Sun (Thesis advisor) / Cao, Yu (Committee member) / Barnaby, Hugh (Committee member) / Arizona State University (Publisher)
Created2018
155321-Thumbnail Image.png
Description
Counterfeiting of goods is a widespread epidemic that is affecting the world economy. The conventional labeling techniques are proving inadequate to thwart determined counterfeiters equipped with sophisticated technologies. There is a growing need of a secure labeling that is easy to manufacture and analyze but extremely difficult to copy. Programmable

Counterfeiting of goods is a widespread epidemic that is affecting the world economy. The conventional labeling techniques are proving inadequate to thwart determined counterfeiters equipped with sophisticated technologies. There is a growing need of a secure labeling that is easy to manufacture and analyze but extremely difficult to copy. Programmable metallization cell technology operates on a principle of controllable reduction of a metal ions to an electrodeposit in a solid electrolyte by application of bias. The nature of metallic electrodeposit is unique for each instance of growth, moreover it has a treelike, bifurcating fractal structure with high information capacity. These qualities of the electrodeposit can be exploited to use it as a physical unclonable function. The secure labels made from the electrodeposits grown in radial structure can provide enhanced authentication and protection from counterfeiting and tampering.

So far only microscale radial structures and electrodeposits have been fabricated which limits their use to labeling only high value items due to high cost associated with their fabrication and analysis. Therefore, there is a need for a simple recipe for fabrication of macroscale structure that does not need sophisticated lithography tools and cleanroom environment. Moreover, the growth kinetics and material characteristics of such macroscale electrodeposits need to be investigated. In this thesis, a recipe for fabrication of centimeter scale radial structure for growing Ag electrodeposits using simple fabrication techniques was proposed. Fractal analysis of an electrodeposit suggested information capacity of 1.27 x 1019. The kinetics of growth were investigated by electrical characterization of the full cell and only solid electrolyte at different temperatures. It was found that mass transport of ions is the rate limiting process in the growth. Materials and optical characterization techniques revealed that the subtle relief like structure and consequently distinct optical response of the electrodeposit provides an added layer of security. Thus, the enormous information capacity, ease of fabrication and simplicity of analysis make macroscale fractal electrodeposits grown in radial programmable metallization cells excellent candidates for application as physical unclonable functions.
ContributorsChamele, Ninad (Author) / Kozicki, Michael (Thesis advisor) / Barnaby, Hugh (Thesis advisor) / Newman, Nathan (Committee member) / Arizona State University (Publisher)
Created2017
158879-Thumbnail Image.png
Description
Lateral programmable metallization cells (PMC) utilize the properties of electrodeposits grown over a solid electrolyte channel. Such devices have an active anode and an inert cathode separated by a long electrodeposit channel in a coplanar arrangement. The ability to transport large amount of metallic mass across the channel makes these

Lateral programmable metallization cells (PMC) utilize the properties of electrodeposits grown over a solid electrolyte channel. Such devices have an active anode and an inert cathode separated by a long electrodeposit channel in a coplanar arrangement. The ability to transport large amount of metallic mass across the channel makes these devices attractive for various More-Than-Moore applications. Existing literature lacks a comprehensive study of electrodeposit growth kinetics in lateral PMCs. Moreover, the morphology of electrodeposit growth in larger, planar devices is also not understood. Despite the variety of applications, lateral PMCs are not embraced by the semiconductor industry due to incompatible materials and high operating voltages needed for such devices. In this work, a numerical model based on the basic processes in PMCs – cation drift and redox reactions – is proposed, and the effect of various materials parameters on the electrodeposit growth kinetics is reported. The morphology of the electrodeposit growth and kinetics of the electrodeposition process are also studied in devices based on Ag-Ge30Se70 materials system. It was observed that the electrodeposition process mainly consists of two regimes of growth – cation drift limited regime and mixed regime. The electrodeposition starts in cation drift limited regime at low electric fields and transitions into mixed regime as the field increases. The onset of mixed regime can be controlled by applied voltage which also affects the morphology of electrodeposit growth. The numerical model was then used to successfully predict the device kinetics and onset of mixed regime. The problem of materials incompatibility with semiconductor manufacturing was solved by proposing a novel device structure. A bilayer structure using semiconductor foundry friendly materials was suggested as a candidate for solid electrolyte. The bilayer structure consists of a low resistivity oxide shunt layer on top of a high resistivity ion carrying oxide layer. Devices using Cu2O as the low resistivity shunt on top of Cu doped WO3 oxide were fabricated. The bilayer devices provided orders of magnitude improvement in device performance in the context of operating voltage and switching time. Electrical and materials characterization revealed the structure of bilayers and the mechanism of electrodeposition in these devices.
ContributorsChamele, Ninad (Author) / Kozicki, Michael (Thesis advisor) / Barnaby, Hugh (Committee member) / Newman, Nathan (Committee member) / Gonzalez-Velo, Yago (Committee member) / Arizona State University (Publisher)
Created2020