Matching Items (46)
Filtering by

Clear all filters

154176-Thumbnail Image.png
Description
Programmable metallization cell (PMC) technology employs the mechanisms of metal ion transport in solid electrolytes (SE) and electrochemical redox reactions in order to form metallic electrodeposits. When a positive bias is applied to an anode opposite to a cathode, atoms at the anode are oxidized to ions and dissolve into

Programmable metallization cell (PMC) technology employs the mechanisms of metal ion transport in solid electrolytes (SE) and electrochemical redox reactions in order to form metallic electrodeposits. When a positive bias is applied to an anode opposite to a cathode, atoms at the anode are oxidized to ions and dissolve into the SE. Under the influence of the electric field, the ions move to the cathode and become reduced to form the electrodeposits. These electrodeposits are filamentary in nature and persistent, and since they are metallic can alter the physical characteristics of the material on which they are formed. PMCs can be used as next generation memories, radio frequency (RF) switches and physical unclonable functions (PUFs).

The morphology of the filaments is impacted by the biasing conditions. Under a relatively high applied electric field, they form as dendritic elements with a low fractal dimension (FD), whereas a low electric field leads to high FD features. Ion depletion effects in the SE due to low ion diffusivity/mobility also influences the morphology by limiting the ion supply into the growing electrodeposit.

Ion transport in SE is due to hopping transitions driven by drift and diffusion force. A physical model of ion hopping with Brownian motion has been proposed, in which the ion transitions are random when time window is larger than characteristic time. The random growth process of filaments in PMC adds entropy to the electrodeposition, which leads to random features in the dendritic patterns. Such patterns has extremely high information capacity due to the fractal nature of the electrodeposits.

In this project, lateral-growth PMCs were fabricated, whose LRS resistance is less than 10Ω, which can be used as RF switches. Also, an array of radial-growth PMCs was fabricated, on which multiple dendrites, all with different shapes, could be grown simultaneously. Those patterns can be used as secure keys in PUFs and authentication can be performed by optical scanning.

A kinetic Monte Carlo (KMC) model is developed to simulate the ion transportation in SE under electric field. The simulation results matched experimental data well that validated the ion hopping model.
ContributorsYu, Weijie (Author) / Kozicki, Michael N (Thesis advisor) / Barnaby, Hugh (Thesis advisor) / Diaz, Rodolfo (Committee member) / Goryll, Michael (Committee member) / Arizona State University (Publisher)
Created2015
156189-Thumbnail Image.png
Description
Static CMOS logic has remained the dominant design style of digital systems for

more than four decades due to its robustness and near zero standby current. Static

CMOS logic circuits consist of a network of combinational logic cells and clocked sequential

elements, such as latches and flip-flops that are used for sequencing computations

over

Static CMOS logic has remained the dominant design style of digital systems for

more than four decades due to its robustness and near zero standby current. Static

CMOS logic circuits consist of a network of combinational logic cells and clocked sequential

elements, such as latches and flip-flops that are used for sequencing computations

over time. The majority of the digital design techniques to reduce power, area, and

leakage over the past four decades have focused almost entirely on optimizing the

combinational logic. This work explores alternate architectures for the flip-flops for

improving the overall circuit performance, power and area. It consists of three main

sections.

First, is the design of a multi-input configurable flip-flop structure with embedded

logic. A conventional D-type flip-flop may be viewed as realizing an identity function,

in which the output is simply the value of the input sampled at the clock edge. In

contrast, the proposed multi-input flip-flop, named PNAND, can be configured to

realize one of a family of Boolean functions called threshold functions. In essence,

the PNAND is a circuit implementation of the well-known binary perceptron. Unlike

other reconfigurable circuits, a PNAND can be configured by simply changing the

assignment of signals to its inputs. Using a standard cell library of such gates, a technology

mapping algorithm can be applied to transform a given netlist into one with

an optimal mixture of conventional logic gates and threshold gates. This approach

was used to fabricate a 32-bit Wallace Tree multiplier and a 32-bit booth multiplier

in 65nm LP technology. Simulation and chip measurements show more than 30%

improvement in dynamic power and more than 20% reduction in core area.

The functional yield of the PNAND reduces with geometry and voltage scaling.

The second part of this research investigates the use of two mechanisms to improve

the robustness of the PNAND circuit architecture. One is the use of forward and reverse body biases to change the device threshold and the other is the use of RRAM

devices for low voltage operation.

The third part of this research focused on the design of flip-flops with non-volatile

storage. Spin-transfer torque magnetic tunnel junctions (STT-MTJ) are integrated

with both conventional D-flipflop and the PNAND circuits to implement non-volatile

logic (NVL). These non-volatile storage enhanced flip-flops are able to save the state of

system locally when a power interruption occurs. However, manufacturing variations

in the STT-MTJs and in the CMOS transistors significantly reduce the yield, leading

to an overly pessimistic design and consequently, higher energy consumption. A

detailed analysis of the design trade-offs in the driver circuitry for performing backup

and restore, and a novel method to design the energy optimal driver for a given yield is

presented. Efficient designs of two nonvolatile flip-flop (NVFF) circuits are presented,

in which the backup time is determined on a per-chip basis, resulting in minimizing

the energy wastage and satisfying the yield constraint. To achieve a yield of 98%,

the conventional approach would have to expend nearly 5X more energy than the

minimum required, whereas the proposed tunable approach expends only 26% more

energy than the minimum. A non-volatile threshold gate architecture NV-TLFF are

designed with the same backup and restore circuitry in 65nm technology. The embedded

logic in NV-TLFF compensates performance overhead of NVL. This leads to the

possibility of zero-overhead non-volatile datapath circuits. An 8-bit multiply-and-

accumulate (MAC) unit is designed to demonstrate the performance benefits of the

proposed architecture. Based on the results of HSPICE simulations, the MAC circuit

with the proposed NV-TLFF cells is shown to consume at least 20% less power and

area as compared to the circuit designed with conventional DFFs, without sacrificing

any performance.
ContributorsYang, Jinghua (Author) / Vrudhula, Sarma (Thesis advisor) / Barnaby, Hugh (Committee member) / Cao, Yu (Committee member) / Seo, Jae-Sun (Committee member) / Arizona State University (Publisher)
Created2018
155918-Thumbnail Image.png
Description
The aging mechanism in devices is prone to uncertainties due to dynamic stress conditions. In AMS circuits these can lead to momentary fluctuations in circuit voltage that may be missed by a compact model and hence cause unpredictable failure. Firstly, multiple aging effects in the devices may have underlying correlations.

The aging mechanism in devices is prone to uncertainties due to dynamic stress conditions. In AMS circuits these can lead to momentary fluctuations in circuit voltage that may be missed by a compact model and hence cause unpredictable failure. Firstly, multiple aging effects in the devices may have underlying correlations. The generation of new traps during TDDB may significantly accelerate BTI, since these traps are close to the dielectric-Si interface in scaled technology. Secondly, the prevalent reliability analysis lacks a direct validation of the lifetime of devices and circuits. The aging mechanism of BTI causes gradual degradation of the device leading to threshold voltage shift and increasing the failure rate. In the 28nm HKMG technology, contribution of BTI to NMOS degradation has become significant at high temperature as compared to Channel Hot Carrier (CHC). This requires revising the End of Lifetime (EOL) calculation based on contribution from induvial aging effects especially in feedback loops. Conventionally, aging in devices is extrapolated from a short-term measurement, but this practice results in unreliable prediction of EOL caused by variability in initial parameters and stress conditions. To mitigate the extrapolation issues and improve predictability, this work aims at providing a new approach to test the device to EOL in a fast and controllable manner. The contributions of this thesis include: (1) based on stochastic trapping/de-trapping mechanism, new compact BTI models are developed and verified with 14nm FinFET and 28nm HKMG data. Moreover, these models are implemented into circuit simulation, illustrating a significant increase in failure rate due to accelerated BTI, (2) developing a model to predict accelerated aging under special conditions like feedback loops and stacked inverters, (3) introducing a feedback loop based test methodology called Adaptive Accelerated Aging (AAA) that can generate accurate aging data till EOL, (4) presenting simulation and experimental data for the models and providing test setup for multiple stress conditions, including those for achieving EOL in 1 hour device as well as ring oscillator (RO) circuit for validation of the proposed methodology, and (5) scaling these models for finding a guard band for VLSI design circuits that can provide realistic aging impact.
ContributorsPatra, Devyani (Author) / Cao, Yu (Thesis advisor) / Barnaby, Hugh (Thesis advisor) / Seo, Jae-Sun (Committee member) / Arizona State University (Publisher)
Created2017
156598-Thumbnail Image.png
Description
VCO as a ubiquitous circuit in many systems is highly demanding for the phase noises. Lowering the noise migrated from the power supply has been the trending topics for many years. Considering the Ring Oscillator(RO) based VCO is more sensitive to the supply noise, it is more significant to find

VCO as a ubiquitous circuit in many systems is highly demanding for the phase noises. Lowering the noise migrated from the power supply has been the trending topics for many years. Considering the Ring Oscillator(RO) based VCO is more sensitive to the supply noise, it is more significant to find out a useful technique to reduce the supply noise. Among the conventional supply noise reduction techniques such as filtering, channel length adjusting for the transistors, and the current noise mutual canceling, the new feature of the 28nm UTBB-FD-SOI process launched by the ST semiconductor offered a new method to reduce the noise, which is realized by allowing the circuit designer to dynamically control the threshold voltage. In this thesis, a new structure of the linear coarse-fine VCO with 1V supply voltage is designed for the ring typed VCO. The structure is also designed to be flexible to tune the frequency coverage by the fine and coarse tunable on-board resistors. The thesis has given the model of the phase noise reduction method. The model has also been proved to be meaningful with the newly designed VCO circuit. For instances, given 1μV/√Hz white noise coupled on the supply, the 3GHz VCO can have a more than 7dBc/Hz phase noise lowering at the 10MHz frequency offset.
ContributorsTang, Miao (Author) / Barnaby, Hugh (Thesis advisor) / Bakkaloglu, Bertan (Committee member) / Mikkola, Esko (Committee member) / Arizona State University (Publisher)
Created2018
156804-Thumbnail Image.png
Description
Semiconductor memory is a key component of the computing systems. Beyond the conventional memory and data storage applications, in this dissertation, both mainstream and eNVM memory technologies are explored for radiation environment, hardware security system and machine learning applications.

In the radiation environment, e.g. aerospace, the memory devices face different

Semiconductor memory is a key component of the computing systems. Beyond the conventional memory and data storage applications, in this dissertation, both mainstream and eNVM memory technologies are explored for radiation environment, hardware security system and machine learning applications.

In the radiation environment, e.g. aerospace, the memory devices face different energetic particles. The strike of these energetic particles can generate electron-hole pairs (directly or indirectly) as they pass through the semiconductor device, resulting in photo-induced current, and may change the memory state. First, the trend of radiation effects of the mainstream memory technologies with technology node scaling is reviewed. Then, single event effects of the oxide based resistive switching random memory (RRAM), one of eNVM technologies, is investigated from the circuit-level to the system level.

Physical Unclonable Function (PUF) has been widely investigated as a promising hardware security primitive, which employs the inherent randomness in a physical system (e.g. the intrinsic semiconductor manufacturing variability). In the dissertation, two RRAM-based PUF implementations are proposed for cryptographic key generation (weak PUF) and device authentication (strong PUF), respectively. The performance of the RRAM PUFs are evaluated with experiment and simulation. The impact of non-ideal circuit effects on the performance of the PUFs is also investigated and optimization strategies are proposed to solve the non-ideal effects. Besides, the security resistance against modeling and machine learning attacks is analyzed as well.

Deep neural networks (DNNs) have shown remarkable improvements in various intelligent applications such as image classification, speech classification and object localization and detection. Increasing efforts have been devoted to develop hardware accelerators. In this dissertation, two types of compute-in-memory (CIM) based hardware accelerator designs with SRAM and eNVM technologies are proposed for two binary neural networks, i.e. hybrid BNN (HBNN) and XNOR-BNN, respectively, which are explored for the hardware resource-limited platforms, e.g. edge devices.. These designs feature with high the throughput, scalability, low latency and high energy efficiency. Finally, we have successfully taped-out and validated the proposed designs with SRAM technology in TSMC 65 nm.

Overall, this dissertation paves the paths for memory technologies’ new applications towards the secure and energy-efficient artificial intelligence system.
ContributorsLiu, Rui (Author) / Yu, Shimeng (Thesis advisor, Committee member) / Cao, Yu (Committee member) / Barnaby, Hugh (Committee member) / Seo, Jae-Sun (Committee member) / Arizona State University (Publisher)
Created2018
156845-Thumbnail Image.png
Description
The rapid improvement in computation capability has made deep convolutional neural networks (CNNs) a great success in recent years on many computer vision tasks with significantly improved accuracy. During the inference phase, many applications demand low latency processing of one image with strict power consumption requirement, which reduces the efficiency

The rapid improvement in computation capability has made deep convolutional neural networks (CNNs) a great success in recent years on many computer vision tasks with significantly improved accuracy. During the inference phase, many applications demand low latency processing of one image with strict power consumption requirement, which reduces the efficiency of GPU and other general-purpose platform, bringing opportunities for specific acceleration hardware, e.g. FPGA, by customizing the digital circuit specific for the deep learning algorithm inference. However, deploying CNNs on portable and embedded systems is still challenging due to large data volume, intensive computation, varying algorithm structures, and frequent memory accesses. This dissertation proposes a complete design methodology and framework to accelerate the inference process of various CNN algorithms on FPGA hardware with high performance, efficiency and flexibility.

As convolution contributes most operations in CNNs, the convolution acceleration scheme significantly affects the efficiency and performance of a hardware CNN accelerator. Convolution involves multiply and accumulate (MAC) operations with four levels of loops. Without fully studying the convolution loop optimization before the hardware design phase, the resulting accelerator can hardly exploit the data reuse and manage data movement efficiently. This work overcomes these barriers by quantitatively analyzing and optimizing the design objectives (e.g. memory access) of the CNN accelerator based on multiple design variables. An efficient dataflow and hardware architecture of CNN acceleration are proposed to minimize the data communication while maximizing the resource utilization to achieve high performance.

Although great performance and efficiency can be achieved by customizing the FPGA hardware for each CNN model, significant efforts and expertise are required leading to long development time, which makes it difficult to catch up with the rapid development of CNN algorithms. In this work, we present an RTL-level CNN compiler that automatically generates customized FPGA hardware for the inference tasks of various CNNs, in order to enable high-level fast prototyping of CNNs from software to FPGA and still keep the benefits of low-level hardware optimization. First, a general-purpose library of RTL modules is developed to model different operations at each layer. The integration and dataflow of physical modules are predefined in the top-level system template and reconfigured during compilation for a given CNN algorithm. The runtime control of layer-by-layer sequential computation is managed by the proposed execution schedule so that even highly irregular and complex network topology, e.g. GoogLeNet and ResNet, can be compiled. The proposed methodology is demonstrated with various CNN algorithms, e.g. NiN, VGG, GoogLeNet and ResNet, on two different standalone FPGAs achieving state-of-the art performance.

Based on the optimized acceleration strategy, there are still a lot of design options, e.g. the degree and dimension of computation parallelism, the size of on-chip buffers, and the external memory bandwidth, which impact the utilization of computation resources and data communication efficiency, and finally affect the performance and energy consumption of the accelerator. The large design space of the accelerator makes it impractical to explore the optimal design choice during the real implementation phase. Therefore, a performance model is proposed in this work to quantitatively estimate the accelerator performance and resource utilization. By this means, the performance bottleneck and design bound can be identified and the optimal design option can be explored early in the design phase.
ContributorsMa, Yufei (Author) / Vrudhula, Sarma (Thesis advisor) / Seo, Jae-Sun (Thesis advisor) / Cao, Yu (Committee member) / Barnaby, Hugh (Committee member) / Arizona State University (Publisher)
Created2018
157304-Thumbnail Image.png
Description
Space exploration is a large field that requires high performing circuitry due to the harsh environment. Within a space environment one of the biggest factors leading to circuit failure is radiation. Circuits must be robust enough to continue operation after being exposed to the high doses of radiation. Bandga

Space exploration is a large field that requires high performing circuitry due to the harsh environment. Within a space environment one of the biggest factors leading to circuit failure is radiation. Circuits must be robust enough to continue operation after being exposed to the high doses of radiation. Bandgap reference (BGR) circuits are designed to be voltage references that stay stable across a wide range of supply voltages and temperatures. A bandgap reference is a piece of a large circuit that supplies critical elements of the large circuit with a constant voltage. When used in a space environment with large amounts of radiation a BGR needs to maintain its output voltage to enable the rest of the circuit to operate under proper conditions. Since a BGR is not a standalone circuit it is difficult and expensive to test if a BGR is maintaining its reference voltage.

This thesis describes a methodology of isolating and simulating bandgap references. Both NPN and PNP bandgap references are simulated over a variety of radiation doses and dose rates. This methodology will allow the degradation due to radiation of a BGR to be modeled easily and affordably. It can be observed that many circuits experience enhanced low dose rate sensitivity (ELDRS) which can lead to failure at low total ionizing doses (TID) of radiation. A compact model library demonstrating degradation of transistors at both high and low dose rates (HDR and LDR) will be used to show bandgap references reliability. Specifically, two bandgap references being utilized in commercial off the shelf low dropout regulators (LDO) will be evaluated. The LDOs are reverse engineered in a simulation program with integrated circuit emphasis (SPICE). Within the two LDOs the bandgaps will be the points of interest. Of the LDOs one has a positive regulated voltage and one has a negative regulated voltage. This requires an NPN and a PNP based BGR respectively. This simulation methodology will draw conclusions about the above bandgap references, and how they operate under radiation at different doses and dose rates.
ContributorsDavis, Parker William (Author) / Barnaby, Hugh (Thesis advisor) / Kitchen, Jennifer (Committee member) / Privat, Aymeric (Committee member) / Arizona State University (Publisher)
Created2019
154803-Thumbnail Image.png
Description
Over decades, scientists have been scaling devices to increasingly smaller feature sizes for ever better performance of complementary metal-oxide semiconductor (CMOS) technology to meet requirements on speed, complexity, circuit density, power consumption and ultimately cost required by many advanced applications. However, going to these ultra-scaled CMOS devices also brings some

Over decades, scientists have been scaling devices to increasingly smaller feature sizes for ever better performance of complementary metal-oxide semiconductor (CMOS) technology to meet requirements on speed, complexity, circuit density, power consumption and ultimately cost required by many advanced applications. However, going to these ultra-scaled CMOS devices also brings some drawbacks. Aging due to bias-temperature-instability (BTI) and Hot carrier injection (HCI) is the dominant cause of functional failure in large scale logic circuits. The aging phenomena, on top of process variations, translate into complexity and reduced design margin for circuits. Such issues call for “Design for Reliability”. In order to increase the overall design efficiency, it is important to (i) study the impact of aging on circuit level along with the transistor level understanding (ii) calibrate the theoretical findings with measurement data (iii) implementing tools that analyze the impact of BTI and HCI reliability on circuit timing into VLSI design process at each stage. In this work, post silicon measurements of a 28nm HK-MG technology are done to study the effect of aging on Frequency Degradation of digital circuits. A novel voltage controlled ring oscillator (VCO) structure, developed by NIMO research group is used to determine the effect of aging mechanisms like NBTI, PBTI and SILC on circuit parameters. Accelerated aging mechanism is proposed to avoid the time consuming measurement process and extrapolation of data to the end of life thus instead of predicting the circuit behavior, one can measure it, within a short period of time. Finally, to bridge the gap between device level models and circuit level aging analysis, a System Level Reliability Analysis Flow (SyRA) developed by NIMO group, is implemented for a TSMC 65nm industrial level design to achieve one-step reliability prediction for digital design.
ContributorsBansal, Ankita (Author) / Cao, Yu (Thesis advisor) / Seo, Jae sun (Committee member) / Barnaby, Hugh (Committee member) / Arizona State University (Publisher)
Created2016
155062-Thumbnail Image.png
Description
The formation of dendrites in materials is usually seen as a failure-inducing defect in devices. Naturally, most research views dendrites as a problem needing a solution while focusing on process control techniques and post-mortem analysis of various stress patterns with the ultimate goal of total suppression of the structures. However,

The formation of dendrites in materials is usually seen as a failure-inducing defect in devices. Naturally, most research views dendrites as a problem needing a solution while focusing on process control techniques and post-mortem analysis of various stress patterns with the ultimate goal of total suppression of the structures. However, programmable metallization cell (PMC) technology embraces dendrite formation in chalcogenide glasses by utilizing the nascent conductive filaments as its core operative element. Furthermore, exciting More-than-Moore capabilities in the realms of device watermarking and hardware encryption schema are made possible by the random nature of dendritic branch growth. While dendritic structures have been observed and are well-documented in solid state materials, there is still no satisfactory theoretical model that can provide insight and a better understanding of how dendrites form. Ultimately, what is desired is the capability to predict the final structure of the conductive filament in a PMC device so that exciting new applications can be developed with PMC technology.

This thesis details the results of an effort to create a first-principles MATLAB simulation model that uses configurable physical parameters to generate images of dendritic structures. Generated images are compared against real-world samples. While growth has a significant random component, there are several reliable characteristics that form under similar parameter sets that can be monitored such as the relative length of major dendrite arms, common branching angles, and overall growth directionality.

The first simulation model that was constructed takes a Newtonian perspective of the problem and is implemented using the Euler numerical method. This model has several shortcomings stemming majorly from the simplistic treatment of the problem, but is highly performant. The model is then revised to use the Verlet numerical method, which increases the simulation accuracy, but still does not fully resolve the issues with the theoretical background. The final simulation model returns to the Euler method, but is a stochastic model based on Mott-Gurney’s ion hopping theory applied to solids. The results from this model are seen to match real samples the closest of all simulations.
ContributorsFoss, Ryan (Author) / Kozicki, Michael N (Thesis advisor) / Barnaby, Hugh (Committee member) / Allee, David R. (Committee member) / Arizona State University (Publisher)
Created2016
155615-Thumbnail Image.png
Description
The scaling of transistors has numerous advantages such as increased memory density, less power consumption and better performance; but on the other hand, they also give rise to many reliability issues. One of the major reliability issue is the hot carrier injection and the effect it has on device degradation

The scaling of transistors has numerous advantages such as increased memory density, less power consumption and better performance; but on the other hand, they also give rise to many reliability issues. One of the major reliability issue is the hot carrier injection and the effect it has on device degradation over time which causes serious circuit malfunctions.

Hot carrier injection has been studied from early 1980's and a lot of research has been done on the various hot carrier injection mechanisms and how the devices get damaged due to this effect. However, most of the existing hot carrier degradation models do not consider the physics involved in the degradation process and they just calculate the change in threshold voltage for different stress voltages and time. Based on this, an analytical expression is formulated that predicts the device lifetime.

This thesis starts by discussing various hot carrier injection mechanisms and the effects it has on the device. Studies have shown charges getting trapped in gate oxide and interface trap generation are two mechanisms for device degradation. How various device parameters get affected due to these traps is discussed here. The physics based models such as lucky hot electron model and substrate current model are presented and gives an idea how the gate current and substrate current can be related to hot carrier injection and density of traps created.

Devices are stressed under various voltages and from the experimental data obtained, the density of trapped charges and interface traps are calculated using mid-gap technique. In this thesis, a simple analytical model based on substrate current is used to calculate the density of trapped charges in oxide and interface traps generated and it is a function of stress voltage and stress time. The model is verified against the data and the TCAD simulations. Finally, the analytical model is incorporated in a Verilog-A model and based on the surface potential method, the threshold voltage shift due to hot carrier stress is calculated.
ContributorsMuthuseenu, Kiraneswar (Author) / Barnaby, Hugh (Thesis advisor) / Kozicki, Michael (Committee member) / Velo, Yago Gonzalez (Committee member) / Arizona State University (Publisher)
Created2017