Search Content

Energy-efficient digital circuit design using threshold logic gates

Description

Improving energy efficiency has always been the prime objective of the custom and automated digital circuit design techniques. As a result, a multitude of methods to reduce power without sacrificing performance have been proposed. However, as the field of design automation has matured over the last few decades, there have…

Improving energy efficiency has always been the prime objective of the custom and automated digital circuit design techniques. As a result, a multitude of methods to reduce power without sacrificing performance have been proposed. However, as the field of design automation has matured over the last few decades, there have been no new automated design techniques, that can provide considerable improvements in circuit power, leakage and area. Although emerging nano-devices are expected to replace the existing MOSFET devices, they are far from being as mature as semiconductor devices and their full potential and promises are many years away from being practical.

The research described in this dissertation consists of four main parts. First is a new circuit architecture of a differential threshold logic flipflop called PNAND. The PNAND gate is an edge-triggered multi-input sequential cell whose next state function is a threshold function of its inputs. Second a new approach, called hybridization, that replaces flipflops and parts of their logic cones with PNAND cells is described. The resulting \hybrid circuit, which consists of conventional logic cells and PNANDs, is shown to have significantly less power consumption, smaller area, less standby power and less power variation.

Third, a new architecture of a field programmable array, called field programmable threshold logic array (FPTLA), in which the standard lookup table (LUT) is replaced by a PNAND is described. The FPTLA is shown to have as much as 50% lower energy-delay product compared to conventional FPGA using well known FPGA modeling tool called VPR.

Fourth, a novel clock skewing technique that makes use of the completion detection feature of the differential mode flipflops is described. This clock skewing method improves the area and power of the ASIC circuits by increasing slack on timing paths. An additional advantage of this method is the elimination of hold time violation on given short paths.

Several circuit design methodologies such as retiming and asynchronous circuit design can use the proposed threshold logic gate effectively. Therefore, the use of threshold logic flipflops in conventional design methodologies opens new avenues of research towards more energy-efficient circuits.

ContributorsKulkarni, Niranjan (Author) / Vrudhula, Sarma (Thesis advisor) / Colbourn, Charles (Committee member) / Seo, Jae-Sun (Committee member) / Yu, Shimeng (Committee member) / Arizona State University (Publisher)

Created2015

Modeling and Simulation of the Programmable Metallization Cells (PMCs) and Diamond-Based Power Devices

Description

This PhD thesis consists of three main themes. The first part focusses on modeling of Silver (Ag)-Chalcogenide glass based resistive memory devices known as the Programmable Metallization Cell (PMC). The proposed models are examined with the Technology Computer Aided Design (TCAD) simulations. In order to find a relationship between electrochemistry…

This PhD thesis consists of three main themes. The first part focusses on modeling of Silver (Ag)-Chalcogenide glass based resistive memory devices known as the Programmable Metallization Cell (PMC). The proposed models are examined with the Technology Computer Aided Design (TCAD) simulations. In order to find a relationship between electrochemistry and carrier-trap statistics in chalcogenide glass films, an analytical mapping for electron trapping is derived. Then, a physical-based model is proposed in order to explain the dynamic behavior of the photodoping mechanism in lateral PMCs. At the end, in order to extract the time constant of ChG materials, a method which enables us to determine the carriers’ mobility with and without the UV light exposure is proposed. In order to validate these models, the results of TCAD simulations using Silvaco ATLAS are also presented in the study, which show good agreement.

In the second theme of this dissertation, a new model is presented to predict single event transients in 1T-1R memory arrays as an inverter, where the PMC is modeled as a constant resistance while the OFF transistor is model as a diode in parallel to a capacitance. The model divides the output voltage transient response of an inverter into three time segments, where an ionizing particle striking through the drain–body junction of the OFF-state NMOS is represented as a photocurrent pulse. If this current source is large enough, the output voltage can drop to a negative voltage. In this model, the OFF-state NMOS is represented as the parallel combination of an ideal diode and the intrinsic capacitance of the drain–body junction, while a resistance represents an ON-state NMOS. The proposed model is verified by 3-D TCAD mixed-mode device simulations. In order to investigate the flexibility of the model, the effects of important parameters, such as ON-state PMOS resistance, doping concentration of p-region in the diode, and the photocurrent pulse are scrutinized.

The third theme of this dissertation develops various models together with TCAD simulations to model the behavior of different diamond-based devices, including PIN diodes and bipolar junction transistors (BJTs). Diamond is a very attractive material for contemporary power semiconductor devices because of its excellent material properties, such as high breakdown voltage and superior thermal conductivity compared to other materials. Collectively, this research project enhances the development of high power and high temperature electronics using diamond-based semiconductors. During the fabrication process of diamond-based devices, structural defects particularly threading dislocations (TDs), may affect the device electrical properties, and models were developed to account of such defects. Recognition of their behavior helps us understand and predict the performance of diamond-based devices. Here, the electrical conductance through TD sites is shown to be governed by the Poole-Frenkel emission (PFE) for the temperature (T) range of 323 K ˂ T ˂ 423 K. Analytical models were performed to fit with experimental data over the aforementioned temperature range. Next, the Silvaco Atlas tool, a drift-diffusion based TCAD commercial software, was used to model diamond-based BJTs. Here, some field plate methods are proposed in order to decrease the surface electric field. The models used in Atlas are modiﬁed to account for both hopping transport in the impurity bands associated with high activation energies for boron doped and phosphorus doped diamond.

ContributorsSaremi, Mehdi (Author) / Goodnick, Stephen M (Thesis advisor) / Vasileska, Dragica (Committee member) / Kozicki, Michael N (Committee member) / Yu, Shimeng (Committee member) / Arizona State University (Publisher)

Created2017

Algorithm and Hardware Co-design for Learning On-a-chip

Description

Machine learning technology has made a lot of incredible achievements in recent years. It has rivalled or exceeded human performance in many intellectual tasks including image recognition, face detection and the Go game. Many machine learning algorithms require huge amount of computation such as in multiplication of large matrices. As…

Machine learning technology has made a lot of incredible achievements in recent years. It has rivalled or exceeded human performance in many intellectual tasks including image recognition, face detection and the Go game. Many machine learning algorithms require huge amount of computation such as in multiplication of large matrices. As silicon technology has scaled to sub-14nm regime, simply scaling down the device cannot provide enough speed-up any more. New device technologies and system architectures are needed to improve the computing capacity. Designing specific hardware for machine learning is highly in demand. Efforts need to be made on a joint design and optimization of both hardware and algorithm.

For machine learning acceleration, traditional SRAM and DRAM based system suffer from low capacity, high latency, and high standby power. Instead, emerging memories, such as Phase Change Random Access Memory (PRAM), Spin-Transfer Torque Magnetic Random Access Memory (STT-MRAM), and Resistive Random Access Memory (RRAM), are promising candidates providing low standby power, high data density, fast access and excellent scalability. This dissertation proposes a hierarchical memory modeling framework and models PRAM and STT-MRAM in four different levels of abstraction. With the proposed models, various simulations are conducted to investigate the performance, optimization, variability, reliability, and scalability.

Emerging memory devices such as RRAM can work as a 2-D crosspoint array to speed up the multiplication and accumulation in machine learning algorithms. This dissertation proposes a new parallel programming scheme to achieve in-memory learning with RRAM crosspoint array. The programming circuitry is designed and simulated in TSMC 65nm technology showing 900X speedup for the dictionary learning task compared to the CPU performance.

From the algorithm perspective, inspired by the high accuracy and low power of the brain, this dissertation proposes a bio-plausible feedforward inhibition spiking neural network with Spike-Rate-Dependent-Plasticity (SRDP) learning rule. It achieves more than 95% accuracy on the MNIST dataset, which is comparable to the sparse coding algorithm, but requires far fewer number of computations. The role of inhibition in this network is systematically studied and shown to improve the hardware efficiency in learning.

ContributorsXu, Zihan (Author) / Cao, Yu (Thesis advisor) / Chakrabarti, Chaitali (Committee member) / Seo, Jae-Sun (Committee member) / Yu, Shimeng (Committee member) / Arizona State University (Publisher)

Created2017

Cu-Silica Based Programmable Metallization Cell: Fabrication, Characterization and Applications

Description

The Programmable Metallization Cell (PMC) is a novel solid-state resistive switching technology. It has a simple metal-insulator-metal “MIM” structure with one metal being electrochemically active (Cu) and the other one being inert (Pt or W), an insulating film (silica) acts as solid electrolyte for ion transport is sandwiched between these…

The Programmable Metallization Cell (PMC) is a novel solid-state resistive switching technology. It has a simple metal-insulator-metal “MIM” structure with one metal being electrochemically active (Cu) and the other one being inert (Pt or W), an insulating film (silica) acts as solid electrolyte for ion transport is sandwiched between these two electrodes. PMC’s resistance can be altered by an external electrical stimulus. The change of resistance is attributed to the formation or dissolution of Cu metal filament(s) within the silica layer which is associated with electrochemical redox reactions and ion transportation. In this dissertation, a comprehensive study of microfabrication method and its impacts on performance of PMC device is demonstrated, gamma-ray total ionizing dose (TID) impacts on device reliability is investigated, and the materials properties of doped/undoped silica switching layers are illuminated by impedance spectroscopy (IS). Due to the inherent CMOS compatibility, Cu-silica PMCs have great potential to be adopted in many emerging technologies, such as non-volatile storage cells and selector cells in ultra-dense 3D crosspoint memories, as well as electronic synapses in brain-inspired neuromorphic computing. Cu-silica PMC device performance for these applications is also assessed in this dissertation.

ContributorsChen, Wenhao (Author) / Kozicki, Michael N (Thesis advisor) / Barnaby, Hugh J (Thesis advisor) / Yu, Shimeng (Committee member) / Thornton, Trevor (Committee member) / Arizona State University (Publisher)

Created2017

Voltage Sense Amplifier (VSA) Design For RRAM Cross-Point Memory Array Structures

Description

RRAM is an emerging technology that looks to replace FLASH NOR and possibly NAND memory. It is attractive because it uses an adjustable resistance and does not rely on charge; in the sub-10nm feature size circuitry this is critical. However, RRAM cross-point arrays suffer tremendously from leakage currents that prevent…

RRAM is an emerging technology that looks to replace FLASH NOR and possibly NAND memory. It is attractive because it uses an adjustable resistance and does not rely on charge; in the sub-10nm feature size circuitry this is critical. However, RRAM cross-point arrays suffer tremendously from leakage currents that prevent proper readings in larger array sizes. In this research an exponential IV selector was added to each cell to minimize this current. Using this technique the largest array-size supportable was determined to be 512x512 cells using the conventional voltage sense amplifier by HSPICE simulations. However, with the increase in array size, the sensing latency also remarkably increases due to more sneak path currents, approaching 873 ns for the 512x512 array.

ContributorsMadler, Ryan Anton (Author) / Yu, Shimeng (Thesis director) / Cao, Yu (Committee member) / Electrical Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2016-05

Digital Modeling of Analog Effect Circuits

Description

While SPICE circuit simulation software gives researchers and industry accurate information regarding the behavior and characteristics of circuits, the auditory effect of SPICE circuit simulation on audio circuits is not well documented. This project takes a thoroughly analyzed and popular audio effect circuit called the Ibanez Tubescreamer and simulates its…

While SPICE circuit simulation software gives researchers and industry accurate information regarding the behavior and characteristics of circuits, the auditory effect of SPICE circuit simulation on audio circuits is not well documented. This project takes a thoroughly analyzed and popular audio effect circuit called the Ibanez Tubescreamer and simulates its distortion effect on a .wav file in order to hear the effect of SPICE simulation. Specifically, the TS-808 schematic is drawn in the SPICE program LTSPICE and simulated using generated sinusoids and recorded .wav files. Specific components are imported using .MODEL and .SUBCKT to accurately represent the diodes, bipolar transistors, op amps, and other components in order to hear how each component affects the response. Various transient responses are extracted as .wav files and assembled as figures in order to characterize the result of the circuit on the input. Once the actual circuit is built and debugged, all of the same transient analysis is applied and then compared to the SPICE simulation figures gathered in the digital simulation. These results are then compared along with a subjective hearing test of the digital simulation and analog circuit in order to test the validity of the SPICE simulations. The digital simulations reveal that the distortion follows the signature characteristics of Ibanez Tubescreamer which shows that SPICE simulation will give insight into the real effects of audio circuits modeled in SPICE programs. Diodes--such as Silicon, Germanium, Zener, Red LEDs and Blue LEDs--can dramatically change the waveforms and sound of the inputs within the circuit where as the Op-amps--such as the JRC4558, TL072, and NE5532--have little to no effect on the waveforms and subjective effects on the output .wav files. After building the circuit and hearing the difference between the analog circuit and digital simulation, the differences between the two are apparent but very similar in nature--proving that the SPICE simulation can give meaningful insight into the sound of the actual analog circuit. Some of the differences can be explained by the variance of equipment and environment used in recording and playback. Since this project did not use high fidelity audio recording equipment and consistency in the equipment used for playback, it is uncertain if the simulation and actual circuit could be classified as completely accurate. Any further work on the project would be recording and playing back in a constant environment and looking into a wider range of specific components instead of looking into one permutation.

ContributorsMacias, Cole Thomas (Author) / Goryll, Michael (Thesis director) / Yu, Shimeng (Committee member) / Electrical Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2015-12

Reconfigurable architectures and systems for IoT applications

Description

Internet of Things (IoT) has become a popular topic in industry over the recent years, which describes an ecosystem of internet-connected devices or things that enrich the everyday life by improving our productivity and efficiency. The primary components of the IoT ecosystem are hardware, software and services. While the software…

Internet of Things (IoT) has become a popular topic in industry over the recent years, which describes an ecosystem of internet-connected devices or things that enrich the everyday life by improving our productivity and efficiency. The primary components of the IoT ecosystem are hardware, software and services. While the software and services of IoT system focus on data collection and processing to make decisions, the underlying hardware is responsible for sensing the information, preprocess and transmit it to the servers. Since the IoT ecosystem is still in infancy, there is a great need for rapid prototyping platforms that would help accelerate the hardware design process. However, depending on the target IoT application, different sensors are required to sense the signals such as heart-rate, temperature, pressure, acceleration, etc., and there is a great need for reconfigurable platforms that can prototype different sensor interfacing circuits.

This thesis primarily focuses on two important hardware aspects of an IoT system: (a) an FPAA based reconfigurable sensing front-end system and (b) an FPGA based reconfigurable processing system. To enable reconfiguration capability for any sensor type, Programmable ANalog Device Array (PANDA), a transistor-level analog reconfigurable platform is proposed. CAD tools required for implementation of front-end circuits on the platform are also developed. To demonstrate the capability of the platform on silicon, a small-scale array of 24×25 PANDA cells is fabricated in 65nm technology. Several analog circuit building blocks including amplifiers, bias circuits and filters are prototyped on the platform, which demonstrates the effectiveness of the platform for rapid prototyping IoT sensor interfaces.

IoT systems typically use machine learning algorithms that run on the servers to process the data in order to make decisions. Recently, embedded processors are being used to preprocess the data at the energy-constrained sensor node or at IoT gateway, which saves considerable energy for transmission and bandwidth. Using conventional CPU based systems for implementing the machine learning algorithms is not energy-efficient. Hence an FPGA based hardware accelerator is proposed and an optimization methodology is developed to maximize throughput of any convolutional neural network (CNN) based machine learning algorithm on a resource-constrained FPGA.

ContributorsSuda, Naveen (Author) / Cao, Yu (Thesis advisor) / Bakkaloglu, Bertan (Committee member) / Ozev, Sule (Committee member) / Yu, Shimeng (Committee member) / Seo, Jae-Sun (Committee member) / Arizona State University (Publisher)

Created2016

Multi-Level Control of Conductive Nano-Filament Evolution in HfO2 ReRAM by Pulse-Train Operations

Description

Precise electrical manipulation of nanoscale defects such as vacancy nano-filaments is highly desired for the multi-level control of ReRAM. In this paper we present a systematic investigation on the pulse-train operation scheme for reliable multi-level control of conductive filament evolution. By applying the pulse-train scheme to a 3 bit per…

Precise electrical manipulation of nanoscale defects such as vacancy nano-filaments is highly desired for the multi-level control of ReRAM. In this paper we present a systematic investigation on the pulse-train operation scheme for reliable multi-level control of conductive filament evolution. By applying the pulse-train scheme to a 3 bit per cell HfO2 ReRAM, the relative standard deviations of resistance levels are improved up to 80% compared to the single-pulse scheme. The observed exponential relationship between the saturated resistance and the pulse amplitude provides evidence for the gap-formation model of the filament-rupture process.

ContributorsZhao, L. (Author) / Chen, H.-Y. (Author) / Wu, S.-C (Author) / Jiang, Z. (Author) / Yu, Shimeng (Author) / Hou, T.-H. (Author) / Wong, H.-S. Philip (Author) / Nishi, Y. (Author) / Ira A. Fulton Schools of Engineering (Contributor)

Created2014-03-26

Stochastic Learning in Oxide Binary Synaptic Device for Neuromorphic Computing

Description

Hardware implementation of neuromorphic computing is attractive as a computing paradigm beyond the conventional digital computing. In this work, we show that the SET (off-to-on) transition of metal oxide resistive switching memory becomes probabilistic under a weak programming condition. The switching variability of the binary synaptic device implements a stochastic…

Hardware implementation of neuromorphic computing is attractive as a computing paradigm beyond the conventional digital computing. In this work, we show that the SET (off-to-on) transition of metal oxide resistive switching memory becomes probabilistic under a weak programming condition. The switching variability of the binary synaptic device implements a stochastic learning rule. Such stochastic SET transition was statistically measured and modeled for a simulation of a winner-take-all network for competitive learning. The simulation illustrates that with such stochastic learning, the orientation classification function of input patterns can be effectively realized. The system performance metrics were compared between the conventional approach using the analog synapse and the approach in this work that employs the binary synapse utilizing the stochastic learning. The feasibility of using binary synapse in the neurormorphic computing may relax the constraints to engineer continuous multilevel intermediate states and widens the material choice for the synaptic device design.

ContributorsYu, Shimeng (Author) / Gao, Bin (Author) / Fang, Zheng (Author) / Yu, Hongyu (Author) / Kang, Jinfeng (Author) / Wong, H.-S. Philip (Author) / Ira A. Fulton Schools of Engineering (Contributor)

Created2013-10-31