Search Content

Semiconductor Memory Applications in Radiation Environment, Hardware Security and Machine Learning System

Description

Semiconductor memory is a key component of the computing systems. Beyond the conventional memory and data storage applications, in this dissertation, both mainstream and eNVM memory technologies are explored for radiation environment, hardware security system and machine learning applications.

In the radiation environment, e.g. aerospace, the memory devices face different…

Semiconductor memory is a key component of the computing systems. Beyond the conventional memory and data storage applications, in this dissertation, both mainstream and eNVM memory technologies are explored for radiation environment, hardware security system and machine learning applications.

In the radiation environment, e.g. aerospace, the memory devices face different energetic particles. The strike of these energetic particles can generate electron-hole pairs (directly or indirectly) as they pass through the semiconductor device, resulting in photo-induced current, and may change the memory state. First, the trend of radiation effects of the mainstream memory technologies with technology node scaling is reviewed. Then, single event effects of the oxide based resistive switching random memory (RRAM), one of eNVM technologies, is investigated from the circuit-level to the system level.

Physical Unclonable Function (PUF) has been widely investigated as a promising hardware security primitive, which employs the inherent randomness in a physical system (e.g. the intrinsic semiconductor manufacturing variability). In the dissertation, two RRAM-based PUF implementations are proposed for cryptographic key generation (weak PUF) and device authentication (strong PUF), respectively. The performance of the RRAM PUFs are evaluated with experiment and simulation. The impact of non-ideal circuit effects on the performance of the PUFs is also investigated and optimization strategies are proposed to solve the non-ideal effects. Besides, the security resistance against modeling and machine learning attacks is analyzed as well.

Deep neural networks (DNNs) have shown remarkable improvements in various intelligent applications such as image classification, speech classification and object localization and detection. Increasing efforts have been devoted to develop hardware accelerators. In this dissertation, two types of compute-in-memory (CIM) based hardware accelerator designs with SRAM and eNVM technologies are proposed for two binary neural networks, i.e. hybrid BNN (HBNN) and XNOR-BNN, respectively, which are explored for the hardware resource-limited platforms, e.g. edge devices.. These designs feature with high the throughput, scalability, low latency and high energy efficiency. Finally, we have successfully taped-out and validated the proposed designs with SRAM technology in TSMC 65 nm.

Overall, this dissertation paves the paths for memory technologies’ new applications towards the secure and energy-efficient artificial intelligence system.

ContributorsLiu, Rui (Author) / Yu, Shimeng (Thesis advisor, Committee member) / Cao, Yu (Committee member) / Barnaby, Hugh (Committee member) / Seo, Jae-Sun (Committee member) / Arizona State University (Publisher)

Created2018

Hardware Acceleration of Deep Convolutional Neural Networks on FPGA

Description

The rapid improvement in computation capability has made deep convolutional neural networks (CNNs) a great success in recent years on many computer vision tasks with significantly improved accuracy. During the inference phase, many applications demand low latency processing of one image with strict power consumption requirement, which reduces the efficiency…

The rapid improvement in computation capability has made deep convolutional neural networks (CNNs) a great success in recent years on many computer vision tasks with significantly improved accuracy. During the inference phase, many applications demand low latency processing of one image with strict power consumption requirement, which reduces the efficiency of GPU and other general-purpose platform, bringing opportunities for specific acceleration hardware, e.g. FPGA, by customizing the digital circuit specific for the deep learning algorithm inference. However, deploying CNNs on portable and embedded systems is still challenging due to large data volume, intensive computation, varying algorithm structures, and frequent memory accesses. This dissertation proposes a complete design methodology and framework to accelerate the inference process of various CNN algorithms on FPGA hardware with high performance, efficiency and flexibility.

As convolution contributes most operations in CNNs, the convolution acceleration scheme significantly affects the efficiency and performance of a hardware CNN accelerator. Convolution involves multiply and accumulate (MAC) operations with four levels of loops. Without fully studying the convolution loop optimization before the hardware design phase, the resulting accelerator can hardly exploit the data reuse and manage data movement efficiently. This work overcomes these barriers by quantitatively analyzing and optimizing the design objectives (e.g. memory access) of the CNN accelerator based on multiple design variables. An efficient dataflow and hardware architecture of CNN acceleration are proposed to minimize the data communication while maximizing the resource utilization to achieve high performance.

Although great performance and efficiency can be achieved by customizing the FPGA hardware for each CNN model, significant efforts and expertise are required leading to long development time, which makes it difficult to catch up with the rapid development of CNN algorithms. In this work, we present an RTL-level CNN compiler that automatically generates customized FPGA hardware for the inference tasks of various CNNs, in order to enable high-level fast prototyping of CNNs from software to FPGA and still keep the benefits of low-level hardware optimization. First, a general-purpose library of RTL modules is developed to model different operations at each layer. The integration and dataflow of physical modules are predefined in the top-level system template and reconfigured during compilation for a given CNN algorithm. The runtime control of layer-by-layer sequential computation is managed by the proposed execution schedule so that even highly irregular and complex network topology, e.g. GoogLeNet and ResNet, can be compiled. The proposed methodology is demonstrated with various CNN algorithms, e.g. NiN, VGG, GoogLeNet and ResNet, on two different standalone FPGAs achieving state-of-the art performance.

Based on the optimized acceleration strategy, there are still a lot of design options, e.g. the degree and dimension of computation parallelism, the size of on-chip buffers, and the external memory bandwidth, which impact the utilization of computation resources and data communication efficiency, and finally affect the performance and energy consumption of the accelerator. The large design space of the accelerator makes it impractical to explore the optimal design choice during the real implementation phase. Therefore, a performance model is proposed in this work to quantitatively estimate the accelerator performance and resource utilization. By this means, the performance bottleneck and design bound can be identified and the optimal design option can be explored early in the design phase.

ContributorsMa, Yufei (Author) / Vrudhula, Sarma (Thesis advisor) / Seo, Jae-Sun (Thesis advisor) / Cao, Yu (Committee member) / Barnaby, Hugh (Committee member) / Arizona State University (Publisher)

Created2018

Total Ionizing Dose and Dose Rate Effects on (Positive and Negative) BJT Based Bandgap References

Description

Space exploration is a large field that requires high performing circuitry due to the harsh environment. Within a space environment one of the biggest factors leading to circuit failure is radiation. Circuits must be robust enough to continue operation after being exposed to the high doses of radiation. Bandga…

Space exploration is a large field that requires high performing circuitry due to the harsh environment. Within a space environment one of the biggest factors leading to circuit failure is radiation. Circuits must be robust enough to continue operation after being exposed to the high doses of radiation. Bandgap reference (BGR) circuits are designed to be voltage references that stay stable across a wide range of supply voltages and temperatures. A bandgap reference is a piece of a large circuit that supplies critical elements of the large circuit with a constant voltage. When used in a space environment with large amounts of radiation a BGR needs to maintain its output voltage to enable the rest of the circuit to operate under proper conditions. Since a BGR is not a standalone circuit it is difficult and expensive to test if a BGR is maintaining its reference voltage.

This thesis describes a methodology of isolating and simulating bandgap references. Both NPN and PNP bandgap references are simulated over a variety of radiation doses and dose rates. This methodology will allow the degradation due to radiation of a BGR to be modeled easily and affordably. It can be observed that many circuits experience enhanced low dose rate sensitivity (ELDRS) which can lead to failure at low total ionizing doses (TID) of radiation. A compact model library demonstrating degradation of transistors at both high and low dose rates (HDR and LDR) will be used to show bandgap references reliability. Specifically, two bandgap references being utilized in commercial off the shelf low dropout regulators (LDO) will be evaluated. The LDOs are reverse engineered in a simulation program with integrated circuit emphasis (SPICE). Within the two LDOs the bandgaps will be the points of interest. Of the LDOs one has a positive regulated voltage and one has a negative regulated voltage. This requires an NPN and a PNP based BGR respectively. This simulation methodology will draw conclusions about the above bandgap references, and how they operate under radiation at different doses and dose rates.

ContributorsDavis, Parker William (Author) / Barnaby, Hugh (Thesis advisor) / Kitchen, Jennifer (Committee member) / Privat, Aymeric (Committee member) / Arizona State University (Publisher)

Created2019

Retention of programmable metallization cells during ionizing radiation exposure

Description

Non-volatile memory (NVM) has become a staple in the everyday life of consumers. NVM manifests inside cell phones, laptops, and most recently, wearable tech such as smart watches. NAND Flash has been an excellent solution to conditions requiring fast, compact NVM. Current technology nodes are nearing the physical limits of…

Non-volatile memory (NVM) has become a staple in the everyday life of consumers. NVM manifests inside cell phones, laptops, and most recently, wearable tech such as smart watches. NAND Flash has been an excellent solution to conditions requiring fast, compact NVM. Current technology nodes are nearing the physical limits of scaling, preventing flash from improving. To combat the limitations of flash and to appease consumer demand for progressively faster and denser NVM, new technologies are needed. One possible candidate for the replacement of NAND Flash is programmable metallization cells (PMC). PMC are a type of resistive memory, meaning that they do not rely on charge storage to maintain a logic state. Depending on their application, it is possible that devices containing NVM will be exposed to harsh radiation environments. As part of the process for developing a novel memory technology, it is important to characterize the effects irradiation has on the functionality of the devices.

This thesis characterizes the effects that ionizing γ-ray irradiation has on the retention of the programmed resistive state of a PMC. The PMC devices tested used Ge30Se70 doped with Ag as the solid electrolyte layer and were fabricated by the thesis author in a Class 100 clean room. Individual device tiles were wire bonded into ceramic packages and tested in a biased and floating contact scenario.

The first scenario presented shows that PMC devices are capable of retaining their programmed state up to the maximum exposed total ionizing dose (TID) of 3.1 Mrad(Si). In this first scenario, the contacts of the PMC devices were left floating during exposure. The second scenario tested shows that the PMC devices are capable of retaining their state until the maximum TID of 10.1 Mrad(Si) was reached. The contacts in the second scenario were biased, with a 50 mV read voltage applied to the anode contact. Analysis of the results show that Ge30Se70 PMC are ionizing radiation tolerant and can retain a programmed state to a higher TID than NAND Flash memory.

ContributorsTaggart, Jennifer Lynn (Author) / Barnaby, Hugh (Thesis advisor) / Kozicki, Michael (Committee member) / Holbert, Keith E. (Committee member) / Arizona State University (Publisher)

Created2015

Reliability issues and design solutions in advanced CMOS design

Description

Over decades, scientists have been scaling devices to increasingly smaller feature sizes for ever better performance of complementary metal-oxide semiconductor (CMOS) technology to meet requirements on speed, complexity, circuit density, power consumption and ultimately cost required by many advanced applications. However, going to these ultra-scaled CMOS devices also brings some…

Over decades, scientists have been scaling devices to increasingly smaller feature sizes for ever better performance of complementary metal-oxide semiconductor (CMOS) technology to meet requirements on speed, complexity, circuit density, power consumption and ultimately cost required by many advanced applications. However, going to these ultra-scaled CMOS devices also brings some drawbacks. Aging due to bias-temperature-instability (BTI) and Hot carrier injection (HCI) is the dominant cause of functional failure in large scale logic circuits. The aging phenomena, on top of process variations, translate into complexity and reduced design margin for circuits. Such issues call for “Design for Reliability”. In order to increase the overall design efficiency, it is important to (i) study the impact of aging on circuit level along with the transistor level understanding (ii) calibrate the theoretical findings with measurement data (iii) implementing tools that analyze the impact of BTI and HCI reliability on circuit timing into VLSI design process at each stage. In this work, post silicon measurements of a 28nm HK-MG technology are done to study the effect of aging on Frequency Degradation of digital circuits. A novel voltage controlled ring oscillator (VCO) structure, developed by NIMO research group is used to determine the effect of aging mechanisms like NBTI, PBTI and SILC on circuit parameters. Accelerated aging mechanism is proposed to avoid the time consuming measurement process and extrapolation of data to the end of life thus instead of predicting the circuit behavior, one can measure it, within a short period of time. Finally, to bridge the gap between device level models and circuit level aging analysis, a System Level Reliability Analysis Flow (SyRA) developed by NIMO group, is implemented for a TSMC 65nm industrial level design to achieve one-step reliability prediction for digital design.

ContributorsBansal, Ankita (Author) / Cao, Yu (Thesis advisor) / Seo, Jae sun (Committee member) / Barnaby, Hugh (Committee member) / Arizona State University (Publisher)

Created2016

Simulating radial dendrite growth

Description

The formation of dendrites in materials is usually seen as a failure-inducing defect in devices. Naturally, most research views dendrites as a problem needing a solution while focusing on process control techniques and post-mortem analysis of various stress patterns with the ultimate goal of total suppression of the structures. However,…

The formation of dendrites in materials is usually seen as a failure-inducing defect in devices. Naturally, most research views dendrites as a problem needing a solution while focusing on process control techniques and post-mortem analysis of various stress patterns with the ultimate goal of total suppression of the structures. However, programmable metallization cell (PMC) technology embraces dendrite formation in chalcogenide glasses by utilizing the nascent conductive filaments as its core operative element. Furthermore, exciting More-than-Moore capabilities in the realms of device watermarking and hardware encryption schema are made possible by the random nature of dendritic branch growth. While dendritic structures have been observed and are well-documented in solid state materials, there is still no satisfactory theoretical model that can provide insight and a better understanding of how dendrites form. Ultimately, what is desired is the capability to predict the final structure of the conductive filament in a PMC device so that exciting new applications can be developed with PMC technology.

This thesis details the results of an effort to create a first-principles MATLAB simulation model that uses configurable physical parameters to generate images of dendritic structures. Generated images are compared against real-world samples. While growth has a significant random component, there are several reliable characteristics that form under similar parameter sets that can be monitored such as the relative length of major dendrite arms, common branching angles, and overall growth directionality.

The first simulation model that was constructed takes a Newtonian perspective of the problem and is implemented using the Euler numerical method. This model has several shortcomings stemming majorly from the simplistic treatment of the problem, but is highly performant. The model is then revised to use the Verlet numerical method, which increases the simulation accuracy, but still does not fully resolve the issues with the theoretical background. The final simulation model returns to the Euler method, but is a stochastic model based on Mott-Gurney’s ion hopping theory applied to solids. The results from this model are seen to match real samples the closest of all simulations.

ContributorsFoss, Ryan (Author) / Kozicki, Michael N (Thesis advisor) / Barnaby, Hugh (Committee member) / Allee, David R. (Committee member) / Arizona State University (Publisher)

Created2016

Surface Potential Modelling of Hot Carrier Degradation in CMOS Technology

Description

The scaling of transistors has numerous advantages such as increased memory density, less power consumption and better performance; but on the other hand, they also give rise to many reliability issues. One of the major reliability issue is the hot carrier injection and the effect it has on device degradation…

The scaling of transistors has numerous advantages such as increased memory density, less power consumption and better performance; but on the other hand, they also give rise to many reliability issues. One of the major reliability issue is the hot carrier injection and the effect it has on device degradation over time which causes serious circuit malfunctions.

Hot carrier injection has been studied from early 1980's and a lot of research has been done on the various hot carrier injection mechanisms and how the devices get damaged due to this effect. However, most of the existing hot carrier degradation models do not consider the physics involved in the degradation process and they just calculate the change in threshold voltage for different stress voltages and time. Based on this, an analytical expression is formulated that predicts the device lifetime.

This thesis starts by discussing various hot carrier injection mechanisms and the effects it has on the device. Studies have shown charges getting trapped in gate oxide and interface trap generation are two mechanisms for device degradation. How various device parameters get affected due to these traps is discussed here. The physics based models such as lucky hot electron model and substrate current model are presented and gives an idea how the gate current and substrate current can be related to hot carrier injection and density of traps created.

Devices are stressed under various voltages and from the experimental data obtained, the density of trapped charges and interface traps are calculated using mid-gap technique. In this thesis, a simple analytical model based on substrate current is used to calculate the density of trapped charges in oxide and interface traps generated and it is a function of stress voltage and stress time. The model is verified against the data and the TCAD simulations. Finally, the analytical model is incorporated in a Verilog-A model and based on the surface potential method, the threshold voltage shift due to hot carrier stress is calculated.

ContributorsMuthuseenu, Kiraneswar (Author) / Barnaby, Hugh (Thesis advisor) / Kozicki, Michael (Committee member) / Velo, Yago Gonzalez (Committee member) / Arizona State University (Publisher)

Created2017

Front End Electronics for Neutron- Gamma Spectrometer Device

Description

With the natural resources of earth depleting very fast, the natural resources of other celestial bodies are considered a potential replacement. Thus, there has been rise of space missions constantly and with it the need of more sophisticated spectrometer devices has increased. The most important requirement in such an application…

With the natural resources of earth depleting very fast, the natural resources of other celestial bodies are considered a potential replacement. Thus, there has been rise of space missions constantly and with it the need of more sophisticated spectrometer devices has increased. The most important requirement in such an application is low area and power consumption.

To save area, some scintillators have been developed that can resolve both neutrons and gamma events rather than traditional scintillators which can do only one of these and thus, the spacecraft needs two such devices. But with this development, the requirements out of the readout electronics has also increased which now need to discriminate between neutron and gamma events.

This work presents a novel architecture for discriminating such events and compares the results with another approach developed by a partner company. The results show excellent potential in this approach for the neutron-gamma discrimination and the team at ASU is going to expand on this design and build up a working prototype for the complete spectrometer device.

ContributorsGupta, Kush (Author) / Barnaby, Hugh (Thesis advisor) / Hardgrove, Craig (Committee member) / Ozev, Sule (Committee member) / Arizona State University (Publisher)

Created2017

High performance microbial fuel cells and supercapacitors using Micro-Electro-Mechanical System (MEMS) technology

Description

A Microbial fuel cell (MFC) is a bio-inspired carbon-neutral, renewable electrochemical converter to extract electricity from catabolic reaction of micro-organisms. It is a promising technology capable of directly converting the abundant biomass on the planet into electricity and potentially alleviate the emerging global warming and energy crisis. The current and…

A Microbial fuel cell (MFC) is a bio-inspired carbon-neutral, renewable electrochemical converter to extract electricity from catabolic reaction of micro-organisms. It is a promising technology capable of directly converting the abundant biomass on the planet into electricity and potentially alleviate the emerging global warming and energy crisis. The current and power density of MFCs are low compared with conventional energy conversion techniques. Since its debut in 2002, many studies have been performed by adopting a variety of new configurations and structures to improve the power density. The reported maximum areal and volumetric power densities range from 19 mW/m2 to 1.57 W/m2 and from 6.3 W/m3 to 392 W/m3, respectively, which are still low compared with conventional energy conversion techniques. In this dissertation, the impact of scaling effect on the performance of MFCs are investigated, and it is found that by scaling down the characteristic length of MFCs, the surface area to volume ratio increases and the current and power density improves. As a result, a miniaturized MFC fabricated by Micro-Electro-Mechanical System(MEMS) technology with gold anode is presented in this dissertation, which demonstrate a high power density of 3300 W/m3. The performance of the MEMS MFC is further improved by adopting anodes with higher surface area to volume ratio, such as carbon nanotube (CNT) and graphene based anodes, and the maximum power density is further improved to a record high power density of 11220 W/m3. A novel supercapacitor by regulating the respiration of the bacteria is also presented, and a high power density of 531.2 A/m2 (1,060,000 A/m3) and 197.5 W/m2 (395,000 W/m3), respectively, are marked, which are one to two orders of magnitude higher than any previously reported microbial electrochemical techniques.

ContributorsRen, Hao (Author) / Chae, Junseok (Thesis advisor) / Bakkaloglu, Bertan (Committee member) / Phillips, Stephen (Committee member) / Goryll, Michael (Committee member) / Arizona State University (Publisher)

Created2016

High speed camera chip

Description

The market for high speed camera chips, or image sensors, has experienced rapid growth over the past decades owing to its broad application space in security, biomedical equipment, and mobile devices. CMOS (complementary metal-oxide-semiconductor) technology has significantly improved the performance of the high speed camera chip by enabling the monolithic…

The market for high speed camera chips, or image sensors, has experienced rapid growth over the past decades owing to its broad application space in security, biomedical equipment, and mobile devices. CMOS (complementary metal-oxide-semiconductor) technology has significantly improved the performance of the high speed camera chip by enabling the monolithic integration of pixel circuits and on-chip analog-to-digital conversion. However, for low light intensity applications, many CMOS image sensors have a sub-optimum dynamic range, particularly in high speed operation. Thus the requirements for a sensor to have a high frame rate and high fill factor is attracting more attention. Another drawback for the high speed camera chip is its high power demands due to its high operating frequency. Therefore, a CMOS image sensor with high frame rate, high fill factor, high voltage range and low power is difficult to realize.

This thesis presents the design of pixel circuit, the pixel array and column readout chain for a high speed camera chip. An integrated PN (positive-negative) junction photodiode and an accompanying ten transistor pixel circuit are implemented using a 0.18 µm CMOS technology. Multiple methods are applied to minimize the subthreshold currents, which is critical for low light detection. A layout sharing technique is used to increase the fill factor to 64.63%. Four programmable gain amplifiers (PGAs) and 10-bit pipeline analog-to-digital converters (ADCs) are added to complete on-chip analog to digital conversion. The simulation results of extracted circuit indicate ENOB (effective number of bits) is greater than 8 bits with FoM (figures of merit) =0.789. The minimum detectable voltage level is determined to be 470μV based on noise analysis. The total power consumption of PGA and ADC is 8.2mW for each conversion. The whole camera chip reaches 10508 frames per second (fps) at full resolution with 3.1mm x 3.4mm area.

ContributorsZhao, Tong (Author) / Barnaby, Hugh (Thesis advisor) / Mikkola, Esko (Thesis advisor) / Bakkaloglu, Bertan (Committee member) / Arizona State University (Publisher)

Created2017

Filtering by