Matching Items (21)

Simulation models for programmable metallization cells

Description

Advances in software and applications continue to demand advances in memory. The ideal memory would be non-volatile and have maximal capacity, speed, retention time, endurance, and radiation hardness while also having minimal physical size, energy usage, and cost. The programmable

Advances in software and applications continue to demand advances in memory. The ideal memory would be non-volatile and have maximal capacity, speed, retention time, endurance, and radiation hardness while also having minimal physical size, energy usage, and cost. The programmable metallization cell (PMC) is an emerging memory technology that is likely to surpass flash memory in all the listed ideal memory characteristics. A comprehensive physics-based model is needed to fully understand PMC operation and aid in design optimization. With the intent of advancing the PMC modeling effort, this thesis presents two simulation models for the PMC. The first model is a finite element model based on Silvaco Atlas finite element analysis software. Limitations of the software are identified that make this model inconsistent with the operating mechanism of the PMC. The second model is a physics-based numerical model developed for the PMC. This model is successful in matching data measured from a chalcogenide glass PMC designed and manufactured at ASU. Matched operating characteristics observable in the current and resistance vs. voltage data include the OFF/ON resistances and write/erase and electrodeposition voltage thresholds. Multilevel programming is also explained and demonstrated with the numerical model. The numerical model has already proven useful by revealing some information presented about the operation and characteristics of the PMC.

Contributors

Agent

Created

Date Created
2013

152288-Thumbnail Image.png

Resistance switching in chalcogenide based programmable metallization cells (PMC) and sensors under gamma-rays

Description

Chalcogenide glass (ChG) materials have gained wide attention because of their applications in conductive bridge random access memory (CBRAM), phase change memories (PC-RAM), optical rewritable disks (CD-RW and DVD-RW), microelectromechanical systems (MEMS), microfluidics, and optical communications. One of the significant

Chalcogenide glass (ChG) materials have gained wide attention because of their applications in conductive bridge random access memory (CBRAM), phase change memories (PC-RAM), optical rewritable disks (CD-RW and DVD-RW), microelectromechanical systems (MEMS), microfluidics, and optical communications. One of the significant properties of ChG materials is the change in the resistivity of the material when a metal such as Ag or Cu is added to it by diffusion. This study demonstrates the potential radiation-sensing capabilities of two metal/chalcogenide glass device configurations. Lateral and vertical device configurations sense the radiation-induced migration of Ag+ ions in germanium selenide glasses via changes in electrical resistance between electrodes on the ChG. Before irradiation, these devices exhibit a high-resistance `OFF-state' (in the order of 10E12) but following irradiation, with either 60-Co gamma-rays or UV light, their resistance drops to a low-resistance `ON-state' (around 10E3). Lateral devices have exhibited cyclical recovery with room temperature annealing of the Ag doped ChG, which suggests potential uses in reusable radiation sensor applications. The feasibility of producing inexpensive flexible radiation sensors has been demonstrated by studying the effects of mechanical strain and temperature stress on sensors formed on flexible polymer substrate. The mechanisms of radiation-induced Ag/Ag+ transport and reactions in ChG have been modeled using a finite element device simulator, ATLAS. The essential reactions captured by the simulator are radiation-induced carrier generation, combined with reduction/oxidation for Ag species in the chalcogenide film. Metal-doped ChGs are solid electrolytes that have both ionic and electronic conductivity. The ChG based Programmable Metallization Cell (PMC) is a technology platform that offers electric field dependent resistance switching mechanisms by formation and dissolution of nano sized conductive filaments in a ChG solid electrolyte between oxidizable and inert electrodes. This study identifies silver anode agglomeration in PMC devices following large radiation dose exposure and considers device failure mechanisms via electrical and material characterization. The results demonstrate that by changing device structural parameters, silver agglomeration in PMC devices can be suppressed and reliable resistance switching may be maintained for extremely high doses ranging from 4 Mrad(GeSe) to more than 10 Mrad (ChG).

Contributors

Agent

Created

Date Created
2013

153070-Thumbnail Image.png

Construction of GCCFG for inter-procedural optimizations in Software Managed Manycore (SMM)

Description

Software Managed Manycore (SMM) architectures - in which each core has only a scratch pad memory (instead of caches), - are a promising solution for scaling memory hierarchy to hundreds of cores. However, in these architectures, the code and data

Software Managed Manycore (SMM) architectures - in which each core has only a scratch pad memory (instead of caches), - are a promising solution for scaling memory hierarchy to hundreds of cores. However, in these architectures, the code and data of the tasks mapped to the cores must be explicitly managed in the software by the compiler. State-of-the-art compiler techniques for SMM architectures require inter-procedural information and analysis. A call graph of the program does not have enough information, and Global CFG, i.e., combining all the control flow graphs of the program has too much information, and becomes too big. As a result, most new techniques have informally defined and used GCCFG (Global Call Control Flow Graph) - a whole program representation which captures the control-flow as well as function call information in a succinct way - to perform inter-procedural analysis. However, how to construct it has not been shown yet. We find that for several simple call and control flow graphs, constructing GCCFG is relatively straightforward, but there are several cases in common applications where unique graph transformation is needed in order to formally and correctly construct the GCCFG. This paper fills this gap, and develops graph transformations to allow the construction of GCCFG in (almost) all cases. Our experiments show that by using succinct representation (GCCFG) rather than elaborate representation (GlobalCFG), the compilation time of state-of-the-art code management technique [4] can be improved by an average of 5X, and that of stack management [20] can be improved by an average of 4X.

Contributors

Agent

Created

Date Created
2014

154501-Thumbnail Image.png

Analysis and design of native file system enhancements for storage class memory

Description

As persistent non-volatile memory solutions become integrated in the computing ecosystem and landscape, traditional commodity file systems architected and developed for traditional block I/O based memory solutions must be reevaluated. A majority of commodity file systems have been architected

As persistent non-volatile memory solutions become integrated in the computing ecosystem and landscape, traditional commodity file systems architected and developed for traditional block I/O based memory solutions must be reevaluated. A majority of commodity file systems have been architected and designed with the goal of managing data on non-volatile storage devices such as hard disk drives (HDDs) and solid state drives (SSDs). HDDs and SSDs are attached to a computing system via a controller or I/O hub, often referred to as the southbridge. The point of HDD and SSD attachment creates multiple levels of translation for any data managed by the CPU that must be stored in non-volatile memory (NVM) on an HDD or SSD. Storage Class Memory (SCM) devices provide the ability to store data at the CPU and DRAM level of a computing system. A novel set of modifications to the ext2 and ext4 commodity file systems to address the needs of SCM will be presented and discussed. An in-depth analysis of many existing file systems, from multiple sources, will be presented along with an analysis to identify key modifications and extensions that would be necessary to execute file system on SCM devices. From this analysis, modifications and extensions have been applied to the FAT commodity file system for key functional tests that will be presented to demonstrate the operation and execution of the file system extensions.

Contributors

Agent

Created

Date Created
2016

151200-Thumbnail Image.png

Compilation of stream programs onto embedded multicore architectures

Description

In recent years, we have observed the prevalence of stream applications in many embedded domains. Stream programs distinguish themselves from traditional sequential programming languages through well defined independent actors, explicit data communication, and stable code/data access patterns. In order to

In recent years, we have observed the prevalence of stream applications in many embedded domains. Stream programs distinguish themselves from traditional sequential programming languages through well defined independent actors, explicit data communication, and stable code/data access patterns. In order to achieve high performance and low power, scratch pad memory (SPM) has been introduced in today's embedded multicore processors. Current design frameworks for developing stream applications on SPM enhanced embedded architectures typically do not include a compiler that can perform automatic partitioning, mapping and scheduling under limited on-chip SPM capacities and memory access delays. Consequently, many designs are implemented manually, which leads to lengthy tasks and inferior designs. In this work, optimization techniques that automatically compile stream programs onto embedded multi-core architectures are proposed. As an initial case study, we implemented an automatic target recognition (ATR) algorithm on the IBM Cell Broadband Engine (BE). Then integer linear programming (ILP) and heuristic approaches were proposed to schedule stream programs on a single core embedded processor that has an SPM with code overlay. Later, ILP and heuristic approaches for Compiling Stream programs on SPM enhanced Multicore Processors (CSMP) were studied. The proposed CSMP ILP and heuristic approaches do not optimize for cycles in stream applications. Further, the number of software pipeline stages in the implementation is dependent on actor to processing engine (PE) mapping and is uncontrollable. We next presented a Retiming technique for Throughput optimization on Embedded Multi-core processors (RTEM). RTEM approach inherently handles cycles and can accept an upper bound on the number of software pipeline stages to be generated. We further enhanced RTEM by incorporating unrolling (URSTEM) that preserves all the beneficial properties of RTEM heuristic and also scales with the number of PEs through unrolling.

Contributors

Agent

Created

Date Created
2012

150204-Thumbnail Image.png

Programmable metallization cell devices for flexible electronics

Description

Programmable metallization cell (PMC) technology is based on an electrochemical phenomenon in which a metallic electrodeposit can be grown or dissolved between two electrodes depending on the voltage applied between them. Devices based on this phenomenon exhibit a unique, self-healing

Programmable metallization cell (PMC) technology is based on an electrochemical phenomenon in which a metallic electrodeposit can be grown or dissolved between two electrodes depending on the voltage applied between them. Devices based on this phenomenon exhibit a unique, self-healing property, as a broken metallic structure can be healed by applying an appropriate voltage between the two broken ends. This work explores methods of fabricating interconnects and switches based on PMC technology on flexible substrates. The objective was the evaluation of the feasibility of using this technology in flexible electronics applications in which reliability is a primary concern. The re-healable property of the interconnect is characterized for the silver doped germanium selenide (Ag-Ge-Se) solid electrolyte system. This property was evaluated by measuring the resistances of the healed interconnect structures and comparing these to the resistances of the unbroken structures. The reliability of the interconnects in both unbroken and healed states is studied by investigating the resistances of the structures to DC voltages, AC voltages and different temperatures as a function of time. This work also explores replacing silver with copper for these interconnects to enhance their reliability. A model for PMC-based switches on flexible substrates is proposed and compared to the observed device behavior with the objective of developing a formal design methodology for these devices. The switches were subjected to voltage sweeps and their resistance was investigated as a function of sweep voltage. The resistance of the switches as a function of voltage pulse magnitude when placed in series with a resistance was also investigated. A model was then developed to explain the behavior of these devices. All observations were based on statistical measurements to account for random errors. The results of this work demonstrate that solid electrolyte based interconnects display self-healing capability, which depends on the applied healing voltage and the current limit. However, they fail at lower current densities than metal interconnects due to an ion-drift induced failure mechanism. The results on the PMC based switches demonstrate that a model comprising a Schottky diode in parallel with a variable resistor predicts the behavior of the device.

Contributors

Agent

Created

Date Created
2011

150476-Thumbnail Image.png

Multidimensional DFT IP generators for FPGA platforms

Description

Multidimensional (MD) discrete Fourier transform (DFT) is a key kernel algorithm in many signal processing applications, such as radar imaging and medical imaging. Traditionally, a two-dimensional (2-D) DFT is computed using Row-Column (RC) decomposition, where one-dimensional (1-D) DFTs are computed

Multidimensional (MD) discrete Fourier transform (DFT) is a key kernel algorithm in many signal processing applications, such as radar imaging and medical imaging. Traditionally, a two-dimensional (2-D) DFT is computed using Row-Column (RC) decomposition, where one-dimensional (1-D) DFTs are computed along the rows followed by 1-D DFTs along the columns. However, architectures based on RC decomposition are not efficient for large input size data which have to be stored in external memories based Synchronous Dynamic RAM (SDRAM). In this dissertation, first an efficient architecture to implement 2-D DFT for large-sized input data is proposed. This architecture achieves very high throughput by exploiting the inherent parallelism due to a novel 2-D decomposition and by utilizing the row-wise burst access pattern of the SDRAM external memory. In addition, an automatic IP generator is provided for mapping this architecture onto a reconfigurable platform of Xilinx Virtex-5 devices. For a 2048x2048 input size, the proposed architecture is 1.96 times faster than RC decomposition based implementation under the same memory constraints, and also outperforms other existing implementations. While the proposed 2-D DFT IP can achieve high performance, its output is bit-reversed. For systems where the output is required to be in natural order, use of this DFT IP would result in timing overhead. To solve this problem, a new bandwidth-efficient MD DFT IP that is transpose-free and produces outputs in natural order is proposed. It is based on a novel decomposition algorithm that takes into account the output order, FPGA resources, and the characteristics of off-chip memory access. An IP generator is designed and integrated into an in-house FPGA development platform, AlgoFLEX, for easy verification and fast integration. The corresponding 2-D and 3-D DFT architectures are ported onto the BEE3 board and their performance measured and analyzed. The results shows that the architecture can maintain the maximum memory bandwidth throughout the whole procedure while avoiding matrix transpose operations used in most other MD DFT implementations. The proposed architecture has also been ported onto the Xilinx ML605 board. When clocked at 100 MHz, 2048x2048 images with complex single-precision can be processed in less than 27 ms. Finally, transpose-free imaging flows for range-Doppler algorithm (RDA) and chirp-scaling algorithm (CSA) in SAR imaging are proposed. The corresponding implementations take advantage of the memory access patterns designed for the MD DFT IP and have superior timing performance. The RDA and CSA flows are mapped onto a unified architecture which is implemented on an FPGA platform. When clocked at 100MHz, the RDA and CSA computations with data size 4096x4096 can be completed in 323ms and 162ms, respectively. This implementation outperforms existing SAR image accelerators based on FPGA and GPU.

Contributors

Agent

Created

Date Created
2012

149580-Thumbnail Image.png

Improving code overlay performance by pre-fetching in scratch pad memory systems

Description

Advances in electronics technology and innovative manufacturing processes have driven the semiconductor industry towards extensive miniaturization & ever greater integration of chip design. One consequence of this sustained evolution has been the growing relative cost of accessing off-chip components with

Advances in electronics technology and innovative manufacturing processes have driven the semiconductor industry towards extensive miniaturization & ever greater integration of chip design. One consequence of this sustained evolution has been the growing relative cost of accessing off-chip components with external memory being one of the dominant contributors. In embedded systems and applications, where power consumption and cost are extremely crucial factors, the use of on chip Scratch Pad Memories (SPMs) has proven to be a good alternative to caches. SPMs are more efficient than on-chip caches in a wide variety of aspects including energy consumption, power dissipation, speed performance, area, and timing predictability. However, at the same time, they entail explicit software-level management. Specifically, the system performance depends upon overlay scheme for mapping code and data onto the size-limited SPMs. It has been found that for applications with large code sizes, the overlay overhead cost becomes significant. This work aims to evaluate and implement pre-fetching as a performance improvement technique for SPMs. It is implemented in code overlay manager, provided with the Cell Broadband Engine (CBE) Synergistic Processing Unit (SPU) compiler from IBM, spu-gcc. Four different approaches proposed in this work use profiling information to predict pre-fetch calls. The pre-fetching technique achieves considerable performance improvement by hiding some of the code overlay cost behind active computations by fetching the required code segment in advance into SPM. Experimental results supporting this claim are obtained using the IBM Cell architecture platform with substantial gain of more than 30%.

Contributors

Agent

Created

Date Created
2011

149552-Thumbnail Image.png

Kinetics of programmable metallization cell memory

Description

Programmable Metallization Cell (PMC) technology has been shown to possess the necessary qualities for it to be considered as a leading contender for the next generation memory. These qualities include high speed and endurance, extreme scalability, ease of fabrication, ultra

Programmable Metallization Cell (PMC) technology has been shown to possess the necessary qualities for it to be considered as a leading contender for the next generation memory. These qualities include high speed and endurance, extreme scalability, ease of fabrication, ultra low power operation, and perhaps most importantly ease of integration with the CMOS back end of line (BEOL) process flow. One area where detailed study is lacking is the reliability of PMC devices. In previous reliability work, the low and high resistance states were monitored for periods of hours to days without any applied voltage and the results were extrapolated to several years (>10) but little has been done to analyze the low resistance state under stress. With or without stress, the low resistance state appears to be highly stable but a gradual increase in resistance with time, less than one order of magnitude after ten years when extrapolated, has been observed. It is important to understand the physics behind this resistance rise mechanism to comprehend the reliability issues associated with the low resistance state. This is also related to the erase process in PMC cells where the transition from the ON to OFF state occurs under a negative voltage. Hence it is important to investigate this erase process in PMC cells under different conditions and to model it. Analyzing the programming and the erase operations separately is important for any memory technology but its ability to cycle efficiently (reliably) at low voltages and for more than 1E4 cycles (without affecting the cells performance) is more critical. Future memory technologies must operate with the low power supply voltages (<1V) required for small geometry nodes. Low voltage programming of PMC memory devices has previously been demonstrated using slow voltage sweeps and small numbers of fast pulses. In this work PMC memory cells were cycled at low voltages using symmetric pulses with different load resistances and the distribution of the ON and OFF resistances was analyzed. The effect of the program current used during the program-erase cycling on the resulting resistance distributions is also investigated. Finally the variation found in the behavior of similar resistance ON states in PMC cells was analyzed more in detail and measures to reduce this variation were looked into. It was found that slow low current programming helped reducing the variation in erase times of similar resistance ON states in PMC cells. This scheme was also used as a pre-conditioning technique and the improvements in subsequent cycling behavior were compared.

Contributors

Agent

Created

Date Created
2011

149321-Thumbnail Image.png

NVM challenges in medical devices

Description

Electronic devices are gaining an increasing market share in the medical field. Medical devices are becoming more sophisticated, and encompassing more applications. Unlike consumer electronics, medical devices have far more limitations when it comes to area, power and most importantly

Electronic devices are gaining an increasing market share in the medical field. Medical devices are becoming more sophisticated, and encompassing more applications. Unlike consumer electronics, medical devices have far more limitations when it comes to area, power and most importantly reliability. The medical devices industry has recently seen the advantages of using Flash memory instead of Read Only Memory (ROM) for firmware storage, and in some cases to replace Electrically Programmable Read Only Memories (EEPROMs) in medical devices for frequent data storage. There are direct advantages to using Flash memory instead of Read Only Memory, most importantly the fact that firmware can be rewritten along the development cycle and in the field. However, Flash technology requires high voltage circuitry that makes it harder to integrate into low power devices. There have been a lot of advances in Non-Volatile Memory (NVM) technologies, and many Flash rivals are starting to gain attention. The purpose of this thesis is to evaluate these new technologies against Flash to determine the feasibility as well as the advantages of each technology. The focus is on embedded memory in a medical device micro-controller and application specific integrated circuits (ASIC). A behavioral model of a Programmable Metallization Cell (PMC) was used to simulate the behavior and determine the advantages of using PMC technology versus flash. When compared to flash test data, PMC based embedded memory showed a reduction in power consumption by many orders of magnitude. Analysis showed that an approximated 20% device longevity increase can be achieved by using embedded PMC technology.

Contributors

Agent

Created

Date Created
2010