Search Content

Reliable arithmetic circuit design inspired by SNP systems

Description

ABSTRACT Developing new non-traditional device models is gaining popularity as the silicon-based electrical device approaches its limitation when it scales down. Membrane systems, also called P systems, are a new class of biological computation model inspired by the way cells process chemical signals. Spiking Neural P systems (SNP systems), a…

ABSTRACT Developing new non-traditional device models is gaining popularity as the silicon-based electrical device approaches its limitation when it scales down. Membrane systems, also called P systems, are a new class of biological computation model inspired by the way cells process chemical signals. Spiking Neural P systems (SNP systems), a certain kind of membrane systems, is inspired by the way the neurons in brain interact using electrical spikes. Compared to the traditional Boolean logic, SNP systems not only perform similar functions but also provide a more promising solution for reliable computation. Two basic neuron types, Low Pass (LP) neurons and High Pass (HP) neurons, are introduced. These two basic types of neurons are capable to build an arbitrary SNP neuron. This leads to the conclusion that these two basic neuron types are Turing complete since SNP systems has been proved Turing complete. These two basic types of neurons are further used as the elements to construct general-purpose arithmetic circuits, such as adder, subtractor and comparator. In this thesis, erroneous behaviors of neurons are discussed. Transmission error (spike loss) is proved to be equivalent to threshold error, which makes threshold error discussion more universal. To improve the reliability, a new structure called motif is proposed. Compared to Triple Modular Redundancy improvement, motif design presents its efficiency and effectiveness in both single neuron and arithmetic circuit analysis. DRAM-based CMOS circuits are used to implement the two basic types of neurons. Functionality of basic type neurons is proved using the SPICE simulations. The motif improved adder and the comparator, as compared to conventional Boolean logic design, are much more reliable with lower leakage, and smaller silicon area. This leads to the conclusion that SNP system could provide a more promising solution for reliable computation than the conventional Boolean logic.

ContributorsAn, Pei (Author) / Cao, Yu (Thesis advisor) / Barnaby, Hugh (Committee member) / Chakrabarti, Chaitali (Committee member) / Arizona State University (Publisher)

Created2013

Synthesis and characterization of erbium compound nanowires as high gain optical materials

Description

Integrated photonics requires high gain optical materials in the telecom wavelength range for optical amplifiers and coherent light sources. Erbium (Er) containing materials are ideal candidates due to the 1.5 μm emission from Er3+ ions. However, the Er density in typical Er-doped materials is less than 1 x 1020 cm-3,…

Integrated photonics requires high gain optical materials in the telecom wavelength range for optical amplifiers and coherent light sources. Erbium (Er) containing materials are ideal candidates due to the 1.5 μm emission from Er3+ ions. However, the Er density in typical Er-doped materials is less than 1 x 1020 cm-3, thus limiting the maximum optical gain to a few dB/cm, too small to be useful for integrated photonics applications. Er compounds could potentially solve this problem since they contain much higher Er density. So far the existing Er compounds suffer from short lifetime and strong upconversion effects, mainly due to poor quality of crystals produced by various methods of thin film growth and deposition. This dissertation explores a new Er compound: erbium chloride silicate (ECS, Er3(SiO4)2Cl ) in the nanowire form, which facilitates the growth of high quality single crystals. Growth methods for such single crystal ECS nanowires have been established. Various structural and optical characterizations have been carried out. The high crystal quality of ECS material leads to a long lifetime of the first excited state of Er3+ ions up to 1 ms at Er density higher than 1022 cm-3. This Er lifetime-density product was found to be the largest among all Er containing materials. A unique integrating sphere method was developed to measure the absorption cross section of ECS nanowires from 440 to 1580 nm. Pump-probe experiments demonstrated a 644 dB/cm signal enhancement from a single ECS wire. It was estimated that such large signal enhancement can overcome the absorption to result in a net material gain, but not sufficient to compensate waveguide propagation loss. In order to suppress the upconversion process in ECS, Ytterbium (Yb) and Yttrium (Y) ions are introduced as substituent ions of Er in the ECS crystal structure to reduce Er density. While the addition of Yb ions only partially succeeded, erbium yttrium chloride silicate (EYCS) with controllable Er density was synthesized successfully. EYCS with 30 at. % Er was found to be the best. It shows the strongest PL emission at 1.5 μm, and thus can be potentially used as a high gain material.

ContributorsYin, Leijun (Author) / Ning, Cun-Zheng (Thesis advisor) / Chamberlin, Ralph (Committee member) / Yu, Hongbin (Committee member) / Menéndez, Jose (Committee member) / Ponce, Fernando (Committee member) / Arizona State University (Publisher)

Created2013

Towards energy efficient computing with Linux: enabling task level power awareness and support for energy efficient accelerator

Description

With increasing transistor volume and reducing feature size, it has become a major design constraint to reduce power consumption also. This has given rise to aggressive architectural changes for on-chip power management and rapid development to energy efficient hardware accelerators. Accordingly, the objective of this research work is to facilitate…

With increasing transistor volume and reducing feature size, it has become a major design constraint to reduce power consumption also. This has given rise to aggressive architectural changes for on-chip power management and rapid development to energy efficient hardware accelerators. Accordingly, the objective of this research work is to facilitate software developers to leverage these hardware techniques and improve energy efficiency of the system. To achieve this, I propose two solutions for Linux kernel: Optimal use of these architectural enhancements to achieve greater energy efficiency requires accurate modeling of processor power consumption. Though there are many models available in literature to model processor power consumption, there is a lack of such models to capture power consumption at the task-level. Task-level energy models are a requirement for an operating system (OS) to perform real-time power management as OS time multiplexes tasks to enable sharing of hardware resources. I propose a detailed design methodology for constructing an architecture agnostic task-level power model and incorporating it into a modern operating system to build an online task-level power profiler. The profiler is implemented inside the latest Linux kernel and validated for Intel Sandy Bridge processor. It has a negligible overhead of less than 1\% hardware resource consumption. The profiler power prediction was demonstrated for various application benchmarks from SPEC to PARSEC with less than 4\% error. I also demonstrate the importance of the proposed profiler for emerging architectural techniques through use case scenarios, which include heterogeneous computing and fine grained per-core DVFS. Along with architectural enhancement in general purpose processors to improve energy efficiency, hardware accelerators like Coarse Grain reconfigurable architecture (CGRA) are gaining popularity. Unlike vector processors, which rely on data parallelism, CGRA can provide greater flexibility and compiler level control making it more suitable for present SoC environment. To provide streamline development environment for CGRA, I propose a flexible framework in Linux to do design space exploration for CGRA. With accurate and flexible hardware models, fine grained integration with accurate architectural simulator, and Linux memory management and DMA support, a user can carry out limitless experiments on CGRA in full system environment.

ContributorsDesai, Digant Pareshkumar (Author) / Vrudhula, Sarma (Thesis advisor) / Chakrabarti, Chaitali (Committee member) / Wu, Carole-Jean (Committee member) / Arizona State University (Publisher)

Created2013

Unified framework for energy-proportional computing in multicore processors: novel algorithms and practical implementation

Description

Multicore processors have proliferated in nearly all forms of computing, from servers, desktop, to smartphones. The primary reason for this large adoption of multicore processors is due to its ability to overcome the power-wall by providing higher performance at a lower power consumption rate. With multi-cores, there is increased need…

Multicore processors have proliferated in nearly all forms of computing, from servers, desktop, to smartphones. The primary reason for this large adoption of multicore processors is due to its ability to overcome the power-wall by providing higher performance at a lower power consumption rate. With multi-cores, there is increased need for dynamic energy management (DEM), much more than for single-core processors, as DEM for multi-cores is no more a mechanism just to ensure that a processor is kept under specified temperature limits, but also a set of techniques that manage various processor controls like dynamic voltage and frequency scaling (DVFS), task migration, fan speed, etc. to achieve a stated objective. The objectives span a wide range from maximizing throughput, minimizing power consumption, reducing peak temperature, maximizing energy efficiency, maximizing processor reliability, and so on, along with much more wider constraints of temperature, power, timing, and reliability constraints. Thus DEM can be very complex and challenging to achieve. Since often times many DEMs operate together on a single processor, there is a need to unify various DEM techniques. This dissertation address such a need. In this work, a framework for DEM is proposed that provides a unifying processor model that includes processor power, thermal, timing, and reliability models, supports various DEM control mechanisms, many different objective functions along with equally diverse constraint specifications. Using the framework, a range of novel solutions is derived for instances of DEM problems, that include maximizing processor performance, energy efficiency, or minimizing power consumption, peak temperature under constraints of maximum temperature, memory reliability and task deadlines. Finally, a robust closed-loop controller to implement the above solutions on a real processor platform with a very low operational overhead is proposed. Along with the controller design, a model identification methodology for obtaining the required power and thermal models for the controller is also discussed. The controller is architecture independent and hence easily portable across many platforms. The controller has been successfully deployed on Intel Sandy Bridge processor and the use of the controller has increased the energy efficiency of the processor by over 30%

ContributorsHanumaiah, Vinay (Author) / Vrudhula, Sarma (Thesis advisor) / Chatha, Karamvir (Committee member) / Chakrabarti, Chaitali (Committee member) / Rodriguez, Armando (Committee member) / Askin, Ronald (Committee member) / Arizona State University (Publisher)

Created2013

Adaptive parameter estimation, modeling and patient-specific classification of electrocardiogram signals

Description

Adaptive processing and classification of electrocardiogram (ECG) signals are important in eliminating the strenuous process of manually annotating ECG recordings for clinical use. Such algorithms require robust models whose parameters can adequately describe the ECG signals. Although different dynamic statistical models describing ECG signals currently exist, they depend considerably on…

Adaptive processing and classification of electrocardiogram (ECG) signals are important in eliminating the strenuous process of manually annotating ECG recordings for clinical use. Such algorithms require robust models whose parameters can adequately describe the ECG signals. Although different dynamic statistical models describing ECG signals currently exist, they depend considerably on a priori information and user-specified model parameters. Also, ECG beat morphologies, which vary greatly across patients and disease states, cannot be uniquely characterized by a single model. In this work, sequential Bayesian based methods are used to appropriately model and adaptively select the corresponding model parameters of ECG signals. An adaptive framework based on a sequential Bayesian tracking method is proposed to adaptively select the cardiac parameters that minimize the estimation error, thus precluding the need for pre-processing. Simulations using real ECG data from the online Physionet database demonstrate the improvement in performance of the proposed algorithm in accurately estimating critical heart disease parameters. In addition, two new approaches to ECG modeling are presented using the interacting multiple model and the sequential Markov chain Monte Carlo technique with adaptive model selection. Both these methods can adaptively choose between different models for various ECG beat morphologies without requiring prior ECG information, as demonstrated by using real ECG signals. A supervised Bayesian maximum-likelihood (ML) based classifier uses the estimated model parameters to classify different types of cardiac arrhythmias. However, the non-availability of sufficient amounts of representative training data and the large inter-patient variability pose a challenge to the existing supervised learning algorithms, resulting in a poor classification performance. In addition, recently developed unsupervised learning methods require a priori knowledge on the number of diseases to cluster the ECG data, which often evolves over time. In order to address these issues, an adaptive learning ECG classification method that uses Dirichlet process Gaussian mixture models is proposed. This approach does not place any restriction on the number of disease classes, nor does it require any training data. This algorithm is adapted to be patient-specific by labeling or identifying the generated mixtures using the Bayesian ML method, assuming the availability of labeled training data.

ContributorsEdla, Shwetha Reddy (Author) / Papandreou-Suppappola, Antonia (Thesis advisor) / Chakrabarti, Chaitali (Committee member) / Kovvali, Narayan (Committee member) / Tepedelenlioğlu, Cihan (Committee member) / Arizona State University (Publisher)

Created2012

Plasma surface interactions at interlayer dielectric (ILD) and metal surfaces

Description

In this dissertation, remote plasma interactions with the surfaces of low-k interlayer dielectric (ILD), Cu and Cu adhesion layers are investigated. The first part of the study focuses on the simultaneous plasma treatment of ILD and chemical mechanical polishing (CMP) Cu surfaces using N2/H2 plasma processes. H atoms and radicals…

In this dissertation, remote plasma interactions with the surfaces of low-k interlayer dielectric (ILD), Cu and Cu adhesion layers are investigated. The first part of the study focuses on the simultaneous plasma treatment of ILD and chemical mechanical polishing (CMP) Cu surfaces using N2/H2 plasma processes. H atoms and radicals in the plasma react with the carbon groups leading to carbon removal for the ILD films. Results indicate that an N2 plasma forms an amide-like layer on the surface which apparently leads to reduced carbon abstraction from an H2 plasma process. In addition, FTIR spectra indicate the formation of hydroxyl (Si-OH) groups following the plasma exposure. Increased temperature (380 °C) processing leads to a reduction of the hydroxyl group formation compared to ambient temperature processes, resulting in reduced changes of the dielectric constant. For CMP Cu surfaces, the carbonate contamination was removed by an H2 plasma process at elevated temperature while the C-C and C-H contamination was removed by an N2 plasma process at elevated temperature. The second part of this study examined oxide stability and cleaning of Ru surfaces as well as consequent Cu film thermal stability with the Ru layers. The ~2 monolayer native Ru oxide was reduced after H-plasma processing. The thermal stability or islanding of the Cu film on the Ru substrate was characterized by in-situ XPS. After plasma cleaning of the Ru adhesion layer, the deposited Cu exhibited full coverage. In contrast, for Cu deposition on the Ru native oxide substrate, Cu islanding was detected and was described in terms of grain boundary grooving and surface and interface energies. The thermal stability of 7 nm Ti, Pt and Ru ii interfacial adhesion layers between a Cu film (10 nm) and a Ta barrier layer (4 nm) have been investigated in the third part. The barrier properties and interfacial stability have been evaluated by Rutherford backscattering spectrometry (RBS). Atomic force microscopy (AFM) was used to measure the surfaces before and after annealing, and all the surfaces are relatively smooth excluding islanding or de-wetting phenomena as a cause of the instability. The RBS showed no discernible diffusion across the adhesion layer/Ta and Ta/Si interfaces which provides a stable underlying layer. For a Ti interfacial layer RBS indicates that during 400 °C annealing Ti interdiffuses through the Cu film and accumulates at the surface. For the Pt/Cu system Pt interdiffuion is detected which is less evident than Ti. Among the three adhesion layer candidates, Ru shows negligible diffusion into the Cu film indicating thermal stability at 400 °C.

ContributorsLiu, Xin (Author) / Nemanich, Robert (Thesis advisor) / Chamberlin, Ralph (Committee member) / Chen, Tingyong (Committee member) / Smith, David (Committee member) / Ponce, Fernando (Committee member) / Arizona State University (Publisher)

Created2012

Compiler and runtime for memory management on software managed manycore processors

Description

We are expecting hundreds of cores per chip in the near future. However, scaling the memory architecture in manycore architectures becomes a major challenge. Cache coherence provides a single image of memory at any time in execution to all the cores, yet coherent cache architectures are believed will not scale…

We are expecting hundreds of cores per chip in the near future. However, scaling the memory architecture in manycore architectures becomes a major challenge. Cache coherence provides a single image of memory at any time in execution to all the cores, yet coherent cache architectures are believed will not scale to hundreds and thousands of cores. In addition, caches and coherence logic already take 20-50% of the total power consumption of the processor and 30-60% of die area. Therefore, a more scalable architecture is needed for manycore architectures. Software Managed Manycore (SMM) architectures emerge as a solution. They have scalable memory design in which each core has direct access to only its local scratchpad memory, and any data transfers to/from other memories must be done explicitly in the application using Direct Memory Access (DMA) commands. Lack of automatic memory management in the hardware makes such architectures extremely power-efficient, but they also become difficult to program. If the code/data of the task mapped onto a core cannot fit in the local scratchpad memory, then DMA calls must be added to bring in the code/data before it is required, and it may need to be evicted after its use. However, doing this adds a lot of complexity to the programmer's job. Now programmers must worry about data management, on top of worrying about the functional correctness of the program - which is already quite complex. This dissertation presents a comprehensive compiler and runtime integration to automatically manage the code and data of each task in the limited local memory of the core. We firstly developed a Complete Circular Stack Management. It manages stack frames between the local memory and the main memory, and addresses the stack pointer problem as well. Though it works, we found we could further optimize the management for most cases. Thus a Smart Stack Data Management (SSDM) is provided. In this work, we formulate the stack data management problem and propose a greedy algorithm for the same. Later on, we propose a general cost estimation algorithm, based on which CMSM heuristic for code mapping problem is developed. Finally, heap data is dynamic in nature and therefore it is hard to manage it. We provide two schemes to manage unlimited amount of heap data in constant sized region in the local memory. In addition to those separate schemes for different kinds of data, we also provide a memory partition methodology.

ContributorsBai, Ke (Author) / Shrivastava, Aviral (Thesis advisor) / Chatha, Karamvir (Committee member) / Xue, Guoliang (Committee member) / Chakrabarti, Chaitali (Committee member) / Arizona State University (Publisher)

Created2014

Interface electronic state characterization of plasma enhanced atomic layer deposited dielectrics on GaN

Description

In this dissertation, the interface chemistry and electronic structure of plasma-enhanced atomic layer deposited (PEALD) dielectrics on GaN are investigated with x-ray and ultraviolet photoemission spectroscopy (XPS and UPS). Three interrelated issues are discussed in this study: (1) PEALD dielectric growth process optimization, (2) interface electronic structure of comparative PEALD…

In this dissertation, the interface chemistry and electronic structure of plasma-enhanced atomic layer deposited (PEALD) dielectrics on GaN are investigated with x-ray and ultraviolet photoemission spectroscopy (XPS and UPS). Three interrelated issues are discussed in this study: (1) PEALD dielectric growth process optimization, (2) interface electronic structure of comparative PEALD dielectrics on GaN, and (3) interface electronic structure of PEALD dielectrics on Ga- and N-face GaN. The first study involved an in-depth case study of PEALD Al2O3 growth using dimethylaluminum isopropoxide, with a special focus on oxygen plasma effects. Saturated and self-limiting growth of Al2O3 films were obtained with an enhanced growth rate within the PEALD temperature window (25-220 ºC). The properties of Al2O3 deposited at various temperatures were characterized to better understand the relation between the growth parameters and film properties. In the second study, the interface electronic structures of PEALD dielectrics on Ga-face GaN films were measured. Five promising dielectrics (Al2O3, HfO2, SiO2, La2O3, and ZnO) with a range of band gap energies were chosen. Prior to dielectric growth, a combined wet chemical and in-situ H2/N2 plasma clean process was employed to remove the carbon contamination and prepare the surface for dielectric deposition. The surface band bending and band offsets were measured by XPS and UPS for dielectrics on GaN. The trends of the experimental band offsets on GaN were related to the dielectric band gap energies. In addition, the experimental band offsets were near the calculated values based on the charge neutrality level model. The third study focused on the effect of the polarization bound charge of the Ga- and N-face GaN on interface electronic structures. A surface pretreatment process consisting of a NH4OH wet chemical and an in-situ NH3 plasma treatment was applied to remove carbon contamination, retain monolayer oxygen coverage, and potentially passivate N-vacancy related defects. The surface band bending and polarization charge compensation of Ga- and N-face GaN were investigated. The surface band bending and band offsets were determined for Al2O3, HfO2, and SiO2 on Ga- and N-face GaN. Different dielectric thicknesses and post deposition processing were investigated to understand process related defect formation and/or reduction.

ContributorsYang, Jialing (Author) / Nemanich, Robert J (Thesis advisor) / Chen, Tingyong (Committee member) / Peng, Xihong (Committee member) / Ponce, Fernando (Committee member) / Smith, David (Committee member) / Arizona State University (Publisher)

Created2014

Adaptive learning and unsupervised clustering of immune responses using microarray random sequence peptides

Description

Immunosignaturing is a medical test for assessing the health status of a patient by applying microarrays of random sequence peptides to determine the patient's immune fingerprint by associating antibodies from a biological sample to immune responses. The immunosignature measurements can potentially provide pre-symptomatic diagnosis for infectious diseases or detection of…

Immunosignaturing is a medical test for assessing the health status of a patient by applying microarrays of random sequence peptides to determine the patient's immune fingerprint by associating antibodies from a biological sample to immune responses. The immunosignature measurements can potentially provide pre-symptomatic diagnosis for infectious diseases or detection of biological threats. Currently, traditional bioinformatics tools, such as data mining classification algorithms, are used to process the large amount of peptide microarray data. However, these methods generally require training data and do not adapt to changing immune conditions or additional patient information. This work proposes advanced processing techniques to improve the classification and identification of single and multiple underlying immune response states embedded in immunosignatures, making it possible to detect both known and previously unknown diseases or biothreat agents. Novel adaptive learning methodologies for un- supervised and semi-supervised clustering integrated with immunosignature feature extraction approaches are proposed. The techniques are based on extracting novel stochastic features from microarray binding intensities and use Dirichlet process Gaussian mixture models to adaptively cluster the immunosignatures in the feature space. This learning-while-clustering approach allows continuous discovery of antibody activity by adaptively detecting new disease states, with limited a priori disease or patient information. A beta process factor analysis model to determine underlying patient immune responses is also proposed to further improve the adaptive clustering performance by formatting new relationships between patients and antibody activity. In order to extend the clustering methods for diagnosing multiple states in a patient, the adaptive hierarchical Dirichlet process is integrated with modified beta process factor analysis latent feature modeling to identify relationships between patients and infectious agents. The use of Bayesian nonparametric adaptive learning techniques allows for further clustering if additional patient data is received. Significant improvements in feature identification and immune response clustering are demonstrated using samples from patients with different diseases.

ContributorsMalin, Anna (Author) / Papandreou-Suppappola, Antonia (Thesis advisor) / Bliss, Daniel (Committee member) / Chakrabarti, Chaitali (Committee member) / Kovvali, Narayan (Committee member) / Lacroix, Zoé (Committee member) / Arizona State University (Publisher)

Created2013

Dynamic scheduling of stream programs on embedded multi-core processors

Description

Stream computing has emerged as an importantmodel of computation for embedded system applications particularly in the multimedia and network processing domains. In recent past several programming languages and embedded multi-core processors have been proposed for streaming applications. This thesis examines the execution and dynamic scheduling of stream programs on embedded…

Stream computing has emerged as an importantmodel of computation for embedded system applications particularly in the multimedia and network processing domains. In recent past several programming languages and embedded multi-core processors have been proposed for streaming applications. This thesis examines the execution and dynamic scheduling of stream programs on embedded multi-core processors. The thesis addresses the problem in the context of a multi-tasking environment with a time varying allocation of processing elements for a particular streaming application. As a solution the thesis proposes a two step approach where the stream program is compiled to gather key application information, and to generate re-targetable code. A light weight dynamic scheduler incorporates the second stage of the approach. The dynamic scheduler utilizes the static information and available resources to assign or partition the application across the multi-core architecture. The objective of the dynamic scheduler is to maximize the throughput of the application, and it is sensitive to the resource (processing elements, scratch-pad memory, DMA bandwidth) constraints imposed by the target architecture. We evaluate the proposed approach by compiling and scheduling benchmark stream programs on a representative embedded multi-core processor. We present experimental results that evaluate the quality of the solutions generated by the proposed approach by comparisons with existing techniques.

ContributorsLee, Haeseung (Author) / Chatha, Karamvir (Thesis advisor) / Vrudhula, Sarma (Committee member) / Chakrabarti, Chaitali (Committee member) / Wu, Carole-Jean (Committee member) / Arizona State University (Publisher)

Created2013

ASU Electronic Theses and Dissertations