Matching Items (103)
Filtering by

Clear all filters

149992-Thumbnail Image.png
Description
Process variations have become increasingly important for scaled technologies starting at 45nm. The increased variations are primarily due to random dopant fluctuations, line-edge roughness and oxide thickness fluctuation. These variations greatly impact all aspects of circuit performance and pose a grand challenge to future robust IC design. To improve robustness,

Process variations have become increasingly important for scaled technologies starting at 45nm. The increased variations are primarily due to random dopant fluctuations, line-edge roughness and oxide thickness fluctuation. These variations greatly impact all aspects of circuit performance and pose a grand challenge to future robust IC design. To improve robustness, efficient methodology is required that considers effect of variations in the design flow. Analyzing timing variability of complex circuits with HSPICE simulations is very time consuming. This thesis proposes an analytical model to predict variability in CMOS circuits that is quick and accurate. There are several analytical models to estimate nominal delay performance but very little work has been done to accurately model delay variability. The proposed model is comprehensive and estimates nominal delay and variability as a function of transistor width, load capacitance and transition time. First, models are developed for library gates and the accuracy of the models is verified with HSPICE simulations for 45nm and 32nm technology nodes. The difference between predicted and simulated σ/μ for the library gates is less than 1%. Next, the accuracy of the model for nominal delay is verified for larger circuits including ISCAS'85 benchmark circuits. The model predicted results are within 4% error of HSPICE simulated results and take a small fraction of the time, for 45nm technology. Delay variability is analyzed for various paths and it is observed that non-critical paths can become critical because of Vth variation. Variability on shortest paths show that rate of hold violations increase enormously with increasing Vth variation.
ContributorsGummalla, Samatha (Author) / Chakrabarti, Chaitali (Thesis advisor) / Cao, Yu (Thesis advisor) / Bakkaloglu, Bertan (Committee member) / Arizona State University (Publisher)
Created2011
150167-Thumbnail Image.png
Description
Redundant Binary (RBR) number representations have been extensively used in the past for high-throughput Digital Signal Processing (DSP) systems. Data-path components based on this number system have smaller critical path delay but larger area compared to conventional two's complement systems. This work explores the use of RBR number representation for

Redundant Binary (RBR) number representations have been extensively used in the past for high-throughput Digital Signal Processing (DSP) systems. Data-path components based on this number system have smaller critical path delay but larger area compared to conventional two's complement systems. This work explores the use of RBR number representation for implementing high-throughput DSP systems that are also energy-efficient. Data-path components such as adders and multipliers are evaluated with respect to critical path delay, energy and Energy-Delay Product (EDP). A new design for a RBR adder with very good EDP performance has been proposed. The corresponding RBR parallel adder has a much lower critical path delay and EDP compared to two's complement carry select and carry look-ahead adder implementations. Next, several RBR multiplier architectures are investigated and their performance compared to two's complement systems. These include two new multiplier architectures: a purely RBR multiplier where both the operands are in RBR form, and a hybrid multiplier where the multiplicand is in RBR form and the other operand is represented in conventional two's complement form. Both the RBR and hybrid designs are demonstrated to have better EDP performance compared to conventional two's complement multipliers. The hybrid multiplier is also shown to have a superior EDP performance compared to the RBR multiplier, with much lower implementation area. Analysis on the effect of bit-precision is also performed, and it is shown that the performance gain of RBR systems improves for higher bit precision. Next, in order to demonstrate the efficacy of the RBR representation at the system-level, the performance of RBR and hybrid implementations of some common DSP kernels such as Discrete Cosine Transform, edge detection using Sobel operator, complex multiplication, Lifting-based Discrete Wavelet Transform (9, 7) filter, and FIR filter, is compared with two's complement systems. It is shown that for relatively large computation modules, the RBR to two's complement conversion overhead gets amortized. In case of systems with high complexity, for iso-throughput, both the hybrid and RBR implementations are demonstrated to be superior with lower average energy consumption. For low complexity systems, the conversion overhead is significant, and overpowers the EDP performance gain obtained from the RBR computation operation.
ContributorsMahadevan, Rupa (Author) / Chakrabarti, Chaitali (Thesis advisor) / Kiaei, Sayfe (Committee member) / Cao, Yu (Committee member) / Arizona State University (Publisher)
Created2011
150187-Thumbnail Image.png
Description
Genomic and proteomic sequences, which are in the form of deoxyribonucleic acid (DNA) and amino acids respectively, play a vital role in the structure, function and diversity of every living cell. As a result, various genomic and proteomic sequence processing methods have been proposed from diverse disciplines, including biology, chemistry,

Genomic and proteomic sequences, which are in the form of deoxyribonucleic acid (DNA) and amino acids respectively, play a vital role in the structure, function and diversity of every living cell. As a result, various genomic and proteomic sequence processing methods have been proposed from diverse disciplines, including biology, chemistry, physics, computer science and electrical engineering. In particular, signal processing techniques were applied to the problems of sequence querying and alignment, that compare and classify regions of similarity in the sequences based on their composition. However, although current approaches obtain results that can be attributed to key biological properties, they require pre-processing and lack robustness to sequence repetitions. In addition, these approaches do not provide much support for efficiently querying sub-sequences, a process that is essential for tracking localized database matches. In this work, a query-based alignment method for biological sequences that maps sequences to time-domain waveforms before processing the waveforms for alignment in the time-frequency plane is first proposed. The mapping uses waveforms, such as time-domain Gaussian functions, with unique sequence representations in the time-frequency plane. The proposed alignment method employs a robust querying algorithm that utilizes a time-frequency signal expansion whose basis function is matched to the basic waveform in the mapped sequences. The resulting WAVEQuery approach is demonstrated for both DNA and protein sequences using the matching pursuit decomposition as the signal basis expansion. The alignment localization of WAVEQuery is specifically evaluated over repetitive database segments, and operable in real-time without pre-processing. It is demonstrated that WAVEQuery significantly outperforms the biological sequence alignment method BLAST for queries with repetitive segments for DNA sequences. A generalized version of the WAVEQuery approach with the metaplectic transform is also described for protein sequence structure prediction. For protein alignment, it is often necessary to not only compare the one-dimensional (1-D) primary sequence structure but also the secondary and tertiary three-dimensional (3-D) space structures. This is done after considering the conformations in the 3-D space due to the degrees of freedom of these structures. As a result, a novel directionality based 3-D waveform mapping for the 3-D protein structures is also proposed and it is used to compare protein structures using a matched filter approach. By incorporating a 3-D time axis, a highly-localized Gaussian-windowed chirp waveform is defined, and the amino acid information is mapped to the chirp parameters that are then directly used to obtain directionality in the 3-D space. This mapping is unique in that additional characteristic protein information such as hydrophobicity, that relates the sequence with the structure, can be added as another representation parameter. The additional parameter helps tracking similarities over local segments of the structure, this enabling classification of distantly related proteins which have partial structural similarities. This approach is successfully tested for pairwise alignments over full length structures, alignments over multiple structures to form a phylogenetic trees, and also alignments over local segments. Also, basic classification over protein structural classes using directional descriptors for the protein structure is performed.
ContributorsRavichandran, Lakshminarayan (Author) / Papandreou-Suppappola, Antonia (Thesis advisor) / Spanias, Andreas S (Thesis advisor) / Chakrabarti, Chaitali (Committee member) / Tepedelenlioğlu, Cihan (Committee member) / Lacroix, Zoé (Committee member) / Arizona State University (Publisher)
Created2011
152360-Thumbnail Image.png
Description
In this work, we present approximate adders and multipliers to reduce data-path complexity of specialized hardware for various image processing systems. These approximate circuits have a lower area, latency and power consumption compared to their accurate counterparts and produce fairly accurate results. We build upon the work on approximate adders

In this work, we present approximate adders and multipliers to reduce data-path complexity of specialized hardware for various image processing systems. These approximate circuits have a lower area, latency and power consumption compared to their accurate counterparts and produce fairly accurate results. We build upon the work on approximate adders and multipliers presented in [23] and [24]. First, we show how choice of algorithm and parallel adder design can be used to implement 2D Discrete Cosine Transform (DCT) algorithm with good performance but low area. Our implementation of the 2D DCT has comparable PSNR performance with respect to the algorithm presented in [23] with ~35-50% reduction in area. Next, we use the approximate 2x2 multiplier presented in [24] to implement parallel approximate multipliers. We demonstrate that if some of the 2x2 multipliers in the design of the parallel multiplier are accurate, the accuracy of the multiplier improves significantly, especially when two large numbers are multiplied. We choose Gaussian FIR Filter and Fast Fourier Transform (FFT) algorithms to illustrate the efficacy of our proposed approximate multiplier. We show that application of the proposed approximate multiplier improves the PSNR performance of 32x32 FFT implementation by 4.7 dB compared to the implementation using the approximate multiplier described in [24]. We also implement a state-of-the-art image enlargement algorithm, namely Segment Adaptive Gradient Angle (SAGA) [29], in hardware. The algorithm is mapped to pipelined hardware blocks and we synthesized the design using 90 nm technology. We show that a 64x64 image can be processed in 496.48 µs when clocked at 100 MHz. The average PSNR performance of our implementation using accurate parallel adders and multipliers is 31.33 dB and that using approximate parallel adders and multipliers is 30.86 dB, when evaluated against the original image. The PSNR performance of both designs is comparable to the performance of the double precision floating point MATLAB implementation of the algorithm.
ContributorsVasudevan, Madhu (Author) / Chakrabarti, Chaitali (Thesis advisor) / Frakes, David (Committee member) / Gupta, Sandeep (Committee member) / Arizona State University (Publisher)
Created2013
152344-Thumbnail Image.png
Description
Structural integrity is an important characteristic of performance for critical components used in applications such as aeronautics, materials, construction and transportation. When appraising the structural integrity of these components, evaluation methods must be accurate. In addition to possessing capability to perform damage detection, the ability to monitor the level of

Structural integrity is an important characteristic of performance for critical components used in applications such as aeronautics, materials, construction and transportation. When appraising the structural integrity of these components, evaluation methods must be accurate. In addition to possessing capability to perform damage detection, the ability to monitor the level of damage over time can provide extremely useful information in assessing the operational worthiness of a structure and in determining whether the structure should be repaired or removed from service. In this work, a sequential Bayesian approach with active sensing is employed for monitoring crack growth within fatigue-loaded materials. The monitoring approach is based on predicting crack damage state dynamics and modeling crack length observations. Since fatigue loading of a structural component can change while in service, an interacting multiple model technique is employed to estimate probabilities of different loading modes and incorporate this information in the crack length estimation problem. For the observation model, features are obtained from regions of high signal energy in the time-frequency plane and modeled for each crack length damage condition. Although this observation model approach exhibits high classification accuracy, the resolution characteristics can change depending upon the extent of the damage. Therefore, several different transmission waveforms and receiver sensors are considered to create multiple modes for making observations of crack damage. Resolution characteristics of the different observation modes are assessed using a predicted mean squared error criterion and observations are obtained using the predicted, optimal observation modes based on these characteristics. Calculation of the predicted mean square error metric can be computationally intensive, especially if performed in real time, and an approximation method is proposed. With this approach, the real time computational burden is decreased significantly and the number of possible observation modes can be increased. Using sensor measurements from real experiments, the overall sequential Bayesian estimation approach, with the adaptive capability of varying the state dynamics and observation modes, is demonstrated for tracking crack damage.
ContributorsHuff, Daniel W (Author) / Papandreou-Suppappola, Antonia (Thesis advisor) / Kovvali, Narayan (Committee member) / Chakrabarti, Chaitali (Committee member) / Chattopadhyay, Aditi (Committee member) / Arizona State University (Publisher)
Created2013
151465-Thumbnail Image.png
Description
Adaptive processing and classification of electrocardiogram (ECG) signals are important in eliminating the strenuous process of manually annotating ECG recordings for clinical use. Such algorithms require robust models whose parameters can adequately describe the ECG signals. Although different dynamic statistical models describing ECG signals currently exist, they depend considerably on

Adaptive processing and classification of electrocardiogram (ECG) signals are important in eliminating the strenuous process of manually annotating ECG recordings for clinical use. Such algorithms require robust models whose parameters can adequately describe the ECG signals. Although different dynamic statistical models describing ECG signals currently exist, they depend considerably on a priori information and user-specified model parameters. Also, ECG beat morphologies, which vary greatly across patients and disease states, cannot be uniquely characterized by a single model. In this work, sequential Bayesian based methods are used to appropriately model and adaptively select the corresponding model parameters of ECG signals. An adaptive framework based on a sequential Bayesian tracking method is proposed to adaptively select the cardiac parameters that minimize the estimation error, thus precluding the need for pre-processing. Simulations using real ECG data from the online Physionet database demonstrate the improvement in performance of the proposed algorithm in accurately estimating critical heart disease parameters. In addition, two new approaches to ECG modeling are presented using the interacting multiple model and the sequential Markov chain Monte Carlo technique with adaptive model selection. Both these methods can adaptively choose between different models for various ECG beat morphologies without requiring prior ECG information, as demonstrated by using real ECG signals. A supervised Bayesian maximum-likelihood (ML) based classifier uses the estimated model parameters to classify different types of cardiac arrhythmias. However, the non-availability of sufficient amounts of representative training data and the large inter-patient variability pose a challenge to the existing supervised learning algorithms, resulting in a poor classification performance. In addition, recently developed unsupervised learning methods require a priori knowledge on the number of diseases to cluster the ECG data, which often evolves over time. In order to address these issues, an adaptive learning ECG classification method that uses Dirichlet process Gaussian mixture models is proposed. This approach does not place any restriction on the number of disease classes, nor does it require any training data. This algorithm is adapted to be patient-specific by labeling or identifying the generated mixtures using the Bayesian ML method, assuming the availability of labeled training data.
ContributorsEdla, Shwetha Reddy (Author) / Papandreou-Suppappola, Antonia (Thesis advisor) / Chakrabarti, Chaitali (Committee member) / Kovvali, Narayan (Committee member) / Tepedelenlioğlu, Cihan (Committee member) / Arizona State University (Publisher)
Created2012
151700-Thumbnail Image.png
Description
Ultrasound imaging is one of the major medical imaging modalities. It is cheap, non-invasive and has low power consumption. Doppler processing is an important part of many ultrasound imaging systems. It is used to provide blood velocity information and is built on top of B-mode systems. We investigate the performance

Ultrasound imaging is one of the major medical imaging modalities. It is cheap, non-invasive and has low power consumption. Doppler processing is an important part of many ultrasound imaging systems. It is used to provide blood velocity information and is built on top of B-mode systems. We investigate the performance of two velocity estimation schemes used in Doppler processing systems, namely, directional velocity estimation (DVE) and conventional velocity estimation (CVE). We find that DVE provides better estimation performance and is the only functioning method when the beam to flow angle is large. Unfortunately, DVE is computationally expensive and also requires divisions and square root operations that are hard to implement. We propose two approximation techniques to replace these computations. The simulation results on cyst images show that the proposed approximations do not affect the estimation performance. We also study backend processing which includes envelope detection, log compression and scan conversion. Three different envelope detection methods are compared. Among them, FIR based Hilbert Transform is considered the best choice when phase information is not needed, while quadrature demodulation is a better choice if phase information is necessary. Bilinear and Gaussian interpolation are considered for scan conversion. Through simulations of a cyst image, we show that bilinear interpolation provides comparable contrast-to-noise ratio (CNR) performance with Gaussian interpolation and has lower computational complexity. Thus, bilinear interpolation is chosen for our system.
ContributorsWei, Siyuan (Author) / Chakrabarti, Chaitali (Thesis advisor) / Frakes, David (Committee member) / Papandreou-Suppappola, Antonia (Committee member) / Arizona State University (Publisher)
Created2013
152139-Thumbnail Image.png
Description
ABSTRACT Developing new non-traditional device models is gaining popularity as the silicon-based electrical device approaches its limitation when it scales down. Membrane systems, also called P systems, are a new class of biological computation model inspired by the way cells process chemical signals. Spiking Neural P systems (SNP systems), a

ABSTRACT Developing new non-traditional device models is gaining popularity as the silicon-based electrical device approaches its limitation when it scales down. Membrane systems, also called P systems, are a new class of biological computation model inspired by the way cells process chemical signals. Spiking Neural P systems (SNP systems), a certain kind of membrane systems, is inspired by the way the neurons in brain interact using electrical spikes. Compared to the traditional Boolean logic, SNP systems not only perform similar functions but also provide a more promising solution for reliable computation. Two basic neuron types, Low Pass (LP) neurons and High Pass (HP) neurons, are introduced. These two basic types of neurons are capable to build an arbitrary SNP neuron. This leads to the conclusion that these two basic neuron types are Turing complete since SNP systems has been proved Turing complete. These two basic types of neurons are further used as the elements to construct general-purpose arithmetic circuits, such as adder, subtractor and comparator. In this thesis, erroneous behaviors of neurons are discussed. Transmission error (spike loss) is proved to be equivalent to threshold error, which makes threshold error discussion more universal. To improve the reliability, a new structure called motif is proposed. Compared to Triple Modular Redundancy improvement, motif design presents its efficiency and effectiveness in both single neuron and arithmetic circuit analysis. DRAM-based CMOS circuits are used to implement the two basic types of neurons. Functionality of basic type neurons is proved using the SPICE simulations. The motif improved adder and the comparator, as compared to conventional Boolean logic design, are much more reliable with lower leakage, and smaller silicon area. This leads to the conclusion that SNP system could provide a more promising solution for reliable computation than the conventional Boolean logic.
ContributorsAn, Pei (Author) / Cao, Yu (Thesis advisor) / Barnaby, Hugh (Committee member) / Chakrabarti, Chaitali (Committee member) / Arizona State University (Publisher)
Created2013
152173-Thumbnail Image.png
Description
Stream computing has emerged as an importantmodel of computation for embedded system applications particularly in the multimedia and network processing domains. In recent past several programming languages and embedded multi-core processors have been proposed for streaming applications. This thesis examines the execution and dynamic scheduling of stream programs on embedded

Stream computing has emerged as an importantmodel of computation for embedded system applications particularly in the multimedia and network processing domains. In recent past several programming languages and embedded multi-core processors have been proposed for streaming applications. This thesis examines the execution and dynamic scheduling of stream programs on embedded multi-core processors. The thesis addresses the problem in the context of a multi-tasking environment with a time varying allocation of processing elements for a particular streaming application. As a solution the thesis proposes a two step approach where the stream program is compiled to gather key application information, and to generate re-targetable code. A light weight dynamic scheduler incorporates the second stage of the approach. The dynamic scheduler utilizes the static information and available resources to assign or partition the application across the multi-core architecture. The objective of the dynamic scheduler is to maximize the throughput of the application, and it is sensitive to the resource (processing elements, scratch-pad memory, DMA bandwidth) constraints imposed by the target architecture. We evaluate the proposed approach by compiling and scheduling benchmark stream programs on a representative embedded multi-core processor. We present experimental results that evaluate the quality of the solutions generated by the proposed approach by comparisons with existing techniques.
ContributorsLee, Haeseung (Author) / Chatha, Karamvir (Thesis advisor) / Vrudhula, Sarma (Committee member) / Chakrabarti, Chaitali (Committee member) / Wu, Carole-Jean (Committee member) / Arizona State University (Publisher)
Created2013
151310-Thumbnail Image.png
Description
Characterization of standard cells is one of the crucial steps in the IC design. Scaling of CMOS technology has lead to timing un-certainties such as that of cross coupling noise due to interconnect parasitic, skew variation due to voltage jitter and proximity effect of multiple inputs switching (MIS). Due to

Characterization of standard cells is one of the crucial steps in the IC design. Scaling of CMOS technology has lead to timing un-certainties such as that of cross coupling noise due to interconnect parasitic, skew variation due to voltage jitter and proximity effect of multiple inputs switching (MIS). Due to increased operating frequency and process variation, the probability of MIS occurrence and setup / hold failure within a clock cycle is high. The delay variation due to temporal proximity of MIS is significant for multiple input gates in the standard cell library. The shortest paths are affected by MIS due to the lack of averaging effect. Thus, sensitive designs such as that of SRAM row and column decoder circuits have high probability for MIS impact. The traditional static timing analysis (STA) assumes single input switching (SIS) scenario which is not adequate enough to capture gate delay accurately, as the delay variation due to temporal proximity of the MIS is ~15%-45%. Whereas, considering all possible scenarios of MIS for characterization is computationally intensive with huge data volume. Various modeling techniques are developed for the characterization of MIS effect. Some techniques require coefficient extraction through multiple spice simulation, and do not discuss speed up approach or apply models with complicated algorithms to account for MIS effect. The STA flow accounts for process variation through uncertainty parameter to improve product yield. Some of the MIS delay variability models account for MIS variation through table look up approach, resulting in huge data volume or do not consider propagation of RAT in the design flow. Thus, there is a need for a methodology to model MIS effect with less computational resource, and integration of such effect into design flow without trading off the accuracy. A finite-point based analytical model for MIS effect is proposed for multiple input logic gates and similar approach is extended for setup/hold characterization of sequential elements. Integration of MIS variation into design flow is explored. The proposed methodology is validated using benchmark circuits at 45nm technology node under process variation. Experimental results show significant reduction in runtime and data volume with ~10% error compared to that of SPICE simulation.
ContributorsSubramaniam, Anupama R (Author) / Cao, Yu (Thesis advisor) / Chakrabarti, Chaitali (Committee member) / Roveda, Janet (Committee member) / Yu, Hongbin (Committee member) / Arizona State University (Publisher)
Created2012