Matching Items (8)

152360-Thumbnail Image.png

Image processing using approximate data-path units

Description

In this work, we present approximate adders and multipliers to reduce data-path complexity of specialized hardware for various image processing systems. These approximate circuits have a lower area, latency and

In this work, we present approximate adders and multipliers to reduce data-path complexity of specialized hardware for various image processing systems. These approximate circuits have a lower area, latency and power consumption compared to their accurate counterparts and produce fairly accurate results. We build upon the work on approximate adders and multipliers presented in [23] and [24]. First, we show how choice of algorithm and parallel adder design can be used to implement 2D Discrete Cosine Transform (DCT) algorithm with good performance but low area. Our implementation of the 2D DCT has comparable PSNR performance with respect to the algorithm presented in [23] with ~35-50% reduction in area. Next, we use the approximate 2x2 multiplier presented in [24] to implement parallel approximate multipliers. We demonstrate that if some of the 2x2 multipliers in the design of the parallel multiplier are accurate, the accuracy of the multiplier improves significantly, especially when two large numbers are multiplied. We choose Gaussian FIR Filter and Fast Fourier Transform (FFT) algorithms to illustrate the efficacy of our proposed approximate multiplier. We show that application of the proposed approximate multiplier improves the PSNR performance of 32x32 FFT implementation by 4.7 dB compared to the implementation using the approximate multiplier described in [24]. We also implement a state-of-the-art image enlargement algorithm, namely Segment Adaptive Gradient Angle (SAGA) [29], in hardware. The algorithm is mapped to pipelined hardware blocks and we synthesized the design using 90 nm technology. We show that a 64x64 image can be processed in 496.48 µs when clocked at 100 MHz. The average PSNR performance of our implementation using accurate parallel adders and multipliers is 31.33 dB and that using approximate parallel adders and multipliers is 30.86 dB, when evaluated against the original image. The PSNR performance of both designs is comparable to the performance of the double precision floating point MATLAB implementation of the algorithm.

Contributors

Agent

Created

Date Created
  • 2013

151474-Thumbnail Image.png

Medical implant receiver system

Description

The medical industry has benefited greatly by electronic integration resulting in the explosive growth of active medical implants. These devices often treat and monitor chronic health conditions and require very

The medical industry has benefited greatly by electronic integration resulting in the explosive growth of active medical implants. These devices often treat and monitor chronic health conditions and require very minimal power usage. A key part of these medical implants is an ultra-low power two way wireless communication system. This enables both control of the implant as well as relay of information collected. This research has focused on a high performance receiver for medical implant applications. One commonly quoted specification to compare receivers is energy per bit required. This metric is useful, but incomplete in that it ignores Sensitivity level, bit error rate, and immunity to interferers. In this study exploration of receiver architectures and convergence upon a comprehensive solution is done. This analysis is used to design and build a system for validation. The Direct Conversion Receiver architecture implemented for the MICS standard in 0.18 µm CMOS process consumes approximately 2 mW is competitive with published research.

Contributors

Agent

Created

Date Created
  • 2012

154195-Thumbnail Image.png

Energy-efficient digital circuit design using threshold logic gates

Description

Improving energy efficiency has always been the prime objective of the custom and automated digital circuit design techniques. As a result, a multitude of methods to reduce power without sacrificing

Improving energy efficiency has always been the prime objective of the custom and automated digital circuit design techniques. As a result, a multitude of methods to reduce power without sacrificing performance have been proposed. However, as the field of design automation has matured over the last few decades, there have been no new automated design techniques, that can provide considerable improvements in circuit power, leakage and area. Although emerging nano-devices are expected to replace the existing MOSFET devices, they are far from being as mature as semiconductor devices and their full potential and promises are many years away from being practical.

The research described in this dissertation consists of four main parts. First is a new circuit architecture of a differential threshold logic flipflop called PNAND. The PNAND gate is an edge-triggered multi-input sequential cell whose next state function is a threshold function of its inputs. Second a new approach, called hybridization, that replaces flipflops and parts of their logic cones with PNAND cells is described. The resulting \hybrid circuit, which consists of conventional logic cells and PNANDs, is shown to have significantly less power consumption, smaller area, less standby power and less power variation.

Third, a new architecture of a field programmable array, called field programmable threshold logic array (FPTLA), in which the standard lookup table (LUT) is replaced by a PNAND is described. The FPTLA is shown to have as much as 50% lower energy-delay product compared to conventional FPGA using well known FPGA modeling tool called VPR.

Fourth, a novel clock skewing technique that makes use of the completion detection feature of the differential mode flipflops is described. This clock skewing method improves the area and power of the ASIC circuits by increasing slack on timing paths. An additional advantage of this method is the elimination of hold time violation on given short paths.

Several circuit design methodologies such as retiming and asynchronous circuit design can use the proposed threshold logic gate effectively. Therefore, the use of threshold logic flipflops in conventional design methodologies opens new avenues of research towards more energy-efficient circuits.

Contributors

Agent

Created

Date Created
  • 2015

158679-Thumbnail Image.png

Experimental Evaluation of the Feasibility of Wearable Piezoelectric Energy Harvesting

Description

Technological advances in low power wearable electronics and energy optimization techniques

make motion energy harvesting a viable energy source. However, it has not been

widely adopted due to bulky energy harvester designs

Technological advances in low power wearable electronics and energy optimization techniques

make motion energy harvesting a viable energy source. However, it has not been

widely adopted due to bulky energy harvester designs that are uncomfortable to wear. This

work addresses this problem by analyzing the feasibility of powering low wearable power

devices using piezoelectric energy generated at the human knee. We start with a novel

mathematical model for estimating the power generated from human knee joint movements.

This thesis’s major contribution is to analyze the feasibility of human motion energy harvesting

and validating this analytical model using a commercially available piezoelectric

module. To this end, we implemented an experimental setup that replicates a human knee.

Then, we performed experiments at different excitation frequencies and amplitudes with

two commercially available Macro Fiber Composite (MFC) modules. These experimental

results are used to validate the analytical model and predict the energy harvested as a function

of the number of steps taken in a day. The model estimates that 13μWcan be generated

on an average while walking with a 4.8% modeling error. The obtained results show that

piezoelectricity is indeed a viable approach for powering low-power wearable devices.

Contributors

Agent

Created

Date Created
  • 2020

154931-Thumbnail Image.png

Integrated CMOS-based low power electrochemical impedance spectroscopy for biomedical applications

Description

This thesis dissertation presents design of portable low power Electrochemical Impedance Spectroscopy (EIS) system which can be used for biomedical applications such as tear diagnosis, blood diagnosis, or any other

This thesis dissertation presents design of portable low power Electrochemical Impedance Spectroscopy (EIS) system which can be used for biomedical applications such as tear diagnosis, blood diagnosis, or any other body-fluid diagnosis. Two design methodologies are explained in this dissertation (a) a discrete component-based portable low-power EIS system and (b) an integrated CMOS-based portable low-power EIS system. Both EIS systems were tested in a laboratory environment and the characterization results are compared. The advantages and disadvantages of the integrated EIS system relative to the discrete component-based EIS system are presented including experimental data. The specifications of both EIS systems are compared with commercially available non-portable EIS workstations. These designed EIS systems are handheld and very low-cost relative to the currently available commercial EIS workstations.

Contributors

Agent

Created

Date Created
  • 2016

152924-Thumbnail Image.png

Design techniques for ultra-low noise and low power low dropout (LDO) regulators

Description

Modern day deep sub-micron SOC architectures often demand very low supply noise levels. As supply voltage decreases with decreasing deep sub-micron gate length, noise on the power supply starts playing

Modern day deep sub-micron SOC architectures often demand very low supply noise levels. As supply voltage decreases with decreasing deep sub-micron gate length, noise on the power supply starts playing a dominant role in noise-sensitive analog blocks, especially high precision ADC, PLL, and RF SOC's. Most handheld and portable applications and highly sensitive medical instrumentation circuits tend to use low noise regulators as on-chip or on board power supply. Nonlinearities associated with LNA's, mixers and oscillators up-convert low frequency noise with the signal band. Specifically, synthesizer and TCXO phase noise, LNA and mixer noise figure, and adjacent channel power ratios of the PA are heavily influenced by the supply noise and ripple. This poses a stringent requirement on a very low noise power supply with high accuracy and fast transient response. Low Dropout (LDO) regulators are preferred over switching regulators for these applications due to their attractive low noise and low ripple features. LDO's shield sensitive blocks from high frequency fluctuations on the power supply while providing high accuracy, fast response supply regulation.

This research focuses on developing innovative techniques to reduce the noise of any generic wideband LDO, stable with or without load capacitor. The proposed techniques include Switched RC Filtering to reduce the Bandgap Reference noise, Current Mode Chopping to reduce the Error Amplifier noise & MOS-R based RC filter to reduce the noise due to bias current. The residual chopping ripple was reduced using a Switched Capacitor notch filter. Using these techniques, the integrated noise of a wideband LDO was brought down to 15µV in the integration band of 10Hz to 100kHz. These techniques can be integrated into any generic LDO without any significant area overhead.

Contributors

Agent

Created

Date Created
  • 2014

153334-Thumbnail Image.png

In support of high quality 3-D ultrasound imaging for hand-held devices

Description

Three dimensional (3-D) ultrasound is safe, inexpensive, and has been shown to drastically improve system ease-of-use, diagnostic efficiency, and patient throughput. However, its high computational complexity and resulting high power

Three dimensional (3-D) ultrasound is safe, inexpensive, and has been shown to drastically improve system ease-of-use, diagnostic efficiency, and patient throughput. However, its high computational complexity and resulting high power consumption has precluded its use in hand-held applications.

In this dissertation, algorithm-architecture co-design techniques that aim to make hand-held 3-D ultrasound a reality are presented. First, image enhancement methods to improve signal-to-noise ratio (SNR) are proposed. These include virtual source firing techniques and a low overhead digital front-end architecture using orthogonal chirps and orthogonal Golay codes.

Second, algorithm-architecture co-design techniques to reduce the power consumption of 3-D SAU imaging systems is presented. These include (i) a subaperture multiplexing strategy and the corresponding apodization method to alleviate the signal bandwidth bottleneck, and (ii) a highly efficient iterative delay calculation method to eliminate complex operations such as multiplications, divisions and square-root in delay calculation during beamforming. These techniques were used to define Sonic Millip3De, a 3-D die stacked architecture for digital beamforming in SAU systems. Sonic Millip3De produces 3-D high resolution images at 2 frames per second with system power consumption of 15W in 45nm technology.

Third, a new beamforming method based on separable delay decomposition is proposed to reduce the computational complexity of the beamforming unit in an SAU system. The method is based on minimizing the root-mean-square error (RMSE) due to delay decomposition. It reduces the beamforming complexity of a SAU system by 19x while providing high image fidelity that is comparable to non-separable beamforming. The resulting modified Sonic Millip3De architecture supports a frame rate of 32 volumes per second while maintaining power consumption of 15W in 45nm technology.

Next a 3-D plane-wave imaging system that utilizes both separable beamforming and coherent compounding is presented. The resulting system has computational complexity comparable to that of a non-separable non-compounding baseline system while significantly improving contrast-to-noise ratio and SNR. The modified Sonic Millip3De architecture is now capable of generating high resolution images at 1000 volumes per second with 9-fire-angle compounding.

Contributors

Agent

Created

Date Created
  • 2015

149617-Thumbnail Image.png

Topics in power and performance optimization of embedded systems

Description

The ubiquity of embedded computational systems has exploded in recent years impacting everything from hand-held computers and automotive driver assistance to battlefield command and control and autonomous systems. Typical embedded

The ubiquity of embedded computational systems has exploded in recent years impacting everything from hand-held computers and automotive driver assistance to battlefield command and control and autonomous systems. Typical embedded computing systems are characterized by highly resource constrained operating environments. In particular, limited energy resources constrain performance in embedded systems often reliant on independent fuel or battery supplies. Ultimately, mitigating energy consumption without sacrificing performance in these systems is paramount. In this work power/performance optimization emphasizing prevailing data centric applications including video and signal processing is addressed for energy constrained embedded systems. Frameworks are presented which exchange quality of service (QoS) for reduced power consumption enabling power aware energy management. Power aware systems provide users with tools for precisely managing available energy resources in light of user priorities, extending availability when QoS can be sacrificed. Specifically, power aware management tools for next generation bistable electrophoretic displays and the state of the art H.264 video codec are introduced. The multiprocessor system on chip (MPSoC) paradigm is examined in the context of next generation many-core hand-held computing devices. MPSoC architectures promise to breach the power/performance wall prohibiting advancement of complex high performance single core architectures. Several many-core distributed memory MPSoC architectures are commercially available, while the tools necessary to effectively tap their enormous potential remain largely open for discovery. Adaptable scalability in many-core systems is addressed through a scalable high performance multicore H.264 video decoder implemented on the representative Cell Broadband Engine (CBE) architecture. The resulting agile performance scalable system enables efficient adaptive power optimization via decoding-rate driven sleep and voltage/frequency state management. The significant problem of mapping applications onto these architectures is additionally addressed from the perspective of instruction mapping for limited distributed memory architectures with a code overlay generator implemented on the CBE. Finally runtime scheduling and mapping of scalable applications in multitasking environments is addressed through the introduction of a lightweight work partitioning framework targeting streaming applications with low latency and near optimal throughput demonstrated on the CBE.

Contributors

Agent

Created

Date Created
  • 2011