Search Content

Energy-efficient scheduling for heterogeneous servers in the dark silicon era

Description

Driven by stringent power and thermal constraints, heterogeneous multi-core processors, such as the ARM big-LITTLE architecture, are becoming increasingly popular. In this thesis, the use of low-power heterogeneous multi-cores as Microservers using web search as a motivational application is addressed. In particular, I propose a new family of scheduling policies…

Driven by stringent power and thermal constraints, heterogeneous multi-core processors, such as the ARM big-LITTLE architecture, are becoming increasingly popular. In this thesis, the use of low-power heterogeneous multi-cores as Microservers using web search as a motivational application is addressed. In particular, I propose a new family of scheduling policies for heterogeneous microservers that assign incoming search queries to available cores so as to optimize for performance metrics such as mean response time and service level agreements, while guaranteeing thermally-safe operation. Thorough experimental evaluations on a big-LITTLE platform demonstrate, on an heterogeneous eight-core Samsung Exynos 5422 MpSoC, with four big and little cores each, that naive performance oriented scheduling policies quickly result in thermal instability, while the proposed policies not only reduce peak temperature but also achieve 4.8x reduction in processing time and 5.6x increase in energy efficiency compared to baseline scheduling policies.

ContributorsJain, Sankalp (Author) / Ogras, Umit Y. (Thesis advisor) / Garg, Siddharth (Committee member) / Chakrabarti, Chaitali (Committee member) / Arizona State University (Publisher)

Created2015

In support of high quality 3-D ultrasound imaging for hand-held devices

Description

Three dimensional (3-D) ultrasound is safe, inexpensive, and has been shown to drastically improve system ease-of-use, diagnostic efficiency, and patient throughput. However, its high computational complexity and resulting high power consumption has precluded its use in hand-held applications.

In this dissertation, algorithm-architecture co-design techniques that aim to make hand-held 3-D ultrasound…

Three dimensional (3-D) ultrasound is safe, inexpensive, and has been shown to drastically improve system ease-of-use, diagnostic efficiency, and patient throughput. However, its high computational complexity and resulting high power consumption has precluded its use in hand-held applications.

In this dissertation, algorithm-architecture co-design techniques that aim to make hand-held 3-D ultrasound a reality are presented. First, image enhancement methods to improve signal-to-noise ratio (SNR) are proposed. These include virtual source firing techniques and a low overhead digital front-end architecture using orthogonal chirps and orthogonal Golay codes.

Second, algorithm-architecture co-design techniques to reduce the power consumption of 3-D SAU imaging systems is presented. These include (i) a subaperture multiplexing strategy and the corresponding apodization method to alleviate the signal bandwidth bottleneck, and (ii) a highly efficient iterative delay calculation method to eliminate complex operations such as multiplications, divisions and square-root in delay calculation during beamforming. These techniques were used to define Sonic Millip3De, a 3-D die stacked architecture for digital beamforming in SAU systems. Sonic Millip3De produces 3-D high resolution images at 2 frames per second with system power consumption of 15W in 45nm technology.

Third, a new beamforming method based on separable delay decomposition is proposed to reduce the computational complexity of the beamforming unit in an SAU system. The method is based on minimizing the root-mean-square error (RMSE) due to delay decomposition. It reduces the beamforming complexity of a SAU system by 19x while providing high image fidelity that is comparable to non-separable beamforming. The resulting modified Sonic Millip3De architecture supports a frame rate of 32 volumes per second while maintaining power consumption of 15W in 45nm technology.

Next a 3-D plane-wave imaging system that utilizes both separable beamforming and coherent compounding is presented. The resulting system has computational complexity comparable to that of a non-separable non-compounding baseline system while significantly improving contrast-to-noise ratio and SNR. The modified Sonic Millip3De architecture is now capable of generating high resolution images at 1000 volumes per second with 9-fire-angle compounding.

ContributorsYang, Ming (Author) / Chakrabarti, Chaitali (Thesis advisor) / Papandreou-Suppappola, Antonia (Committee member) / Karam, Lina (Committee member) / Frakes, David (Committee member) / Ogras, Umit Y. (Committee member) / Arizona State University (Publisher)

Created2015

A low power digital controller for DC-DC converter applications with integrated PFM mode detector

Description

Switching Converters (SC) are an excellent choice for hand held devices due to their high power conversion efficiency. However, they suffer from two major drawbacks. The first drawback is that their dynamic response is sensitive to variations in inductor (L) and capacitor (C) values. A cost effective solution is implemented…

Switching Converters (SC) are an excellent choice for hand held devices due to their high power conversion efficiency. However, they suffer from two major drawbacks. The first drawback is that their dynamic response is sensitive to variations in inductor (L) and capacitor (C) values. A cost effective solution is implemented by designing a programmable digital controller. Despite variations in L and C values, the target dynamic response can be achieved by computing and programming the filter coefficients for a particular L and C. Besides, digital controllers have higher immunity to environmental changes such as temperature and aging of components. The second drawback of SCs is their poor efficiency during low load conditions if operated in Pulse Width Modulation (PWM) mode. However, if operated in Pulse Frequency Modulation (PFM) mode, better efficiency numbers can be achieved. A mostly-digital way of detecting PFM mode is implemented. Besides, a slow serial interface to program the chip, and a high speed serial interface to characterize mixed signal blocks as well as to ship data in or out for debug purposes are designed. The chip is taped out in 0.18µm IBM's radiation hardened CMOS process technology. A test board is built with the chip, external power FETs and driver IC. At the time of this writing, PWM operation, PFM detection, transitions between PWM and PFM, and both serial interfaces are validated on the test board.

ContributorsMumma Reddy, Abhiram (Author) / Bakkaloglu, Bertan (Thesis advisor) / Ogras, Umit Y. (Committee member) / Seo, Jae-Sun (Committee member) / Arizona State University (Publisher)

Created2014

An Analysis of the Unmanned Aerial Systems-to-Ground Channel and Joint Sensing and Communications Systems Using Software Defined Radio

Description

Software-defined radio provides users with a low-cost and flexible platform for implementing and studying advanced communications and remote sensing applications. Two such applications include unmanned aerial system-to-ground communications channel and joint sensing and communication systems. In this work, these applications are studied.

In the first part, unmanned aerial system-to-ground communications…

Software-defined radio provides users with a low-cost and flexible platform for implementing and studying advanced communications and remote sensing applications. Two such applications include unmanned aerial system-to-ground communications channel and joint sensing and communication systems. In this work, these applications are studied.

In the first part, unmanned aerial system-to-ground communications channel models are derived from empirical data collected from software-defined radio transceivers in residential and mountainous desert environments using a small (< 20 kg) unmanned aerial system during low-altitude flight (< 130 m). The Kullback-Leibler divergence measure was employed to characterize model mismatch from the empirical data. Using this measure the derived models accurately describe the underlying data.

In the second part, an experimental joint sensing and communications system is implemented using a network of software-defined radio transceivers. A novel co-design receiver architecture is presented and demonstrated within a three-node joint multiple access system topology consisting of an independent radar and communications transmitter along with a joint radar and communications receiver. The receiver tracks an emulated target moving along a predefined path and simultaneously decodes a communications message. Experimental system performance bounds are characterized jointly using the communications channel capacity and novel estimation information rate.

ContributorsGutierrez, Richard (Author) / Bliss, Daniel W (Thesis advisor) / Papandreou-Suppappola, Antonia (Committee member) / Ogras, Umit Y. (Committee member) / Tepedelenlioğlu, Cihan (Committee member) / Arizona State University (Publisher)

Created2018

An Electrical-Stimulus-Only BIST IC For Capacitive MEMS Accelerometer Sensitivity Characterization

Description

Testing and calibration constitute a significant part of the overall manufacturing cost of microelectromechanical system (MEMS) devices. Developing a low-cost testing and calibration scheme applicable at the user side that ensures the continuous reliability and accuracy is a crucial need. The main purpose of testing is to eliminate defective devices…

Testing and calibration constitute a significant part of the overall manufacturing cost of microelectromechanical system (MEMS) devices. Developing a low-cost testing and calibration scheme applicable at the user side that ensures the continuous reliability and accuracy is a crucial need. The main purpose of testing is to eliminate defective devices and to verify the qualifications of a product is met. The calibration process for capacitive MEMS devices, for the most part, entails the determination of the mechanical sensitivity. In this work, a physical-stimulus-free built-in-self-test (BIST) integrated circuit (IC) design characterizing the sensitivity of capacitive MEMS accelerometers is presented. The BIST circuity can extract the amplitude and phase response of the acceleration sensor's mechanics under electrical excitation within 0.55% of error with respect to its mechanical sensitivity under the physical stimulus. Sensitivity characterization is performed using a low computation complexity multivariate linear regression model. The BIST circuitry maximizes the use of existing analog and mixed-signal readout signal chain and the host processor core, without the need for computationally expensive Fast Fourier Transform (FFT)-based approaches. The BIST IC is designed and fabricated using the 0.18-µm CMOS technology. The sensor analog front-end and BIST circuitry are integrated with a three-axis, low-g capacitive MEMS accelerometer in a single hermetically sealed package. The BIST circuitry occupies 0.3 mm2 with a total readout IC area of 1.0 mm2 and consumes 8.9 mW during self-test operation.

ContributorsOzel, Muhlis Kenan (Author) / Bakkaloglu, Bertan (Thesis advisor) / Ozev, Sule (Thesis advisor) / Kiaei, Sayfe (Committee member) / Ogras, Umit Y. (Committee member) / Arizona State University (Publisher)

Created2017

Dual Application ADC using Three Calibration Techniques in 10nm Technology

Description

In this work, a 12-bit ADC with three types of calibration is proposed for high speed security applications as well as a precision application. This converter performs for both applications because it satisfies all the necessary specifications such as minimal device mismatch and offset, programmability to decrease aging effects, high…

In this work, a 12-bit ADC with three types of calibration is proposed for high speed security applications as well as a precision application. This converter performs for both applications because it satisfies all the necessary specifications such as minimal device mismatch and offset, programmability to decrease aging effects, high SNR for increased ENOB and fast conversion rate. The designed converter implements three types of calibration necessary for offset and gain error, including: a correlated double sampling integrator used in the first stage of the ADC, a power up auto zero technique implemented in the digital code to store any offset and subtract out if necessary, and an automatic startup and manual calibration to control the common mode voltages. The proposed ADC was designed in Intel’s 10nm technology. This ADC is designed to monitor DC voltages for the precision and high speed applications. The conversion rate of the analog to digital converter is programmable to 7µs or 910ns, depending on the precision or high speed application, respectively. The range of the input and reference supply is 0 to 1.25V. The ADC is designed in Intel 10nm technology using a 1.8V supply consuming an area of 0.0705mm2. This thesis explores challenges of designing a dual-purpose analog to digital converter, which include: 1.) increased offset in 10nm technology, 2.) dual application ADC that can be accurate and fast, 3.) reducing the parasitic capacitance of the ADC, and 4.) gain error that occurs in ADCs.

ContributorsSchmelter, Brooke (Author) / Bakkaloglu, Bertan (Thesis advisor) / Ogras, Umit Y. (Committee member) / Kitchen, Jennifer (Committee member) / Arizona State University (Publisher)

Created2017

Removing Reliance on Tester of a VCO-Based ADC Using an On-Chip DAC

Description

Accessibility to the internal nodes of an analog/mixed-signal circuit while testing is extremely difficult. Furthermore, with technology scaling, the effect of process variations becomes more pronounced which in turn effects the test time, test cost, and die yield. As devices become more unreliable, the probability of failure of a die…

Accessibility to the internal nodes of an analog/mixed-signal circuit while testing is extremely difficult. Furthermore, with technology scaling, the effect of process variations becomes more pronounced which in turn effects the test time, test cost, and die yield. As devices become more unreliable, the probability of failure of a die increases, yield decreases affecting the quality of test and cost.Therefore, test time minimization and test cost reduction are important. Moreover, process variations can affect the performance of analog/mixed circuits. Therefore, the performance of a System On-Chip(SoC) which tends to integrate multiple band gap reference circuits (BGRs) is effected due to the wide variations caused in the behavior of the BGR as a result of increasing process variations. Calibration of the BGR is, thus, important in the test process so as to obtain accuracy in the measurement of the output voltage of BGR. Furthermore, as test time minimization and test cost reduction are important in a test process, Built-in Self Test (BIST) techniques have become more popular. To obtain accuracy in the measurement of the output voltage of BGR, a VCO-based zoom-in ADC architecture that was designed to calibrate the output of the BGR voltage which dictates the circuit performance. However, the zoom-voltages for the circuit are generated using a tester. As the number of such ADCs integrated on a SoC increase, the number of nodes to be accessed by the tester increase. Moreover, the capacitance of the probe affects the accuracy of the applied input voltages of the VCO-based ADC. Therefore, accessibility decreases with increase in scaling.Further, generating a wide range of inputs becomes burdensome for the tester. For all the above reasons, an on-chip DAC circuitry was proposed as a part of this thesis, to decrease the reliance on tester. The suggested DAC architecture is a simple resistor string whose resolution depends on the number of zoom-in voltages to be generated. This architecture has a linear and monotonic behavior which is very important as the VCO has a highly non-linear behavior. Thus, the voltages generated by the DAC should be accurate with minimum error so that the worst-case Integral Non-Linearity error (INL) is less than 1mV considering resistor mismatches over process variations. With the increase in the number of VCO-based ADCs on a chip, the test time savings increase exponentially. Thus, the introduction of an on-chip DAC circuitry offers various advantages like decreasing accessibility requirement during the test process, occupying less area, reducing test cost and most importantly, decreasing the reliance on tester.

ContributorsRavouri, Yestina (Author) / Ozev, Sule (Thesis advisor) / Ogras, Umit Y. (Committee member) / Christen, Jennifer Blain (Committee member) / Arizona State University (Publisher)

Created2017

Monitor-Based In-Field Wearout Mitigation for CMOS RF Integrated Circuits

Description

Performance failure due to aging is an increasing concern for RF circuits. While most aging studies are focused on the concept of mean-time-to-failure, for analog circuits, aging results in continuous degradation in performance before it causes catastrophic failures. In this regard, the lifetime of RF/analog circuits, which is defined as…

Performance failure due to aging is an increasing concern for RF circuits. While most aging studies are focused on the concept of mean-time-to-failure, for analog circuits, aging results in continuous degradation in performance before it causes catastrophic failures. In this regard, the lifetime of RF/analog circuits, which is defined as the point where at least one specification fails, is not just determined by aging at the device level, but also by the slack in the specifications, process variations, and the stress conditions on the devices. In this dissertation, firstly, a methodology for analyzing the performance degradation of RF circuits caused by aging mechanisms in MOSFET devices at design-time (pre-silicon) is presented. An algorithm to determine reliability hotspots in the circuit is proposed and design-time optimization methods to enhance the lifetime by making the most likely to fail circuit components more reliable is performed. RF circuits are used as test cases to demonstrate that the lifetime can be enhanced using the proposed design-time technique with low area and no performance impact. Secondly, in-field monitoring and recovering technique for the performance of aged RF circuits is discussed. The proposed in-field technique is based on two phases: During the design time, degradation profiles of the aged circuit are obtained through simulations. From these profiles, hotspot identification of aged RF circuits are conducted and the circuit variable that is easy to measure but highly correlated to the performance of the primary circuit is determined for a monitoring purpose. After deployment, an on-chip DC monitor is periodically activated and its results are used to monitor, and if necessary, recover the circuit performances degraded by aging mechanisms. It is also necessary to co-design the monitoring and recovery mechanism along with the primary circuit for minimal performance impact. A low noise amplifier (LNA) and LC-tank oscillators are fabricated for case studies to demonstrate that the lifetime can be enhanced using the proposed monitoring and recovery techniques in the field. Experimental results with fabricated LNA/oscillator chips show the performance degradation from the accelerated stress conditions and this loss can be recovered by the proposed mitigation scheme.

ContributorsChang, Doo Hwang (Author) / Ozev, Sule (Thesis advisor) / Bakkaloglu, Bertan (Committee member) / Kitchen, Jennifer (Committee member) / Ogras, Umit Y. (Committee member) / Arizona State University (Publisher)

Created2017

Power-Performance Modeling and Adaptive Management of Heterogeneous Mobile Platforms

Description

Nearly 60% of the world population uses a mobile phone, which is typically powered by a system-on-chip (SoC). While the mobile platform capabilities range widely, responsiveness, long battery life and reliability are common design concerns that are crucial to remain competitive. Consequently, state-of-the-art mobile platforms have become highly heterogeneous by…

Nearly 60% of the world population uses a mobile phone, which is typically powered by a system-on-chip (SoC). While the mobile platform capabilities range widely, responsiveness, long battery life and reliability are common design concerns that are crucial to remain competitive. Consequently, state-of-the-art mobile platforms have become highly heterogeneous by combining a powerful SoC with numerous other resources, including display, memory, power management IC, battery and wireless modems. Furthermore, the SoC itself is a heterogeneous resource that integrates many processing elements, such as CPU cores, GPU, video, image, and audio processors. Therefore, CPU cores do not dominate the platform power consumption under many application scenarios.

Competitive performance requires higher operating frequency, and leads to larger power consumption. In turn, power consumption increases the junction and skin temperatures, which have adverse effects on the device reliability and user experience. As a result, allocating the power budget among the major platform resources and temperature control have become fundamental consideration for mobile platforms. Dynamic thermal and power management algorithms address this problem by putting a subset of the processing elements or shared resources to sleep states, or throttling their frequencies. However, an adhoc approach could easily cripple the performance, if it slows down the performance-critical processing element. Furthermore, mobile platforms run a wide range of applications with time varying workload characteristics, unlike early generations, which supported only limited functionality. As a result, there is a need for adaptive power and performance management approaches that consider the platform as a whole, rather than focusing on a subset. Towards this need, our specific contributions include (a) a framework to dynamically select the Pareto-optimal frequency and active cores for the heterogeneous CPUs, such as ARM big.Little architecture, (b) a dynamic power budgeting approach for allocating optimal power consumption to the CPU and GPU using performance sensitivity models for each PE, (c) an adaptive GPU frame time sensitivity prediction model to aid power management algorithms, and (d) an online learning algorithm that constructs adaptive run-time models for non-stationary workloads.

ContributorsGupta, Ujjwala (Author) / Ogras, Umit Y. (Thesis advisor) / Chakrabarti, Chaitali (Committee member) / Kishinevsky, Michael (Committee member) / Dutt, Nikil (Committee member) / Arizona State University (Publisher)

Created2018

High Performance Power Management Integrated Circuits for Portable Devices

Description

Portable devices often require multiple power management IC (PMIC) to power different sub-modules, Li-ion batteries are well suited for portable devices because of its small size, high energy density and long life cycle. Since Li-ion battery is the major power source for portable device, fast and high-efficiency battery charging solution…

Portable devices often require multiple power management IC (PMIC) to power different sub-modules, Li-ion batteries are well suited for portable devices because of its small size, high energy density and long life cycle. Since Li-ion battery is the major power source for portable device, fast and high-efficiency battery charging solution has become a major requirement in portable device application.

In the first part of dissertation, a high performance Li-ion switching battery charger is proposed. Cascaded two loop (CTL) control architecture is used for seamless CC-CV transition, time based technique is utilized to minimize controller area and power consumption. Time domain controller is implemented by using voltage controlled oscillator (VCO) and voltage controlled delay line (VCDL). Several efficiency improvement techniques such as segmented power-FET, quasi-zero voltage switching (QZVS) and switching frequency reduction are proposed. The proposed switching battery charger is able to provide maximum 2 A charging current and has an peak efficiency of 93.3%. By configure the charger as boost converter, the charger is able to provide maximum 1.5 A charging current while achieving 96.3% peak efficiency.

The second part of dissertation presents a digital low dropout regulator (DLDO) for system on a chip (SoC) in portable devices application. The proposed DLDO achieve fast transient settling time, lower undershoot/overshoot and higher PSR performance compared to state of the art. By having a good PSR performance, the proposed DLDO is able to power mixed signal load. To achieve a fast load transient response, a load transient detector (LTD) enables boost mode operation of the digital PI controller. The boost mode operation achieves sub microsecond settling time, and reduces the settling time by 50% to 250 ns, undershoot/overshoot by 35% to 250 mV and 17% to 125 mV without compromising the system stability.

ContributorsLim, Chai Yong (Author) / Kiaei, Sayfe (Thesis advisor) / Bakkaloglu, Bertan (Committee member) / Ogras, Umit Y. (Committee member) / Seo, Jae-Sun (Committee member) / Arizona State University (Publisher)

Created2018

Filtering by