Search Content

Matching Items (3)

Filtering by

System level power and thermal management on embedded processors

Description

Semiconductor scaling technology has led to a sharp growth in transistor counts. This has resulted in an exponential increase on both power dissipation and heat flux (or power density) in modern microprocessors. These microprocessors are integrated as the major components in many modern embedded devices, which offer richer features and attain higher performance than ever before. Therefore, power and thermal management have become the significant design considerations for modern embedded devices. Dynamic voltage/frequency scaling (DVFS) and dynamic power management (DPM) are two well-known hardware capabilities offered by modern embedded processors. However, the power or thermal aware performance optimization is not fully explored for the mainstream embedded processors with discrete DVFS and DPM capabilities. Many key problems have not been answered yet. What is the maximum performance that an embedded processor can achieve under power or thermal constraint for a periodic application? Does there exist an efficient algorithm for the power or thermal management problems with guaranteed quality bound? These questions are hard to be answered because the discrete settings of DVFS and DPM enhance the complexity of many power and thermal management problems, which are generally NP-hard. The dissertation presents a comprehensive study on these NP-hard power and thermal management problems for embedded processors with discrete DVFS and DPM capabilities. In the domain of power management, the dissertation addresses the power minimization problem for real-time schedules, the energy-constrained make-span minimization problem on homogeneous and heterogeneous chip multiprocessors (CMP) architectures, and the battery aware energy management problem with nonlinear battery discharging model. In the domain of thermal management, the work addresses several thermal-constrained performance maximization problems for periodic embedded applications. All the addressed problems are proved to be NP-hard or strongly NP-hard in the study. Then the work focuses on the design of the off-line optimal or polynomial time approximation algorithms as solutions in the problem design space. Several addressed NP-hard problems are tackled by dynamic programming with optimal solutions and pseudo-polynomial run time complexity. Because the optimal algorithms are not efficient in worst case, the fully polynomial time approximation algorithms are provided as more efficient solutions. Some efficient heuristic algorithms are also presented as solutions to several addressed problems. The comprehensive study answers the key questions in order to fully explore the power and thermal management potentials on embedded processors with discrete DVFS and DPM capabilities. The provided solutions enable the theoretical analysis of the maximum performance for periodic embedded applications under power or thermal constraints.

ContributorsZhang, Sushu (Author) / Chatha, Karam S (Thesis advisor) / Cao, Yu (Committee member) / Konjevod, Goran (Committee member) / Vrudhula, Sarma (Committee member) / Xue, Guoliang (Committee member) / Arizona State University (Publisher)

Created2012

Smart compilers for reliable and power-efficient embedded computing

Description

Thanks to continuous technology scaling, intelligent, fast and smaller digital systems are now available at affordable costs. As a result, digital systems have found use in a wide range of application areas that were not even imagined before, including medical (e.g., MRI, remote or post-operative monitoring devices, etc.), automotive (e.g., adaptive cruise control, anti-lock brakes, etc.), security systems (e.g., residential security gateways, surveillance devices, etc.), and in- and out-of-body sensing (e.g., capsule swallowed by patients measuring digestive system pH, heart monitors, etc.). Such computing systems, which are completely embedded within the application, are called embedded systems, as opposed to general purpose computing systems. In the design of such embedded systems, power consumption and reliability are indispensable system requirements. In battery operated portable devices, the battery is the single largest factor contributing to device cost, weight, recharging time, frequency and ultimately its usability. For example, in the Apple iPhone 4 smart-phone, the battery is $40\%$ of the device weight, occupies $36\%$ of its volume and allows only $7$ hours (over 3G) of talk time. As embedded systems find use in a range of sensitive applications, from bio-medical applications to safety and security systems, the reliability of the computations performed becomes a crucial factor. At our current technology-node, portable embedded systems are prone to expect failures due to soft errors at the rate of once-per-year; but with aggressive technology scaling, the rate is predicted to increase exponentially to once-per-hour. Over the years, researchers have been successful in developing techniques, implemented at different layers of the design-spectrum, to improve system power efficiency and reliability. Among the layers of design abstraction, I observe that the interface between the compiler and processor micro-architecture possesses a unique potential for efficient design optimizations. A compiler designer is able to observe and analyze the application software at a finer granularity; while the processor architect analyzes the system output (power, performance, etc.) for each executed instruction. At the compiler micro-architecture interface, if the system knowledge at the two design layers can be integrated, design optimizations at the two layers can be modified to efficiently utilize available resources and thereby achieve appreciable system-level benefits. To this effect, the thesis statement is that, ``by merging system design information at the compiler and micro-architecture design layers, smart compilers can be developed, that achieve reliable and power-efficient embedded computing through: i) Pure compiler techniques, ii) Hybrid compiler micro-architecture techniques, and iii) Compiler-aware architectures''. In this dissertation demonstrates, through contributions in each of the three compiler-based techniques, the effectiveness of smart compilers in achieving power-efficiency and reliability in embedded systems.

ContributorsJeyapaul, Reiley (Author) / Shrivastava, Aviral (Thesis advisor) / Vrudhula, Sarma (Committee member) / Clark, Lawrence (Committee member) / Colbourn, Charles (Committee member) / Arizona State University (Publisher)

Created2012

Compilation of stream programs onto embedded multicore architectures

Description

In recent years, we have observed the prevalence of stream applications in many embedded domains. Stream programs distinguish themselves from traditional sequential programming languages through well defined independent actors, explicit data communication, and stable code/data access patterns. In order to achieve high performance and low power, scratch pad memory (SPM) has been introduced in today's embedded multicore processors. Current design frameworks for developing stream applications on SPM enhanced embedded architectures typically do not include a compiler that can perform automatic partitioning, mapping and scheduling under limited on-chip SPM capacities and memory access delays. Consequently, many designs are implemented manually, which leads to lengthy tasks and inferior designs. In this work, optimization techniques that automatically compile stream programs onto embedded multi-core architectures are proposed. As an initial case study, we implemented an automatic target recognition (ATR) algorithm on the IBM Cell Broadband Engine (BE). Then integer linear programming (ILP) and heuristic approaches were proposed to schedule stream programs on a single core embedded processor that has an SPM with code overlay. Later, ILP and heuristic approaches for Compiling Stream programs on SPM enhanced Multicore Processors (CSMP) were studied. The proposed CSMP ILP and heuristic approaches do not optimize for cycles in stream applications. Further, the number of software pipeline stages in the implementation is dependent on actor to processing engine (PE) mapping and is uncontrollable. We next presented a Retiming technique for Throughput optimization on Embedded Multi-core processors (RTEM). RTEM approach inherently handles cycles and can accept an upper bound on the number of software pipeline stages to be generated. We further enhanced RTEM by incorporating unrolling (URSTEM) that preserves all the beneficial properties of RTEM heuristic and also scales with the number of PEs through unrolling.

ContributorsChe, Weijia (Author) / Chatha, Karam Singh (Thesis advisor) / Vrudhula, Sarma (Committee member) / Chakrabarti, Chaitali (Committee member) / Shrivastava, Aviral (Committee member) / Arizona State University (Publisher)

Created2012