Matching Items (17)

133905-Thumbnail Image.png

Networking Pricing Analysis

Description

This thesis examines the impact of price changes of select microprocessors on the market share and 5-year gross profit net present values of Company X in the networking market through

This thesis examines the impact of price changes of select microprocessors on the market share and 5-year gross profit net present values of Company X in the networking market through a multi-step analysis. The networking market includes segments including media processing, cloud services, security, routers & switches, and access points. For this thesis our team focused on the routers & switches, as well as the security segments. Company X wants to capitalize on the expected growth of the networking market as it transitions to its fifth generation (henceforth referred to as 5G) by positioning itself favorably in its customers eyes through high quality products offered at competitive prices. Our team performed a quantitative analysis of benchmark data to measure the performances of Company X's products against those of its competitors. We collected this data from third party computer reviewers, as well as the published reports of Company X and its competitors. Through the use of a preference matrix, we then normalized this performance data to adjust for different scales. In order to provide a well-rounded analysis, we adjusted these normalized performances for power consumption (using thermal design power as a proxy) as well as price. We believe these adjusted performances are more valuable than raw benchmark data, as they appeal to the demands of price-sensitive customers. Based on these comparisons, our team was able to assess price changes for their market and discounted financial impact on Company X. Our findings challenge the current pricing of one of the two products being analyzed and suggests a 9% decrease in the price of said product. This recommendation most effectively positions Company X for the development of 5G by offering the best balance of market share and NPV.

Contributors

Agent

Created

Date Created
  • 2018-05

152892-Thumbnail Image.png

Constrained energy optimization in heterogeneous platforms using generalized scaling models

Description

Mobile platforms are becoming highly heterogeneous by combining a powerful multiprocessor system-on-chip (MpSoC) with numerous resources including display, memory, power management IC (PMIC), battery and wireless modems into a compact

Mobile platforms are becoming highly heterogeneous by combining a powerful multiprocessor system-on-chip (MpSoC) with numerous resources including display, memory, power management IC (PMIC), battery and wireless modems into a compact package. Furthermore, the MpSoC itself is a heterogeneous resource that integrates many processing elements such as CPU cores, GPU, video, image, and audio processors. As a result, optimization approaches targeting mobile computing needs to consider the platform at various levels of granularity.

Platform energy consumption and responsiveness are two major considerations for mobile systems since they determine the battery life and user satisfaction, respectively. In this work, the models for power consumption, response time, and energy consumption of heterogeneous mobile platforms are presented. Then, these models are used to optimize the energy consumption of baseline platforms under power, response time, and temperature constraints with and without introducing new resources. It is shown, the optimal design choices depend on dynamic power management algorithm, and adding new resources is more energy efficient than scaling existing resources alone. The framework is verified through actual experiments on Qualcomm Snapdragon 800 based tablet MDP/T. Furthermore, usage of the framework at both design and runtime optimization is also presented.

Contributors

Agent

Created

Date Created
  • 2014

150197-Thumbnail Image.png

Design of an automated validation environment for a radiation hardened MIPS microprocessor

Description

Ever reducing time to market, along with short product lifetimes, has created a need to shorten the microprocessor design time. Verification of the design and its analysis are two major

Ever reducing time to market, along with short product lifetimes, has created a need to shorten the microprocessor design time. Verification of the design and its analysis are two major components of this design cycle. Design validation techniques can be broadly classified into two major categories: simulation based approaches and formal techniques. Simulation based microprocessor validation involves running millions of cycles using random or pseudo random tests and allows verification of the register transfer level (RTL) model against an architectural model, i.e., that the processor executes instructions as required. The validation effort involves model checking to a high level description or simulation of the design against the RTL implementation. Formal techniques exhaustively analyze parts of the design but, do not verify RTL against the architecture specification. The focus of this work is to implement a fully automated validation environment for a MIPS based radiation hardened microprocessor using simulation based approaches. The basic framework uses the classical validation approach in which the design to be validated is described in a Hardware Definition Language (HDL) such as VHDL or Verilog. To implement a simulation based approach a number of random or pseudo random tests are generated. The output of the HDL based design is compared against the one obtained from a "perfect" model implementing similar functionality, a mismatch in the results would thus indicate a bug in the HDL based design. Effort is made to design the environment in such a manner that it can support validation during different stages of the design cycle. The validation environment includes appropriate changes so as to support architecture changes which are introduced because of radiation hardening. The manner in which the validation environment is build is highly dependent on the specifications of the perfect model used for comparisons. This work implements the validation environment for two MIPS simulators as the reference model. Two bugs have been discovered in the RTL model, using simulation based approaches through the validation environment.

Contributors

Agent

Created

Date Created
  • 2011

153089-Thumbnail Image.png

Data movement energy characterization of emerging smartphone workloads for mobile platforms

Description

A benchmark suite that is representative of the programs a processor typically executes is necessary to understand a processor's performance or energy consumption characteristics. The first contribution of this work

A benchmark suite that is representative of the programs a processor typically executes is necessary to understand a processor's performance or energy consumption characteristics. The first contribution of this work addresses this need for mobile platforms with MobileBench, a selection of representative smartphone applications. In smartphones, like any other portable computing systems, energy is a limited resource. Based on the energy characterization of a commercial widely-used smartphone, application cores are found to consume a significant part of the total energy consumption of the device. With this insight, the subsequent part of this thesis focuses on the portion of energy that is spent to move data from the memory system to the application core's internal registers. The primary motivation for this work comes from the relatively higher power consumption associated with a data movement instruction compared to that of an arithmetic instruction. The data movement energy cost is worsened esp. in a System on Chip (SoC) because the amount of data received and exchanged in a SoC based smartphone increases at an explosive rate. A detailed investigation is performed to quantify the impact of data movement

on the overall energy consumption of a smartphone device. To aid this study, microbenchmarks that generate desired data movement patterns between different levels of the memory hierarchy are designed. Energy costs of data movement are then computed by measuring the instantaneous power consumption of the device when the micro benchmarks are executed. This work makes an extensive use of hardware performance counters to validate the memory access behavior of microbenchmarks and to characterize the energy consumed in moving data. Finally, the calculated energy costs of data movement are used to characterize the portion of energy that MobileBench applications spend in moving data. The results of this study show that a significant 35% of the total device energy is spent in data movement alone. Energy is an increasingly important criteria in the context of designing architectures for future smartphones and this thesis offers insights into data movement energy consumption.

Contributors

Agent

Created

Date Created
  • 2014

153288-Thumbnail Image.png

Register files for embedded low-power applications including microprocessors

Description

Register file (RF) memory is important in low power system on chip (SOC) due to its

inherent low voltage stability. Moreover, designs increasingly use compiled instead of custom memory blocks, which

Register file (RF) memory is important in low power system on chip (SOC) due to its

inherent low voltage stability. Moreover, designs increasingly use compiled instead of custom memory blocks, which frequently employ static, rather than pre-charged dynamic RFs. In this work, the various RFs designed for a microprocessor cache and register files are discussed. Comparison between static and dynamic RF power dissipation and timing characteristics is also presented. The relative timing and power advantages of the designs are shown to be dependent on the memory aspect ratio, i.e. array width and height.

Contributors

Agent

Created

Date Created
  • 2014

154425-Thumbnail Image.png

Post-silicon validation of radiation hardened microprocessor, embedded flash and test structures

Description

Digital systems are essential to the technological advancements in space exploration. Microprocessor and flash memory are the essential parts of such a digital system. Space exploration requires a special class

Digital systems are essential to the technological advancements in space exploration. Microprocessor and flash memory are the essential parts of such a digital system. Space exploration requires a special class of radiation hardened microprocessors and flash memories, which are not functionally disrupted in the presence of radiation. The reference design ‘HERMES’ is a radiation-hardened microprocessor with performance comparable to commercially available designs. The reference design ‘eFlash’ is a prototype of soft-error hardened flash memory for configuring Xilinx FPGAs. These designs are manufactured using a foundry bulk CMOS 90-nm low standby power (LP) process. This thesis presents the post-silicon validation results of these designs.

Contributors

Agent

Created

Date Created
  • 2016

153968-Thumbnail Image.png

Compiler and architecture design for coarse-grained programmable accelerators

Description

The holy grail of computer hardware across all market segments has been to sustain performance improvement at the same pace as silicon technology scales. As the technology scales and the

The holy grail of computer hardware across all market segments has been to sustain performance improvement at the same pace as silicon technology scales. As the technology scales and the size of transistors shrinks, the power consumption and energy usage per transistor decrease. On the other hand, the transistor density increases significantly by technology scaling. Due to technology factors, the reduction in power consumption per transistor is not sufficient to offset the increase in power consumption per unit area. Therefore, to improve performance, increasing energy-efficiency must be addressed at all design levels from circuit level to application and algorithm levels.

At architectural level, one promising approach is to populate the system with hardware accelerators each optimized for a specific task. One drawback of hardware accelerators is that they are not programmable. Therefore, their utilization can be low as they perform one specific function. Using software programmable accelerators is an alternative approach to achieve high energy-efficiency and programmability. Due to intrinsic characteristics of software accelerators, they can exploit both instruction level parallelism and data level parallelism.

Coarse-Grained Reconfigurable Architecture (CGRA) is a software programmable accelerator consists of a number of word-level functional units. Motivated by promising characteristics of software programmable accelerators, the potentials of CGRAs in future computing platforms is studied and an end-to-end CGRA research framework is developed. This framework consists of three different aspects: CGRA architectural design, integration in a computing system, and CGRA compiler. First, the design and implementation of a CGRA and its instruction set is presented. This design is then modeled in a cycle accurate system simulator. The simulation platform enables us to investigate several problems associated with a CGRA when it is deployed as an accelerator in a computing system. Next, the problem of mapping a compute intensive region of a program to CGRAs is formulated. From this formulation, several efficient algorithms are developed which effectively utilize CGRA scarce resources very well to minimize the running time of input applications. Finally, these mapping algorithms are integrated in a compiler framework to construct a compiler for CGRA

Contributors

Agent

Created

Date Created
  • 2015

153414-Thumbnail Image.png

Energy-efficient scheduling for heterogeneous servers in the dark silicon era

Description

Driven by stringent power and thermal constraints, heterogeneous multi-core processors, such as the ARM big-LITTLE architecture, are becoming increasingly popular. In this thesis, the use of low-power heterogeneous multi-cores as

Driven by stringent power and thermal constraints, heterogeneous multi-core processors, such as the ARM big-LITTLE architecture, are becoming increasingly popular. In this thesis, the use of low-power heterogeneous multi-cores as Microservers using web search as a motivational application is addressed. In particular, I propose a new family of scheduling policies for heterogeneous microservers that assign incoming search queries to available cores so as to optimize for performance metrics such as mean response time and service level agreements, while guaranteeing thermally-safe operation. Thorough experimental evaluations on a big-LITTLE platform demonstrate, on an heterogeneous eight-core Samsung Exynos 5422 MpSoC, with four big and little cores each, that naive performance oriented scheduling policies quickly result in thermal instability, while the proposed policies not only reduce peak temperature but also achieve 4.8x reduction in processing time and 5.6x increase in energy efficiency compared to baseline scheduling policies.

Contributors

Agent

Created

Date Created
  • 2015

152388-Thumbnail Image.png

Chip level implementation techniques for radiation hardened microprocessors

Description

Microprocessors are the processing heart of any digital system and are central to all the technological advancements of the age including space exploration and monitoring. The demands of space exploration

Microprocessors are the processing heart of any digital system and are central to all the technological advancements of the age including space exploration and monitoring. The demands of space exploration require a special class of microprocessors called radiation hardened microprocessors which are less susceptible to radiation present outside the earth's atmosphere, in other words their functioning is not disrupted even in presence of disruptive radiation. The presence of these particles forces the designers to come up with design techniques at circuit and chip levels to alleviate the errors which can be encountered in the functioning of microprocessors. Microprocessor evolution has been very rapid in terms of performance but the same cannot be said about its rad-hard counterpart. With the total data processing capability overall increasing rapidly, the clear lack of performance of the processors manifests as a bottleneck in any processing system. To design high performance rad-hard microprocessors designers have to overcome difficult design problems at various design stages i.e. Architecture, Synthesis, Floorplanning, Optimization, routing and analysis all the while maintaining circuit radiation hardness. The reference design `HERMES' is targeted at 90nm IBM G process and is expected to reach 500Mhz which is twice as fast any processor currently available. Chapter 1 talks about the mechanisms of radiation effects which cause upsets and degradation to the functioning of digital circuits. Chapter 2 gives a brief description of the components which are used in the design and are part of the consistent efforts at ASUVLSI lab culminating in this chip level implementation of the design. Chapter 3 explains the basic digital design ASIC flow and the changes made to it leading to a rad-hard specific ASIC flow used in implementing this chip. Chapter 4 talks about the triple mode redundant (TMR) specific flow which is used in the block implementation, delineating the challenges faced and the solutions proposed to make the flow work. Chapter 5 explains the challenges faced and solutions arrived at while using the top-level flow described in chapter 3. Chapter 6 puts together the results and analyzes the design in terms of basic integrated circuit design constraints.

Contributors

Agent

Created

Date Created
  • 2013

152415-Thumbnail Image.png

Compiler and runtime for memory management on software managed manycore processors

Description

We are expecting hundreds of cores per chip in the near future. However, scaling the memory architecture in manycore architectures becomes a major challenge. Cache coherence provides a single image

We are expecting hundreds of cores per chip in the near future. However, scaling the memory architecture in manycore architectures becomes a major challenge. Cache coherence provides a single image of memory at any time in execution to all the cores, yet coherent cache architectures are believed will not scale to hundreds and thousands of cores. In addition, caches and coherence logic already take 20-50% of the total power consumption of the processor and 30-60% of die area. Therefore, a more scalable architecture is needed for manycore architectures. Software Managed Manycore (SMM) architectures emerge as a solution. They have scalable memory design in which each core has direct access to only its local scratchpad memory, and any data transfers to/from other memories must be done explicitly in the application using Direct Memory Access (DMA) commands. Lack of automatic memory management in the hardware makes such architectures extremely power-efficient, but they also become difficult to program. If the code/data of the task mapped onto a core cannot fit in the local scratchpad memory, then DMA calls must be added to bring in the code/data before it is required, and it may need to be evicted after its use. However, doing this adds a lot of complexity to the programmer's job. Now programmers must worry about data management, on top of worrying about the functional correctness of the program - which is already quite complex. This dissertation presents a comprehensive compiler and runtime integration to automatically manage the code and data of each task in the limited local memory of the core. We firstly developed a Complete Circular Stack Management. It manages stack frames between the local memory and the main memory, and addresses the stack pointer problem as well. Though it works, we found we could further optimize the management for most cases. Thus a Smart Stack Data Management (SSDM) is provided. In this work, we formulate the stack data management problem and propose a greedy algorithm for the same. Later on, we propose a general cost estimation algorithm, based on which CMSM heuristic for code mapping problem is developed. Finally, heap data is dynamic in nature and therefore it is hard to manage it. We provide two schemes to manage unlimited amount of heap data in constant sized region in the local memory. In addition to those separate schemes for different kinds of data, we also provide a memory partition methodology.

Contributors

Agent

Created

Date Created
  • 2014