Matching Items (3)
Filtering by

Clear all filters

151945-Thumbnail Image.png
Description
In recent years we have witnessed a shift towards multi-processor system-on-chips (MPSoCs) to address the demands of embedded devices (such as cell phones, GPS devices, luxury car features, etc.). Highly optimized MPSoCs are well-suited to tackle the complex application demands desired by the end user customer. These MPSoCs incorporate a

In recent years we have witnessed a shift towards multi-processor system-on-chips (MPSoCs) to address the demands of embedded devices (such as cell phones, GPS devices, luxury car features, etc.). Highly optimized MPSoCs are well-suited to tackle the complex application demands desired by the end user customer. These MPSoCs incorporate a constellation of heterogeneous processing elements (PEs) (general purpose PEs and application-specific integrated circuits (ASICS)). A typical MPSoC will be composed of a application processor, such as an ARM Coretex-A9 with cache coherent memory hierarchy, and several application sub-systems. Each of these sub-systems are composed of highly optimized instruction processors, graphics/DSP processors, and custom hardware accelerators. Typically, these sub-systems utilize scratchpad memories (SPM) rather than support cache coherency. The overall architecture is an integration of the various sub-systems through a high bandwidth system-level interconnect (such as a Network-on-Chip (NoC)). The shift to MPSoCs has been fueled by three major factors: demand for high performance, the use of component libraries, and short design turn around time. As customers continue to desire more and more complex applications on their embedded devices the performance demand for these devices continues to increase. Designers have turned to using MPSoCs to address this demand. By using pre-made IP libraries designers can quickly piece together a MPSoC that will meet the application demands of the end user with minimal time spent designing new hardware. Additionally, the use of MPSoCs allows designers to generate new devices very quickly and thus reducing the time to market. In this work, a complete MPSoC synthesis design flow is presented. We first present a technique \cite{leary1_intro} to address the synthesis of the interconnect architecture (particularly Network-on-Chip (NoC)). We then address the synthesis of the memory architecture of a MPSoC sub-system \cite{leary2_intro}. Lastly, we present a co-synthesis technique to generate the functional and memory architectures simultaneously. The validity and quality of each synthesis technique is demonstrated through extensive experimentation.
ContributorsLeary, Glenn (Author) / Chatha, Karamvir S (Thesis advisor) / Vrudhula, Sarma (Committee member) / Shrivastava, Aviral (Committee member) / Beraha, Rudy (Committee member) / Arizona State University (Publisher)
Created2013
152993-Thumbnail Image.png
Description
The need for multi-core architectural trends was realized in the desktop computing domain fairly long back. This trend is also beginning to be seen in the deeply embedded systems such as automotive and avionics industry owing to ever increasing demands in terms of sheer computational bandwidth, responsiveness, reliability and power

The need for multi-core architectural trends was realized in the desktop computing domain fairly long back. This trend is also beginning to be seen in the deeply embedded systems such as automotive and avionics industry owing to ever increasing demands in terms of sheer computational bandwidth, responsiveness, reliability and power consumption constraints. The adoption of such multi-core architectures in safety critical systems is often met with resistance owing to the overhead in migration of the existing stable code base to the new system setup, typically requiring extensive re-design. This also brings about the need for exhaustive testing and validation that goes hand in hand with such a migration, especially in safety critical real-time systems.

This project highlights the steps to develop an asymmetric multiprocessing variant of Micrium µC/OS-II real-time operating system suited for a multi-core system. This RTOS variant also supports multi-core synchronization, shared memory management and multi-core messaging queues.

Since such specialized embedded systems are usually developed by system designers focused more so on the functionality than on the coding standards, the adoption of automatic production code generation tools, such as SIMULINK's Embedded Coder, is increasingly becoming the industry norm. Such tools are capable of producing robust, industry compliant code with very little roll out time. This project documents the process of extending SIMULINK's automatic code generation tool for the AMP variant of µC/OS-II on Freescale's MPC5675K, dual-core Microcontroller Unit. This includes code generation from task based models and multi-rate models. Apart from this, it also de-scribes the development of additional software tools to allow semantically consistent communication between task on the same kernel and those across the kernels.
ContributorsBulusu, Girish Rao (Author) / Lee, Yann-Hang (Thesis advisor) / Fainekos, Georgios (Committee member) / Wu, Carole-Jean (Committee member) / Arizona State University (Publisher)
Created2014
153193-Thumbnail Image.png
Description
As the number of cores per chip increases, maintaining cache coherence becomes prohibitive for both power and performance. Non Coherent Cache (NCC) architectures do away with hardware-based cache coherence, but they become difficult to program. Some existing architectures provide a middle ground by providing some shared memory in the hardware.

As the number of cores per chip increases, maintaining cache coherence becomes prohibitive for both power and performance. Non Coherent Cache (NCC) architectures do away with hardware-based cache coherence, but they become difficult to program. Some existing architectures provide a middle ground by providing some shared memory in the hardware. Specifically, the 48-core Intel Single-chip Cloud Computer (SCC) provides some off-chip (DRAM) shared memory some on-chip (SRAM) shared memory. We call such architectures Hybrid Shared Memory, or HSM, manycore architectures. However, how to efficiently execute multi-threaded programs on HSM architectures is an open problem. To be able to execute a multi-threaded program correctly on HSM architectures, the compiler must: i) identify all the shared data and map it to the shared memory, and ii) map the frequently accessed shared data to the on-chip shared memory. This work presents a source-to-source translator written using CETUS that identifies a conservative superset of all the shared data in a multi-threaded application and maps it to the shared memory such that it enables execution on HSM architectures.
ContributorsRawat, Tushar (Author) / Shrivastava, Aviral (Thesis advisor) / Dasgupta, Partha (Committee member) / Fainekos, Georgios (Committee member) / Arizona State University (Publisher)
Created2014