Search Content

StreamWorks: an energy-efficient embedded co-processor for stream computing

Description

Stream processing has emerged as an important model of computation especially in the context of multimedia and communication sub-systems of embedded System-on-Chip (SoC) architectures. The dataflow nature of streaming applications allows them to be most naturally expressed as a set of kernels iteratively operating on continuous streams of data. The…

Stream processing has emerged as an important model of computation especially in the context of multimedia and communication sub-systems of embedded System-on-Chip (SoC) architectures. The dataflow nature of streaming applications allows them to be most naturally expressed as a set of kernels iteratively operating on continuous streams of data. The kernels are computationally intensive and are mainly characterized by real-time constraints that demand high throughput and data bandwidth with limited global data reuse. Conventional architectures fail to meet these demands due to their poorly matched execution models and the overheads associated with instruction and data movements.

This work presents StreamWorks, a multi-core embedded architecture for energy-efficient stream computing. The basic processing element in the StreamWorks architecture is the StreamEngine (SE) which is responsible for iteratively executing a stream kernel. SE introduces an instruction locking mechanism that exploits the iterative nature of the kernels and enables fine-grain instruction reuse. Each instruction in a SE is locked to a Reservation Station (RS) and revitalizes itself after execution; thus never retiring from the RS. The entire kernel is hosted in RS Banks (RSBs) close to functional units for energy-efficient instruction delivery. The dataflow semantics of stream kernels are captured by a context-aware dataflow execution mode that efficiently exploits the Instruction Level Parallelism (ILP) and Data-level parallelism (DLP) within stream kernels.

Multiple SEs are grouped together to form a StreamCluster (SC) that communicate via a local interconnect. A novel software FIFO virtualization technique with split-join functionality is proposed for efficient and scalable stream communication across SEs. The proposed communication mechanism exploits the Task-level parallelism (TLP) of the stream application. The performance and scalability of the communication mechanism is evaluated against the existing data movement schemes for scratchpad based multi-core architectures. Further, overlay schemes and architectural support are proposed that allow hosting any number of kernels on the StreamWorks architecture. The proposed oevrlay schemes for code management supports kernel(context) switching for the most common use cases and can be adapted for any multi-core architecture that use software managed local memories.

The performance and energy-efficiency of the StreamWorks architecture is evaluated for stream kernel and application benchmarks by implementing the architecture in 45nm TSMC and comparison with a low power RISC core and a contemporary accelerator.

ContributorsPanda, Amrit (Author) / Chatha, Karam S. (Thesis advisor) / Wu, Carole-Jean (Thesis advisor) / Chakrabarti, Chaitali (Committee member) / Shrivastava, Aviral (Committee member) / Arizona State University (Publisher)

Created2014

Evidence-based development of trustworthy mobile medical apps

Description

Widespread adoption of smartphone based Mobile Medical Apps (MMAs) is opening new avenues for innovation, bringing MMAs to the forefront of low cost healthcare delivery. These apps often control human physiology and work on sensitive data. Thus it is necessary to have evidences of their trustworthiness i.e. maintaining privacy of…

Widespread adoption of smartphone based Mobile Medical Apps (MMAs) is opening new avenues for innovation, bringing MMAs to the forefront of low cost healthcare delivery. These apps often control human physiology and work on sensitive data. Thus it is necessary to have evidences of their trustworthiness i.e. maintaining privacy of health data, long term operation of wearable sensors and ensuring no harm to the user before actual marketing. Traditionally, clinical studies are used to validate the trustworthiness of medical systems. However, they can take long time and could potentially harm the user. Such evidences can be generated using simulations and mathematical analysis. These methods involve estimating the MMA interactions with human physiology. However, the nonlinear nature of human physiology makes the estimation challenging.

This research analyzes and develops MMA software while considering its interactions with human physiology to assure trustworthiness. A novel app development methodology is used to objectively evaluate trustworthiness of a MMA by generating evidences using automatic techniques. It involves developing the Health-Dev β tool to generate a) evidences of trustworthiness of MMAs and b) requirements assured code generation for vulnerable components of the MMA without hindering the app development process. In this method, all requests from MMAs pass through a trustworthy entity, Trustworthy Data Manager which checks if the app request satisfies the MMA requirements. This method is intended to expedite the design to marketing process of MMAs. The objectives of this research is to develop models, tools and theory for evidence generation and can be divided into the following themes:

• Sustainable design configuration estimation of MMAs: Developing an optimization framework which can generate sustainable and safe sensor configuration while considering interactions of the MMA with the environment.

• Evidence generation using simulation and formal methods: Developing models and tools to verify safety properties of the MMA design to ensure no harm to the human physiology.

• Automatic code generation for MMAs: Investigating methods for automatically

• Performance analysis of trustworthy data manager: Evaluating response time generating trustworthy software for vulnerable components of a MMA and evidences.performance of trustworthy data manager under interactions from non-MMA smartphone apps.

ContributorsBagade, Priyanka (Author) / Gupta, Sandeep K. S. (Thesis advisor) / Wu, Carole-Jean (Committee member) / Doupe, Adam (Committee member) / Zhang, Yi (Committee member) / Arizona State University (Publisher)

Created2015

WCET-aware scratchpad memory management for hard real-time systems

Description

Cyber-physical systems and hard real-time systems have strict timing constraints that specify deadlines until which tasks must finish their execution. Missing a deadline can cause unexpected outcome or endanger human lives in safety-critical applications, such as automotive or aeronautical systems. It is, therefore, of utmost importance to obtain and optimize…

Cyber-physical systems and hard real-time systems have strict timing constraints that specify deadlines until which tasks must finish their execution. Missing a deadline can cause unexpected outcome or endanger human lives in safety-critical applications, such as automotive or aeronautical systems. It is, therefore, of utmost importance to obtain and optimize a safe upper bound of each task’s execution time or the worst-case execution time (WCET), to guarantee the absence of any missed deadline. Unfortunately, conventional microarchitectural components, such as caches and branch predictors, are only optimized for average-case performance and often make WCET analysis complicated and pessimistic. Caches especially have a large impact on the worst-case performance due to expensive off- chip memory accesses involved in cache miss handling. In this regard, software-controlled scratchpad memories (SPMs) have become a promising alternative to caches. An SPM is a raw SRAM, controlled only by executing data movement instructions explicitly at runtime, and such explicit control facilitates static analyses to obtain safe and tight upper bounds of WCETs. SPM management techniques, used in compilers targeting an SPM-based processor, determine how to use a given SPM space by deciding where to insert data movement instructions and what operations to perform at those program locations. This dissertation presents several management techniques for program code and stack data, which aim to optimize the WCETs of a given program. The proposed code management techniques include optimal allocation algorithms and a polynomial-time heuristic for allocating functions to the SPM space, with or without the use of abstraction of SPM regions, and a heuristic for splitting functions into smaller partitions. The proposed stack data management technique, on the other hand, finds an optimal set of program locations to evict and restore stack frames to avoid stack overflows, when the call stack resides in a size-limited SPM. In the evaluation, the WCETs of various benchmarks including real-world automotive applications are statically calculated for SPMs and caches in several different memory configurations.

ContributorsKim, Yooseong (Author) / Shrivastava, Aviral (Thesis advisor) / Broman, David (Committee member) / Fainekos, Georgios (Committee member) / Wu, Carole-Jean (Committee member) / Arizona State University (Publisher)

Created2017

Warning a distracted driver: smart phone applications, informative warnings and automated driving take-over requests

Description

While various collision warning studies in driving have been conducted, only a handful of studies have investigated the effectiveness of warnings with a distracted driver. Across four experiments, the present study aimed to understand the apparent gap in the literature of distracted drivers and warning effectiveness, specifically by studying various…

While various collision warning studies in driving have been conducted, only a handful of studies have investigated the effectiveness of warnings with a distracted driver. Across four experiments, the present study aimed to understand the apparent gap in the literature of distracted drivers and warning effectiveness, specifically by studying various warnings presented to drivers while they were operating a smart phone. Experiment One attempted to understand which smart phone tasks, (text vs image) or (self-paced vs other-paced) are the most distracting to a driver. Experiment Two compared the effectiveness of different smartphone based applications (app’s) for mitigating driver distraction. Experiment Three investigated the effects of informative auditory and tactile warnings which were designed to convey directional information to a distracted driver (moving towards or away). Lastly, Experiment Four extended the research into the area of autonomous driving by investigating the effectiveness of different auditory take-over request signals. Novel to both Experiment Three and Four was that the warnings were delivered from the source of the distraction (i.e., by either the sound triggered at the smart phone location or through a vibration given on the wrist of the hand holding the smart phone). This warning placement was an attempt to break the driver’s attentional focus on their smart phone and understand how to best re-orient the driver in order to improve the driver’s situational awareness (SA). The overall goal was to explore these novel methods of improved SA so drivers may more quickly and appropriately respond to a critical event.

ContributorsMcNabb, Jaimie Christine (Author) / Gray, Dr. Rob (Thesis advisor) / Branaghan, Dr. Russell (Committee member) / Becker, Dr. Vaughn (Committee member) / Arizona State University (Publisher)

Created2017

Filtering by

StreamWorks: an energy-efficient embedded co-processor for stream computing

Evidence-based development of trustworthy mobile medical apps

WCET-aware scratchpad memory management for hard real-time systems

Warning a distracted driver: smart phone applications, informative warnings and automated driving take-over requests