Search Content

System-level synthesis of dataplane subsystems for MPSoCs

Description

In recent years we have witnessed a shift towards multi-processor system-on-chips (MPSoCs) to address the demands of embedded devices (such as cell phones, GPS devices, luxury car features, etc.). Highly optimized MPSoCs are well-suited to tackle the complex application demands desired by the end user customer. These MPSoCs incorporate a…

In recent years we have witnessed a shift towards multi-processor system-on-chips (MPSoCs) to address the demands of embedded devices (such as cell phones, GPS devices, luxury car features, etc.). Highly optimized MPSoCs are well-suited to tackle the complex application demands desired by the end user customer. These MPSoCs incorporate a constellation of heterogeneous processing elements (PEs) (general purpose PEs and application-specific integrated circuits (ASICS)). A typical MPSoC will be composed of a application processor, such as an ARM Coretex-A9 with cache coherent memory hierarchy, and several application sub-systems. Each of these sub-systems are composed of highly optimized instruction processors, graphics/DSP processors, and custom hardware accelerators. Typically, these sub-systems utilize scratchpad memories (SPM) rather than support cache coherency. The overall architecture is an integration of the various sub-systems through a high bandwidth system-level interconnect (such as a Network-on-Chip (NoC)). The shift to MPSoCs has been fueled by three major factors: demand for high performance, the use of component libraries, and short design turn around time. As customers continue to desire more and more complex applications on their embedded devices the performance demand for these devices continues to increase. Designers have turned to using MPSoCs to address this demand. By using pre-made IP libraries designers can quickly piece together a MPSoC that will meet the application demands of the end user with minimal time spent designing new hardware. Additionally, the use of MPSoCs allows designers to generate new devices very quickly and thus reducing the time to market. In this work, a complete MPSoC synthesis design flow is presented. We first present a technique \cite{leary1_intro} to address the synthesis of the interconnect architecture (particularly Network-on-Chip (NoC)). We then address the synthesis of the memory architecture of a MPSoC sub-system \cite{leary2_intro}. Lastly, we present a co-synthesis technique to generate the functional and memory architectures simultaneously. The validity and quality of each synthesis technique is demonstrated through extensive experimentation.

ContributorsLeary, Glenn (Author) / Chatha, Karamvir S (Thesis advisor) / Vrudhula, Sarma (Committee member) / Shrivastava, Aviral (Committee member) / Beraha, Rudy (Committee member) / Arizona State University (Publisher)

Created2013

A structured design methodology for high performance VLSI arrays

Description

The geometric growth in the integrated circuit technology due to transistor scaling also with system-on-chip design strategy, the complexity of the integrated circuit has increased manifold. Short time to market with high reliability and performance is one of the most competitive challenges. Both custom and ASIC design methodologies have evolved…

The geometric growth in the integrated circuit technology due to transistor scaling also with system-on-chip design strategy, the complexity of the integrated circuit has increased manifold. Short time to market with high reliability and performance is one of the most competitive challenges. Both custom and ASIC design methodologies have evolved over the time to cope with this but the high manual labor in custom and statistic design in ASIC are still causes of concern. This work proposes a new circuit design strategy that focuses mostly on arrayed structures like TLB, RF, Cache, IPCAM etc. that reduces the manual effort to a great extent and also makes the design regular, repetitive still achieving high performance. The method proposes making the complete design custom schematic but using the standard cells. This requires adding some custom cells to the already exhaustive library to optimize the design for performance. Once schematic is finalized, the designer places these standard cells in a spreadsheet, placing closely the cells in the critical paths. A Perl script then generates Cadence Encounter compatible placement file. The design is then routed in Encounter. Since designer is the best judge of the circuit architecture, placement by the designer will allow achieve most optimal design. Several designs like IPCAM, issue logic, TLB, RF and Cache designs were carried out and the performance were compared against the fully custom and ASIC flow. The TLB, RF and Cache were the part of the HEMES microprocessor.

ContributorsMaurya, Satendra Kumar (Author) / Clark, Lawrence T (Thesis advisor) / Holbert, Keith E. (Committee member) / Vrudhula, Sarma (Committee member) / Allee, David (Committee member) / Arizona State University (Publisher)

Created2012

The value of a STEM PhD

Description

The quality and quantity of talented members of the US STEM workforce has

been a subject of great interest to policy and decision makers for the past 40 years.

Recent research indicates that while there exist specific shortages in specific disciplines

and areas of expertise in the private sector and the federal government,…

The quality and quantity of talented members of the US STEM workforce has

been a subject of great interest to policy and decision makers for the past 40 years.

Recent research indicates that while there exist specific shortages in specific disciplines

and areas of expertise in the private sector and the federal government, there is no

noticeable shortage in any STEM academic discipline, but rather a surplus of PhDs

vying for increasingly scarce tenure track positions. Despite the seeming availability

of industry and private sector jobs, recent PhDs still struggle to find employment in

those areas. I argue that the decades old narrative suggesting a shortage of STEM

PhDs in the US poses a threat to the value of the natural science PhD, and that

this narrative contributes significantly to why so many PhDs struggle to find career

employment in their fields. This study aims to address the following question: what is

the value of a STEM PhD outside academia? I begin with a critical review of existing

literature, and then analyze programmatic documents for STEM PhD programs at

ASU, interviews with industry employers, and an examination the public face of value

for these degrees. I then uncover the nature of the value alignment, value disconnect,

and value erosion in the ecosystem which produces and then employs STEM PhDs,

concluding with specific areas which merit special consideration in an effort to increase

the value of these degrees for all stakeholders involved.

ContributorsGarbee, Elizabeth (Author) / Maynard, Andrew D. (Thesis advisor) / Wetmore, Jameson (Committee member) / Anderson, Derrick (Committee member) / Arizona State University (Publisher)

Created2018

Design & Analysis of a 21st Century, Scalable, Student-Centric Model of Innovation at the Collegiate Level

Description

The Luminosity Lab, located at Arizona State University, is a prototype for a novel model of interdisciplinary, student-led innovation. The model’s design was informed by the following desired outcomes: i) the model would be well-suited for the 21st century, ii) it would attract, motivate, and retain the university’s strongest student…

The Luminosity Lab, located at Arizona State University, is a prototype for a novel model of interdisciplinary, student-led innovation. The model’s design was informed by the following desired outcomes: i) the model would be well-suited for the 21st century, ii) it would attract, motivate, and retain the university’s strongest student talent, iii) it would operate without the oversight of faculty, and iv) it would work towards the conceptualization, design, development, and deployment of solutions that would positively impact society. This model of interdisciplinary research was tested at Arizona State University across four academic years with participation of over 200 students, who represented more than 20 academic disciplines. The results have shown successful integration of interdisciplinary expertise to identify unmet needs, design innovative concepts, and develop research-informed solutions. This dissertation analyzes Luminosity’s model to determine the following: i) Can a collegiate, student-driven interdisciplinary model of innovation designed for the 21st century perform without faculty management? ii) What are the motivators and culture that enable student success within this model? and iii) How does Luminosity differ from traditional research opportunities and learning experiences?
Through a qualitative, grounded theory analysis, this dissertation examines the phenomena of the students engaging in Luminosity’s model, who have demonstrated their ability to serve as the principal investigators and innovators in conducting substantial discovery, research, and innovation work through full project life cycles. This study supports a theory that highly talented students often feel limited by the pace and scope of their college educations, and yearn for experiences that motivate them with agency, achievement, mastery, affinity for colleagues, and a desire to impact society. Through the cumulative effect of these motivators and an organizational design that facilitates a bottom-up approach to student-driven innovation, Luminosity has established itself as a novel model of research and development in the collegiate space.

ContributorsNaufel, Mark Naufel (Author) / Becker, David V (Thesis advisor) / Cooke, Nancy J. (Committee member) / Anderson, Derrick (Committee member) / Arizona State University (Publisher)

Created2020

Reduced Order Models and Approximations for Hardware Acceleration of Neural Networks

Description

Many real-world engineering problems require simulations to evaluate the design objectives and constraints. Often, due to the complexity of the system model, simulations can be prohibitive in terms of computation time. One approach to overcome this issue is to construct a surrogate model, which approximates the original model. The focus…

Many real-world engineering problems require simulations to evaluate the design objectives and constraints. Often, due to the complexity of the system model, simulations can be prohibitive in terms of computation time. One approach to overcome this issue is to construct a surrogate model, which approximates the original model. The focus of this work is on the data-driven surrogate models, in which empirical approximations of the output are performed given the input parameters. Recently neural networks (NN) have re-emerged as a popular method for constructing data-driven surrogate models. Although, NNs have achieved excellent accuracy and are widely used, they pose their own challenges. This work addresses two common challenges, the need for: (1) hardware acceleration and (2) uncertainty quantification (UQ) in the presence of input variability. The high demand in the inference phase of deep NNs in cloud servers/edge devices calls for the design of low power custom hardware accelerators. The first part of this work describes the design of an energy-efficient long short-term memory (LSTM) accelerator. The overarching goal is to aggressively reduce the power consumption and area of the LSTM components using approximate computing, and then use architectural level techniques to boost the performance. The proposed design is synthesized and placed and routed as an application-specific integrated circuit (ASIC). The results demonstrate that this accelerator is 1.2X and 3.6X more energy-efficient and area-efficient than the baseline LSTM. In the second part of this work, a robust framework is developed based on an alternate data-driven surrogate model referred to as polynomial chaos expansion (PCE) for addressing UQ. In contrast to many existing approaches, no assumptions are made on the elements of the function space and UQ is a function of the expansion coefficients. Moreover, the sensitivity of the output with respect to any subset of the input variables can be computed analytically by post-processing the PCE coefficients. This provides a systematic and incremental method to pruning or changing the order of the model. This framework is evaluated on several real-world applications from different domains and is extended for classification tasks as well.

ContributorsAzari, Elham (Author) / Vrudhula, Sarma (Thesis advisor) / Fainekos, Georgios (Committee member) / Ren, Fengbo (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)

Created2021

Filtering by

System-level synthesis of dataplane subsystems for MPSoCs

A structured design methodology for high performance VLSI arrays

The value of a STEM PhD

Design & Analysis of a 21st Century, Scalable, Student-Centric Model of Innovation at the Collegiate Level

Reduced Order Models and Approximations for Hardware Acceleration of Neural Networks