Skip to main content

ASU Global menu

Skip to Content Report an accessibility problem ASU Home My ASU Colleges and Schools Sign In
Arizona State University Arizona State University
ASU Library KEEP

Main navigation

Home Browse Collections Share Your Work
Copyright Describe Your Materials File Formats Open Access Repository Practices Share Your Materials Terms of Deposit API Documentation
Skip to Content Report an accessibility problem ASU Home My ASU Colleges and Schools Sign In
  1. KEEP
  2. Theses and Dissertations
  3. ASU Electronic Theses and Dissertations
  4. Accelerating Linear Algebra and Machine Learning Kernels on a Massively Parallel Reconfigurable Architecture
  5. Full metadata

Accelerating Linear Algebra and Machine Learning Kernels on a Massively Parallel Reconfigurable Architecture

Full metadata

Description

This thesis presents efficient implementations of several linear algebra kernels, machine learning kernels and a neural network based recommender systems engine onto a massively parallel reconfigurable architecture, Transformer. The linear algebra kernels include Triangular Matrix Solver (TRSM), LU Decomposition (LUD), QR Decomposition (QRD), and Matrix Inversion. The machine learning kernels include an LSTM (Long Short Term Memory) cell, and a GRU (gated Recurrent Unit) cell used in recurrent neural networks. The neural network based recommender systems engine consists of multiple kernels including fully connected layers, embedding layer, 1-D batchnorm, Adam optimizer, etc.

Transformer is a massively parallel reconfigurable multicore architecture designed at the University of Michigan. The Transformer configuration considered here is 4 tiles and 16 General Processing Elements (GPEs) per tile. It supports a two level cache hierarchy where the L1 and L2 caches can operate in shared (S) or private (P) modes. The architecture was modeled using Gem5 and cycle accurate simulations were done to evaluate the performance in terms of execution times, giga-operations per second per Watt (GOPS/W), and giga-floating-point-operations per second per Watt (GFLOPS/W).

This thesis shows that for linear algebra kernels, each kernel achieves high performance for a certain cache mode and that this cache mode can change when the matrix size changes. For instance, for smaller matrix sizes, L1P, L2P cache mode is best for TRSM, while L1S, L2S is the best cache mode for LUD, and L1P, L2S is the best for QRD. For each kernel, the optimal cache mode changes when the matrix size is increased. For instance, for TRSM, the L1P, L2P cache mode is best for smaller matrix sizes ($N=64, 128, 256, 512$) and it changes to L1S, L2P for larger matrix sizes ($N=1024$). For machine learning kernels, L1P, L2P is the best cache mode for all network parameter sizes.

Gem5 simulations show that the peak performance for TRSM, LUD, QRD and Matrix Inverse in the 14nm node is 97.5, 59.4, 133.0 and 83.05 GFLOPS/W, respectively. For LSTM and GRU, the peak performance is 44.06 and 69.3 GFLOPS/W.

The neural network based recommender system was implemented in L1S, L2S cache mode. It includes a forward pass and a backward pass and is significantly more complex in terms of both computational complexity and data movement. The most computationally intensive block is the fully connected layer followed by Adam optimizer. The overall performance of the recommender systems engine is 54.55 GFLOPS/W and 169.12 GOPS/W.

Date Created
2019
Contributors
  • Soorishetty, Anuraag (Author)
  • Chakrabarti, Chaitali (Thesis advisor)
  • Kim, Hun Seok (Committee member)
  • LiKamWa, Robert (Committee member)
  • Arizona State University (Publisher)
Topical Subject
  • Electrical Engineering
Resource Type
Text
Genre
Masters Thesis
Academic theses
Extent
68 pages
Language
eng
Copyright Statement
In Copyright
Primary Member of
ASU Electronic Theses and Dissertations
Peer-reviewed
No
Open Access
No
Handle
https://hdl.handle.net/2286/R.I.55557
Level of coding
minimal
Note
Masters Thesis Electrical Engineering 2019
System Created
  • 2020-01-14 09:15:46
System Modified
  • 2021-08-26 09:47:01
  •     
  • 1 year 7 months ago
Additional Formats
  • OAI Dublin Core
  • MODS XML

Quick actions

About this item

Overview
 Copy permalink

Explore this item

Explore Document

Share this content

Feedback

ASU University Technology Office Arizona State University.
KEEP

Contact Us

Repository Services
Home KEEP PRISM ASU Research Data Repository
Resources
Terms of Deposit Sharing Materials: ASU Digital Repository Guide Open Access at ASU

The ASU Library acknowledges the twenty-three Native Nations that have inhabited this land for centuries. Arizona State University's four campuses are located in the Salt River Valley on ancestral territories of Indigenous peoples, including the Akimel O’odham (Pima) and Pee Posh (Maricopa) Indian Communities, whose care and keeping of these lands allows us to be here today. ASU Library acknowledges the sovereignty of these nations and seeks to foster an environment of success and possibility for Native American students and patrons. We are advocates for the incorporation of Indigenous knowledge systems and research methodologies within contemporary library practice. ASU Library welcomes members of the Akimel O’odham and Pee Posh, and all Native nations to the Library.

Number one in the U.S. for innovation. ASU ahead of MIT and Stanford. - U.S. News and World Report, 8 years, 2016-2023
Maps and Locations Jobs Directory Contact ASU My ASU
Copyright and Trademark Accessibility Privacy Terms of Use Emergency COVID-19 Information