Matching Items (3)
Filtering by

Clear all filters

156610-Thumbnail Image.png
Description
Deep neural networks (DNN) have shown tremendous success in various cognitive tasks, such as image classification, speech recognition, etc. However, their usage on resource-constrained edge devices has been limited due to high computation and large memory requirement.

To overcome these challenges, recent works have extensively investigated model compression techniques such

Deep neural networks (DNN) have shown tremendous success in various cognitive tasks, such as image classification, speech recognition, etc. However, their usage on resource-constrained edge devices has been limited due to high computation and large memory requirement.

To overcome these challenges, recent works have extensively investigated model compression techniques such as element-wise sparsity, structured sparsity and quantization. While most of these works have applied these compression techniques in isolation, there have been very few studies on application of quantization and structured sparsity together on a DNN model.

This thesis co-optimizes structured sparsity and quantization constraints on DNN models during training. Specifically, it obtains optimal setting of 2-bit weight and 2-bit activation coupled with 4X structured compression by performing combined exploration of quantization and structured compression settings. The optimal DNN model achieves 50X weight memory reduction compared to floating-point uncompressed DNN. This memory saving is significant since applying only structured sparsity constraints achieves 2X memory savings and only quantization constraints achieves 16X memory savings. The algorithm has been validated on both high and low capacity DNNs and on wide-sparse and deep-sparse DNN models. Experiments demonstrated that deep-sparse DNN outperforms shallow-dense DNN with varying level of memory savings depending on DNN precision and sparsity levels. This work further proposed a Pareto-optimal approach to systematically extract optimal DNN models from a huge set of sparse and dense DNN models. The resulting 11 optimal designs were further evaluated by considering overall DNN memory which includes activation memory and weight memory. It was found that there is only a small change in the memory footprint of the optimal designs corresponding to the low sparsity DNNs. However, activation memory cannot be ignored for high sparsity DNNs.
ContributorsSrivastava, Gaurav (Author) / Seo, Jae-Sun (Thesis advisor) / Chakrabarti, Chaitali (Committee member) / Berisha, Visar (Committee member) / Arizona State University (Publisher)
Created2018
154757-Thumbnail Image.png
Description
Speech recognition and keyword detection are becoming increasingly popular applications for mobile systems. While deep neural network (DNN) implementation of these systems have very good performance,

they have large memory and compute resource requirements, making their implementation on a mobile device quite challenging. In this thesis, techniques to reduce the

Speech recognition and keyword detection are becoming increasingly popular applications for mobile systems. While deep neural network (DNN) implementation of these systems have very good performance,

they have large memory and compute resource requirements, making their implementation on a mobile device quite challenging. In this thesis, techniques to reduce the memory and computation cost

of keyword detection and speech recognition networks (or DNNs) are presented.

The first technique is based on representing all weights and biases by a small number of bits and mapping all nodal computations into fixed-point ones with minimal degradation in the

accuracy. Experiments conducted on the Resource Management (RM) database show that for the keyword detection neural network, representing the weights by 5 bits results in a 6 fold reduction in memory compared to a floating point implementation with very little loss in performance. Similarly, for the speech recognition neural network, representing the weights by 6 bits results in a 5 fold reduction in memory while maintaining an error rate similar to a floating point implementation. Additional reduction in memory is achieved by a technique called weight pruning,

where the weights are classified as sensitive and insensitive and the sensitive weights are represented with higher precision. A combination of these two techniques helps reduce the memory

footprint by 81 - 84% for speech recognition and keyword detection networks respectively.

Further reduction in memory size is achieved by judiciously dropping connections for large blocks of weights. The corresponding technique, termed coarse-grain sparsification, introduces

hardware-aware sparsity during DNN training, which leads to efficient weight memory compression and significant reduction in the number of computations during classification without

loss of accuracy. Keyword detection and speech recognition DNNs trained with 75% of the weights dropped and classified with 5-6 bit weight precision effectively reduced the weight memory

requirement by ~95% compared to a fully-connected network with double precision, while showing similar performance in keyword detection accuracy and word error rate.
ContributorsArunachalam, Sairam (Author) / Chakrabarti, Chaitali (Thesis advisor) / Seo, Jae-Sun (Thesis advisor) / Cao, Yu (Committee member) / Arizona State University (Publisher)
Created2016
168739-Thumbnail Image.png
Description
Visual navigation is a useful and important task for a variety of applications. As the preva­lence of robots increase, there is an increasing need for energy-­efficient navigation methods as well. Many aspects of efficient visual navigation algorithms have been implemented in the lit­erature, but there is a lack of work

Visual navigation is a useful and important task for a variety of applications. As the preva­lence of robots increase, there is an increasing need for energy-­efficient navigation methods as well. Many aspects of efficient visual navigation algorithms have been implemented in the lit­erature, but there is a lack of work on evaluation of the efficiency of the image sensors. In this thesis, two methods are evaluated: adaptive image sensor quantization for traditional camera pipelines as well as new event­-based sensors for low­-power computer vision.The first contribution in this thesis is an evaluation of performing varying levels of sen­sor linear and logarithmic quantization with the task of visual simultaneous localization and mapping (SLAM). This unconventional method can provide efficiency benefits with a trade­ off between accuracy of the task and energy-­efficiency. A new sensor quantization method, gradient­-based quantization, is introduced to improve the accuracy of the task. This method only lowers the bit level of parts of the image that are less likely to be important in the SLAM algorithm since lower bit levels signify better energy­-efficiency, but worse task accuracy. The third contribution is an evaluation of the efficiency and accuracy of event­-based camera inten­sity representations for the task of optical flow. The results of performing a learning based optical flow are provided for each of five different reconstruction methods along with ablation studies. Lastly, the challenges of an event feature­-based SLAM system are presented with re­sults demonstrating the necessity for high quality and high­ resolution event data. The work in this thesis provides studies useful for examining trade­offs for an efficient visual navigation system with traditional and event vision sensors. The results of this thesis also provide multiple directions for future work.
ContributorsChristie, Olivia Catherine (Author) / Jayasuriya, Suren (Thesis advisor) / Chakrabarti, Chaitali (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)
Created2022