Matching Items (2)
Filtering by

Clear all filters

155003-Thumbnail Image.png
Description
The technological advances in the past few decades have made possible creation and consumption of digital visual content at an explosive rate. Consequently, there is a need for efficient quality monitoring systems to ensure minimal degradation of images and videos during various processing operations like compression, transmission, storage etc. Objective

The technological advances in the past few decades have made possible creation and consumption of digital visual content at an explosive rate. Consequently, there is a need for efficient quality monitoring systems to ensure minimal degradation of images and videos during various processing operations like compression, transmission, storage etc. Objective Image Quality Assessment (IQA) algorithms have been developed that predict quality scores which match well with human subjective quality assessment. However, a lot of research still remains to be done before IQA algorithms can be deployed in real world systems. Long runtimes for one frame of image is a major hurdle. Graphics Processing Units (GPUs), equipped with massive number of computational cores, provide an opportunity to accelerate IQA algorithms by performing computations in parallel. Indeed, General Purpose Graphics Processing Units (GPGPU) techniques have been applied to a few Full Reference IQA algorithms which fall under the. We present a GPGPU implementation of Blind Image Integrity Notator using DCT Statistics (BLIINDS-II), which falls under the No Reference IQA algorithm paradigm. We have been able to achieve a speedup of over 30x over the previous CPU version of this algorithm. We test our implementation using various distorted images from the CSIQ database and present the performance trends observed. We achieve a very consistent performance of around 9 milliseconds per distorted image, which made possible the execution of over 100 images per second (100 fps).
ContributorsYadav, Aman (Author) / Sohoni, Sohum (Thesis advisor) / Aukes, Daniel (Committee member) / Redkar, Sangram (Committee member) / Arizona State University (Publisher)
Created2016
155837-Thumbnail Image.png
Description
With the advent of GPGPU, many applications are being accelerated by using CUDA programing paradigm. We are able to achieve around 10x -100x speedups by simply porting the application on to the GPU and running the parallel chunk of code on its multi cored SIMT (Single instruction multiple thread) architecture.

With the advent of GPGPU, many applications are being accelerated by using CUDA programing paradigm. We are able to achieve around 10x -100x speedups by simply porting the application on to the GPU and running the parallel chunk of code on its multi cored SIMT (Single instruction multiple thread) architecture. But for optimal performance it is necessary to make sure that all the GPU resources are efficiently used, and the latencies in the application are minimized. For this, it is essential to monitor the Hardware usage of the algorithm and thus diagnose the compute and memory bottlenecks in the implementation. In the following thesis, we will be analyzing the mapping of CUDA implementation of BLIINDS-II algorithm on the underlying GPU hardware, and come up with a Kepler architecture specific solution of using shuffle instruction via CUB library to tackle the two major bottlenecks in the algorithm. Experiments were conducted to convey the advantage of using shuffle instru3ction in algorithm over only using shared memory as a buffer to global memory. With the new implementation of BLIINDS-II algorithm using CUB library, a speedup of around 13.7% was achieved.
ContributorsWadekar, Ameya (Author) / Sohoni, Sohum (Thesis advisor) / Aukes, Daniel (Committee member) / Redkar, Sangram (Committee member) / Arizona State University (Publisher)
Created2017