Matching Items (13)

154977-Thumbnail Image.png

The feasibility of domain specific compilation for spatially programmable architectures

Description

Integrated circuits must be energy efficient. This efficiency affects all aspects of chip design, from the battery life of embedded devices to thermal heating on high performance servers. As technology

Integrated circuits must be energy efficient. This efficiency affects all aspects of chip design, from the battery life of embedded devices to thermal heating on high performance servers. As technology scaling slows, future generations of transistors will lack the energy efficiency gains as it has had in previous generations. Therefore, other sources of energy efficiency will be much more important. Many computations have the potential to be executed for extreme energy efficiency but are not instigated because the platforms they run on are not optimized for efficient execution. ASICs improve energy efficiency by reducing flexibility and leveraging the properties of a specific computation. However, ASICs are fixed in function and therefore have incredible opportunity cost. FPGAs offer a reconfigurable solution but are 25x less energy efficient than ASIC implementation. Spatially programmable architectures (SPAs) are similar in design and structure to ASICs and FPGAs but are able bridge the ASIC-FPGA energy efficiency gap by trading flexibility for efficiency. However, SPAs are difficult to program because they do not share the same programming model as normal architectures that execute in time. This work addresses compiler challenges for coarse grained, locally interconnected SPA for domain efficiency (SPADE). A novel SPADE topology, called the wave pipeline, is introduced that is designed for the image signal processing domain that is both efficient and simple to compile to. A compiler for the wave pipeline is created that solves for maximum energy and area efficiency using low complexity, greedy methods. The wave pipeline topology and compiler allow for us to investigate and experiment with image signal processing applications to prove the feasibility of SPADE compilers.

Contributors

Agent

Created

Date Created
  • 2016

155148-Thumbnail Image.png

Subjective and objective evaluation of visual attention models

Description

Visual attention (VA) is the study of mechanisms that allow the human visual system (HVS) to selectively process relevant visual information. This work focuses on the subjective and objective evaluation

Visual attention (VA) is the study of mechanisms that allow the human visual system (HVS) to selectively process relevant visual information. This work focuses on the subjective and objective evaluation of computational VA models for the distortion-free case as well as in the presence of image distortions.

Existing VA models are traditionally evaluated by using VA metrics that quantify the match between predicted saliency and fixation data obtained from eye-tracking experiments on human observers. Though there is a considerable number of objective VA metrics, there exists no study that validates that these metrics are adequate for the evaluation of VA models. This work constructs a VA Quality (VAQ) Database by subjectively assessing the prediction performance of VA models on distortion-free images. Additionally, shortcomings in existing metrics are discussed through illustrative examples and a new metric that uses local weights based on fixation density and that overcomes these flaws, is proposed. The proposed VA metric outperforms all other popular existing metrics in terms of the correlation with subjective ratings.

In practice, the image quality is affected by a host of factors at several stages of the image processing pipeline such as acquisition, compression, and transmission. However, none of the existing studies have discussed the subjective and objective evaluation of visual saliency models in the presence of distortion. In this work, a Distortion-based Visual Attention Quality (DVAQ) subjective database is constructed to evaluate the quality of VA maps for images in the presence of distortions. For creating this database, saliency maps obtained from images subjected to various types of distortions, including blur, noise and compression, and varying levels of distortion severity are rated by human observers in terms of their visual resemblance to corresponding ground-truth fixation density maps. The performance of traditionally used as well as recently proposed VA metrics are evaluated by correlating their scores with the human subjective ratings. In addition, an objective evaluation of 20 state-of-the-art VA models is performed using the top-performing VA metrics together with a study of how the VA models’ prediction performance changes with different types and levels of distortions.

Contributors

Agent

Created

Date Created
  • 2016

153241-Thumbnail Image.png

Spatial and multi-temporal visual change detection with application to SAR image analysis

Description

Thousands of high-resolution images are generated each day. Detecting and analyzing variations in these images are key steps in image understanding. This work focuses on spatial and multitemporal

visual change detection

Thousands of high-resolution images are generated each day. Detecting and analyzing variations in these images are key steps in image understanding. This work focuses on spatial and multitemporal

visual change detection and its applications in multi-temporal synthetic aperture radar (SAR) images.

The Canny edge detector is one of the most widely-used edge detection algorithms due to its superior performance in terms of SNR and edge localization and only one response to a single edge. In this work, we propose a mechanism to implement the Canny algorithm at the block level without any loss in edge detection performance as compared to the original frame-level Canny algorithm. The resulting block-based algorithm has significantly reduced memory requirements and can achieve a significantly reduced latency. Furthermore, the proposed algorithm can be easily integrated with other block-based image processing systems. In addition, quantitative evaluations and subjective tests show that the edge detection performance of the proposed algorithm is better than the original frame-based algorithm, especially when noise is present in the images.

In the context of multi-temporal SAR images for earth monitoring applications, one critical issue is the detection of changes occurring after a natural or anthropic disaster. In this work, we propose a novel similarity measure for automatic change detection using a pair of SAR images

acquired at different times and apply it in both the spatial and wavelet domains. This measure is based on the evolution of the local statistics of the image between two dates. The local statistics are modeled as a Gaussian Mixture Model (GMM), which is more suitable and flexible to approximate the local distribution of the SAR image with distinct land-cover typologies. Tests on real datasets show that the proposed detectors outperform existing methods in terms of the quality of the similarity maps, which are assessed using the receiver operating characteristic (ROC) curves, and in terms of the total error rates of the final change detection maps. Furthermore, we proposed a new

similarity measure for automatic change detection based on a divisive normalization transform in order to reduce the computation complexity. Tests show that our proposed DNT-based change detector

exhibits competitive detection performance while achieving lower computational complexity as compared to previously suggested methods.

Contributors

Agent

Created

Date Created
  • 2014

157934-Thumbnail Image.png

Driver Assistance System and Feedback for Hybrid Electric Vehicles Using Sensor Fusion

Description

Transportation plays a significant role in every human's life. Numerous factors, such as cost of living, available amenities, work style, to name a few, play a vital role in determining

Transportation plays a significant role in every human's life. Numerous factors, such as cost of living, available amenities, work style, to name a few, play a vital role in determining the amount of travel time. Such factors, among others, led in part to an increased need for private transportation and, consequently, leading to an increase in the purchase of private cars. Also, road safety was impacted by numerous factors such as Driving Under Influence (DUI), driver’s distraction due to the increase in the use of mobile devices while driving. These factors led to an increasing need for an Advanced Driver Assistance System (ADAS) to help the driver stay aware of the environment and to improve road safety.

EcoCAR3 is one of the Advanced Vehicle Technology Competitions, sponsored by the United States Department of Energy (DoE) and managed by Argonne National Laboratory in partnership with the North American automotive industry. Students are challenged beyond the traditional classroom environment in these competitions, where they redesign a donated production vehicle to improve energy efficiency and to meet emission standards while maintaining the features that are attractive to the customer, including but not limited to performance, consumer acceptability, safety, and cost.

This thesis presents a driver assistance system interface that was implemented as part of EcoCAR3, including the adopted sensors, hardware and software components, system implementation, validation, and testing. The implemented driver assistance system uses a combination of range measurement sensors to determine the distance, relative location, & the relative velocity of obstacles and surrounding objects together with a computer vision algorithm for obstacle detection and classification. The sensor system and vision system were tested individually and then combined within the overall system. Also, a visual and audio feedback system was designed and implemented to provide timely feedback for the driver as an attempt to enhance situational awareness and improve safety.

Since the driver assistance system was designed and developed as part of a DoE sponsored competition, the system needed to satisfy competition requirements and rules. This work attempted to optimize the system in terms of performance, robustness, and cost while satisfying these constraints.

Contributors

Agent

Created

Date Created
  • 2019

158654-Thumbnail Image.png

Robust Deep Learning Through Selective Feature Regeneration.

Description

In recent years, the widespread use of deep neural networks (DNNs) has facilitated great improvements in performance for computer vision tasks like image classification and object recognition. In most realistic

In recent years, the widespread use of deep neural networks (DNNs) has facilitated great improvements in performance for computer vision tasks like image classification and object recognition. In most realistic computer vision applications, an input image undergoes some form of image distortion such as blur and additive noise during image acquisition or transmission. Deep networks trained on pristine images perform poorly when tested on such distortions. DNN predictions have also been shown to be vulnerable to carefully crafted adversarial perturbations. Specifically, so-called universal adversarial perturbations are image-agnostic perturbations that can be added to any image and can fool a target network into making erroneous predictions. This work proposes selective DNN feature regeneration to improve the robustness of existing DNNs to image distortions and universal adversarial perturbations.

In the context of common naturally occurring image distortions, a metric is proposed to identify the most susceptible DNN convolutional filters and rank them in order of the highest gain in classification accuracy upon correction. The proposed approach called DeepCorrect applies small stacks of convolutional layers with residual connections at the output of these ranked filters and trains them to correct the most distortion-affected filter activations, whilst leaving the rest of the pre-trained filter outputs in the network unchanged. Performance results show that applying DeepCorrect models for common vision tasks significantly improves the robustness of DNNs against distorted images and outperforms other alternative approaches.

In the context of universal adversarial perturbations, departing from existing defense strategies that work mostly in the image domain, a novel and effective defense which only operates in the DNN feature domain is presented. This approach identifies pre-trained convolutional features that are most vulnerable to adversarial perturbations and deploys trainable feature regeneration units which transform these DNN filter activations into resilient features that are robust to universal perturbations. Regenerating only the top 50% adversarially susceptible activations in at most 6 DNN layers and leaving all remaining DNN activations unchanged can outperform existing defense strategies across different network architectures and across various universal attacks.

Contributors

Agent

Created

Date Created
  • 2020

149422-Thumbnail Image.png

3D modeling using multi-view images

Description

There is a growing interest in the creation of three-dimensional (3D) images and videos due to the growing demand for 3D visual media in commercial markets. A possible solution to

There is a growing interest in the creation of three-dimensional (3D) images and videos due to the growing demand for 3D visual media in commercial markets. A possible solution to produce 3D media files is to convert existing 2D images and videos to 3D. The 2D to 3D conversion methods that estimate the depth map from 2D scenes for 3D reconstruction present an efficient approach to save on the cost of the coding, transmission and storage of 3D visual media in practical applications. Various 2D to 3D conversion methods based on depth maps have been developed using existing image and video processing techniques. The depth maps can be estimated either from a single 2D view or from multiple 2D views. This thesis presents a MATLAB-based 2D to 3D conversion system from multiple views based on the computation of a sparse depth map. The 2D to 3D conversion system is able to deal with the multiple views obtained from uncalibrated hand-held cameras without knowledge of the prior camera parameters or scene geometry. The implemented system consists of techniques for image feature detection and registration, two-view geometry estimation, projective 3D scene reconstruction and metric upgrade to reconstruct the 3D structures by means of a metric transformation. The implemented 2D to 3D conversion system is tested using different multi-view image sets. The obtained experimental results of reconstructed sparse depth maps of feature points in 3D scenes provide relative depth information of the objects. Sample ground-truth depth data points are used to calculate a scale factor in order to estimate the true depth by scaling the obtained relative depth information using the estimated scale factor. It was found out that the obtained reconstructed depth map is consistent with the ground-truth depth data.

Contributors

Agent

Created

Date Created
  • 2010

152389-Thumbnail Image.png

Automated animal coloration quantification in digital images using dominant colors and skin classification

Description

The origin and function of color in animals has been a subject of great interest for taxonomists and ecologists in recent years. Coloration in animals is useful for many important

The origin and function of color in animals has been a subject of great interest for taxonomists and ecologists in recent years. Coloration in animals is useful for many important functions like species identification, camouflage and understanding evolutionary relationships. Quantitative measurements of color signal and patch size in mammals, birds and reptiles, to name a few are strong indicators of sexual selection cues and individual health. These measurements provide valuable insights into the impact of environmental conditions on habitat and breeding of mammals, birds and reptiles. Recent advances in the area of digital cameras and sensors have led to a significant increase in the use of digital photography as a means of color quantification in animals. Although a significant amount of research has been conducted on ways to standardize image acquisition conditions and calibrate cameras for use in animal color quantification, almost no work has been done on designing automated methods for animal color quantification. This thesis presents a novel perceptual"–"based framework for the automated extraction and quantification of animal coloration from digital images with slowly varying (almost homogenous) background colors. This implemented framework uses a combination of several techniques including color space quantization using a few dominant colors, foreground"–"background identification, Bayesian classification and mixture Gaussian modelling of conditional densities, edge"–"enhanced model"–"based classification and Saturation"–"Brightness quantization to extract the colored patch. This approach assumes no prior information about the color of either the subject or the background and also the position of the subject in the image. The performance of the proposed method is evaluated for the plumage color of the wild house finches. Segmentation results obtained using the implemented framework are compared with manually scored results to illustrate the performance of this system. The segmentation results show a high correlation with manually scored images. This novel framework also eliminates common problems in manual scoring of digital images such as low repeatability and inter"–"observer error.

Contributors

Agent

Created

Date Created
  • 2013

150440-Thumbnail Image.png

Efficient perceptual super-resolution

Description

Super-Resolution (SR) techniques are widely developed to increase image resolution by fusing several Low-Resolution (LR) images of the same scene to overcome sensor hardware limitations and reduce media impairments in

Super-Resolution (SR) techniques are widely developed to increase image resolution by fusing several Low-Resolution (LR) images of the same scene to overcome sensor hardware limitations and reduce media impairments in a cost-effective manner. When choosing a solution for the SR problem, there is always a trade-off between computational efficiency and High-Resolution (HR) image quality. Existing SR approaches suffer from extremely high computational requirements due to the high number of unknowns to be estimated in the solution of the SR inverse problem. This thesis proposes efficient iterative SR techniques based on Visual Attention (VA) and perceptual modeling of the human visual system. In the first part of this thesis, an efficient ATtentive-SELective Perceptual-based (AT-SELP) SR framework is presented, where only a subset of perceptually significant active pixels is selected for processing by the SR algorithm based on a local contrast sensitivity threshold model and a proposed low complexity saliency detector. The proposed saliency detector utilizes a probability of detection rule inspired by concepts of luminance masking and visual attention. The second part of this thesis further enhances on the efficiency of selective SR approaches by presenting an ATtentive (AT) SR framework that is completely driven by VA region detectors. Additionally, different VA techniques that combine several low-level features, such as center-surround differences in intensity and orientation, patch luminance and contrast, bandpass outputs of patch luminance and contrast, and difference of Gaussians of luminance intensity are integrated and analyzed to illustrate the effectiveness of the proposed selective SR frameworks. The proposed AT-SELP SR and AT-SR frameworks proved to be flexible by integrating a Maximum A Posteriori (MAP)-based SR algorithm as well as a fast two-stage Fusion-Restoration (FR) SR estimator. By adopting the proposed selective SR frameworks, simulation results show significant reduction on average in computational complexity with comparable visual quality in terms of quantitative metrics such as PSNR, SNR or MAE gains, and subjective assessment. The third part of this thesis proposes a Perceptually Weighted (WP) SR technique that incorporates unequal weighting parameters in the cost function of iterative SR problems. The proposed approach is inspired by the unequal processing of the Human Visual System (HVS) to different local image features in an image. Simulation results show an enhanced reconstruction quality and faster convergence rates when applied to the MAP-based and FR-based SR schemes.

Contributors

Agent

Created

Date Created
  • 2011

157065-Thumbnail Image.png

Performance Evaluation of Object Proposal Generators for Salient Object Detection

Description

The detection and segmentation of objects appearing in a natural scene, often referred to as Object Detection, has gained a lot of interest in the computer vision field. Although most

The detection and segmentation of objects appearing in a natural scene, often referred to as Object Detection, has gained a lot of interest in the computer vision field. Although most existing object detectors aim to detect all the objects in a given scene, it is important to evaluate whether these methods are capable of detecting the salient objects in the scene when constraining the number of proposals that can be generated due to constraints on timing or computations during execution. Salient objects are objects that tend to be more fixated by human subjects. The detection of salient objects is important in applications such as image collection browsing, image display on small devices, and perceptual compression.

This thesis proposes a novel evaluation framework that analyses the performance of popular existing object proposal generators in detecting the most salient objects. This work also shows that, by incorporating saliency constraints, the number of generated object proposals and thus the computational cost can be decreased significantly for a target true positive detection rate (TPR).

As part of the proposed framework, salient ground-truth masks are generated from the given original ground-truth masks for a given dataset. Given an object detection dataset, this work constructs salient object location ground-truth data, referred to here as salient ground-truth data for short, that only denotes the locations of salient objects. This is obtained by first computing a saliency map for the input image and then using it to assign a saliency score to each object in the image. Objects whose saliency scores are sufficiently high are referred to as salient objects. The detection rates are analyzed for existing object proposal generators with respect to the original ground-truth masks and the generated salient ground-truth masks.

As part of this work, a salient object detection database with salient ground-truth masks was constructed from the PASCAL VOC 2007 dataset. Not only does this dataset aid in analyzing the performance of existing object detectors for salient object detection, but it also helps in the development of new object detection methods and evaluating their performance in terms of successful detection of salient objects.

Contributors

Agent

Created

Date Created
  • 2019

152770-Thumbnail Image.png

Texture structure analysis

Description

Texture analysis plays an important role in applications like automated pattern inspection, image and video compression, content-based image retrieval, remote-sensing, medical imaging and document processing, to name a few. Texture

Texture analysis plays an important role in applications like automated pattern inspection, image and video compression, content-based image retrieval, remote-sensing, medical imaging and document processing, to name a few. Texture Structure Analysis is the process of studying the structure present in the textures. This structure can be expressed in terms of perceived regularity. Our human visual system (HVS) uses the perceived regularity as one of the important pre-attentive cues in low-level image understanding. Similar to the HVS, image processing and computer vision systems can make fast and efficient decisions if they can quantify this regularity automatically. In this work, the problem of quantifying the degree of perceived regularity when looking at an arbitrary texture is introduced and addressed. One key contribution of this work is in proposing an objective no-reference perceptual texture regularity metric based on visual saliency. Other key contributions include an adaptive texture synthesis method based on texture regularity, and a low-complexity reduced-reference visual quality metric for assessing the quality of synthesized textures. In order to use the best performing visual attention model on textures, the performance of the most popular visual attention models to predict the visual saliency on textures is evaluated. Since there is no publicly available database with ground-truth saliency maps on images with exclusive texture content, a new eye-tracking database is systematically built. Using the Visual Saliency Map (VSM) generated by the best visual attention model, the proposed texture regularity metric is computed. The proposed metric is based on the observation that VSM characteristics differ between textures of differing regularity. The proposed texture regularity metric is based on two texture regularity scores, namely a textural similarity score and a spatial distribution score. In order to evaluate the performance of the proposed regularity metric, a texture regularity database called RegTEX, is built as a part of this work. It is shown through subjective testing that the proposed metric has a strong correlation with the Mean Opinion Score (MOS) for the perceived regularity of textures. The proposed method is also shown to be robust to geometric and photometric transformations and outperforms some of the popular texture regularity metrics in predicting the perceived regularity. The impact of the proposed metric to improve the performance of many image-processing applications is also presented. The influence of the perceived texture regularity on the perceptual quality of synthesized textures is demonstrated through building a synthesized textures database named SynTEX. It is shown through subjective testing that textures with different degrees of perceived regularities exhibit different degrees of vulnerability to artifacts resulting from different texture synthesis approaches. This work also proposes an algorithm for adaptively selecting the appropriate texture synthesis method based on the perceived regularity of the original texture. A reduced-reference texture quality metric for texture synthesis is also proposed as part of this work. The metric is based on the change in perceived regularity and the change in perceived granularity between the original and the synthesized textures. The perceived granularity is quantified through a new granularity metric that is proposed in this work. It is shown through subjective testing that the proposed quality metric, using just 2 parameters, has a strong correlation with the MOS for the fidelity of synthesized textures and outperforms the state-of-the-art full-reference quality metrics on 3 different texture databases. Finally, the ability of the proposed regularity metric in predicting the perceived degradation of textures due to compression and blur artifacts is also established.

Contributors

Agent

Created

Date Created
  • 2014