Matching Items (2)
Filtering by

Clear all filters

156219-Thumbnail Image.png
Description
Deep learning architectures have been widely explored in computer vision and have

depicted commendable performance in a variety of applications. A fundamental challenge

in training deep networks is the requirement of large amounts of labeled training

data. While gathering large quantities of unlabeled data is cheap and easy, annotating

the data is an expensive

Deep learning architectures have been widely explored in computer vision and have

depicted commendable performance in a variety of applications. A fundamental challenge

in training deep networks is the requirement of large amounts of labeled training

data. While gathering large quantities of unlabeled data is cheap and easy, annotating

the data is an expensive process in terms of time, labor and human expertise.

Thus, developing algorithms that minimize the human effort in training deep models

is of immense practical importance. Active learning algorithms automatically identify

salient and exemplar samples from large amounts of unlabeled data and can augment

maximal information to supervised learning models, thereby reducing the human annotation

effort in training machine learning models. The goal of this dissertation is to

fuse ideas from deep learning and active learning and design novel deep active learning

algorithms. The proposed learning methodologies explore diverse label spaces to

solve different computer vision applications. Three major contributions have emerged

from this work; (i) a deep active framework for multi-class image classication, (ii)

a deep active model with and without label correlation for multi-label image classi-

cation and (iii) a deep active paradigm for regression. Extensive empirical studies

on a variety of multi-class, multi-label and regression vision datasets corroborate the

potential of the proposed methods for real-world applications. Additional contributions

include: (i) a multimodal emotion database consisting of recordings of facial

expressions, body gestures, vocal expressions and physiological signals of actors enacting

various emotions, (ii) four multimodal deep belief network models and (iii)

an in-depth analysis of the effect of transfer of multimodal emotion features between

source and target networks on classification accuracy and training time. These related

contributions help comprehend the challenges involved in training deep learning

models and motivate the main goal of this dissertation.
ContributorsRanganathan, Hiranmayi (Author) / Sethuraman, Panchanathan (Thesis advisor) / Papandreou-Suppappola, Antonia (Committee member) / Li, Baoxin (Committee member) / Chakraborty, Shayok (Committee member) / Arizona State University (Publisher)
Created2018
155339-Thumbnail Image.png
Description
The widespread adoption of computer vision models is often constrained by the issue of domain mismatch. Models that are trained with data belonging to one distribution, perform poorly when tested with data from a different distribution. Variations in vision based data can be attributed to the following reasons, viz., differences

The widespread adoption of computer vision models is often constrained by the issue of domain mismatch. Models that are trained with data belonging to one distribution, perform poorly when tested with data from a different distribution. Variations in vision based data can be attributed to the following reasons, viz., differences in image quality (resolution, brightness, occlusion and color), changes in camera perspective, dissimilar backgrounds and an inherent diversity of the samples themselves. Machine learning techniques like transfer learning are employed to adapt computational models across distributions. Domain adaptation is a special case of transfer learning, where knowledge from a source domain is transferred to a target domain in the form of learned models and efficient feature representations.

The dissertation outlines novel domain adaptation approaches across different feature spaces; (i) a linear Support Vector Machine model for domain alignment; (ii) a nonlinear kernel based approach that embeds domain-aligned data for enhanced classification; (iii) a hierarchical model implemented using deep learning, that estimates domain-aligned hash values for the source and target data, and (iv) a proposal for a feature selection technique to reduce cross-domain disparity. These adaptation procedures are tested and validated across a range of computer vision applications like object classification, facial expression recognition, digit recognition, and activity recognition. The dissertation also provides a unique perspective of domain adaptation literature from the point-of-view of linear, nonlinear and hierarchical feature spaces. The dissertation concludes with a discussion on the future directions for research that highlight the role of domain adaptation in an era of rapid advancements in artificial intelligence.
ContributorsDemakethepalli Venkateswara, Hemanth (Author) / Panchanathan, Sethuraman (Thesis advisor) / Li, Baoxin (Committee member) / Davulcu, Hasan (Committee member) / Ye, Jieping (Committee member) / Chakraborty, Shayok (Committee member) / Arizona State University (Publisher)
Created2017