Matching Items (2)
Filtering by

Clear all filters

156430-Thumbnail Image.png
Description
Machine learning models convert raw data in the form of video, images, audio,

text, etc. into feature representations that are convenient for computational process-

ing. Deep neural networks have proven to be very efficient feature extractors for a

variety of machine learning tasks. Generative models based on deep neural networks

introduce constraints on the

Machine learning models convert raw data in the form of video, images, audio,

text, etc. into feature representations that are convenient for computational process-

ing. Deep neural networks have proven to be very efficient feature extractors for a

variety of machine learning tasks. Generative models based on deep neural networks

introduce constraints on the feature space to learn transferable and disentangled rep-

resentations. Transferable feature representations help in training machine learning

models that are robust across different distributions of data. For example, with the

application of transferable features in domain adaptation, models trained on a source

distribution can be applied to a data from a target distribution even though the dis-

tributions may be different. In style transfer and image-to-image translation, disen-

tangled representations allow for the separation of style and content when translating

images.

This thesis examines learning transferable data representations in novel deep gen-

erative models. The Semi-Supervised Adversarial Translator (SAT) utilizes adversar-

ial methods and cross-domain weight sharing in a neural network to extract trans-

ferable representations. These transferable interpretations can then be decoded into

the original image or a similar image in another domain. The Explicit Disentangling

Network (EDN) utilizes generative methods to disentangle images into their core at-

tributes and then segments sets of related attributes. The EDN can separate these

attributes by controlling the ow of information using a novel combination of losses

and network architecture. This separation of attributes allows precise modi_cations

to speci_c components of the data representation, boosting the performance of ma-

chine learning tasks. The effectiveness of these models is evaluated across domain

adaptation, style transfer, and image-to-image translation tasks.
ContributorsEusebio, Jose Miguel Ang (Author) / Panchanathan, Sethuraman (Thesis advisor) / Davulcu, Hasan (Committee member) / Venkateswara, Hemanth (Committee member) / Arizona State University (Publisher)
Created2018
158117-Thumbnail Image.png
Description
Visual object recognition has achieved great success with advancements in deep learning technologies. Notably, the existing recognition models have gained human-level performance on many of the recognition tasks. However, these models are data hungry, and their performance is constrained by the amount of training data. Inspired by the human ability

Visual object recognition has achieved great success with advancements in deep learning technologies. Notably, the existing recognition models have gained human-level performance on many of the recognition tasks. However, these models are data hungry, and their performance is constrained by the amount of training data. Inspired by the human ability to recognize object categories based on textual descriptions of objects and previous visual knowledge, the research community has extensively pursued the area of zero-shot learning. In this area of research, machine vision models are trained to recognize object categories that are not observed during the training process. Zero-shot learning models leverage textual information to transfer visual knowledge from seen object categories in order to recognize unseen object categories.

Generative models have recently gained popularity as they synthesize unseen visual features and convert zero-shot learning into a classical supervised learning problem. These generative models are trained using seen classes and are expected to implicitly transfer the knowledge from seen to unseen classes. However, their performance is stymied by overfitting towards seen classes, which leads to substandard performance in generalized zero-shot learning. To address this concern, this dissertation proposes a novel generative model that leverages the semantic relationship between seen and unseen categories and explicitly performs knowledge transfer from seen categories to unseen categories. Experiments were conducted on several benchmark datasets to demonstrate the efficacy of the proposed model for both zero-shot learning and generalized zero-shot learning. The dissertation also provides a unique Student-Teacher based generative model for zero-shot learning and concludes with future research directions in this area.
ContributorsVyas, Maunil Rohitbhai (Author) / Panchanathan, Sethuraman (Thesis advisor) / Venkateswara, Hemanth (Thesis advisor) / McDaniel, Troy (Committee member) / Arizona State University (Publisher)
Created2020