Theses and Dissertations
Displaying 1 - 2 of 2
Filtering by
- All Subjects: multimodal
- Creators: Papandreou-Suppappola, Antonia
Description
Deep learning architectures have been widely explored in computer vision and have
depicted commendable performance in a variety of applications. A fundamental challenge
in training deep networks is the requirement of large amounts of labeled training
data. While gathering large quantities of unlabeled data is cheap and easy, annotating
the data is an expensive process in terms of time, labor and human expertise.
Thus, developing algorithms that minimize the human effort in training deep models
is of immense practical importance. Active learning algorithms automatically identify
salient and exemplar samples from large amounts of unlabeled data and can augment
maximal information to supervised learning models, thereby reducing the human annotation
effort in training machine learning models. The goal of this dissertation is to
fuse ideas from deep learning and active learning and design novel deep active learning
algorithms. The proposed learning methodologies explore diverse label spaces to
solve different computer vision applications. Three major contributions have emerged
from this work; (i) a deep active framework for multi-class image classication, (ii)
a deep active model with and without label correlation for multi-label image classi-
cation and (iii) a deep active paradigm for regression. Extensive empirical studies
on a variety of multi-class, multi-label and regression vision datasets corroborate the
potential of the proposed methods for real-world applications. Additional contributions
include: (i) a multimodal emotion database consisting of recordings of facial
expressions, body gestures, vocal expressions and physiological signals of actors enacting
various emotions, (ii) four multimodal deep belief network models and (iii)
an in-depth analysis of the effect of transfer of multimodal emotion features between
source and target networks on classification accuracy and training time. These related
contributions help comprehend the challenges involved in training deep learning
models and motivate the main goal of this dissertation.
depicted commendable performance in a variety of applications. A fundamental challenge
in training deep networks is the requirement of large amounts of labeled training
data. While gathering large quantities of unlabeled data is cheap and easy, annotating
the data is an expensive process in terms of time, labor and human expertise.
Thus, developing algorithms that minimize the human effort in training deep models
is of immense practical importance. Active learning algorithms automatically identify
salient and exemplar samples from large amounts of unlabeled data and can augment
maximal information to supervised learning models, thereby reducing the human annotation
effort in training machine learning models. The goal of this dissertation is to
fuse ideas from deep learning and active learning and design novel deep active learning
algorithms. The proposed learning methodologies explore diverse label spaces to
solve different computer vision applications. Three major contributions have emerged
from this work; (i) a deep active framework for multi-class image classication, (ii)
a deep active model with and without label correlation for multi-label image classi-
cation and (iii) a deep active paradigm for regression. Extensive empirical studies
on a variety of multi-class, multi-label and regression vision datasets corroborate the
potential of the proposed methods for real-world applications. Additional contributions
include: (i) a multimodal emotion database consisting of recordings of facial
expressions, body gestures, vocal expressions and physiological signals of actors enacting
various emotions, (ii) four multimodal deep belief network models and (iii)
an in-depth analysis of the effect of transfer of multimodal emotion features between
source and target networks on classification accuracy and training time. These related
contributions help comprehend the challenges involved in training deep learning
models and motivate the main goal of this dissertation.
ContributorsRanganathan, Hiranmayi (Author) / Sethuraman, Panchanathan (Thesis advisor) / Papandreou-Suppappola, Antonia (Committee member) / Li, Baoxin (Committee member) / Chakraborty, Shayok (Committee member) / Arizona State University (Publisher)
Created2018
Description
As the demand for wireless systems increases exponentially, it has become necessary
for different wireless modalities, like radar and communication systems, to share the
available bandwidth. One approach to realize coexistence successfully is for each
system to adopt a transmit waveform with a unique nonlinear time-varying phase
function. At the receiver of the system of interest, the waveform received for process-
ing may still suffer from low signal-to-interference-plus-noise ratio (SINR) due to the
presence of the waveforms that are matched to the other coexisting systems. This
thesis uses a time-frequency based approach to increase the SINR of a system by estimating the unique nonlinear instantaneous frequency (IF) of the waveform matched
to the system. Specifically, the IF is estimated using the synchrosqueezing transform,
a highly localized time-frequency representation that also enables reconstruction of
individual waveform components. As the IF estimate is biased, modified versions of
the transform are investigated to obtain estimators that are both unbiased and also
matched to the unique nonlinear phase function of a given waveform. Simulations
using transmit waveforms of coexisting wireless systems are provided to demonstrate
the performance of the proposed approach using both biased and unbiased IF estimators.
for different wireless modalities, like radar and communication systems, to share the
available bandwidth. One approach to realize coexistence successfully is for each
system to adopt a transmit waveform with a unique nonlinear time-varying phase
function. At the receiver of the system of interest, the waveform received for process-
ing may still suffer from low signal-to-interference-plus-noise ratio (SINR) due to the
presence of the waveforms that are matched to the other coexisting systems. This
thesis uses a time-frequency based approach to increase the SINR of a system by estimating the unique nonlinear instantaneous frequency (IF) of the waveform matched
to the system. Specifically, the IF is estimated using the synchrosqueezing transform,
a highly localized time-frequency representation that also enables reconstruction of
individual waveform components. As the IF estimate is biased, modified versions of
the transform are investigated to obtain estimators that are both unbiased and also
matched to the unique nonlinear phase function of a given waveform. Simulations
using transmit waveforms of coexisting wireless systems are provided to demonstrate
the performance of the proposed approach using both biased and unbiased IF estimators.
ContributorsGattani, Vineet Sunil (Author) / Papandreou-Suppappola, Antonia (Thesis advisor) / Richmond, Christ (Committee member) / Maurer, Alexander (Committee member) / Arizona State University (Publisher)
Created2018