Matching Items (5)

Filtering by

Clear all filters

136785-Thumbnail Image.png

Exploring the Design of Vibrotactile Cues for Visio-Haptic Sensory Substitution

Description

This paper presents the design and evaluation of a haptic interface for augmenting human-human interpersonal interactions by delivering facial expressions of an interaction partner to an individual who is blind using a visual-to-tactile mapping of facial action units and emotions.

This paper presents the design and evaluation of a haptic interface for augmenting human-human interpersonal interactions by delivering facial expressions of an interaction partner to an individual who is blind using a visual-to-tactile mapping of facial action units and emotions. Pancake shaftless vibration motors are mounted on the back of a chair to provide vibrotactile stimulation in the context of a dyadic (one-on-one) interaction across a table. This work explores the design of spatiotemporal vibration patterns that can be used to convey the basic building blocks of facial movements according to the Facial Action Unit Coding System. A behavioral study was conducted to explore the factors that influence the naturalness of conveying affect using vibrotactile cues.

Contributors

Agent

Created

Date Created
2014-05

158120-Thumbnail Image.png

Language Image Transformer

Description

Humans perceive the environment using multiple modalities like vision, speech (language), touch, taste, and smell. The knowledge obtained from one modality usually complements the other. Learning through several modalities helps in constructing an accurate model of the environment. Most of

Humans perceive the environment using multiple modalities like vision, speech (language), touch, taste, and smell. The knowledge obtained from one modality usually complements the other. Learning through several modalities helps in constructing an accurate model of the environment. Most of the current vision and language models are modality-specific and, in many cases, extensively use deep-learning based attention mechanisms for learning powerful representations. This work discusses the role of attention in associating vision and language for generating shared representation. Language Image Transformer (LIT) is proposed for learning multi-modal representations of the environment. It uses a training objective based on Contrastive Predictive Coding (CPC) to maximize the Mutual Information (MI) between the visual and linguistic representations. It learns the relationship between the modalities using the proposed cross-modal attention layers. It is trained and evaluated using captioning datasets, MS COCO, and Conceptual Captions. The results and the analysis offers a perspective on the use of Mutual Information Maximisation (MIM) for generating generalizable representations across multiple modalities.

Contributors

Agent

Created

Date Created
2020

158278-Thumbnail Image.png

Generalized Domain Adaptation for Visual Domains

Description

Humans have a great ability to recognize objects in different environments irrespective of their variations. However, the same does not apply to machine learning models which are unable to generalize to images of objects from different domains. The generalization of

Humans have a great ability to recognize objects in different environments irrespective of their variations. However, the same does not apply to machine learning models which are unable to generalize to images of objects from different domains. The generalization of these models to new data is constrained by the domain gap. Many factors such as image background, image resolution, color, camera perspective and variations in the objects are responsible for the domain gap between the training data (source domain) and testing data (target domain). Domain adaptation algorithms aim to overcome the domain gap between the source and target domains and learn robust models that can perform well across both the domains.

This thesis provides solutions for the standard problem of unsupervised domain adaptation (UDA) and the more generic problem of generalized domain adaptation (GDA). The contributions of this thesis are as follows. (1) Certain and Consistent Domain Adaptation model for closed-set unsupervised domain adaptation by aligning the features of the source and target domain using deep neural networks. (2) A multi-adversarial deep learning model for generalized domain adaptation. (3) A gating model that detects out-of-distribution samples for generalized domain adaptation.

The models were tested across multiple computer vision datasets for domain adaptation.

The dissertation concludes with a discussion on the proposed approaches and future directions for research in closed set and generalized domain adaptation.

Contributors

Agent

Created

Date Created
2020

158259-Thumbnail Image.png

Incremental Learning With Sample Generation From Pretrained Networks

Description

In the last decade deep learning based models have revolutionized machine learning and computer vision applications. However, these models are data-hungry and training them is a time-consuming process. In addition, when deep neural networks are updated to augment their prediction

In the last decade deep learning based models have revolutionized machine learning and computer vision applications. However, these models are data-hungry and training them is a time-consuming process. In addition, when deep neural networks are updated to augment their prediction space with new data, they run into the problem of catastrophic forgetting, where the model forgets previously learned knowledge as it overfits to the newly available data. Incremental learning algorithms enable deep neural networks to prevent catastrophic forgetting by retaining knowledge of previously observed data while also learning from newly available data.

This thesis presents three models for incremental learning; (i) Design of an algorithm for generative incremental learning using a pre-trained deep neural network classifier; (ii) Development of a hashing based clustering algorithm for efficient incremental learning; (iii) Design of a student-teacher coupled neural network to distill knowledge for incremental learning. The proposed algorithms were evaluated using popular vision datasets for classification tasks. The thesis concludes with a discussion about the feasibility of using these techniques to transfer information between networks and also for incremental learning applications.

Contributors

Agent

Created

Date Created
2020

157758-Thumbnail Image.png

Deep domain fusion for adaptive image classification

Description

Endowing machines with the ability to understand digital images is a critical task for a host of high-impact applications, including pathology detection in radiographic imaging, autonomous vehicles, and assistive technology for the visually impaired. Computer vision systems rely on large

Endowing machines with the ability to understand digital images is a critical task for a host of high-impact applications, including pathology detection in radiographic imaging, autonomous vehicles, and assistive technology for the visually impaired. Computer vision systems rely on large corpora of annotated data in order to train task-specific visual recognition models. Despite significant advances made over the past decade, the fact remains collecting and annotating the data needed to successfully train a model is a prohibitively expensive endeavor. Moreover, these models are prone to rapid performance degradation when applied to data sampled from a different domain. Recent works in the development of deep adaptation networks seek to overcome these challenges by facilitating transfer learning between source and target domains. In parallel, the unification of dominant semi-supervised learning techniques has illustrated unprecedented potential for utilizing unlabeled data to train classification models in defiance of discouragingly meager sets of annotated data.

In this thesis, a novel domain adaptation algorithm -- Domain Adaptive Fusion (DAF) -- is proposed, which encourages a domain-invariant linear relationship between the pixel-space of different domains and the prediction-space while being trained under a domain adversarial signal. The thoughtful combination of key components in unsupervised domain adaptation and semi-supervised learning enable DAF to effectively bridge the gap between source and target domains. Experiments performed on computer vision benchmark datasets for domain adaptation endorse the efficacy of this hybrid approach, outperforming all of the baseline architectures on most of the transfer tasks.

Contributors

Agent

Created

Date Created
2019