Matching Items (6)
Filtering by

Clear all filters

168404-Thumbnail Image.png
Description
Communicating with computers through thought has been a remarkable achievement in recent years. This was made possible by the use of Electroencephalography (EEG). Brain-computer interface (BCI) relies heavily on Electroencephalography (EEG) signals for communication between humans and computers. With the advent ofdeep learning, many studies recently applied these techniques to

Communicating with computers through thought has been a remarkable achievement in recent years. This was made possible by the use of Electroencephalography (EEG). Brain-computer interface (BCI) relies heavily on Electroencephalography (EEG) signals for communication between humans and computers. With the advent ofdeep learning, many studies recently applied these techniques to EEG data to perform various tasks like emotion recognition, motor imagery classification, sleep analysis, and many more. Despite the rise of interest in EEG signal classification, very few studies have explored the MindBigData dataset, which collects EEG signals recorded at the stimulus of seeing a digit and thinking about it. This dataset takes us closer to realizing the idea of mind-reading or communication via thought. Thus classifying these signals into the respective digit that the user thinks about is a challenging task. This serves as a motivation to study this dataset and apply existing deep learning techniques to study it. Given the recent success of transformer architecture in different domains like Computer Vision and Natural language processing, this thesis studies transformer architecture for EEG signal classification. Also, it explores other deep learning techniques for the same. As a result, the proposed classification pipeline achieves comparable performance with the existing methods.
ContributorsMuglikar, Omkar Dushyant (Author) / Wang, Yalin (Thesis advisor) / Liang, Jianming (Committee member) / Venkateswara, Hemanth (Committee member) / Arizona State University (Publisher)
Created2021
168538-Thumbnail Image.png
Description
Recently, Generative Adversarial Networks (GANs) have been applied to the problem of Cold-Start Recommendation, but the training performance of these models is hampered by the extreme sparsity in warm user purchase behavior. This thesis introduces a novel representation for user-vectors by combining user demographics and user preferences, making the model

Recently, Generative Adversarial Networks (GANs) have been applied to the problem of Cold-Start Recommendation, but the training performance of these models is hampered by the extreme sparsity in warm user purchase behavior. This thesis introduces a novel representation for user-vectors by combining user demographics and user preferences, making the model a hybrid system which uses Collaborative Filtering and Content Based Recommendation. This system models user purchase behavior using weighted user-product preferences (explicit feedback) rather than binary user-product interactions (implicit feedback). Using this a novel sparse adversarial model, Sparse ReguLarized Generative Adversarial Network (SRLGAN), is developed for Cold-Start Recommendation. SRLGAN leverages the sparse user-purchase behavior which ensures training stability and avoids over-fitting on warm users. The performance of SRLGAN is evaluated on two popular datasets and demonstrate state-of-the-art results.
ContributorsShah, Aksheshkumar Ajaykumar (Author) / Venkateswara, Hemanth (Thesis advisor) / Berman, Spring (Thesis advisor) / Ladani, Leila J (Committee member) / Arizona State University (Publisher)
Created2022
158117-Thumbnail Image.png
Description
Visual object recognition has achieved great success with advancements in deep learning technologies. Notably, the existing recognition models have gained human-level performance on many of the recognition tasks. However, these models are data hungry, and their performance is constrained by the amount of training data. Inspired by the human ability

Visual object recognition has achieved great success with advancements in deep learning technologies. Notably, the existing recognition models have gained human-level performance on many of the recognition tasks. However, these models are data hungry, and their performance is constrained by the amount of training data. Inspired by the human ability to recognize object categories based on textual descriptions of objects and previous visual knowledge, the research community has extensively pursued the area of zero-shot learning. In this area of research, machine vision models are trained to recognize object categories that are not observed during the training process. Zero-shot learning models leverage textual information to transfer visual knowledge from seen object categories in order to recognize unseen object categories.

Generative models have recently gained popularity as they synthesize unseen visual features and convert zero-shot learning into a classical supervised learning problem. These generative models are trained using seen classes and are expected to implicitly transfer the knowledge from seen to unseen classes. However, their performance is stymied by overfitting towards seen classes, which leads to substandard performance in generalized zero-shot learning. To address this concern, this dissertation proposes a novel generative model that leverages the semantic relationship between seen and unseen categories and explicitly performs knowledge transfer from seen categories to unseen categories. Experiments were conducted on several benchmark datasets to demonstrate the efficacy of the proposed model for both zero-shot learning and generalized zero-shot learning. The dissertation also provides a unique Student-Teacher based generative model for zero-shot learning and concludes with future research directions in this area.
ContributorsVyas, Maunil Rohitbhai (Author) / Panchanathan, Sethuraman (Thesis advisor) / Venkateswara, Hemanth (Thesis advisor) / McDaniel, Troy (Committee member) / Arizona State University (Publisher)
Created2020
158127-Thumbnail Image.png
Description
Over the past decade, advancements in neural networks have been instrumental in achieving remarkable breakthroughs in the field of computer vision. One of the applications is in creating assistive technology to improve the lives of visually impaired people by making the world around them more accessible. A lot of research

Over the past decade, advancements in neural networks have been instrumental in achieving remarkable breakthroughs in the field of computer vision. One of the applications is in creating assistive technology to improve the lives of visually impaired people by making the world around them more accessible. A lot of research in convolutional neural networks has led to human-level performance in different vision tasks including image classification, object detection, instance segmentation, semantic segmentation, panoptic segmentation and scene text recognition. All the before mentioned tasks, individually or in combination, have been used to create assistive technologies to improve accessibility for the blind.

This dissertation outlines various applications to improve accessibility and independence for visually impaired people during shopping by helping them identify products in retail stores. The dissertation includes the following contributions; (i) A dataset containing images of breakfast-cereal products and a classifier using a deep neural (ResNet) network; (ii) A dataset for training a text detection and scene-text recognition model; (iii) A model for text detection and scene-text recognition to identify product images using a user-controlled camera; (iv) A dataset of twenty thousand products with product information and related images that can be used to train and test a system designed to identify products.
ContributorsPatel, Akshar (Author) / Panchanathan, Sethuraman (Thesis advisor) / Venkateswara, Hemanth (Thesis advisor) / McDaniel, Troy (Committee member) / Arizona State University (Publisher)
Created2020
157623-Thumbnail Image.png
Description
Feature embeddings differ from raw features in the sense that the former obey certain properties like notion of similarity/dissimilarity in it's embedding space. word2vec is a preeminent example in this direction, where the similarity in the embedding space is measured in terms of the cosine similarity. Such language embedding models

Feature embeddings differ from raw features in the sense that the former obey certain properties like notion of similarity/dissimilarity in it's embedding space. word2vec is a preeminent example in this direction, where the similarity in the embedding space is measured in terms of the cosine similarity. Such language embedding models have seen numerous applications in both language and vision community as they capture the information in the modality (English language) efficiently. Inspired by these language models, this work focuses on learning embedding spaces for two visual computing tasks, 1. Image Hashing 2. Zero Shot Learning. The training set was used to learn embedding spaces over which similarity/dissimilarity is measured using several distance metrics like hamming / euclidean / cosine distances. While the above-mentioned language models learn generic word embeddings, in this work task specific embeddings were learnt which can be used for Image Retrieval and Classification separately.

Image Hashing is the task of mapping images to binary codes such that some notion of user-defined similarity is preserved. The first part of this work focuses on designing a new framework that uses the hash-tags associated with web images to learn the binary codes. Such codes can be used in several applications like Image Retrieval and Image Classification. Further, this framework requires no labelled data, leaving it very inexpensive. Results show that the proposed approach surpasses the state-of-art approaches by a significant margin.

Zero-shot classification is the task of classifying the test sample into a new class which was not seen during training. This is possible by establishing a relationship between the training and the testing classes using auxiliary information. In the second part of this thesis, a framework is designed that trains using the handcrafted attribute vectors and word vectors but doesn’t require the expensive attribute vectors during test time. More specifically, an intermediate space is learnt between the word vector space and the image feature space using the hand-crafted attribute vectors. Preliminary results on two zero-shot classification datasets show that this is a promising direction to explore.
ContributorsGattupalli, Jaya Vijetha (Author) / Li, Baoxin (Thesis advisor) / Yang, Yezhou (Committee member) / Venkateswara, Hemanth (Committee member) / Arizona State University (Publisher)
Created2019
158259-Thumbnail Image.png
Description
In the last decade deep learning based models have revolutionized machine learning and computer vision applications. However, these models are data-hungry and training them is a time-consuming process. In addition, when deep neural networks are updated to augment their prediction space with new data, they run into the problem of

In the last decade deep learning based models have revolutionized machine learning and computer vision applications. However, these models are data-hungry and training them is a time-consuming process. In addition, when deep neural networks are updated to augment their prediction space with new data, they run into the problem of catastrophic forgetting, where the model forgets previously learned knowledge as it overfits to the newly available data. Incremental learning algorithms enable deep neural networks to prevent catastrophic forgetting by retaining knowledge of previously observed data while also learning from newly available data.

This thesis presents three models for incremental learning; (i) Design of an algorithm for generative incremental learning using a pre-trained deep neural network classifier; (ii) Development of a hashing based clustering algorithm for efficient incremental learning; (iii) Design of a student-teacher coupled neural network to distill knowledge for incremental learning. The proposed algorithms were evaluated using popular vision datasets for classification tasks. The thesis concludes with a discussion about the feasibility of using these techniques to transfer information between networks and also for incremental learning applications.
ContributorsPatil, Rishabh (Author) / Venkateswara, Hemanth (Thesis advisor) / Panchanathan, Sethuraman (Thesis advisor) / McDaniel, Troy (Committee member) / Arizona State University (Publisher)
Created2020