This collection includes both ASU Theses and Dissertations, submitted by graduate students, and the Barrett, Honors College theses submitted by undergraduate students. 

Displaying 1 - 3 of 3
Filtering by

Clear all filters

153249-Thumbnail Image.png
Description
In this thesis we consider the problem of facial expression recognition (FER) from video sequences. Our method is based on subspace representations and Grassmann manifold based learning. We use Local Binary Pattern (LBP) at the frame level for representing the facial features. Next we develop a model to represent the

In this thesis we consider the problem of facial expression recognition (FER) from video sequences. Our method is based on subspace representations and Grassmann manifold based learning. We use Local Binary Pattern (LBP) at the frame level for representing the facial features. Next we develop a model to represent the video sequence in a lower dimensional expression subspace and also as a linear dynamical system using Autoregressive Moving Average (ARMA) model. As these subspaces lie on Grassmann space, we use Grassmann manifold based learning techniques such as kernel Fisher Discriminant Analysis with Grassmann kernels for classification. We consider six expressions namely, Angry (AN), Disgust (Di), Fear (Fe), Happy (Ha), Sadness (Sa) and Surprise (Su) for classification. We perform experiments on extended Cohn-Kanade (CK+) facial expression database to evaluate the expression recognition performance. Our method demonstrates good expression recognition performance outperforming other state of the art FER algorithms. We achieve an average recognition accuracy of 97.41% using a method based on expression subspace, kernel-FDA and Support Vector Machines (SVM) classifier. By using a simpler classifier, 1-Nearest Neighbor (1-NN) along with kernel-FDA, we achieve a recognition accuracy of 97.09%. We find that to process a group of 19 frames in a video sequence, LBP feature extraction requires majority of computation time (97 %) which is about 1.662 seconds on the Intel Core i3, dual core platform. However when only 3 frames (onset, middle and peak) of a video sequence are used, the computational complexity is reduced by about 83.75 % to 260 milliseconds at the expense of drop in the recognition accuracy to 92.88 %.
ContributorsYellamraju, Anirudh (Author) / Chakrabarti, Chaitali (Thesis advisor) / Turaga, Pavan (Thesis advisor) / Karam, Lina (Committee member) / Arizona State University (Publisher)
Created2014
189297-Thumbnail Image.png
Description
This thesis encompasses a comprehensive research effort dedicated to overcoming the critical bottlenecks that hinder the current generation of neural networks, thereby significantly advancing their reliability and performance. Deep neural networks, with their millions of parameters, suffer from over-parameterization and lack of constraints, leading to limited generalization capabilities. In other

This thesis encompasses a comprehensive research effort dedicated to overcoming the critical bottlenecks that hinder the current generation of neural networks, thereby significantly advancing their reliability and performance. Deep neural networks, with their millions of parameters, suffer from over-parameterization and lack of constraints, leading to limited generalization capabilities. In other words, the complex architecture and millions of parameters present challenges in finding the right balance between capturing useful patterns and avoiding noise in the data. To address these issues, this thesis explores novel solutions based on knowledge distillation, enabling the learning of robust representations. Leveraging the capabilities of large-scale networks, effective learning strategies are developed. Moreover, the limitations of dependency on external networks in the distillation process, which often require large-scale models, are effectively overcome by proposing a self-distillation strategy. The proposed approach empowers the model to generate high-level knowledge within a single network, pushing the boundaries of knowledge distillation. The effectiveness of the proposed method is not only demonstrated across diverse applications, including image classification, object detection, and semantic segmentation but also explored in practical considerations such as handling data scarcity and assessing the transferability of the model to other learning tasks. Another major obstacle hindering the development of reliable and robust models lies in their black-box nature, impeding clear insights into the contributions toward the final predictions and yielding uninterpretable feature representations. To address this challenge, this thesis introduces techniques that incorporate simple yet powerful deep constraints rooted in Riemannian geometry. These constraints confer geometric qualities upon the latent representation, thereby fostering a more interpretable and insightful representation. In addition to its primary focus on general tasks like image classification and activity recognition, this strategy offers significant benefits in real-world applications where data scarcity is prevalent. Moreover, its robustness in feature removal showcases its potential for edge applications. By successfully tackling these challenges, this research contributes to advancing the field of machine learning and provides a foundation for building more reliable and robust systems across various application domains.
ContributorsChoi, Hongjun (Author) / Turaga, Pavan (Thesis advisor) / Jayasuriya, Suren (Committee member) / Li, Wenwen (Committee member) / Fazli, Pooyan (Committee member) / Arizona State University (Publisher)
Created2023
158817-Thumbnail Image.png
Description
Over the past decade, machine learning research has made great strides and significant impact in several fields. Its success is greatly attributed to the development of effective machine learning algorithms like deep neural networks (a.k.a. deep learning), availability of large-scale databases and access to specialized hardware like Graphic Processing Units.

Over the past decade, machine learning research has made great strides and significant impact in several fields. Its success is greatly attributed to the development of effective machine learning algorithms like deep neural networks (a.k.a. deep learning), availability of large-scale databases and access to specialized hardware like Graphic Processing Units. When designing and training machine learning systems, researchers often assume access to large quantities of data that capture different possible variations. Variations in the data is needed to incorporate desired invariance and robustness properties in the machine learning system, especially in the case of deep learning algorithms. However, it is very difficult to gather such data in a real-world setting. For example, in certain medical/healthcare applications, it is very challenging to have access to data from all possible scenarios or with the necessary amount of variations as required to train the system. Additionally, the over-parameterized and unconstrained nature of deep neural networks can cause them to be poorly trained and in many cases over-confident which, in turn, can hamper their reliability and generalizability. This dissertation is a compendium of my research efforts to address the above challenges. I propose building invariant feature representations by wedding concepts from topological data analysis and Riemannian geometry, that automatically incorporate the desired invariance properties for different computer vision applications. I discuss how deep learning can be used to address some of the common challenges faced when working with topological data analysis methods. I describe alternative learning strategies based on unsupervised learning and transfer learning to address issues like dataset shifts and limited training data. Finally, I discuss my preliminary work on applying simple orthogonal constraints on deep learning feature representations to help develop more reliable and better calibrated models.
ContributorsSom, Anirudh (Author) / Turaga, Pavan (Thesis advisor) / Krishnamurthi, Narayanan (Committee member) / Spanias, Andreas (Committee member) / Li, Baoxin (Committee member) / Arizona State University (Publisher)
Created2020