This collection includes both ASU Theses and Dissertations, submitted by graduate students, and the Barrett, Honors College theses submitted by undergraduate students. 

Displaying 201 - 210 of 213
Filtering by

Clear all filters

165198-Thumbnail Image.jpg
ContributorsHaagen, Jordan (Author) / Turaga, Pavan (Thesis director) / Drummond Otten, Caitlin (Committee member) / Barrett, The Honors College (Contributor) / Arts, Media and Engineering Sch T (Contributor)
Created2022-05
165200-Thumbnail Image.png
Description
Oftentimes, patients struggle to accurately describe their symptoms to medical professionals, which produces erroneous diagnoses, delaying and preventing treatment. My app, Augnosis, will streamline constructive communication between patient and doctor, and allow for more accurate diagnoses. The goal of this project was to create an app capable of gathering data

Oftentimes, patients struggle to accurately describe their symptoms to medical professionals, which produces erroneous diagnoses, delaying and preventing treatment. My app, Augnosis, will streamline constructive communication between patient and doctor, and allow for more accurate diagnoses. The goal of this project was to create an app capable of gathering data on visual symptoms of facial acne and categorizing it to differentiate between diagnoses using image recognition and identification. “Augnosis”, is a combination of the words “Augmented Reality” and “Self-Diagnosis”, the former being the medium in which it is immersed and the latter detailing its functionality.
ContributorsGoyal, Nandika (Author) / Johnson, Mina (Thesis director) / Bryan, Chris (Committee member) / Turaga, Pavan (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)
Created2022-05
190759-Thumbnail Image.png
Description
This thesis presents robust and novel solutions using knowledge distillation with geometric approaches and multimodal data that can address the current challenges in deep learning, providing a comprehensive understanding of the learning process involved in knowledge distillation. Deep learning has attained significant success in various applications, such as health and

This thesis presents robust and novel solutions using knowledge distillation with geometric approaches and multimodal data that can address the current challenges in deep learning, providing a comprehensive understanding of the learning process involved in knowledge distillation. Deep learning has attained significant success in various applications, such as health and wellness promotion, smart homes, and intelligent surveillance. In general, stacking more layers or increasing the number of trainable parameters causes deep networks to exhibit improved performance. However, this causes the model to become large, resulting in an additional need for computing and power resources for training, storage, and deployment. These are the core challenges in incorporating such models into small devices with limited power and computational resources. In this thesis, robust solutions aimed at addressing the aforementioned challenges are presented. These proposed methodologies and algorithmic contributions enhance the performance and efficiency of deep learning models. The thesis encompasses a comprehensive exploration of knowledge distillation, an approach that holds promise for creating compact models from high-capacity ones, while preserving their performance. This exploration covers diverse datasets, including both time series and image data, shedding light on the pivotal role of augmentation methods in knowledge distillation. The effects of these methods are rigorously examined through empirical experiments. Furthermore, the study within this thesis delves into the efficient utilization of features derived from two different teacher models, each trained on dissimilar data representations, including time-series and image data. Through these investigations, I present novel approaches to knowledge distillation, leveraging geometric techniques for the analysis of multimodal data. These solutions not only address real-world challenges but also offer valuable insights and recommendations for modeling in new applications.
ContributorsJeon, Eunsom (Author) / Turaga, Pavan (Thesis advisor) / Li, Baoxin (Committee member) / Lee, Hyunglae (Committee member) / Jayasuriya, Suren (Committee member) / Arizona State University (Publisher)
Created2023
190963-Thumbnail Image.png
Description
The need for robust verification and validation of automated vehicles (AVs) to ensure driving safety grows more urgent as increasing numbers of AVs are allowed to operate on open roads. To address this need, AV developers can present a safety case to regulators and the public that provides an evidence-based

The need for robust verification and validation of automated vehicles (AVs) to ensure driving safety grows more urgent as increasing numbers of AVs are allowed to operate on open roads. To address this need, AV developers can present a safety case to regulators and the public that provides an evidence-based justification of their assertion that an AV is safe to operate on open roads. This work aims to describe the development of a scenario-based testing methodology that contributes to this safety case. A high-level definition of this test selection and scoring methodology (TSSM) is first presented, along with an outline of its scope and key ideas. This is followed by a literature review that details the current state of the art in AV testing, including the driving performance metrics and equations that provide a basis for the TSSM. A chart-based method for quantifying an AV’s operational design domain (ODD) and behavioral competency portfolio is then described that provides the foundation for a scenario generation and filtration process. After outlining a method for the AV to progress through increasingly robust test methods based on its current technology readiness level (TRL), the generation and filtration of two sets of scenarios by the TSSM is outlined: a standardized set that can be used to compare the performance of vehicles with identical ODD and behavioral competency portfolios, and a set containing high-relevance scenarios that is partially randomized to ensure test integrity. A related framework for incorporating testing on open roads is subsequently specified. An equation for an overall AV driving performance score is then defined that quantifies the aggregate performance of the AV across all generated scenarios. The TSSM continues according to an iterative process, which includes a method for exploring edge and corner scenarios, until a stopping condition is achieved. Two proofs of concept are provided: a demonstration of the ability of the TSSM to pare scenarios from a preexisting database, and an example ODD and behavioral competency portfolio specification form. Finally, this work concludes by evaluating the TSSM and its proofs of concept and outlining possible future work on the methodology.
ContributorsO'Malley, Gavin (Author) / Wishart, Jeffrey (Thesis advisor) / Zhao, Junfeng (Thesis advisor) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)
Created2023
190708-Thumbnail Image.png
Description
Generative models are deep neural network-based models trained to learn the underlying distribution of a dataset. Once trained, these models can be used to sample novel data points from this distribution. Their impressive capabilities have been manifested in various generative tasks, encompassing areas like image-to-image translation, style transfer, image editing,

Generative models are deep neural network-based models trained to learn the underlying distribution of a dataset. Once trained, these models can be used to sample novel data points from this distribution. Their impressive capabilities have been manifested in various generative tasks, encompassing areas like image-to-image translation, style transfer, image editing, and more. One notable application of generative models is data augmentation, aimed at expanding and diversifying the training dataset to augment the performance of deep learning models for a downstream task. Generative models can be used to create new samples similar to the original data but with different variations and properties that are difficult to capture with traditional data augmentation techniques. However, the quality, diversity, and controllability of the shape and structure of the generated samples from these models are often directly proportional to the size and diversity of the training dataset. A more extensive and diverse training dataset allows the generative model to capture overall structures present in the data and generate more diverse and realistic-looking samples. In this dissertation, I present innovative methods designed to enhance the robustness and controllability of generative models, drawing upon physics-based, probabilistic, and geometric techniques. These methods help improve the generalization and controllability of the generative model without necessarily relying on large training datasets. I enhance the robustness of generative models by integrating classical geometric moments for shape awareness and minimizing trainable parameters. Additionally, I employ non-parametric priors for the generative model's latent space through basic probability and optimization methods to improve the fidelity of interpolated images. I adopt a hybrid approach to address domain-specific challenges with limited data and controllability, combining physics-based rendering with generative models for more realistic results. These approaches are particularly relevant in industrial settings, where the training datasets are small and class imbalance is common. Through extensive experiments on various datasets, I demonstrate the effectiveness of the proposed methods over conventional approaches.
ContributorsSingh, Rajhans (Author) / Turaga, Pavan (Thesis advisor) / Jayasuriya, Suren (Committee member) / Berisha, Visar (Committee member) / Fazli, Pooyan (Committee member) / Arizona State University (Publisher)
Created2023
193413-Thumbnail Image.png
Description
Recently developed large language models have achieved remarkable success on a wide range of natural language tasks. Furthermore, they have been shown to possess an impressive ability to generate fluent and coherent text. Despite all the notable abilities of these models, there exist several efficiency and reliability related challenges. For

Recently developed large language models have achieved remarkable success on a wide range of natural language tasks. Furthermore, they have been shown to possess an impressive ability to generate fluent and coherent text. Despite all the notable abilities of these models, there exist several efficiency and reliability related challenges. For example, they are vulnerable to a phenomenon called 'hallucination' in which they generate text that is not factually correct and they also have a large number of parameters which makes their inference slow and computationally expensive. With the objective of taking a step closer towards further enabling the widespread adoption of the Natural Language Processing (NLP) systems, this dissertation studies the following question: how to effectively address the efficiency and reliability related concerns of the NLP systems? Specifically, to improve the reliability of models, this dissertation first presents an approach that actively detects and mitigates the hallucinations of LLMs using a retrieval augmented methodology. Note that another strategy to mitigate incorrect predictions is abstention from answering when error is likely, i.e., selective prediction. To this end, I present selective prediction approaches and conduct extensive experiments to demonstrate their effectiveness. Building on top of selective prediction, I also present post-abstention strategies that focus on reliably increasing the coverage of a selective prediction system without considerably impacting its accuracy. Furthermore, this dissertation covers multiple aspects of improving the efficiency including 'inference efficiency' (making model inferences in a computationally efficient manner without sacrificing the prediction accuracy), 'data sample efficiency' (efficiently collecting data instances for training a task-specific system), 'open-domain QA reader efficiency' (leveraging the external knowledge efficiently while answering open-domain questions), and 'evaluation efficiency' (comparing the performance of different models efficiently). In summary, this dissertation highlights several challenges pertinent to the efficiency and reliability involved in the development of NLP systems and provides effective solutions to address them.
ContributorsVarshney, Neeraj (Author) / Baral, Chitta (Thesis advisor) / Yang, Yezhou (Committee member) / Gopalan, Nakul (Committee member) / Banerjee, Pratyay (Committee member) / Arizona State University (Publisher)
Created2024
193509-Thumbnail Image.png
Description
In the rapidly evolving field of computer vision, propelled by advancements in deeplearning, the integration of hardware-software co-design has become crucial to overcome the limitations of traditional imaging systems. This dissertation explores the integration of hardware-software co-design in computational imaging, particularly in light transport acquisition and Non-Line-of-Sight (NLOS) imaging. By leveraging projector-camera systems and

In the rapidly evolving field of computer vision, propelled by advancements in deeplearning, the integration of hardware-software co-design has become crucial to overcome the limitations of traditional imaging systems. This dissertation explores the integration of hardware-software co-design in computational imaging, particularly in light transport acquisition and Non-Line-of-Sight (NLOS) imaging. By leveraging projector-camera systems and computational techniques, this thesis address critical challenges in imaging complex environments, such as adverse weather conditions, low-light scenarios, and the imaging of reflective or transparent objects. The first contribution in this thesis is the theory, design, and implementation of a slope disparity gating system, which is a vertically aligned configuration of a synchronized raster scanning projector and rolling-shutter camera, facilitating selective imaging through disparity-based triangulation. This system introduces a novel, hardware-oriented approach to selective imaging, circumventing the limitations of post-capture processing. The second contribution of this thesis is the realization of two innovative approaches for spotlight optimization to improve localization and tracking for NLOS imaging. The first approach utilizes radiosity-based optimization to improve 3D localization and object identification for small-scale laboratory settings. The second approach introduces a learningbased illumination network along with a differentiable renderer and NLOS estimation network to optimize human 2D localization and activity recognition. This approach is validated on a large, room-scale scene with complex line-of-sight geometries and occluders. The third contribution of this thesis is an attention-based neural network for passive NLOS settings where there is no controllable illumination. The thesis demonstrates realtime, dynamic NLOS human tracking where the camera is moving on a mobile robotic platform. In addition, this thesis contains an appendix featuring temporally consistent relighting for portrait videos with applications in computer graphics and vision.
ContributorsChandran, Sreenithy (Author) / Jayasuriya, Suren (Thesis advisor) / Turaga, Pavan (Committee member) / Dasarathy, Gautam (Committee member) / Kubo, Hiroyuki (Committee member) / Arizona State University (Publisher)
Created2024
193491-Thumbnail Image.png
Description
With the exponential growth of multi-modal data in the field of computer vision, the ability to do inference effectively among multiple modalities—such as visual, textual, and auditory data—shows significant opportunities. The rapid development of cross-modal applications such as retrieval and association is primarily attributed to their ability to bridge the

With the exponential growth of multi-modal data in the field of computer vision, the ability to do inference effectively among multiple modalities—such as visual, textual, and auditory data—shows significant opportunities. The rapid development of cross-modal applications such as retrieval and association is primarily attributed to their ability to bridge the gap between different modalities of data. However, the current mainstream cross-modal methods always heavily rely on the availability of fully annotated paired data, presenting a significant challenge due to the scarcity of precisely matched datasets in real-world scenarios. In response to this bottleneck, several sophisticated deep learning algorithms are designed to substantially improve the inference capabilities across a broad spectrum of cross-modal applications. This dissertation introduces novel deep learning algorithms aimed at enhancing inference capabilities in cross-modal applications, which take four primary aspects. Firstly, it introduces the algorithm for image retrieval by learning hashing codes. This algorithm only utilizes the other modality data in weakly supervised tags format rather than the supervised label. Secondly, it designs a novel framework for learning the joint embeddings of images and texts for the cross-modal retrieval tasks. It efficiently learns the binary codes from the continuous CLIP feature space and can even deliver competitive performance compared with the results from non-hashing methods. Thirdly, it conducts a method to learn the fragment-level embeddings that capture fine-grained cross-modal association in images and texts. This method uses the fragment proposals in an unsupervised manner. Lastly, this dissertation also outlines the algorithm to enhance the mask-text association ability of pre-trained semantic segmentation models with zero examples provided. Extensive future plans to further improve this algorithm for semantic segmentation tasks will be discussed.
ContributorsZhuo, Yaoxin (Author) / Li, Baoxin (Thesis advisor) / Wu, Teresa (Committee member) / Davulcu, Hasan (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)
Created2024
193555-Thumbnail Image.png
Description
Rapid advancements in artificial intelligence (AI) have revolutionized various do- mains, enabling the development of sophisticated models capable of solving complex problems. However, as AI systems increasingly participate in critical decision-making processes, concerns about their interpretability, robustness, and reliability have in- tensified. Interpretable AI models, such as the Concept-Centric Transformer

Rapid advancements in artificial intelligence (AI) have revolutionized various do- mains, enabling the development of sophisticated models capable of solving complex problems. However, as AI systems increasingly participate in critical decision-making processes, concerns about their interpretability, robustness, and reliability have in- tensified. Interpretable AI models, such as the Concept-Centric Transformer (CCT), have emerged as promising solutions to enhance transparency in AI models. Yet, in- creasing model interpretability often requires enriching training data with concept ex- planations, escalating training costs. Therefore, intrinsically interpretable models like CCT must be designed to be data-efficient, generalizable—to accommodate smaller training sets—and robust against noise and adversarial attacks. Despite progress in interpretable AI, ensuring the robustness of these models remains a challenge.This thesis enhances the data efficiency and generalizability of the CCT model by integrating four techniques: Perturbation Random Masking (PRM), Attention Random Dropout (ARD), and the integration of manifold mixup and input mixup for memory broadcast. Comprehensive experiments on benchmark datasets such as CIFAR-100, CUB-200-2011, and ImageNet show that the enhanced CCT model achieves modest performance improvements over the original model when using a full training set. Furthermore, this performance gap increases as the training data volume decreases, particularly in few-shot learning scenarios. The enhanced CCT maintains high accuracy with limited data (even without explicitly training on ex- ample concept-level explanations), demonstrating its potential for real-world appli- cations where labeled data are scarce. These findings suggest that the enhancements enable more effective use of CCT in settings with data constraints. Ablation studies reveal that no single technique—PRM, ARD, or mixups—dominates in enhancing performance and data efficiency. Each contributes nearly equally, and their combined application yields the best results, indicating a synergistic effect that bolsters the model’s capabilities without any single method being predominant. The results of this research highlight the efficacy of the proposed enhancements in refining CCT models for greater performance, robustness, and data efficiency. By demonstrating improved performance and resilience, particularly in data-limited sce- narios, this thesis underscores the practical applicability of advanced AI systems in critical decision-making roles.
ContributorsPark, Keun Hee (Author) / Pavlic, Theodore (Thesis advisor) / Choi, YooJung (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)
Created2024
193546-Thumbnail Image.png
Description
In the age of artificial intelligence, Machine Learning (ML) has become a pervasive force, impacting countless aspects of our lives. As ML’s influence expands, concerns about its reliability and trustworthiness have intensified, with security and robustness emerging as significant challenges. For instance, it has been demonstrated that slight perturbations to

In the age of artificial intelligence, Machine Learning (ML) has become a pervasive force, impacting countless aspects of our lives. As ML’s influence expands, concerns about its reliability and trustworthiness have intensified, with security and robustness emerging as significant challenges. For instance, it has been demonstrated that slight perturbations to a stop sign can cause ML classifiers to misidentify it as a speed limit sign, raising concerns about whether ML algorithms are suitable for real-world deployments. To tackle these issues, Responsible Machine Learning (Responsible ML) has emerged with a clear mission: to develop secure and robust ML algorithms. This dissertation aims to develop Responsible Machine Learning algorithms under real-world constraints. Specifically, recognizing the role of adversarial attacks in exposing security vulnerabilities and robustifying the ML methods, it lays down the foundation of Responsible ML by outlining a novel taxonomy of adversarial attacks within real-world settings, categorizing them into black-box target-specific, and target-agnostic attacks. Subsequently, it proposes potent adversarial attacks in each category, aiming to obtain effectiveness and efficiency. Transcending conventional boundaries, it then introduces the notion of causality into Responsible ML (a.k.a., Causal Responsible ML), presenting the causal adversarial attack. This represents the first principled framework to explain the transferability of adversarial attacks to unknown models by identifying their common source of vulnerabilities, thereby exposing the pinnacle of threat and vulnerability: conducting successful attacks on any model with no prior knowledge. Finally, acknowledging the surge of Generative AI, this dissertation explores Responsible ML for Generative AI. It introduces a novel adversarial attack that unveils their adversarial vulnerabilities and devises a strong defense mechanism to bolster the models’ robustness against potential attacks.
ContributorsMoraffah, Raha (Author) / Liu, Huan (Thesis advisor) / Yang, Yezhou (Committee member) / Xiao, Chaowei (Committee member) / Turaga, Pavan (Committee member) / Carley, Kathleen (Committee member) / Arizona State University (Publisher)
Created2024