Pavlic, Theodore

Integrating Adversarial Training, Noise Injection, and Mixup into XAI: Pathways to Enhancing Data Efficiency and Generalizability

Description

Rapid advancements in artificial intelligence (AI) have revolutionized various do- mains, enabling the development of sophisticated models capable of solving complex problems. However, as AI systems increasingly participate in critical decision-making processes, concerns about their interpretability, robustness, and reliability have in- tensified. Interpretable AI models, such as the Concept-Centric Transformer (CCT), have emerged as promising solutions to enhance transparency in AI models. Yet, in- creasing model interpretability often requires enriching training data with concept ex- planations, escalating training costs. Therefore, intrinsically interpretable models like CCT must be designed to be data-efficient, generalizable—to accommodate smaller training sets—and robust against noise and adversarial attacks. Despite progress in interpretable AI, ensuring the robustness of these models remains a challenge.This thesis enhances the data efficiency and generalizability of the CCT model by integrating four techniques: Perturbation Random Masking (PRM), Attention Random Dropout (ARD), and the integration of manifold mixup and input mixup for memory broadcast. Comprehensive experiments on benchmark datasets such as CIFAR-100, CUB-200-2011, and ImageNet show that the enhanced CCT model achieves modest performance improvements over the original model when using a full training set. Furthermore, this performance gap increases as the training data volume decreases, particularly in few-shot learning scenarios. The enhanced CCT maintains high accuracy with limited data (even without explicitly training on ex- ample concept-level explanations), demonstrating its potential for real-world appli- cations where labeled data are scarce. These findings suggest that the enhancements enable more effective use of CCT in settings with data constraints. Ablation studies reveal that no single technique—PRM, ARD, or mixups—dominates in enhancing performance and data efficiency. Each contributes nearly equally, and their combined application yields the best results, indicating a synergistic effect that bolsters the model’s capabilities without any single method being predominant. The results of this research highlight the efficacy of the proposed enhancements in refining CCT models for greater performance, robustness, and data efficiency. By demonstrating improved performance and resilience, particularly in data-limited sce- narios, this thesis underscores the practical applicability of advanced AI systems in critical decision-making roles.

Date Created

2024

Agent

Author (aut): Park, Keun Hee
Thesis advisor (ths): Pavlic, Theodore
Committee member: Choi, YooJung
Committee member: Yang, Yezhou
Publisher (pbl): Arizona State University

LanSAR – Language-commanded Scene-aware Action Response

Description

Robot motion and control remains a complex problem both in general and inthe field of machine learning (ML). Without ML approaches, robot controllers are typically designed manually, which can take considerable time, generally requiring accounting for a range of edge cases and often producing models highly constrained to specific tasks. ML can decrease the time it takes to create a model while simultaneously allowing it to operate on a broader range of tasks. The utilization of neural networks to learn from demonstration is, in particular, an approach with growing popularity due to its potential to quickly fit the parameters of a model to mimic training data. Many such neural networks, especially in the realm of transformer-based architectures, act more as planners, taking in an initial context and then generating a sequence from that context one step at a time. Others hybridize the approach, predicting a latent plan and conditioning immediate actions on that plan. Such approaches may limit a model’s ability to interact with a dynamic environment, needing to replan to fully update its understanding of the environmental context. In this thesis, Language-commanded Scene-aware Action Response (LanSAR) is proposed as a reactive transformer-based neural network that makes immediate decisions based on previous actions and environmental changes. Its actions are further conditioned on a language command, serving as a control mechanism while also narrowing the distribution of possible actions around this command. It is shown that LanSAR successfully learns a strong representation of multimodal visual and spatial input, and learns reasonable motions in relation to most language commands. It is also shown that LanSAR can struggle with both the accuracy of motions and understanding the specific semantics of language commands

Date Created

2024

Agent

Author (aut): Hardy, Adam
Thesis advisor (ths): Ben Amor, Heni
Committee member: Srivastava, Siddharth
Committee member: Pavlic, Theodore
Publisher (pbl): Arizona State University

Statistical Sequence Alignment of Protein Coding Regions

Description

Sequence alignment is an essential method in bioinformatics and the basis of many analyses, including phylogenetic inference, ancestral sequence reconstruction, and gene annotation. Sequence artifacts and errors made in alignment reconstruction can impact downstream analyses, leading to erroneous conclusions in comparative and functional genomic studies. While such errors are eventually fixed in the reference genomes of model organisms, many genomes used by researchers contain these artifacts, often forcing researchers to discard large amounts of data to prevent artifacts from impacting results. I developed COATi, a statistical, codon-aware pairwise aligner designed to align protein-coding sequences in the presence of artifacts commonly introduced by sequencing or annotation errors, such as early stop codons and abiological frameshifts. Unlike common sequence aligners, which rely on amino acid translations, only model insertion and deletions between codons, or lack a statistical model, COATi combines a codon substitution model specifically designed for protein-coding regions, a complex insertion-deletion model, and a sequencing base calling error step. The alignment algorithm is based on finite state transducers (FSTs), computational machines well-suited for modeling sequence evolution. I show that COATi outperforms available methods using a simulated empirical pairwise alignment dataset as a benchmark. The FST-based model and alignment algorithm in COATi is resource-intense for sequences longer than a few kilobases. To address this constraint, I developed an approximate model compatible with traditional dynamic programming alignment algorithms. I describe how the original codon substitution model is transformed to build an approximate model and how the alignment algorithm is implemented by modifying the popular Gotoh algorithm. I simulated a benchmark of alignments and measured how well the marginal models approximate the original method. Finally, I present a novel tool for analyzing sequence alignments. Available metrics can measure the similarity between two alignments or the column uncertainty within an alignment but cannot produce a site-specific comparison of two or more alignments. AlnDotPlot is an R software package inspired by traditional dot plots that can provide valuable insights when comparing pairwise alignments. I describe AlnDotPlot and showcase its utility in displaying a single alignment, comparing different pairwise alignments, and summarizing alignment space.

Date Created

2023

Agent

Author (aut): Garcia Mesa, Juan Jose
Thesis advisor (ths): Cartwright, Reed A
Committee member: Taylor, Jesse
Committee member: Pavlic, Theodore
Committee member: Ozkan, Banu
Publisher (pbl): Arizona State University

A Unified Visual and Persistent RESTful Tool for Modular and Hierarchical Modeling

Description

Component-based models are commonly employed to simulate discrete dynamicalsystems. These models lend themselves to formalizing the structures of systems at multiple levels of granularity. Visual development of component-based models serves to simplify the iterative and incremental model specification activities. The Parallel Discrete Events System Specification (DEVS) formalism offers a flexible yet rigorous approach for decomposing a whole model into its components or alternatively, composing a whole model from components. While different concepts, frameworks, and tools offer a variety of visual modeling capabilities, most pose limitations, such as visualizing multiple model hierarchies at any level with arbitrary depths. The visual and persistent layout of any number of hierarchy levels of models can be maintained and navigated seamlessly. Persistence storage is another capability needed for the modeling, simulating, verifying, and validating lifecycle. These are important features to improve the demanding task of creating and changing modular, hierarchical simulation models. This thesis proposes a new approach and develops a tool for the visual development of models. This tool supports storing and reconstructing graphical models using a NoSQL database. It offers unique capabilities important for developing increasingly larger and more complex models essential for analyzing, designing, and building Digital Twins.

Date Created

2023

Agent

Author (aut): Mohite, Sheetal Chandrakant
Thesis advisor (ths): Sarjoughian, Hessam S
Committee member: Bryan, Chris
Committee member: Pavlic, Theodore
Publisher (pbl): Arizona State University

Social Insect-Inspired Behaviors for Collective Search Operations by Unmanned Aerial Vehicle (UAV) Swarms

Description

A swarm of unmanned aerial vehicles (UAVs) has many potential applications including disaster relief, search and rescue, and area surveillance. A critical factor to a UAV swarm’s success is its ability to collectively locate and pursue targets determined to be of high quality with minimal and decentralized communication. Prior work has investigated nature-based solutions to this problem, in particular the behavior of honeybees when making decisions on future nest sites. A UAV swarm may mimic this behavior for similar ends, taking advantage of widespread sensor coverage induced by a large population. To determine whether the proven success of honeybee strategies may still be found in UAV swarms in more complex and difficult conditions, a series of simulations were created in Python using a behavior modeled after the work of Cooke et al. UAV and environmental properties were varied to determine the importance of each to the success of the swarm and to find emergent behaviors caused by combinations of variables. From the simulation work done, it was found that agent population and lifespan were the two most important factors to swarm success, with preference towards small teams with long-lasting UAVs.

Date Created

2023-05

Agent

Author (aut): Gao, Max
Thesis director: Berman, Spring
Committee member: Pavlic, Theodore
Contributor (ctb): Barrett, The Honors College
Contributor (ctb): College of Integrative Sciences and Arts
Contributor (ctb): Engineering Programs

Ant Scattering: Evaluating Aggregation and Consensus Post Physical Decentralization in Temnothorax rugatulus ant colonies

Description

Aggregation is a fundamental principle of animal behavior; it is especially significant tohighly social species, like ants. Ants typically aggregate their workers and brood in a central nest, potentially due to advantages in colony defense and regulation of the environment. In many ant species, when a colony must abandon its nest, it can effectively reach consensus on a new home. Ants of the genus Temnothorax have become a model for this collective decision-making process, and for decentralized cognition more broadly. Previous studies examine emigration by well-aggregated colonies, but can these ants also reach consensus when the colony has been scattered? Such scattering may readily occur in nature if the nest is disturbed by natural or man- made disasters. In this exploratory study, Temnothorax rugatulus colonies were randomly scattered in an arena and presented with a binary equal choice of nest sites. Findings concluded that the colonies were able to re-coalesce, however consensus is more difficult than for aggregated colonies and involved an additional primary phase of multiple temporary aggregations eventually yielding to reunification. The maximum percent of colony utilization for these aggregates was reached within the first hour, after which point, consensus tended to rise as aggregation decreased. Small, but frequent, aggregates formed within the first twenty minutes and remained and dissolved to the nest by varying processes. Each colony included a clump containing the queen, with the majority of aggregates containing at least one brood item. These findings provide additional insight to house-hunting experiments in more naturally challenging circumstances, as well as aggregation within Temnothorax colonies.

Date Created

2023

Agent

Author (aut): Goodland, Brooke
Thesis advisor (ths): Shaffer, Zachary
Thesis advisor (ths): Pratt, Stephen
Committee member: Pavlic, Theodore
Publisher (pbl): Arizona State University

‘Greenwashing’ in the Fashion Industry: An Evaluation of Frameworks for Identifying Deceptive Environmental Claims

Description

In this project, I analyze representative samples from three different fashion brands’ sustainability-related informational materials provided to the public through their websites, annual reports, and clothing tags that promote the company’s environmental initiatives. The three companies were chosen because they each represent global fashion- they are all extremely large, popular, and prevalent brands. These materials are evaluated against three frameworks for identifying deceptive greenwashing claims. I identify instances in which these frameworks are successful in categorizing deceptive claims from these companies as well as instances in which they appear to be vulnerable. To address the vulnerabilities I discover in the three existing frameworks for identifying greenwashing, I propose six new guidelines to be used in conjunction with these frameworks that will help to ensure that consumers can have a more ample toolbox to identify deceptive sustainability claims.

Date Created

2023-05

Agent

Author (aut): Ladewig, Emily
Thesis director: Pavlic, Theodore
Committee member: Roschke, Kristy
Contributor (ctb): Barrett, The Honors College
Contributor (ctb): School of International Letters and Cultures
Contributor (ctb): School of Art
Contributor (ctb): School of Sustainability

Odor Divergence of Enantiomers

Description

Enantiomers are pairs of non-superimposable mirror-image molecules. One molecule in the pair is the clockwise version (+) while the other is the counterclockwise version (-). Some pairs have divergent odor qualities, e.g. L-carvone (“spearmint”) vs. D-carvone (“caraway”), while other pairs do not. Existing theory about the origin of such differences is largely qualitative (Friedman and Miller, 1971; Bentley, 2006; Brookes et al., 2008). While quantitative models based on intrinsic molecular features predict some structure–odor relationships (Keller et al., 2017), they cannot identify, e.g. the more intense enantiomer in a pair; the mathematical operations underlying such features are invariant under symmetry (Shadmany et al., 2018). Only the olfactory receptor (OR) can break this symmetry because each molecule within an enantiomeric pair will have a different binding configuration with a receptor. However, features that predict odor divergence within a pair may be identifiable; for example, six-membered ring flexibility has been offered as a candidate (Brookes et al., 2008). To address this problem, we collected detection threshold data for >400 molecules (organized into enantiomeric pairs) from a variety of public data sources and academic literature. From each pair, we computed the within-pair divergence in odor detection threshold, as well as Mordred descriptors (molecular features derived from the structure of a molecule) and Morgan fingerprints (mathematical representations of molecule structure). While these molecular features are identical within-pair (due to symmetry), they remain distinct across pairs. The resulting structure+perception dataset was used to build a predictive model of odor detection threshold divergence. It predicted a modest fraction of variance in odor detection threshold divergence (r 2 ~ 0.3 in cross-validation). We speculate that most of the remaining variance could be explained by a better understanding of the ligand-receptor binding process.

Date Created

2023-05

Agent

Author (aut): Coleman, Liyah
Thesis director: Pavlic, Theodore
Committee member: Gerkin, Richard
Contributor (ctb): Barrett, The Honors College
Contributor (ctb): Computer Science - BS

Combining learning with knowledge-rich planning allows for efficient multi-agent solutions to the problem of perpetual sparse rewards

Description

This work has improved the quality of the solution to the sparse rewards problemby combining reinforcement learning (RL) with knowledge-rich planning. Classical methods for coping with sparse rewards during reinforcement learning modify the reward landscape so as to better guide the learner. In contrast, this work combines RL with a planner in order to utilize other information about the environment. As the scope for representing environmental information is limited in RL, this work has conflated a model-free learning algorithm – temporal difference (TD) learning – with a Hierarchical Task Network (HTN) planner to accommodate rich environmental information in the algorithm. In the perpetual sparse rewards problem, rewards reemerge after being collected within a fixed interval of time, culminating in a lack of a well-defined goal state as an exit condition to the problem. Incorporating planning in the learning algorithm not only improves the quality of the solution, but the algorithm also avoids the ambiguity of incorporating a goal of maximizing profit while using only a planning algorithm to solve this problem. Upon occasionally using the HTN planner, this algorithm provides the necessary tweak toward the optimal solution. In this work, I have demonstrated an on-policy algorithm that has improved the quality of the solution over vanilla reinforcement learning. The objective of this work has been to observe the capacity of the synthesized algorithm in finding optimal policies to maximize rewards, awareness of the environment, and the awareness of the presence of other agents in the vicinity.

Date Created

2022

Agent

Author (aut): Nandan, Swastik
Thesis advisor (ths): Pavlic, Theodore
Thesis advisor (ths): Das, Jnaneshwar
Committee member: Berman, Spring
Publisher (pbl): Arizona State University

Design of a Graph Neural Network Coupled with an Advantage Actor-Critic Reinforcement Learning Algorithm for Multi-Agent Navigation

Description

A Graph Neural Network (GNN) is a type of neural network architecture that operates on data consisting of objects and their relationships, which are represented by a graph. Within the graph, nodes represent objects and edges represent associations between those objects. The representation of relationships and correlations between data is unique to graph structures. GNNs exploit this feature of graphs by augmenting both forms of data, individual and relational, and have been designed to allow for communication and sharing of data within each neural network layer. These benefits allow each node to have an enriched perspective, or a better understanding, of its neighbouring nodes and its connections to those nodes. The ability of GNNs to efficiently process high-dimensional node data and multi-faceted relationships among nodes gives them advantages over neural network architectures such as Convolutional Neural Networks (CNNs) that do not implicitly handle relational data. These quintessential characteristics of GNN models make them suitable for solving problems in which the correspondences among input data are needed to produce an accurate and precise representation of these data. GNN frameworks may significantly improve existing communication and control techniques for multi-agent tasks by implicitly representing not only information associated with the individual agents, such as agent position, velocity, and camera data, but also their relationships with one another, such as distances between the agents and their ability to communicate with one another. One such task is a multi-agent navigation problem in which the agents must coordinate with one another in a decentralized manner, using proximity sensors only, to navigate safely to their intended goal positions in the environment without collisions or deadlocks. The contribution of this thesis is the design of an end-to-end decentralized control scheme for multi-agent navigation that utilizes GNNs to prevent inter-agent collisions and deadlocks. The contributions consist of the development, simulation and evaluation of the performance of an advantage actor-critic (A2C) reinforcement learning algorithm that employs actor and critic networks for training that simultaneously approximate the policy function and value function, respectively. These networks are implemented using GNN frameworks for navigation by groups of 3, 5, 10 and 15 agents in simulated two-dimensional environments. It is observed that in $40\%$ to $50\%$ of the simulation trials, between 70$\%$ to 80$\%$ of the agents reach their goal positions without colliding with other agents or becoming trapped in deadlocks. The model is also compared to a random run simulation, where actions are chosen randomly for the agents and observe that the model performs notably well for smaller groups of agents.

Date Created

2022

Agent

Author (aut): Ayalasomayajula, Manaswini
Thesis advisor (ths): Berman, Spring
Committee member: Mian, Sami
Committee member: Pavlic, Theodore
Publisher (pbl): Arizona State University

Subscribe to Pavlic, Theodore