Matching Items (255)
Filtering by

Clear all filters

171492-Thumbnail Image.png
Description
The future will be replete with Artificial Intelligence (AI) based agents closely collaborating with humans. Although it is challenging to construct such systems for real-world conditions, the Intelligent Tutoring System (ITS) community has proposed several techniques to work closely with students. However, there is a need to extend these systems

The future will be replete with Artificial Intelligence (AI) based agents closely collaborating with humans. Although it is challenging to construct such systems for real-world conditions, the Intelligent Tutoring System (ITS) community has proposed several techniques to work closely with students. However, there is a need to extend these systems outside the controlled environment of the classroom. More recently, Human-Aware Planning (HAP) community has developed generalized AI techniques for collaborating with humans and providing personalized support or guidance to the collaborators. In this thesis, the take learning from the ITS community is extend to construct such human-aware systems for real-world domains and evaluate them with real stakeholders. First, the applicability of HAP to ITS is demonstrated, by modeling the behavior in a classroom and a state-of-the-art tutoring system called Dragoon. Then these techniques are extended to provide decision support to a human teammate and evaluate the effectiveness of the framework through ablation studies to support students in constructing their plan of study (\ipos). The results show that these techniques are helpful and can support users in their tasks. In the third section of the thesis, an ITS scenario of asking questions (or problems) in active environments is modeled by constructing questions to elicit a human teammate's model of understanding. The framework is evaluated through a user study, where the results show that the queries can be used for eliciting the human teammate's mental model.
ContributorsGrover, Sachin (Author) / Kambhampati, Subbarao (Thesis advisor) / Smith, David (Committee member) / Srivastava, Sidhharth (Committee member) / VanLehn, Kurt (Committee member) / Arizona State University (Publisher)
Created2022
171505-Thumbnail Image.png
Description
The impact of Artificial Intelligence (AI) has increased significantly in daily life. AI is taking big strides towards moving into areas of life that are critical such as healthcare but, also into areas such as entertainment and leisure. Deep neural networks have been pivotal in making all these advancements possible.

The impact of Artificial Intelligence (AI) has increased significantly in daily life. AI is taking big strides towards moving into areas of life that are critical such as healthcare but, also into areas such as entertainment and leisure. Deep neural networks have been pivotal in making all these advancements possible. But, a well-known problem with deep neural networks is the lack of explanations for the choices it makes. To combat this, several methods have been tried in the field of research. One example of this is assigning rankings to the individual features and how influential they are in the decision-making process. In contrast a newer class of methods focuses on Concept Activation Vectors (CAV) which focus on extracting higher-level concepts from the trained model to capture more information as a mixture of several features and not just one. The goal of this thesis is to employ concepts in a novel domain: to explain how a deep learning model uses computer vision to classify music into different genres. Due to the advances in the field of computer vision with deep learning for classification tasks, it is rather a standard practice now to convert an audio clip into corresponding spectrograms and use those spectrograms as image inputs to the deep learning model. Thus, a pre-trained model can classify the spectrogram images (representing songs) into musical genres. The proposed explanation system called “Why Pop?” tries to answer certain questions about the classification process such as what parts of the spectrogram influence the model the most, what concepts were extracted and how are they different for different classes. These explanations aid the user gain insights into the model’s learnings, biases, and the decision-making process.
ContributorsSharma, Shubham (Author) / Bryan, Chris (Thesis advisor) / McDaniel, Troy (Committee member) / Sarwat, Mohamed (Committee member) / Arizona State University (Publisher)
Created2022
171513-Thumbnail Image.png
Description
Automated driving systems (ADS) have come a long way since their inception. It is clear that these systems rely heavily on stochastic deep learning techniques for perception, planning, and prediction, as it is impossible to construct every possible driving scenario to generate driving policies. Moreover, these systems need to be

Automated driving systems (ADS) have come a long way since their inception. It is clear that these systems rely heavily on stochastic deep learning techniques for perception, planning, and prediction, as it is impossible to construct every possible driving scenario to generate driving policies. Moreover, these systems need to be trained and validated extensively on typical and abnormal driving situations before they can be trusted with human life. However, most publicly available driving datasets only consist of typical driving behaviors. On the other hand, there is a plethora of videos available on the internet that capture abnormal driving scenarios, but they are unusable for ADS training or testing as they lack important information such as camera calibration parameters, and annotated vehicle trajectories. This thesis proposes a new toolbox, DeepCrashTest-V2, that is capable of reconstructing high-quality simulations from monocular dashcam videos found on the internet. The toolbox not only estimates the crucial parameters such as camera calibration, ego-motion, and surrounding road user trajectories but also creates a virtual world in Car Learning to Act (CARLA) using data from OpenStreetMaps to simulate the estimated trajectories. The toolbox is open-source and is made available in the form of a python package on GitHub at https://github.com/C-Aniruddh/deepcrashtest_v2.
ContributorsChandratre, Aniruddh Vinay (Author) / Fainekos, Georgios (Thesis advisor) / Ben Amor, Hani (Thesis advisor) / Pedrielli, Giulia (Committee member) / Arizona State University (Publisher)
Created2022
171980-Thumbnail Image.png
Description
The increasing availability of data and advances in computation have spurred the development of data-driven approaches for modeling complex dynamical systems. These approaches are based on the idea that the underlying structure of a complex system can be discovered from data using mathematical and computational techniques. They also show promise

The increasing availability of data and advances in computation have spurred the development of data-driven approaches for modeling complex dynamical systems. These approaches are based on the idea that the underlying structure of a complex system can be discovered from data using mathematical and computational techniques. They also show promise for addressing the challenges of modeling high-dimensional, nonlinear systems with limited data. In this research expository, the state of the art in data-driven approaches for modeling complex dynamical systems is surveyed in a systemic way. First the general formulation of data-driven modeling of dynamical systems is discussed. Then several representative methods in feature engineering and system identification/prediction are reviewed, including recent advances and key challenges.
ContributorsShi, Wenlong (Author) / Ren, Yi (Thesis advisor) / Hong, Qijun (Committee member) / Jiao, Yang (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)
Created2022
171434-Thumbnail Image.png
Description
Recent advances in techniques allow the extraction of Cyber Threat Information (CTI) from online content, such as social media, blog articles, and posts in discussion forums. Most research work focuses on social media and blog posts since their content is often contributed by cybersecurity experts and is usually of cleaner

Recent advances in techniques allow the extraction of Cyber Threat Information (CTI) from online content, such as social media, blog articles, and posts in discussion forums. Most research work focuses on social media and blog posts since their content is often contributed by cybersecurity experts and is usually of cleaner formats. While posts in online forums are noisier and less structured, online forums attract more users than other sources and contain much valuable information that may help predict cyber threats. Therefore, effectively extracting CTI from online forum posts is an important task in today's data-driven cybersecurity defenses. Many Natural Language Processing (NLP) techniques are applied to the cybersecurity domains to extract the useful information, however, there is still space to improve. In this dissertation, a new Named Entity Recognition framework for cybersecurity domains and thread structure construction methods for unstructured forums are proposed to support the extraction of CTI. Then, extend them to filter the posts in the forums to eliminate non cybersecurity related topics with Cyber Attack Relevance Scale (CARS), extract the cybersecurity knowledgeable users to enhance more information for enhancing cybersecurity, and extract trending topic phrases related to cyber attacks in the hackers forums to find the clues for potential future attacks to predict them.
ContributorsKashihara, Kazuaki (Author) / Baral, Chitta (Thesis advisor) / Doupe, Adam (Committee member) / Blanco, Eduardo (Committee member) / Wang, Ruoyu (Committee member) / Arizona State University (Publisher)
Created2022
171440-Thumbnail Image.png
Description
Machine learning models and in specific, neural networks, are well known for being inscrutable in nature. From image classification tasks and generative techniques for data augmentation, to general purpose natural language models, neural networks are currently the algorithm of preference that is riding the top of the current artificial intelligence

Machine learning models and in specific, neural networks, are well known for being inscrutable in nature. From image classification tasks and generative techniques for data augmentation, to general purpose natural language models, neural networks are currently the algorithm of preference that is riding the top of the current artificial intelligence (AI) wave, having experienced the greatest boost in popularity above any other machine learning solution. However, due to their inscrutable design based on the optimization of millions of parameters, it is ever so complex to understand how their decision is influenced nor why (and when) they fail. While some works aim at explaining neural network decisions or making systems to be inherently interpretable the great majority of state of the art machine learning works prioritize performance over interpretability effectively becoming black boxes. Hence, there is still uncertainty in the decision boundaries of these already deployed solutions whose predictions should still be analyzed and taken with care. This becomes even more important when these models are used on sensitive scenarios such as medicine, criminal justice, settings with native inherent social biases or where egregious mispredictions can negatively impact the system or human trust down the line. Thus, the aim of this work is to provide a comprehensive analysis on the failure modes of the state of the art neural networks from three domains: large image classifiers and their misclassifications, generative adversarial networks when used for data augmentation and transformer networks applied to structured representations and reasoning about actions and change.
ContributorsOlmo Hernandez, Alberto (Author) / Kambhampati, Subbarao (Thesis advisor) / Liu, Huan (Committee member) / Li, Baoxin (Committee member) / Sengupta, Sailik (Committee member) / Arizona State University (Publisher)
Created2022
171895-Thumbnail Image.png
Description
Adversarial threats of deep learning are increasingly becoming a concern due to the ubiquitous deployment of deep neural networks(DNNs) in many security-sensitive domains. Among the existing threats, adversarial weight perturbation is an emerging class of threats that attempts to perturb the weight parameters of DNNs to breach security and privacy.In

Adversarial threats of deep learning are increasingly becoming a concern due to the ubiquitous deployment of deep neural networks(DNNs) in many security-sensitive domains. Among the existing threats, adversarial weight perturbation is an emerging class of threats that attempts to perturb the weight parameters of DNNs to breach security and privacy.In this thesis, the first weight perturbation attack introduced is called Bit-Flip Attack (BFA), which can maliciously flip a small number of bits within a computer’s main memory system storing the DNN weight parameter to achieve malicious objectives. Our developed algorithm can achieve three specific attack objectives: I) Un-targeted accuracy degradation attack, ii) Targeted attack, & iii) Trojan attack. Moreover, BFA utilizes the rowhammer technique to demonstrate the bit-flip attack in an actual computer prototype. While the bit-flip attack is conducted in a white-box setting, the subsequent contribution of this thesis is to develop another novel weight perturbation attack in a black-box setting. Consequently, this thesis discusses a new study of DNN model vulnerabilities in a multi-tenant Field Programmable Gate Array (FPGA) cloud under a strict black-box framework. This newly developed attack framework injects faults in the malicious tenant by duplicating specific DNN weight packages during data transmission between off-chip memory and on-chip buffer of a victim FPGA. The proposed attack is also experimentally validated in a multi-tenant cloud FPGA prototype. In the final part, the focus shifts toward deep learning model privacy, popularly known as model extraction, that can steal partial DNN weight parameters remotely with the aid of a memory side-channel attack. In addition, a novel training algorithm is designed to utilize the partially leaked DNN weight bit information, making the model extraction attack more effective. The algorithm effectively leverages the partial leaked bit information and generates a substitute prototype of the victim model with almost identical performance to the victim.
ContributorsRakin, Adnan Siraj (Author) / Fan, Deliang (Thesis advisor) / Chakrabarti, Chaitali (Committee member) / Seo, Jae-Sun (Committee member) / Cao, Yu (Committee member) / Arizona State University (Publisher)
Created2022
189299-Thumbnail Image.png
Description
Multiple robotic arms collaboration is to control multiple robotic arms to collaborate with each other to work on the same task. During the collaboration, theagent is required to avoid all possible collisions between each part of the robotic arms. Thus, incentivizing collaboration and preventing collisions are the two principles which are followed

Multiple robotic arms collaboration is to control multiple robotic arms to collaborate with each other to work on the same task. During the collaboration, theagent is required to avoid all possible collisions between each part of the robotic arms. Thus, incentivizing collaboration and preventing collisions are the two principles which are followed by the agent during the training process. Nowadays, more and more applications, both in industry and daily lives, require at least two arms, instead of requiring only a single arm. A dual-arm robot satisfies much more needs of different types of tasks, such as folding clothes at home, making a hamburger in a grill or picking and placing a product in a warehouse. The applications done in this paper are all about object pushing. This thesis focuses on how to train the agent to learn pushing an object away as far as possible. Reinforcement Learning (RL), which is a type of Machine Learning (ML), is then utilized in this paper to train the agent to generate optimal actions. Deep Deterministic Policy Gradient (DDPG) and Hindsight Experience Replay (HER) are the two RL methods used in this thesis.
ContributorsLin, Steve (Author) / Ben Amor, Hani (Thesis advisor) / Redkar, Sangram (Committee member) / Zhang, Yu (Committee member) / Arizona State University (Publisher)
Created2023
190707-Thumbnail Image.png
Description
Scientific research encompasses a variety of objectives, including measurement, making predictions, identifying laws, and more. The advent of advanced measurement technologies and computational methods has largely automated the processes of big data collection and prediction. However, the discovery of laws, particularly universal ones, still heavily relies on human intellect. Even

Scientific research encompasses a variety of objectives, including measurement, making predictions, identifying laws, and more. The advent of advanced measurement technologies and computational methods has largely automated the processes of big data collection and prediction. However, the discovery of laws, particularly universal ones, still heavily relies on human intellect. Even with human intelligence, complex systems present a unique challenge in discerning the laws that govern them. Even the preliminary step, system description, poses a substantial challenge. Numerous metrics have been developed, but universally applicable laws remain elusive. Due to the cognitive limitations of human comprehension, a direct understanding of big data derived from complex systems is impractical. Therefore, simplification becomes essential for identifying hidden regularities, enabling scientists to abstract observations or draw connections with existing knowledge. As a result, the concept of macrostates -- simplified, lower-dimensional representations of high-dimensional systems -- proves to be indispensable. Macrostates serve a role beyond simplification. They are integral in deciphering reusable laws for complex systems. In physics, macrostates form the foundation for constructing laws and provide building blocks for studying relationships between quantities, rather than pursuing case-by-case analysis. Therefore, the concept of macrostates facilitates the discovery of regularities across various systems. Recognizing the importance of macrostates, I propose the relational macrostate theory and a machine learning framework, MacroNet, to identify macrostates and design microstates. The relational macrostate theory defines a macrostate based on the relationships between observations, enabling the abstraction from microscopic details. In MacroNet, I propose an architecture to encode microstates into macrostates, allowing for the sampling of microstates associated with a specific macrostate. My experiments on simulated systems demonstrate the effectiveness of this theory and method in identifying macrostates such as energy. Furthermore, I apply this theory and method to a complex chemical system, analyzing oil droplets with intricate movement patterns in a Petri dish, to answer the question, ``which combinations of parameters control which behavior?'' The macrostate theory allows me to identify a two-dimensional macrostate, establish a mapping between the chemical compound and the macrostate, and decipher the relationship between oil droplet patterns and the macrostate.
ContributorsZhang, Yanbo (Author) / Walker, Sara I (Thesis advisor) / Anbar, Ariel (Committee member) / Daniels, Bryan (Committee member) / Das, Jnaneshwar (Committee member) / Davies, Paul (Committee member) / Arizona State University (Publisher)
Created2023
190708-Thumbnail Image.png
Description
Generative models are deep neural network-based models trained to learn the underlying distribution of a dataset. Once trained, these models can be used to sample novel data points from this distribution. Their impressive capabilities have been manifested in various generative tasks, encompassing areas like image-to-image translation, style transfer, image editing,

Generative models are deep neural network-based models trained to learn the underlying distribution of a dataset. Once trained, these models can be used to sample novel data points from this distribution. Their impressive capabilities have been manifested in various generative tasks, encompassing areas like image-to-image translation, style transfer, image editing, and more. One notable application of generative models is data augmentation, aimed at expanding and diversifying the training dataset to augment the performance of deep learning models for a downstream task. Generative models can be used to create new samples similar to the original data but with different variations and properties that are difficult to capture with traditional data augmentation techniques. However, the quality, diversity, and controllability of the shape and structure of the generated samples from these models are often directly proportional to the size and diversity of the training dataset. A more extensive and diverse training dataset allows the generative model to capture overall structures present in the data and generate more diverse and realistic-looking samples. In this dissertation, I present innovative methods designed to enhance the robustness and controllability of generative models, drawing upon physics-based, probabilistic, and geometric techniques. These methods help improve the generalization and controllability of the generative model without necessarily relying on large training datasets. I enhance the robustness of generative models by integrating classical geometric moments for shape awareness and minimizing trainable parameters. Additionally, I employ non-parametric priors for the generative model's latent space through basic probability and optimization methods to improve the fidelity of interpolated images. I adopt a hybrid approach to address domain-specific challenges with limited data and controllability, combining physics-based rendering with generative models for more realistic results. These approaches are particularly relevant in industrial settings, where the training datasets are small and class imbalance is common. Through extensive experiments on various datasets, I demonstrate the effectiveness of the proposed methods over conventional approaches.
ContributorsSingh, Rajhans (Author) / Turaga, Pavan (Thesis advisor) / Jayasuriya, Suren (Committee member) / Berisha, Visar (Committee member) / Fazli, Pooyan (Committee member) / Arizona State University (Publisher)
Created2023