This collection includes both ASU Theses and Dissertations, submitted by graduate students, and the Barrett, Honors College theses submitted by undergraduate students. 

Displaying 1 - 8 of 8
Filtering by

Clear all filters

156290-Thumbnail Image.png
Description
Data breaches have been on a rise and financial sector is among the top targeted. It can take a few months and upto a few years to identify the occurrence of a data breach. A major motivation behind data breaches is financial gain, hence most of the data ends u

Data breaches have been on a rise and financial sector is among the top targeted. It can take a few months and upto a few years to identify the occurrence of a data breach. A major motivation behind data breaches is financial gain, hence most of the data ends up being on sale on the darkweb websites. It is important to identify sale of such stolen information on a timely and relevant manner. In this research, we present a system for timely identification of sale of stolen data on darkweb websites. We frame identifying sale of stolen data as a multi-label classification problem and leverage several machine learning approaches based on the thread content (textual) and social network analysis of the user communication seen on darkweb websites. The system generates alerts about trends based on popularity amongst the users of such websites. We evaluate our system using the K-fold cross validation as well as manual evaluation of blind (unseen) data. The method of combining social network and textual features outperforms baseline method i.e only using textual features, by 15 to 20 % improved precision. The alerts provide a good insight and we illustrate our findings by cases studies of the results.
ContributorsDharaiya, Krishna Tushar (Author) / Shakarian, Paulo (Thesis advisor) / Doupe, Adam (Committee member) / Shoshitaishvili, Yan (Committee member) / Arizona State University (Publisher)
Created2018
156681-Thumbnail Image.png
Description
With the rise of the Internet of Things, embedded systems have become an integral part of life and can be found almost anywhere. Their prevalence and increased interconnectivity has made them a prime target for malicious attacks. Today, the vast majority of embedded devices are powered by ARM processors. To

With the rise of the Internet of Things, embedded systems have become an integral part of life and can be found almost anywhere. Their prevalence and increased interconnectivity has made them a prime target for malicious attacks. Today, the vast majority of embedded devices are powered by ARM processors. To protect their processors from attacks, ARM introduced a hardware security extension known as TrustZone. It provides an isolated execution environment within the embedded device in which to deploy various memory integrity and malware detection tools.

Even though Secure World can monitor the Normal World, attackers can attempt to bypass the security measures to retain control of a compromised system. CacheKit is a new type of rootkit that exploits such a vulnerability in the ARM architecture to hide in Normal World cache from memory introspection tools running in Secure World by exploiting cache locking mechanisms. If left unchecked, ARM processors that provide hardware assisted cache locking for performance and time-critical applications in real-time and embedded systems would be completely vulnerable to this undetectable and untraceable attack. Therefore, a new approach is needed to ensure the correct use of such mechanisms and prevent malicious code from being hidden in the cache.

CacheLight is a lightweight approach that leverages the TrustZone and Virtualization extensions of the ARM architecture to allow the system to continue to securely provide these hardware facilities to users while preventing attackers from exploiting them. CacheLight restricts the ability to lock the cache to the Secure World of the processor such that the Normal World can still request certain memory to be locked into the cache by the secure operating system (OS) through a Secure Monitor Call (SMC). This grants the secure OS the power to verify and validate the information that will be locked in the requested cache way thereby ensuring that any data that remains in the cache will not be inconsistent with what exists in main memory for inspection. Malicious attempts to hide data can be prevented and recovered for analysis while legitimate requests can still generate valid entries in the cache.
ContributorsGutierrez, Mauricio (Author) / Zhao, Ziming (Thesis advisor) / Doupe, Adam (Committee member) / Shoshitaishvili, Yan (Committee member) / Arizona State University (Publisher)
Created2018
133260-Thumbnail Image.png
Description
Smart cars are defined by the European Union Agency for Network and Information Security (ENISA) as systems providing connected, added-value features in order to enhance car users' experience or improve car safety. Because of their extra features, smart cars utilize sophisticated computer systems. These systems, particularly the Controller Area Network

Smart cars are defined by the European Union Agency for Network and Information Security (ENISA) as systems providing connected, added-value features in order to enhance car users' experience or improve car safety. Because of their extra features, smart cars utilize sophisticated computer systems. These systems, particularly the Controller Area Network (CAN) bus and protocol, have been shown to provide information that can be used to accurately identify individual Electronic Control Units (ECUs) within a car and the driver that is operating a car. I expand upon this work to consider how information from in-vehicle computer systems can be used to identify individual vehicles. I consider fingerprinting vehicles as a means of aiding in stolen car recovery, thwarting VIN forgery, and supporting an intrusion detection system for networks of smart and autonomous vehicles in the near future. I provide an overview of in-vehicle computer systems and detail my work toward building an ECU testbed and fingerprinting vehicles.
ContributorsDavison, Paulina (Author) / Zhao, Ziming (Thesis director) / Ahn, Gail-Joon (Committee member) / Shoshitaishvili, Yan (Committee member) / Doupe, Adam (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)
Created2018-05
189330-Thumbnail Image.png
Description
This thesis presents a study on the fuzzing of Linux binaries to find occluded bugs. Fuzzing is a widely-used technique for identifying software bugs. Despite their effectiveness, state-of-the-art fuzzers suffer from limitations in efficiency and effectiveness. Fuzzers based on random mutations are fast but struggle to generate high-quality inputs. In

This thesis presents a study on the fuzzing of Linux binaries to find occluded bugs. Fuzzing is a widely-used technique for identifying software bugs. Despite their effectiveness, state-of-the-art fuzzers suffer from limitations in efficiency and effectiveness. Fuzzers based on random mutations are fast but struggle to generate high-quality inputs. In contrast, fuzzers based on symbolic execution produce quality inputs but lack execution speed. This paper proposes FlakJack, a novel hybrid fuzzer that patches the binary on the go to detect occluded bugs guarded by surface bugs. To dynamically overcome the challenge of patching binaries, the paper introduces multiple patching strategies based on the type of bug detected. The performance of FlakJack was evaluated on ten widely-used real-world binaries and one chaff dataset binary. The results indicate that many bugs found recently were already present in previous versions but were occluded by surface bugs. FlakJack’s approach improved the bug-finding ability by patching surface bugs that usually guard occluded bugs, significantly reducing patching cycles. Despite its unbalanced approach compared to other coverage-guided fuzzers, FlakJack is fast, lightweight, and robust. False- Positives can be filtered out quickly, and the approach is practical in other parts of the target. The paper shows that the FlakJack approach can significantly improve fuzzing performance without relying on complex strategies.
ContributorsPraveen Menon, Gokulkrishna (Author) / Bao, Tiffany (Thesis advisor) / Shoshitaishvili, Yan (Thesis advisor) / Doupe, Adam (Committee member) / Arizona State University (Publisher)
Created2023
171701-Thumbnail Image.png
Description
Reverse engineering is a process focused on gaining an understanding for the intricaciesof a system. This practice is critical in cybersecurity as it promotes the findings and patching of vulnerabilities as well as the counteracting of malware. Disassemblers and decompilers have become essential when reverse engineering due to the readability of information they

Reverse engineering is a process focused on gaining an understanding for the intricaciesof a system. This practice is critical in cybersecurity as it promotes the findings and patching of vulnerabilities as well as the counteracting of malware. Disassemblers and decompilers have become essential when reverse engineering due to the readability of information they transcribe from binary files. However, these tools still tend to produce involved and complicated outputs that hinder the acquisition of knowledge during binary analysis. Cognitive Load Theory (CLT) explains that this hindrance is due to the human brain’s inability to process superfluous amounts of data. CLT classifies this data into three cognitive load types — intrinsic, extraneous, and germane — that each can help gauge complex procedures. In this research paper, a novel program call graph is presented accounting for these CLT principles. The goal of this graphical view is to reduce the cognitive load tied to the depiction of binary information and to enhance the overall binary analysis process. This feature was implemented within the binary analysis tool, angr and it’s user interface counterpart, angr-management. Additionally, this paper will examine a conducted user study to quantitatively and qualitatively evaluate the effectiveness of the newly proposed proximity view (PV). The user study includes a binary challenge solving portion measured by defined metrics and a survey phase to receive direct participant feedback regarding the view. The results from this study show statistically significant evidence that PV aids in challenge solving and improves the overall understanding binaries. The results also signify that this improvement comes with the cost of time. The survey section of the user study further indicates that users find PV beneficial to the reverse engineering process, but additional information needs to be included in future developments.
ContributorsSmits, Sean (Author) / Wang, Ruoyu (Thesis advisor) / Shoshitaishvili, Yan (Thesis advisor) / Doupe, Adam (Committee member) / Arizona State University (Publisher)
Created2022
171711-Thumbnail Image.png
Description
Binary analysis and software debugging are critical tools in the modern softwaresecurity ecosystem. With the security arms race between attackers discovering and exploiting vulnerabilities and the development teams patching bugs ever-tightening, there is an immense need for more tooling to streamline the binary analysis and debugging processes. Whether attempting to find the root

Binary analysis and software debugging are critical tools in the modern softwaresecurity ecosystem. With the security arms race between attackers discovering and exploiting vulnerabilities and the development teams patching bugs ever-tightening, there is an immense need for more tooling to streamline the binary analysis and debugging processes. Whether attempting to find the root cause for a buffer overflow or a segmentation fault, the analysis process often involves manually tracing the movement of data throughout a program’s life cycle. Up until this point, there has not been a viable solution to the human limitation of maintaining a cohesive mental image of the intricacies of a program’s data flow. This thesis proposes a novel data dependency graph (DDG) analysis as an addi- tion to angr’s analyses suite. This new analysis ingests a symbolic execution trace in order to generate a directed acyclic graph of the program’s data dependencies. In addition to the development of the backend logic needed to generate this graph, an angr management view to visualize the DDG was implemented. This user interface provides functionality for ancestor and descendant dependency tracing and sub-graph creation. To evaluate the analysis, a user study was conducted to measure the view’s efficacy in regards to binary analysis and software debugging. The study consisted of a control group and experimental group attempting to solve a series of 3 chal- lenges and subsequently providing feedback concerning perceived functionality and comprehensibility pertaining to the view. The results show that the view had a positive trend in relation to challenge-solving accuracy in its target domain, as participants solved 32% more challenges 21% faster when using the analysis than when using vanilla angr management.
ContributorsCapuano, Bailey Kellen (Author) / Shoshitaishvili, Yan (Thesis advisor) / Wang, Ruoyu (Thesis advisor) / Doupe, Adam (Committee member) / Arizona State University (Publisher)
Created2022
158545-Thumbnail Image.png
Description
Due to the increase in computer and database dependency, the damage caused by malicious codes increases. Moreover, gravity and the magnitude of malicious attacks by hackers grow at an unprecedented rate. A key challenge lies on detecting such malicious attacks and codes in real-time by the use of existing methods,

Due to the increase in computer and database dependency, the damage caused by malicious codes increases. Moreover, gravity and the magnitude of malicious attacks by hackers grow at an unprecedented rate. A key challenge lies on detecting such malicious attacks and codes in real-time by the use of existing methods, such as a signature-based detection approach. To this end, computer scientists have attempted to classify heterogeneous types of malware on the basis of their observable characteristics. Existing literature focuses on classifying binary codes, due to the greater accessibility of malware binary than source code. Also, for the improved speed and scalability, machine learning-based approaches are widely used. Despite such merits, the machine learning-based approach critically lacks the interpretability of its outcome, thus restricts understandings of why a given code belongs to a particular type of malicious malware and, importantly, why some portions of a code are reused very often by hackers. In this light, this study aims to enhance understanding of malware by directly investigating reused codes and uncovering their characteristics.

To examine reused codes in malware, both malware with source code and malware with binary code are considered in this thesis. For malware with source code, reused code chunks in the Mirai botnet. This study lists frequently reused code chunks and analyzes the characteristics and location of the code. For malware with binary code, this study performs reverse engineering on the binary code for human readers to comprehend, visually inspects reused codes in binary ransomware code, and illustrates the functionality of the reused codes on the basis of similar behaviors and tactics.

This study makes a novel contribution to the literature by directly investigating the characteristics of reused code in malware. The findings of the study can help cybersecurity practitioners and scholars increase the performance of malware classification.
ContributorsLEe, Yeonjung (Author) / Bao, Youzhi (Thesis advisor) / Doupe, Adam (Committee member) / Shoshitaishvili, Yan (Committee member) / Arizona State University (Publisher)
Created2020
190944-Thumbnail Image.png
Description
The rise in popularity of applications and services that charge for access to proprietary trained models has led to increased interest in the robustness of these models and the security of the environments in which inference is conducted. State-of-the-art attacks extract models and generate adversarial examples by inferring relationships between

The rise in popularity of applications and services that charge for access to proprietary trained models has led to increased interest in the robustness of these models and the security of the environments in which inference is conducted. State-of-the-art attacks extract models and generate adversarial examples by inferring relationships between a model’s input and output. Popular variants of these attacks have been shown to be deterred by countermeasures that poison predicted class distributions and mask class boundary gradients. Neural networks are also vulnerable to timing side-channel attacks. This work builds on top of Subneural, an attack framework that uses floating point timing side channels to extract neural structures. Novel applications of addition timing side channels are introduced, allowing the signs and arrangements of leaked parameters to be discerned more efficiently. Addition timing is also used to leak network biases, making the framework applicable to a wider range of targets. The enhanced framework is shown to be effective against models protected by prediction poisoning and gradient masking adversarial countermeasures and to be competitive with adaptive black box adversarial attacks against stateful defenses. Mitigations necessary to protect against floating-point timing side-channel attacks are also presented.
ContributorsVipat, Gaurav (Author) / Shoshitaishvili, Yan (Thesis advisor) / Doupe, Adam (Committee member) / Srivastava, Siddharth (Committee member) / Arizona State University (Publisher)
Created2023