Search Content

A Hacker-Centric Perspective to Empower Cyber Defense

Description

Malicious hackers utilize the World Wide Web to share knowledge. Previous work has demonstrated that information mined from online hacking communities can be used as precursors to cyber-attacks. In a threatening scenario, where security alert systems are facing high false positive rates, understanding the people behind cyber incidents can hel…

Malicious hackers utilize the World Wide Web to share knowledge. Previous work has demonstrated that information mined from online hacking communities can be used as precursors to cyber-attacks. In a threatening scenario, where security alert systems are facing high false positive rates, understanding the people behind cyber incidents can help reduce the risk of attacks. However, the rapidly evolving nature of those communities leads to limitations still largely unexplored, such as: who are the skilled and influential individuals forming those groups, how they self-organize along the lines of technical expertise, how ideas propagate within them, and which internal patterns can signal imminent cyber offensives? In this dissertation, I have studied four key parts of this complex problem set. Initially, I leverage content, social network, and seniority analysis to mine key-hackers on darkweb forums, identifying skilled and influential individuals who are likely to succeed in their cybercriminal goals. Next, as hackers often use Web platforms to advertise and recruit collaborators, I analyze how social influence contributes to user engagement online. On social media, two time constraints are proposed to extend standard influence measures, which increases their correlation with adoption probability and consequently improves hashtag adoption prediction. On darkweb forums, the prediction of where and when hackers will post a message in the near future is accomplished by analyzing their recurrent interactions with other hackers. After that, I demonstrate how vendors of malware and malicious exploits organically form hidden organizations on darkweb marketplaces, obtaining significant consistency across the vendors’ communities extracted using the similarity of their products in different networks. Finally, I predict imminent cyber-attacks correlating malicious hacking activity on darkweb forums with real-world cyber incidents, evidencing how social indicators are crucial for the performance of the proposed model. This research is a hybrid of social network analysis (SNA), machine learning (ML), evolutionary computation (EC), and temporal logic (TL), presenting expressive contributions to empower cyber defense.

ContributorsSantana Marin, Ericsson (Author) / Shakarian, Paulo (Thesis advisor) / Doupe, Adam (Committee member) / Liu, Huan (Committee member) / Ferrara, Emilio (Committee member) / Arizona State University (Publisher)

Created2020

A Study of Online Security Practices

Description

Data from a total of 282 online web applications was collected, and accounts for 230 of those web applications were created in order to gather data about authentication practices, multistep authentication practices, security question practices, fallback authentication practices, and other security practices for online accounts. The account creation and data…

Data from a total of 282 online web applications was collected, and accounts for 230 of those web applications were created in order to gather data about authentication practices, multistep authentication practices, security question practices, fallback authentication practices, and other security practices for online accounts. The account creation and data collection was done between June 2016 and April 2017. The password strengths for online accounts were analyzed and password strength data was compared to existing data. Security questions used by online accounts were evaluated for security and usability, and fallback authentication practices were assessed based on their adherence to best practices. Alternative authentication schemes were examined, and other security considerations such as use of HTTPS and CAPTCHAs were explored. Based on existing data, password policies require stronger passwords in for web applications in 2017 compared to the requirements in 2010. Nevertheless, password policies for many accounts are still not adequate. About a quarter of online web applications examined use security questions, and many of the questions have usability and security concerns. Security mechanisms such as HTTPS and continuous authentication are in general not used in conjunction with security questions for most web applications, which reduces the overall security of the web application. A majority of web applications use email addresses as the login credential and the password recovery credential and do not follow best practices. About a quarter of accounts use multistep authentication and a quarter of accounts employ continuous authentication, yet most accounts fail to combine security measures for defense in depth. The overall conclusion is that some online web applications are using secure practices; however, a majority of online web applications fail to properly implement and utilize secure practices.

ContributorsGutierrez, Garrett (Author) / Bazzi, Rida (Thesis advisor) / Ahn, Gail-Joon (Committee member) / Doupe, Adam (Committee member) / Arizona State University (Publisher)

Created2017

A Proactive Approach to Detect IoT Based Flooding Attacks by Using Software Defined Networks and Manufacturer Usage Descriptions

Description

The advent of the Internet of Things (IoT) and its increasing appearances in

Small Office/Home Office (SOHO) networks pose a unique issue to the availability

and health of the Internet at large. Many of these devices are shipped insecurely, with

poor default user and password credentials and oftentimes the general consumer does

not have…

The advent of the Internet of Things (IoT) and its increasing appearances in

Small Office/Home Office (SOHO) networks pose a unique issue to the availability

and health of the Internet at large. Many of these devices are shipped insecurely, with

poor default user and password credentials and oftentimes the general consumer does

not have the technical knowledge of how they may secure their devices and networks.

The many vulnerabilities of the IoT coupled with the immense number of existing

devices provide opportunities for malicious actors to compromise such devices and

use them in large scale distributed denial of service attacks, preventing legitimate

users from using services and degrading the health of the Internet in general.

This thesis presents an approach that leverages the benefits of an Internet Engineering

Task Force (IETF) proposed standard named Manufacturer Usage Descriptions,

that is used in conjunction with the concept of Software Defined Networks

(SDN) in order to detect malicious traffic generated from IoT devices suspected of

being utilized in coordinated flooding attacks. The approach then works towards

the ability to detect these attacks at their sources through periodic monitoring of

preemptively permitted flow rules and determining which of the flows within the permitted

set are misbehaving by using an acceptable traffic range using Exponentially

Weighted Moving Averages (EWMA).

ContributorsChang, Laurence Hao (Author) / Yau, Stephen (Thesis advisor) / Doupe, Adam (Committee member) / Huang, Dijiang (Committee member) / Arizona State University (Publisher)

Created2018

Representation Learning for Trustworthy AI

Description

Artificial Intelligence (AI) systems have achieved outstanding performance and have been found to be better than humans at various tasks, such as sentiment analysis, and face recognition. However, the majority of these state-of-the-art AI systems use complex Deep Learning (DL) methods which present challenges for human experts to design and…

Artificial Intelligence (AI) systems have achieved outstanding performance and have been found to be better than humans at various tasks, such as sentiment analysis, and face recognition. However, the majority of these state-of-the-art AI systems use complex Deep Learning (DL) methods which present challenges for human experts to design and evaluate such models with respect to privacy, fairness, and robustness. Recent examination of DL models reveals that representations may include information that could lead to privacy violations, unfairness, and robustness issues. This results in AI systems that are potentially untrustworthy from a socio-technical standpoint. Trustworthiness in AI is defined by a set of model properties such as non-discriminatory bias, protection of users’ sensitive attributes, and lawful decision-making. The characteristics of trustworthy AI can be grouped into three categories: Reliability, Resiliency, and Responsibility. Past research has shown that the successful integration of an AI model depends on its trustworthiness. Thus it is crucial for organizations and researchers to build trustworthy AI systems to facilitate the seamless integration and adoption of intelligent technologies. The main issue with existing AI systems is that they are primarily trained to improve technical measures such as accuracy on a specific task but are not considerate of socio-technical measures. The aim of this dissertation is to propose methods for improving the trustworthiness of AI systems through representation learning. DL models’ representations contain information about a given input and can be used for tasks such as detecting fake news on social media or predicting the sentiment of a review. The findings of this dissertation significantly expand the scope of trustworthy AI research and establish a new paradigm for modifying data representations to balance between properties of trustworthy AI. Specifically, this research investigates multiple techniques such as reinforcement learning for understanding trustworthiness in users’ privacy, fairness, and robustness in classification tasks like cyberbullying detection and fake news detection. Since most social measures in trustworthy AI cannot be used to fine-tune or train an AI model directly, the main contribution of this dissertation lies in using reinforcement learning to alter an AI system’s behavior based on non-differentiable social measures.

ContributorsMosallanezhad, Ahmadreza (Author) / Liu, Huan (Thesis advisor) / Mancenido, Michelle (Thesis advisor) / Doupe, Adam (Committee member) / Maciejewski, Ross (Committee member) / Arizona State University (Publisher)

Created2023

Trapped in Transparency: Analyzing the Effectiveness of Security Defenses in Real-World Scenarios

Description

Honeypots – cyber deception technique used to lure attackers into a trap. They contain fake confidential information to make an attacker believe that their attack has been successful. One of the prerequisites for a honeypot to be effective is that it needs to be undetectable. Deploying sniffing and event logging…

Honeypots – cyber deception technique used to lure attackers into a trap. They contain fake confidential information to make an attacker believe that their attack has been successful. One of the prerequisites for a honeypot to be effective is that it needs to be undetectable. Deploying sniffing and event logging tools alongside the honeypot also helps understand the mindset of the attacker after successful attacks. Is there any data that backs up the claim that honeypots are effective in real life scenarios? The answer is no.Game-theoretic models have been helpful to approximate attacker and defender actions in cyber security. However, in the past these models have relied on expert- created data. The goal of this research project is to determine the effectiveness of honeypots using real-world data. So, how to deploy effective honeypots? This is where honey-patches come into play. Honey-patches are software patches designed to hinder the attacker’s ability to determine whether an attack has been successful or not. When an attacker launches a successful attack on a software, the honey-patch transparently redirects the attacker into a honeypot. The honeypot contains fake information which makes the attacker believe they were successful while in reality they were not. After conducting a series of experiments and analyzing the results, there is a clear indication that honey-patches are not the perfect application security solution having both pros and cons.

ContributorsChauhan, Purv Rakeshkumar (Author) / Doupe, Adam (Thesis advisor) / Bao, Youzhi (Committee member) / Wang, Ruoyu (Committee member) / Arizona State University (Publisher)

Created2022

Enhancing Binary Analysis through Cognitive Load Theory

Description

Reverse engineering is a process focused on gaining an understanding for the intricaciesof a system. This practice is critical in cybersecurity as it promotes the findings and patching of vulnerabilities as well as the counteracting of malware. Disassemblers and decompilers have become essential when reverse engineering due to the readability of information they…

Reverse engineering is a process focused on gaining an understanding for the intricaciesof a system. This practice is critical in cybersecurity as it promotes the findings and patching of vulnerabilities as well as the counteracting of malware. Disassemblers and decompilers have become essential when reverse engineering due to the readability of information they transcribe from binary files. However, these tools still tend to produce involved and complicated outputs that hinder the acquisition of knowledge during binary analysis. Cognitive Load Theory (CLT) explains that this hindrance is due to the human brain’s inability to process superfluous amounts of data. CLT classifies this data into three cognitive load types — intrinsic, extraneous, and germane — that each can help gauge complex procedures. In this research paper, a novel program call graph is presented accounting for these CLT principles. The goal of this graphical view is to reduce the cognitive load tied to the depiction of binary information and to enhance the overall binary analysis process. This feature was implemented within the binary analysis tool, angr and it’s user interface counterpart, angr-management. Additionally, this paper will examine a conducted user study to quantitatively and qualitatively evaluate the effectiveness of the newly proposed proximity view (PV). The user study includes a binary challenge solving portion measured by defined metrics and a survey phase to receive direct participant feedback regarding the view. The results from this study show statistically significant evidence that PV aids in challenge solving and improves the overall understanding binaries. The results also signify that this improvement comes with the cost of time. The survey section of the user study further indicates that users find PV beneficial to the reverse engineering process, but additional information needs to be included in future developments.

ContributorsSmits, Sean (Author) / Wang, Ruoyu (Thesis advisor) / Shoshitaishvili, Yan (Thesis advisor) / Doupe, Adam (Committee member) / Arizona State University (Publisher)

Created2022

Visualizing Information Flow Graph-Based Approach to Tracing Data Dependencies for Binary Analysis

Description

Binary analysis and software debugging are critical tools in the modern softwaresecurity ecosystem. With the security arms race between attackers discovering and exploiting vulnerabilities and the development teams patching bugs ever-tightening, there is an immense need for more tooling to streamline the binary analysis and debugging processes. Whether attempting to find the root…

Binary analysis and software debugging are critical tools in the modern softwaresecurity ecosystem. With the security arms race between attackers discovering and exploiting vulnerabilities and the development teams patching bugs ever-tightening, there is an immense need for more tooling to streamline the binary analysis and debugging processes. Whether attempting to find the root cause for a buffer overflow or a segmentation fault, the analysis process often involves manually tracing the movement of data throughout a program’s life cycle. Up until this point, there has not been a viable solution to the human limitation of maintaining a cohesive mental image of the intricacies of a program’s data flow. This thesis proposes a novel data dependency graph (DDG) analysis as an addi- tion to angr’s analyses suite. This new analysis ingests a symbolic execution trace in order to generate a directed acyclic graph of the program’s data dependencies. In addition to the development of the backend logic needed to generate this graph, an angr management view to visualize the DDG was implemented. This user interface provides functionality for ancestor and descendant dependency tracing and sub-graph creation. To evaluate the analysis, a user study was conducted to measure the view’s efficacy in regards to binary analysis and software debugging. The study consisted of a control group and experimental group attempting to solve a series of 3 chal- lenges and subsequently providing feedback concerning perceived functionality and comprehensibility pertaining to the view. The results show that the view had a positive trend in relation to challenge-solving accuracy in its target domain, as participants solved 32% more challenges 21% faster when using the analysis than when using vanilla angr management.

ContributorsCapuano, Bailey Kellen (Author) / Shoshitaishvili, Yan (Thesis advisor) / Wang, Ruoyu (Thesis advisor) / Doupe, Adam (Committee member) / Arizona State University (Publisher)

Created2022

Extracting Semantic Information from Online Conversations to Enhance Cyber Defense

Description

Recent advances in techniques allow the extraction of Cyber Threat Information (CTI) from online content, such as social media, blog articles, and posts in discussion forums. Most research work focuses on social media and blog posts since their content is often contributed by cybersecurity experts and is usually of cleaner…

Recent advances in techniques allow the extraction of Cyber Threat Information (CTI) from online content, such as social media, blog articles, and posts in discussion forums. Most research work focuses on social media and blog posts since their content is often contributed by cybersecurity experts and is usually of cleaner formats. While posts in online forums are noisier and less structured, online forums attract more users than other sources and contain much valuable information that may help predict cyber threats. Therefore, effectively extracting CTI from online forum posts is an important task in today's data-driven cybersecurity defenses. Many Natural Language Processing (NLP) techniques are applied to the cybersecurity domains to extract the useful information, however, there is still space to improve. In this dissertation, a new Named Entity Recognition framework for cybersecurity domains and thread structure construction methods for unstructured forums are proposed to support the extraction of CTI. Then, extend them to filter the posts in the forums to eliminate non cybersecurity related topics with Cyber Attack Relevance Scale (CARS), extract the cybersecurity knowledgeable users to enhance more information for enhancing cybersecurity, and extract trending topic phrases related to cyber attacks in the hackers forums to find the clues for potential future attacks to predict them.

ContributorsKashihara, Kazuaki (Author) / Baral, Chitta (Thesis advisor) / Doupe, Adam (Committee member) / Blanco, Eduardo (Committee member) / Wang, Ruoyu (Committee member) / Arizona State University (Publisher)

Created2022

Automated reflection of CTF hostile exploits (ARCHES): inductive programming techniques for network traffic comprehension and reflection

Description

As the gap widens between the number of security threats and the number of security professionals, the need for automated security tools becomes increasingly important. These automated systems assist security professionals by identifying and/or fixing potential vulnerabilities before they can be exploited. One such category of tools is exploit generators,…

As the gap widens between the number of security threats and the number of security professionals, the need for automated security tools becomes increasingly important. These automated systems assist security professionals by identifying and/or fixing potential vulnerabilities before they can be exploited. One such category of tools is exploit generators, which craft exploits to demonstrate a vulnerability and provide guidance on how to repair it. Existing exploit generators largely use the application code, either through static or dynamic analysis, to locate crashes and craft a payload.

This thesis proposes the Automated Reflection of CTF Hostile Exploits (ARCHES), an exploit generator that learns by example. ARCHES uses an inductive programming library named IRE to generate exploits from exploit examples. In doing so, ARCHES can create an exploit only from example exploit payloads without interacting with the service. By representing each component of the exploit interaction as a collection of theories for how that component occurs, ARCHES can identify critical state information and replicate an executable exploit. This methodology learns rapidly and works with only a few examples. The ARCHES exploit generator is targeted towards Capture the Flag (CTF) events as a suitable environment for initial research.

The effectiveness of this methodology was evaluated on four exploits with features that demonstrate the capabilities and limitations of this methodology. ARCHES is capable of reproducing exploits that require an understanding of state dependent input, such as a flag id. Additionally, ARCHES can handle basic utilization of state information that is revealed through service output. However, limitations in this methodology result in failure to replicate exploits that require a loop, intricate mathematics, or multiple TCP connections.

Inductive programming has potential as a security tool to augment existing automated security tools. Future research into these techniques will provide more capabilities for security professionals in academia and in industry.

ContributorsCrosley, Zackary (Author) / Doupe, Adam (Thesis advisor) / Shoshitaishvili, Yan (Committee member) / Wang, Ruoyu (Committee member) / Arizona State University (Publisher)

Created2019

Attribute-Based Encryption for Fine-Grained Access Control over Sensitive Data

Description

The traditional access control system suffers from the problem of separation of data ownership and management. It poses data security issues in application scenarios such as cloud computing and blockchain where the data owners either do not trust the data storage provider or even do not know who would have…

The traditional access control system suffers from the problem of separation of data ownership and management. It poses data security issues in application scenarios such as cloud computing and blockchain where the data owners either do not trust the data storage provider or even do not know who would have access to their data once they are appended to the chain. In these scenarios, the data owner actually loses control of the data once they are uploaded to the outside storage. Encryption-before-uploading is the way to solve this issue, however traditional encryption schemes such as AES, RSA, ECC, bring about great overheads in key management on the data owner end and could not provide fine-grained access control as well.

Attribute-Based Encryption (ABE) is a cryptographic way to implement attribute-based access control, which is a fine-grained access control model, thus solving all aforementioned issues. With ABE, the data owner would encrypt the data by a self-defined access control policy before uploading the data. The access control policy is an AND-OR boolean formula over attributes. Only users with attributes that satisfy the access control policy could decrypt the ciphertext. However the existing ABE schemes do not provide some important features in practical applications, e.g., user revocation and attribute expiration. Furthermore, most existing work focus on how to use ABE to protect cloud stored data, while not the blockchain applications.

The main objective of this thesis is to provide solutions to add two important features of the ABE schemes, i.e., user revocation and attribute expiration, and also provide a practical trust framework for using ABE to protect blockchain data. To add the feature of user revocation, I propose to add user's hierarchical identity into the private attribute key. In this way, only users whose identity is not revoked and attributes satisfy the access control policy could decrypt the ciphertext. To add the feature of attribute expiration, I propose to add the attribute valid time period into the private attribute key. The data would be encrypted by access control policy where all attributes have a temporal value. In this way, only users whose attributes both satisfy the access policy and at the same time these attributes do not expire,

are allowed to decrypt the ciphertext. To use ABE in the blockchain applications, I propose an ABE-enabled trust framework in a very popular blockchain platform, Hyperledger Fabric. Based on the design, I implement a light-weight attribute certificate authority for attribute distribution and validation; I implement the proposed ABE schemes and provide a toolkit which supports system setup, key generation,

data encryption and data decryption. All these modules were integrated into a demo system for protecting sensitive les in a blockchain application.

ContributorsDong, Qiuxiang (Author) / Huang, Dijiang (Thesis advisor) / Sen, Arunabha (Committee member) / Doupe, Adam (Committee member) / Arizona State University (Publisher)

Created2020

ASU Electronic Theses and Dissertations

Filtering by

A Hacker-Centric Perspective to Empower Cyber Defense

A Study of Online Security Practices

A Proactive Approach to Detect IoT Based Flooding Attacks by Using Software Defined Networks and Manufacturer Usage Descriptions

Representation Learning for Trustworthy AI

Trapped in Transparency: Analyzing the Effectiveness of Security Defenses in Real-World Scenarios

Enhancing Binary Analysis through Cognitive Load Theory

Visualizing Information Flow Graph-Based Approach to Tracing Data Dependencies for Binary Analysis

Extracting Semantic Information from Online Conversations to Enhance Cyber Defense

Automated reflection of CTF hostile exploits (ARCHES): inductive programming techniques for network traffic comprehension and reflection

Attribute-Based Encryption for Fine-Grained Access Control over Sensitive Data