Search Content

Moving Target Defense Using Live Migration of Docker Containers

Description

Today the information technology systems have addresses, software stacks and other configuration remaining unchanged for a long period of time. This paves way for malicious attacks in the system from unknown vulnerabilities. The attacker can take advantage of this situation and plan their attacks with sufficient time. To protect our…

Today the information technology systems have addresses, software stacks and other configuration remaining unchanged for a long period of time. This paves way for malicious attacks in the system from unknown vulnerabilities. The attacker can take advantage of this situation and plan their attacks with sufficient time. To protect our system from this threat, Moving Target Defense is required where the attack surface is dynamically changed, making it difficult to strike.

In this thesis, I incorporate live migration of Docker container using CRIU (checkpoint restore) for moving target defense. There are 460K Dockerized applications, a 3100% growth over 2 years[1]. Over 4 billion containers have been pulled so far from Docker hub. Docker is supported by a large and fast growing community of contributors and users. As an example, there are 125K Docker Meetup members worldwide. As we see industry adapting to Docker rapidly, a moving target defense solution involving containers is beneficial for being robust and fast. A proof of concept implementation is included for studying performance attributes of Docker migration.

The detection of attack is using a scenario involving definitions of normal events on servers. By defining system activities, and extracting syslog in centralized server, attack can be detected via extracting abnormal activates and this detection can be a trigger for the Docker migration.

ContributorsBohara, Bhakti (Author) / Huang, Dijiang (Thesis advisor) / Doupe, Adam (Committee member) / Zhao, Ziming (Committee member) / Arizona State University (Publisher)

Created2017

iGen: Toward Automatic Generation and Analysis of Indicators of Compromise (IOCs) using Convolutional Neural Network

Description

Field of cyber threats is evolving rapidly and every day multitude of new information about malware and Advanced Persistent Threats (APTs) is generated in the form of malware reports, blog articles, forum posts, etc. However, current Threat Intelligence (TI) systems have several limitations. First, most of the TI systems examine…

Field of cyber threats is evolving rapidly and every day multitude of new information about malware and Advanced Persistent Threats (APTs) is generated in the form of malware reports, blog articles, forum posts, etc. However, current Threat Intelligence (TI) systems have several limitations. First, most of the TI systems examine and interpret data manually with the help of analysts. Second, some of them generate Indicators of Compromise (IOCs) directly using regular expressions without understanding the contextual meaning of those IOCs from the data sources which allows the tools to include lot of false positives. Third, lot of TI systems consider either one or two data sources for the generation of IOCs, and misses some of the most valuable IOCs from other data sources.

To overcome these limitations, we propose iGen, a novel approach to fully automate the process of IOC generation and analysis. Proposed approach is based on the idea that our model can understand English texts like human beings, and extract the IOCs from the different data sources intelligently. Identification of the IOCs is done on the basis of the syntax and semantics of the sentence as well as context words (e.g., ``attacked'', ``suspicious'') present in the sentence which helps the approach work on any kind of data source. Our proposed technique, first removes the words with no contextual meaning like stop words and punctuations etc. Then using the rest of the words in the sentence and output label (IOC or non-IOC sentence), our model intelligently learn to classify sentences into IOC and non-IOC sentences. Once IOC sentences are identified using this learned Convolutional Neural Network (CNN) based approach, next step is to identify the IOC tokens (like domains, IP, URL) in the sentences. This CNN based classification model helps in removing false positives (like IPs which are not malicious). Afterwards, IOCs extracted from different data sources are correlated to find the links between thousands of apparently unrelated attack instances, particularly infrastructures shared between them. Our approach fully automates the process of IOC generation from gathering data from different sources to creating rules (e.g. OpenIOC, snort rules, STIX rules) for deployment on

the security infrastructure.

iGen has collected around 400K IOCs till now with a precision of 95\%, better than any state-of-art method.

ContributorsPanwar, Anupam (Author) / Ahn, Gail-Joon (Thesis advisor) / Doupe, Adam (Committee member) / Zhao, Ziming (Committee member) / Arizona State University (Publisher)

Created2017

Defeating Attackers by Bridging the Gaps Between Security and Intelligence

Description

The omnipresent data, growing number of network devices, and evolving attack techniques have been challenging organizations’ security defenses over the past decade. With humongous volumes of logs generated by those network devices, looking for patterns of malicious activities and identifying them in time is growing beyond the capabilities of their…

The omnipresent data, growing number of network devices, and evolving attack techniques have been challenging organizations’ security defenses over the past decade. With humongous volumes of logs generated by those network devices, looking for patterns of malicious activities and identifying them in time is growing beyond the capabilities of their defense systems. Deep Learning, a subset of Machine Learning (ML) and Artificial Intelligence (AI), fills in this gapwith its ability to learn from huge amounts of data, and improve its performance as the data it learns from increases. In this dissertation, I bring forward security issues pertaining to two top threats that most organizations fear, Advanced Persistent Threat (APT), and Distributed Denial of Service (DDoS), along with deep learning models built towards addressing those security issues. First, I present a deep learning model, APT Detection, capable of detecting anomalous activities in a system. Evaluation of this model demonstrates how it can contribute to early detection of an APT attack with an Area Under the Curve (AUC) of up to 91% on a Receiver Operating Characteristic (ROC) curve. Second, I present DAPT2020, a first of its kind dataset capturing an APT attack exploiting web and system vulnerabilities in an emulated organization’s production network. Evaluation of the dataset using well known machine learning models demonstrates the need for better deep learning models to detect APT attacks. I then present DAPT2021, a semi-synthetic dataset capturing an APT attackexploiting human vulnerabilities, alongside 2 less skilled attacks. By emulating the normal behavior of the employees in a set target organization, DAPT2021 has been created to enable researchers study the causations and correlations among the captured data, a much-needed information to detect an underlying threat early. Finally, I present a distributed defense framework, SmartDefense, that can detect and mitigate over 90% of DDoS traffic at the source and over 97.5% of the remaining DDoS traffic at the Internet Service Provider’s (ISP’s) edge network. Evaluation of this work shows how by using attributes sent by customer edge network, SmartDefense can further help ISPs prevent up to 51.95% of the DDoS traffic from going to the destination.

ContributorsMyneni, Sowmya (Author) / Xue, Guoliang (Thesis advisor) / Doupe, Adam (Committee member) / Li, Baoxin (Committee member) / Baral, Chitta (Committee member) / Arizona State University (Publisher)

Created2022

Unearthing Hidden Bugs: Harnessing Fuzzing With Dynamic Patching in FlakJack

Description

This thesis presents a study on the fuzzing of Linux binaries to find occluded bugs. Fuzzing is a widely-used technique for identifying software bugs. Despite their effectiveness, state-of-the-art fuzzers suffer from limitations in efficiency and effectiveness. Fuzzers based on random mutations are fast but struggle to generate high-quality inputs. In…

This thesis presents a study on the fuzzing of Linux binaries to find occluded bugs. Fuzzing is a widely-used technique for identifying software bugs. Despite their effectiveness, state-of-the-art fuzzers suffer from limitations in efficiency and effectiveness. Fuzzers based on random mutations are fast but struggle to generate high-quality inputs. In contrast, fuzzers based on symbolic execution produce quality inputs but lack execution speed. This paper proposes FlakJack, a novel hybrid fuzzer that patches the binary on the go to detect occluded bugs guarded by surface bugs. To dynamically overcome the challenge of patching binaries, the paper introduces multiple patching strategies based on the type of bug detected. The performance of FlakJack was evaluated on ten widely-used real-world binaries and one chaff dataset binary. The results indicate that many bugs found recently were already present in previous versions but were occluded by surface bugs. FlakJack’s approach improved the bug-finding ability by patching surface bugs that usually guard occluded bugs, significantly reducing patching cycles. Despite its unbalanced approach compared to other coverage-guided fuzzers, FlakJack is fast, lightweight, and robust. False- Positives can be filtered out quickly, and the approach is practical in other parts of the target. The paper shows that the FlakJack approach can significantly improve fuzzing performance without relying on complex strategies.

ContributorsPraveen Menon, Gokulkrishna (Author) / Bao, Tiffany (Thesis advisor) / Shoshitaishvili, Yan (Thesis advisor) / Doupe, Adam (Committee member) / Arizona State University (Publisher)

Created2023

A Model for Calculating Damage Potential in Computer Systems

Description

For systems having computers as a significant component, it becomes a critical task to identify the potential threats that the users of the system can present, while being both inside and outside the system. One of the most important factors that differentiate an insider from an outsider is the fact…

For systems having computers as a significant component, it becomes a critical task to identify the potential threats that the users of the system can present, while being both inside and outside the system. One of the most important factors that differentiate an insider from an outsider is the fact that the insider being a part of the system, owns privileges that enable him/her access to the resources and processes of the system through valid capabilities. An insider with malicious intent can potentially be more damaging compared to outsiders. The above differences help to understand the notion and scope of an insider.

The significant loss to organizations due to the failure to detect and mitigate the insider threat has resulted in an increased interest in insider threat detection. The well-studied effective techniques proposed for defending against attacks by outsiders have not been proven successful against insider attacks. Although a number of security policies and models to deal with the insider threat have been developed, the approach taken by most organizations is the use of audit logs after the attack has taken place. Such approaches are inspired by academic research proposals to address the problem by tracking activities of the insider in the system. Although tracking and logging are important, it is argued that they are not sufficient. Thus, the necessity to predict the potential damage of an insider is considered to help build a stronger evaluation and mitigation strategy for the insider attack. In this thesis, the question that seeks to be answered is the following: `Considering the relationships that exist between the insiders and their role, their access to the resources and the resource set, what is the potential damage that an insider can cause?'

A general system model is introduced that can capture general insider attacks including those documented by Computer Emergency Response Team (CERT) for the Software Engineering Institute (SEI). Further, initial formulations of the damage potential for leakage and availability in the model is introduced. The model usefulness is shown by expressing 14 of actual attacks in the model and show how for each case the attack could have been mitigated.

ContributorsNolastname, Sharad (Author) / Bazzi, Rida (Thesis advisor) / Sen, Arunabha (Committee member) / Doupe, Adam (Committee member) / Arizona State University (Publisher)

Created2019

A Hacker-Centric Perspective to Empower Cyber Defense

Description

Malicious hackers utilize the World Wide Web to share knowledge. Previous work has demonstrated that information mined from online hacking communities can be used as precursors to cyber-attacks. In a threatening scenario, where security alert systems are facing high false positive rates, understanding the people behind cyber incidents can hel…

Malicious hackers utilize the World Wide Web to share knowledge. Previous work has demonstrated that information mined from online hacking communities can be used as precursors to cyber-attacks. In a threatening scenario, where security alert systems are facing high false positive rates, understanding the people behind cyber incidents can help reduce the risk of attacks. However, the rapidly evolving nature of those communities leads to limitations still largely unexplored, such as: who are the skilled and influential individuals forming those groups, how they self-organize along the lines of technical expertise, how ideas propagate within them, and which internal patterns can signal imminent cyber offensives? In this dissertation, I have studied four key parts of this complex problem set. Initially, I leverage content, social network, and seniority analysis to mine key-hackers on darkweb forums, identifying skilled and influential individuals who are likely to succeed in their cybercriminal goals. Next, as hackers often use Web platforms to advertise and recruit collaborators, I analyze how social influence contributes to user engagement online. On social media, two time constraints are proposed to extend standard influence measures, which increases their correlation with adoption probability and consequently improves hashtag adoption prediction. On darkweb forums, the prediction of where and when hackers will post a message in the near future is accomplished by analyzing their recurrent interactions with other hackers. After that, I demonstrate how vendors of malware and malicious exploits organically form hidden organizations on darkweb marketplaces, obtaining significant consistency across the vendors’ communities extracted using the similarity of their products in different networks. Finally, I predict imminent cyber-attacks correlating malicious hacking activity on darkweb forums with real-world cyber incidents, evidencing how social indicators are crucial for the performance of the proposed model. This research is a hybrid of social network analysis (SNA), machine learning (ML), evolutionary computation (EC), and temporal logic (TL), presenting expressive contributions to empower cyber defense.

ContributorsSantana Marin, Ericsson (Author) / Shakarian, Paulo (Thesis advisor) / Doupe, Adam (Committee member) / Liu, Huan (Committee member) / Ferrara, Emilio (Committee member) / Arizona State University (Publisher)

Created2020

A Study of Online Security Practices

Description

Data from a total of 282 online web applications was collected, and accounts for 230 of those web applications were created in order to gather data about authentication practices, multistep authentication practices, security question practices, fallback authentication practices, and other security practices for online accounts. The account creation and data…

Data from a total of 282 online web applications was collected, and accounts for 230 of those web applications were created in order to gather data about authentication practices, multistep authentication practices, security question practices, fallback authentication practices, and other security practices for online accounts. The account creation and data collection was done between June 2016 and April 2017. The password strengths for online accounts were analyzed and password strength data was compared to existing data. Security questions used by online accounts were evaluated for security and usability, and fallback authentication practices were assessed based on their adherence to best practices. Alternative authentication schemes were examined, and other security considerations such as use of HTTPS and CAPTCHAs were explored. Based on existing data, password policies require stronger passwords in for web applications in 2017 compared to the requirements in 2010. Nevertheless, password policies for many accounts are still not adequate. About a quarter of online web applications examined use security questions, and many of the questions have usability and security concerns. Security mechanisms such as HTTPS and continuous authentication are in general not used in conjunction with security questions for most web applications, which reduces the overall security of the web application. A majority of web applications use email addresses as the login credential and the password recovery credential and do not follow best practices. About a quarter of accounts use multistep authentication and a quarter of accounts employ continuous authentication, yet most accounts fail to combine security measures for defense in depth. The overall conclusion is that some online web applications are using secure practices; however, a majority of online web applications fail to properly implement and utilize secure practices.

ContributorsGutierrez, Garrett (Author) / Bazzi, Rida (Thesis advisor) / Ahn, Gail-Joon (Committee member) / Doupe, Adam (Committee member) / Arizona State University (Publisher)

Created2017

A Proactive Approach to Detect IoT Based Flooding Attacks by Using Software Defined Networks and Manufacturer Usage Descriptions

Description

The advent of the Internet of Things (IoT) and its increasing appearances in

Small Office/Home Office (SOHO) networks pose a unique issue to the availability

and health of the Internet at large. Many of these devices are shipped insecurely, with

poor default user and password credentials and oftentimes the general consumer does

not have…

The advent of the Internet of Things (IoT) and its increasing appearances in

Small Office/Home Office (SOHO) networks pose a unique issue to the availability

and health of the Internet at large. Many of these devices are shipped insecurely, with

poor default user and password credentials and oftentimes the general consumer does

not have the technical knowledge of how they may secure their devices and networks.

The many vulnerabilities of the IoT coupled with the immense number of existing

devices provide opportunities for malicious actors to compromise such devices and

use them in large scale distributed denial of service attacks, preventing legitimate

users from using services and degrading the health of the Internet in general.

This thesis presents an approach that leverages the benefits of an Internet Engineering

Task Force (IETF) proposed standard named Manufacturer Usage Descriptions,

that is used in conjunction with the concept of Software Defined Networks

(SDN) in order to detect malicious traffic generated from IoT devices suspected of

being utilized in coordinated flooding attacks. The approach then works towards

the ability to detect these attacks at their sources through periodic monitoring of

preemptively permitted flow rules and determining which of the flows within the permitted

set are misbehaving by using an acceptable traffic range using Exponentially

Weighted Moving Averages (EWMA).

ContributorsChang, Laurence Hao (Author) / Yau, Stephen (Thesis advisor) / Doupe, Adam (Committee member) / Huang, Dijiang (Committee member) / Arizona State University (Publisher)

Created2018

Representation Learning for Trustworthy AI

Description

Artificial Intelligence (AI) systems have achieved outstanding performance and have been found to be better than humans at various tasks, such as sentiment analysis, and face recognition. However, the majority of these state-of-the-art AI systems use complex Deep Learning (DL) methods which present challenges for human experts to design and…

Artificial Intelligence (AI) systems have achieved outstanding performance and have been found to be better than humans at various tasks, such as sentiment analysis, and face recognition. However, the majority of these state-of-the-art AI systems use complex Deep Learning (DL) methods which present challenges for human experts to design and evaluate such models with respect to privacy, fairness, and robustness. Recent examination of DL models reveals that representations may include information that could lead to privacy violations, unfairness, and robustness issues. This results in AI systems that are potentially untrustworthy from a socio-technical standpoint. Trustworthiness in AI is defined by a set of model properties such as non-discriminatory bias, protection of users’ sensitive attributes, and lawful decision-making. The characteristics of trustworthy AI can be grouped into three categories: Reliability, Resiliency, and Responsibility. Past research has shown that the successful integration of an AI model depends on its trustworthiness. Thus it is crucial for organizations and researchers to build trustworthy AI systems to facilitate the seamless integration and adoption of intelligent technologies. The main issue with existing AI systems is that they are primarily trained to improve technical measures such as accuracy on a specific task but are not considerate of socio-technical measures. The aim of this dissertation is to propose methods for improving the trustworthiness of AI systems through representation learning. DL models’ representations contain information about a given input and can be used for tasks such as detecting fake news on social media or predicting the sentiment of a review. The findings of this dissertation significantly expand the scope of trustworthy AI research and establish a new paradigm for modifying data representations to balance between properties of trustworthy AI. Specifically, this research investigates multiple techniques such as reinforcement learning for understanding trustworthiness in users’ privacy, fairness, and robustness in classification tasks like cyberbullying detection and fake news detection. Since most social measures in trustworthy AI cannot be used to fine-tune or train an AI model directly, the main contribution of this dissertation lies in using reinforcement learning to alter an AI system’s behavior based on non-differentiable social measures.

ContributorsMosallanezhad, Ahmadreza (Author) / Liu, Huan (Thesis advisor) / Mancenido, Michelle (Thesis advisor) / Doupe, Adam (Committee member) / Maciejewski, Ross (Committee member) / Arizona State University (Publisher)

Created2023

Enhancing Binary Analysis through Cognitive Load Theory

Description

Reverse engineering is a process focused on gaining an understanding for the intricaciesof a system. This practice is critical in cybersecurity as it promotes the findings and patching of vulnerabilities as well as the counteracting of malware. Disassemblers and decompilers have become essential when reverse engineering due to the readability of information they…

Reverse engineering is a process focused on gaining an understanding for the intricaciesof a system. This practice is critical in cybersecurity as it promotes the findings and patching of vulnerabilities as well as the counteracting of malware. Disassemblers and decompilers have become essential when reverse engineering due to the readability of information they transcribe from binary files. However, these tools still tend to produce involved and complicated outputs that hinder the acquisition of knowledge during binary analysis. Cognitive Load Theory (CLT) explains that this hindrance is due to the human brain’s inability to process superfluous amounts of data. CLT classifies this data into three cognitive load types — intrinsic, extraneous, and germane — that each can help gauge complex procedures. In this research paper, a novel program call graph is presented accounting for these CLT principles. The goal of this graphical view is to reduce the cognitive load tied to the depiction of binary information and to enhance the overall binary analysis process. This feature was implemented within the binary analysis tool, angr and it’s user interface counterpart, angr-management. Additionally, this paper will examine a conducted user study to quantitatively and qualitatively evaluate the effectiveness of the newly proposed proximity view (PV). The user study includes a binary challenge solving portion measured by defined metrics and a survey phase to receive direct participant feedback regarding the view. The results from this study show statistically significant evidence that PV aids in challenge solving and improves the overall understanding binaries. The results also signify that this improvement comes with the cost of time. The survey section of the user study further indicates that users find PV beneficial to the reverse engineering process, but additional information needs to be included in future developments.

ContributorsSmits, Sean (Author) / Wang, Ruoyu (Thesis advisor) / Shoshitaishvili, Yan (Thesis advisor) / Doupe, Adam (Committee member) / Arizona State University (Publisher)

Created2022

Theses and Dissertations