Search Content

Darkweb Cyber Threat Intelligence Mining through the I2P Protocol

Description

This thesis project focused on malicious hacking community activities accessible through the I2P protocol. We visited 315 distinct I2P sites to identify those with malicious hacking content. We also wrote software to scrape and parse data from relevant I2P sites. The data was integrated into the CySIS databases for further…

This thesis project focused on malicious hacking community activities accessible through the I2P protocol. We visited 315 distinct I2P sites to identify those with malicious hacking content. We also wrote software to scrape and parse data from relevant I2P sites. The data was integrated into the CySIS databases for further analysis to contribute to the larger CySIS Lab Darkweb Cyber Threat Intelligence Mining research. We found that the I2P cryptonet was slow and had only a small amount of malicious hacking community activity. However, we also found evidence of a growing perception that Tor anonymity could be compromised. This work will contribute to understanding the malicious hacker community as some Tor users, seeking assured anonymity, transition to I2P.

ContributorsHutchins, James Keith (Author) / Shakarian, Paulo (Thesis director) / Ahn, Gail-Joon (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2016-12

Data Driven Game Theoretic Cyber Threat Mitigation

Description

Penetration testing is regarded as the gold-standard for understanding how well an organization can withstand sophisticated cyber-attacks. However, the recent prevalence of markets specializing in zero-day exploits on the darknet make exploits widely available to potential attackers. The cost associated with these sophisticated kits generally precludes penetration testers from simply…

Penetration testing is regarded as the gold-standard for understanding how well an organization can withstand sophisticated cyber-attacks. However, the recent prevalence of markets specializing in zero-day exploits on the darknet make exploits widely available to potential attackers. The cost associated with these sophisticated kits generally precludes penetration testers from simply obtaining such exploits – so an alternative approach is needed to understand what exploits an attacker will most likely purchase and how to defend against them. In this paper, we introduce a data-driven security game framework to model an attacker and provide policy recommendations to the defender. In addition to providing a formal framework and algorithms to develop strategies, we present experimental results from applying our framework, for various system conﬁgurations, on real-world exploit market data actively mined from the darknet.

ContributorsRobertson, John James (Author) / Shakarian, Paulo (Thesis director) / Doupe, Adam (Committee member) / Electrical Engineering Program (Contributor) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2016-05

An Algorithm for Merging Identities

Description

In online social networks the identities of users are concealed, often by design. This anonymity makes it possible for a single person to have multiple accounts and to engage in malicious activity such as defrauding a service providers, leveraging social influence, or hiding activities that would otherwise be detected. There…

In online social networks the identities of users are concealed, often by design. This anonymity makes it possible for a single person to have multiple accounts and to engage in malicious activity such as defrauding a service providers, leveraging social influence, or hiding activities that would otherwise be detected. There are various methods for detecting whether two online users in a network are the same people in reality and the simplest way to utilize this information is to simply merge their identities and treat the two users as a single user. However, this then raises the issue of how we deal with these composite identities. To solve this problem, we introduce a mathematical abstraction for representing users and their identities as partitions on a set. We then define a similarity function, SIM, between two partitions, a set of properties that SIM must have, and a threshold that SIM must exceed for two users to be considered the same person. The main theoretical result of our work is a proof that for any given partition and similarity threshold, there is only a single unique way to merge the identities of similar users such that no two identities are similar. We also present two algorithms, COLLAPSE and SIM_MERGE, that merge the identities of users to find this unique set of identities. We prove that both algorithms execute in polynomial time and we also perform an experiment on dark web social network data from over 6000 users that demonstrates the runtime of SIM_MERGE.

ContributorsPolican, Andrew Dominic (Author) / Shakarian, Paulo (Thesis director) / Sen, Arunabha (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2018-05

Extracting Semantic Information from Online Conversations to Enhance Cyber Defense

Description

Recent advances in techniques allow the extraction of Cyber Threat Information (CTI) from online content, such as social media, blog articles, and posts in discussion forums. Most research work focuses on social media and blog posts since their content is often contributed by cybersecurity experts and is usually of cleaner…

Recent advances in techniques allow the extraction of Cyber Threat Information (CTI) from online content, such as social media, blog articles, and posts in discussion forums. Most research work focuses on social media and blog posts since their content is often contributed by cybersecurity experts and is usually of cleaner formats. While posts in online forums are noisier and less structured, online forums attract more users than other sources and contain much valuable information that may help predict cyber threats. Therefore, effectively extracting CTI from online forum posts is an important task in today's data-driven cybersecurity defenses. Many Natural Language Processing (NLP) techniques are applied to the cybersecurity domains to extract the useful information, however, there is still space to improve. In this dissertation, a new Named Entity Recognition framework for cybersecurity domains and thread structure construction methods for unstructured forums are proposed to support the extraction of CTI. Then, extend them to filter the posts in the forums to eliminate non cybersecurity related topics with Cyber Attack Relevance Scale (CARS), extract the cybersecurity knowledgeable users to enhance more information for enhancing cybersecurity, and extract trending topic phrases related to cyber attacks in the hackers forums to find the clues for potential future attacks to predict them.

ContributorsKashihara, Kazuaki (Author) / Baral, Chitta (Thesis advisor) / Doupe, Adam (Committee member) / Blanco, Eduardo (Committee member) / Wang, Ruoyu (Committee member) / Arizona State University (Publisher)

Created2022

FDB: A Framework for Flexible and Efficient Fuzzer Debugging

Description

Fuzzing is currently a thriving research area in the cybersecurity field. This work begins by introducing code that brings partial replayability capabilities to AFL++ in an attempt to solve the challenge of the highly random nature of fuzzing that comes from the large amount of random mutations on input seeds.…

Fuzzing is currently a thriving research area in the cybersecurity field. This work begins by introducing code that brings partial replayability capabilities to AFL++ in an attempt to solve the challenge of the highly random nature of fuzzing that comes from the large amount of random mutations on input seeds. The code addresses two of the three sources of nondeterminism described in this work. Furthermore, this work introduces Fuzzing Debugger (FDB), a highly configurable framework to facilitate the debugging of fuzzing by interfacing with GDB. Three debugging modes are described which attempt to tackle two use cases of FDB: (1) pinpointing nondeterminism in fuzz runs, therefore paving the way for replayable fuzz runs and (2) systematically finding preferable stopping points seed analysis.

ContributorsLiu, Denis (Author) / Bao, Tiffany (Thesis director) / Shoshitaishvili, Yan (Committee member) / Barrett, The Honors College (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Computer Science and Engineering Program (Contributor)

Created2023-05

Machine Learning and Mario Speedruns

Description

Machine learning has a near infinite number of applications, of which the potential has yet to have been fully harnessed and realized. This thesis will outline two departments that machine learning can be utilized in, and demonstrate the execution of one methodology in each department. The first department that will…

Machine learning has a near infinite number of applications, of which the potential has yet to have been fully harnessed and realized. This thesis will outline two departments that machine learning can be utilized in, and demonstrate the execution of one methodology in each department. The first department that will be described is self-play in video games, where a neural model will be researched and described that will teach a computer to complete a level of Super Mario World (1990) on its own. The neural model in question was inspired by the academic paper “Evolving Neural Networks through Augmenting Topologies”, which was written by Kenneth O. Stanley and Risto Miikkulainen of University of Texas at Austin. The model that will actually be described is from YouTuber SethBling of the California Institute of Technology. The second department that will be described is cybersecurity, where an algorithm is described from the academic paper “Process Based Volatile Memory Forensics for Ransomware Detection”, written by Asad Arfeen, Muhammad Asim Khan, Obad Zafar, and Usama Ahsan. This algorithm utilizes Python and the Volatility framework to detect malicious software in an infected system.

ContributorsBallecer, Joshua (Author) / Yang, Yezhou (Thesis director) / Luo, Yiran (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)

Created2023-05

An Exploratory Literature Review of Efforts Towards Improving Cybersecurity

Description

Data breaches and software vulnerabilities are increasingly severe problems that incur both monetary and reputational costs for companies as well as societal impacts. While companies have clear monetary and legal incentives to mitigate risk of data breaches, companies have significantly less incentive to mitigate software product vulnerabilities, and their existing…

Data breaches and software vulnerabilities are increasingly severe problems that incur both monetary and reputational costs for companies as well as societal impacts. While companies have clear monetary and legal incentives to mitigate risk of data breaches, companies have significantly less incentive to mitigate software product vulnerabilities, and their existing incentive is widely considered insufficient. In this thesis, I initially set out to perform a statistical analysis correlating company characteristics and behavior with the characteristics of the data breaches they suffer, as well as performing a metaanalysis of existing literature. While the attempted statistical analysis was hindered by lack of sufficiently comprehensive free company datasets, I have recorded my efforts in finding suitable databases. I have also performed an exploratory literature review of 15 papers in the field of improving cybersecurity, and identified four blockers to security addressed and three elements of solutions proposed by the papers, as well as derived insights from the distribution of these blockers and elements of solutions in the papers reviewed.

ContributorsMac, Anthony (Author) / Bazzi, Rida (Thesis director) / Shoshitaishvili, Yan (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)

Created2022-05

Defeating Attackers by Bridging the Gaps Between Security and Intelligence

Description

The omnipresent data, growing number of network devices, and evolving attack techniques have been challenging organizations’ security defenses over the past decade. With humongous volumes of logs generated by those network devices, looking for patterns of malicious activities and identifying them in time is growing beyond the capabilities of their…

The omnipresent data, growing number of network devices, and evolving attack techniques have been challenging organizations’ security defenses over the past decade. With humongous volumes of logs generated by those network devices, looking for patterns of malicious activities and identifying them in time is growing beyond the capabilities of their defense systems. Deep Learning, a subset of Machine Learning (ML) and Artificial Intelligence (AI), fills in this gapwith its ability to learn from huge amounts of data, and improve its performance as the data it learns from increases. In this dissertation, I bring forward security issues pertaining to two top threats that most organizations fear, Advanced Persistent Threat (APT), and Distributed Denial of Service (DDoS), along with deep learning models built towards addressing those security issues. First, I present a deep learning model, APT Detection, capable of detecting anomalous activities in a system. Evaluation of this model demonstrates how it can contribute to early detection of an APT attack with an Area Under the Curve (AUC) of up to 91% on a Receiver Operating Characteristic (ROC) curve. Second, I present DAPT2020, a first of its kind dataset capturing an APT attack exploiting web and system vulnerabilities in an emulated organization’s production network. Evaluation of the dataset using well known machine learning models demonstrates the need for better deep learning models to detect APT attacks. I then present DAPT2021, a semi-synthetic dataset capturing an APT attackexploiting human vulnerabilities, alongside 2 less skilled attacks. By emulating the normal behavior of the employees in a set target organization, DAPT2021 has been created to enable researchers study the causations and correlations among the captured data, a much-needed information to detect an underlying threat early. Finally, I present a distributed defense framework, SmartDefense, that can detect and mitigate over 90% of DDoS traffic at the source and over 97.5% of the remaining DDoS traffic at the Internet Service Provider’s (ISP’s) edge network. Evaluation of this work shows how by using attributes sent by customer edge network, SmartDefense can further help ISPs prevent up to 51.95% of the DDoS traffic from going to the destination.

ContributorsMyneni, Sowmya (Author) / Xue, Guoliang (Thesis advisor) / Doupe, Adam (Committee member) / Li, Baoxin (Committee member) / Baral, Chitta (Committee member) / Arizona State University (Publisher)

Created2022

PyAntiPhish: A Python-Based Machine Learning Detector of Phishing Websites and An Examination of Relevant URL-Based Features

Description

Phishing is one of most common and effective attack vectors in modern cybercrime. Rather than targeting a technical vulnerability in a computer system, phishing attacks target human behavioral or emotional tendencies through manipulative emails, text messages, or phone calls. Through PyAntiPhish, I attempt to create my own version of an…

Phishing is one of most common and effective attack vectors in modern cybercrime. Rather than targeting a technical vulnerability in a computer system, phishing attacks target human behavioral or emotional tendencies through manipulative emails, text messages, or phone calls. Through PyAntiPhish, I attempt to create my own version of an anti-phishing solution, through a series of experiments testing different machine learning classifiers and URL features. With an end-goal implementation as a Chromium browser extension utilizing Python-based machine learning classifiers (those available via the scikit-learn library), my project uses a combination of Python, TypeScript, Node.js, as well as AWS Lambda and API Gateway to act as a solution capable of blocking phishing attacks from the web browser.

ContributorsYang, Branden (Author) / Osburn, Steven (Thesis director) / Malpe, Adwith (Committee member) / Ahn, Gail-Joon (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)

Created2024-05

Exploring the Usage of Drones to Perform Network Reconnaissance and Other Wireless Network Exploitation Methods

Description

Wardriving is when prospective malicious hackers drive with a portable computer to sniff out and map potentially vulnerable networks. With the advent of smart homes and other Internet of Things devices, this poses the possibility of more unsecure targets. The hardware available to the public has also miniaturized and gotten…

Wardriving is when prospective malicious hackers drive with a portable computer to sniff out and map potentially vulnerable networks. With the advent of smart homes and other Internet of Things devices, this poses the possibility of more unsecure targets. The hardware available to the public has also miniaturized and gotten more powerful. One no longer needs to carry a complete laptop to carry out network mapping. With this miniaturization and greater popularity of quadcopter technology, the two can be combined to create a more efficient wardriving setup in a potentially more target-rich environment. Thus, we set out to create a prototype as a proof of concept of this combination. By creating a bracket for a Raspberry Pi to be mounted to a drone with other wireless sniffing equipment, we demonstrate that one can use various off the shelf components to create a powerful network detection device. In this write up, we also outline some of the challenges encountered by combining these two technologies, as well as the solutions to those challenges. Adding payload weight to drones that are not initially designed for it causes detrimental effects to various characteristics such as flight behavior and power consumption. Less computing power is available due to the miniaturization that must take place for a drone-mounted solution. Communication between the miniature computer and a ground control computer is also essential in overall system operation. Below, we highlight solutions to these various problems as well as improvements that can be implemented for maximum system effectiveness.

ContributorsHer, Zachary (Author) / Walker, Elizabeth (Co-author) / Gupta, Sandeep (Thesis director) / Wang, Ruoyu (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)

Created2022-05

Filtering by