Search Content

Data Driven Game Theoretic Cyber Threat Mitigation

Description

Penetration testing is regarded as the gold-standard for understanding how well an organization can withstand sophisticated cyber-attacks. However, the recent prevalence of markets specializing in zero-day exploits on the darknet make exploits widely available to potential attackers. The cost associated with these sophisticated kits generally precludes penetration testers from simply…

Penetration testing is regarded as the gold-standard for understanding how well an organization can withstand sophisticated cyber-attacks. However, the recent prevalence of markets specializing in zero-day exploits on the darknet make exploits widely available to potential attackers. The cost associated with these sophisticated kits generally precludes penetration testers from simply obtaining such exploits – so an alternative approach is needed to understand what exploits an attacker will most likely purchase and how to defend against them. In this paper, we introduce a data-driven security game framework to model an attacker and provide policy recommendations to the defender. In addition to providing a formal framework and algorithms to develop strategies, we present experimental results from applying our framework, for various system conﬁgurations, on real-world exploit market data actively mined from the darknet.

ContributorsRobertson, John James (Author) / Shakarian, Paulo (Thesis director) / Doupe, Adam (Committee member) / Electrical Engineering Program (Contributor) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2016-05

Extracting Semantic Information from Online Conversations to Enhance Cyber Defense

Description

Recent advances in techniques allow the extraction of Cyber Threat Information (CTI) from online content, such as social media, blog articles, and posts in discussion forums. Most research work focuses on social media and blog posts since their content is often contributed by cybersecurity experts and is usually of cleaner…

Recent advances in techniques allow the extraction of Cyber Threat Information (CTI) from online content, such as social media, blog articles, and posts in discussion forums. Most research work focuses on social media and blog posts since their content is often contributed by cybersecurity experts and is usually of cleaner formats. While posts in online forums are noisier and less structured, online forums attract more users than other sources and contain much valuable information that may help predict cyber threats. Therefore, effectively extracting CTI from online forum posts is an important task in today's data-driven cybersecurity defenses. Many Natural Language Processing (NLP) techniques are applied to the cybersecurity domains to extract the useful information, however, there is still space to improve. In this dissertation, a new Named Entity Recognition framework for cybersecurity domains and thread structure construction methods for unstructured forums are proposed to support the extraction of CTI. Then, extend them to filter the posts in the forums to eliminate non cybersecurity related topics with Cyber Attack Relevance Scale (CARS), extract the cybersecurity knowledgeable users to enhance more information for enhancing cybersecurity, and extract trending topic phrases related to cyber attacks in the hackers forums to find the clues for potential future attacks to predict them.

ContributorsKashihara, Kazuaki (Author) / Baral, Chitta (Thesis advisor) / Doupe, Adam (Committee member) / Blanco, Eduardo (Committee member) / Wang, Ruoyu (Committee member) / Arizona State University (Publisher)

Created2022

Representation Learning for Trustworthy AI

Description

Artificial Intelligence (AI) systems have achieved outstanding performance and have been found to be better than humans at various tasks, such as sentiment analysis, and face recognition. However, the majority of these state-of-the-art AI systems use complex Deep Learning (DL) methods which present challenges for human experts to design and…

Artificial Intelligence (AI) systems have achieved outstanding performance and have been found to be better than humans at various tasks, such as sentiment analysis, and face recognition. However, the majority of these state-of-the-art AI systems use complex Deep Learning (DL) methods which present challenges for human experts to design and evaluate such models with respect to privacy, fairness, and robustness. Recent examination of DL models reveals that representations may include information that could lead to privacy violations, unfairness, and robustness issues. This results in AI systems that are potentially untrustworthy from a socio-technical standpoint. Trustworthiness in AI is defined by a set of model properties such as non-discriminatory bias, protection of users’ sensitive attributes, and lawful decision-making. The characteristics of trustworthy AI can be grouped into three categories: Reliability, Resiliency, and Responsibility. Past research has shown that the successful integration of an AI model depends on its trustworthiness. Thus it is crucial for organizations and researchers to build trustworthy AI systems to facilitate the seamless integration and adoption of intelligent technologies. The main issue with existing AI systems is that they are primarily trained to improve technical measures such as accuracy on a specific task but are not considerate of socio-technical measures. The aim of this dissertation is to propose methods for improving the trustworthiness of AI systems through representation learning. DL models’ representations contain information about a given input and can be used for tasks such as detecting fake news on social media or predicting the sentiment of a review. The findings of this dissertation significantly expand the scope of trustworthy AI research and establish a new paradigm for modifying data representations to balance between properties of trustworthy AI. Specifically, this research investigates multiple techniques such as reinforcement learning for understanding trustworthiness in users’ privacy, fairness, and robustness in classification tasks like cyberbullying detection and fake news detection. Since most social measures in trustworthy AI cannot be used to fine-tune or train an AI model directly, the main contribution of this dissertation lies in using reinforcement learning to alter an AI system’s behavior based on non-differentiable social measures.

ContributorsMosallanezhad, Ahmadreza (Author) / Liu, Huan (Thesis advisor) / Mancenido, Michelle (Thesis advisor) / Doupe, Adam (Committee member) / Maciejewski, Ross (Committee member) / Arizona State University (Publisher)

Created2023

Misinformation Detection in Social Media

Description

The pervasive use of social media gives it a crucial role in helping the public perceive reliable information. Meanwhile, the openness and timeliness of social networking sites also allow for the rapid creation and dissemination of misinformation. It becomes increasingly difficult for online users to find accurate and trustworthy information.…

The pervasive use of social media gives it a crucial role in helping the public perceive reliable information. Meanwhile, the openness and timeliness of social networking sites also allow for the rapid creation and dissemination of misinformation. It becomes increasingly difficult for online users to find accurate and trustworthy information. As witnessed in recent incidents of misinformation, it escalates quickly and can impact social media users with undesirable consequences and wreak havoc instantaneously. Different from some existing research in psychology and social sciences about misinformation, social media platforms pose unprecedented challenges for misinformation detection. First, intentional spreaders of misinformation will actively disguise themselves. Second, content of misinformation may be manipulated to avoid being detected, while abundant contextual information may play a vital role in detecting it. Third, not only accuracy, earliness of a detection method is also important in containing misinformation from being viral. Fourth, social media platforms have been used as a fundamental data source for various disciplines, and these research may have been conducted in the presence of misinformation. To tackle the challenges, we focus on developing machine learning algorithms that are robust to adversarial manipulation and data scarcity.

The main objective of this dissertation is to provide a systematic study of misinformation detection in social media. To tackle the challenges of adversarial attacks, I propose adaptive detection algorithms to deal with the active manipulations of misinformation spreaders via content and networks. To facilitate content-based approaches, I analyze the contextual data of misinformation and propose to incorporate the specific contextual patterns of misinformation into a principled detection framework. Considering its rapidly growing nature, I study how misinformation can be detected at an early stage. In particular, I focus on the challenge of data scarcity and propose a novel framework to enable historical data to be utilized for emerging incidents that are seemingly irrelevant. With misinformation being viral, applications that rely on social media data face the challenge of corrupted data. To this end, I present robust statistical relational learning and personalization algorithms to minimize the negative effect of misinformation.

ContributorsWu, Liang (Author) / Liu, Huan (Thesis advisor) / Tong, Hanghang (Committee member) / Doupe, Adam (Committee member) / Davison, Brian D. (Committee member) / Arizona State University (Publisher)

Created2019

Individuals, collectives, sisters: vernacular cosmopolitan praxis among Muslim women in transnational cyberspaces

Description

This dissertation research analyzes the vernacular cosmopolitan praxis of Muslim women in transnational cyberspaces related to topical and collective action networks, in an effort to detangle cosmopolitanism from its Western biases and to move away from studies of online Muslim populations based on geographical locations or homogenous networks, linking individuals…

This dissertation research analyzes the vernacular cosmopolitan praxis of Muslim women in transnational cyberspaces related to topical and collective action networks, in an effort to detangle cosmopolitanism from its Western biases and to move away from studies of online Muslim populations based on geographical locations or homogenous networks, linking individuals through their religious practices or consumption of religious knowledge. Through highlighting praxes rather than contexts, this dissertation disrupts the East/West binary and challenges stereotypes ascribed to Muslim women. One of the research questions related to the cosmopolitan praxis of Muslim women is the following: in what ways do Muslim women engage with "others" online and contribute to bridging dissimilar people? Findings suggest that the social media use of Muslim women contributes to their subtle resistance against communal norms and, although it can serve as an extension to their voices, some of their voices are more readily mediated than others. I employ a connective content analysis methodology to put various datasets associated with topics and collective actions related to Muslim women into conversation. The methodology not only highlights consistencies in the qualitative themes that were iteratively developed through the analyses of the datasets, but also more tangible connections related to social media users, topics, and content. Consequently, this thesis is as much concerned with recasting Western-oriented conceptualizations of cosmopolitanism as it is with how networks dynamics foster and constrain cosmopolitan praxis because digital networks have the extraordinary capacity to link dissimilar people beyond any other type of medium. One of the research questions related to the methodology is: how do network dynamics facilitate and constrain cosmopolitan praxis? The substantive chapters related to the datasets discuss various forms of cosmopolitan praxis that were identified in the analysis: social media use by Muslim women that expands the connective memory, which could contribute to dispelling stereotypes ascribed to them; activism that addresses universal concerns related to women's rights regardless of context; dialogically devising basic standards of social conduct and gender relations; and expressions of tolerance toward divergent views, including alternative interpretations of Islamic beliefs and practices.

ContributorsRobinson, Rebecca S (Author) / Lim, Merlyna (Thesis advisor) / Ali, Souad T. (Committee member) / Parmentier, Mary Jane C, (Committee member) / Arizona State University (Publisher)

Created2014

Programmable Insight: A Computational Methodology to Explore Online News Use of Frames

Description

The Internet is a major source of online news content. Online news is a form of large-scale narrative text with rich, complex contents that embed deep meanings (facts, strategic communication frames, and biases) for shaping and transitioning standards, values, attitudes, and beliefs of the masses. Currently, this body of narrative…

The Internet is a major source of online news content. Online news is a form of large-scale narrative text with rich, complex contents that embed deep meanings (facts, strategic communication frames, and biases) for shaping and transitioning standards, values, attitudes, and beliefs of the masses. Currently, this body of narrative text remains untapped due—in large part—to human limitations. The human ability to comprehend rich text and extract hidden meanings is far superior to known computational algorithms but remains unscalable. In this research, computational treatment is given to online news framing for exposing a deeper level of expressivity coined “double subjectivity” as characterized by its cumulative amplification effects. A visual language is offered for extracting spatial and temporal dynamics of double subjectivity that may give insight into social influence about critical issues, such as environmental, economic, or political discourse. This research offers benefits of 1) scalability for processing hidden meanings in big data and 2) visibility of the entire network dynamics over time and space to give users insight into the current status and future trends of mass communication.

ContributorsCheeks, Loretta H. (Author) / Gaffar, Ashraf (Thesis advisor) / Wald, Dara M (Committee member) / Ben Amor, Hani (Committee member) / Doupe, Adam (Committee member) / Cooke, Nancy J. (Committee member) / Arizona State University (Publisher)

Created2017

A Study of Online Security Practices

Description

Data from a total of 282 online web applications was collected, and accounts for 230 of those web applications were created in order to gather data about authentication practices, multistep authentication practices, security question practices, fallback authentication practices, and other security practices for online accounts. The account creation and data…

Data from a total of 282 online web applications was collected, and accounts for 230 of those web applications were created in order to gather data about authentication practices, multistep authentication practices, security question practices, fallback authentication practices, and other security practices for online accounts. The account creation and data collection was done between June 2016 and April 2017. The password strengths for online accounts were analyzed and password strength data was compared to existing data. Security questions used by online accounts were evaluated for security and usability, and fallback authentication practices were assessed based on their adherence to best practices. Alternative authentication schemes were examined, and other security considerations such as use of HTTPS and CAPTCHAs were explored. Based on existing data, password policies require stronger passwords in for web applications in 2017 compared to the requirements in 2010. Nevertheless, password policies for many accounts are still not adequate. About a quarter of online web applications examined use security questions, and many of the questions have usability and security concerns. Security mechanisms such as HTTPS and continuous authentication are in general not used in conjunction with security questions for most web applications, which reduces the overall security of the web application. A majority of web applications use email addresses as the login credential and the password recovery credential and do not follow best practices. About a quarter of accounts use multistep authentication and a quarter of accounts employ continuous authentication, yet most accounts fail to combine security measures for defense in depth. The overall conclusion is that some online web applications are using secure practices; however, a majority of online web applications fail to properly implement and utilize secure practices.

ContributorsGutierrez, Garrett (Author) / Bazzi, Rida (Thesis advisor) / Ahn, Gail-Joon (Committee member) / Doupe, Adam (Committee member) / Arizona State University (Publisher)

Created2017

A Hacker-Centric Perspective to Empower Cyber Defense

Description

Malicious hackers utilize the World Wide Web to share knowledge. Previous work has demonstrated that information mined from online hacking communities can be used as precursors to cyber-attacks. In a threatening scenario, where security alert systems are facing high false positive rates, understanding the people behind cyber incidents can hel…

Malicious hackers utilize the World Wide Web to share knowledge. Previous work has demonstrated that information mined from online hacking communities can be used as precursors to cyber-attacks. In a threatening scenario, where security alert systems are facing high false positive rates, understanding the people behind cyber incidents can help reduce the risk of attacks. However, the rapidly evolving nature of those communities leads to limitations still largely unexplored, such as: who are the skilled and influential individuals forming those groups, how they self-organize along the lines of technical expertise, how ideas propagate within them, and which internal patterns can signal imminent cyber offensives? In this dissertation, I have studied four key parts of this complex problem set. Initially, I leverage content, social network, and seniority analysis to mine key-hackers on darkweb forums, identifying skilled and influential individuals who are likely to succeed in their cybercriminal goals. Next, as hackers often use Web platforms to advertise and recruit collaborators, I analyze how social influence contributes to user engagement online. On social media, two time constraints are proposed to extend standard influence measures, which increases their correlation with adoption probability and consequently improves hashtag adoption prediction. On darkweb forums, the prediction of where and when hackers will post a message in the near future is accomplished by analyzing their recurrent interactions with other hackers. After that, I demonstrate how vendors of malware and malicious exploits organically form hidden organizations on darkweb marketplaces, obtaining significant consistency across the vendors’ communities extracted using the similarity of their products in different networks. Finally, I predict imminent cyber-attacks correlating malicious hacking activity on darkweb forums with real-world cyber incidents, evidencing how social indicators are crucial for the performance of the proposed model. This research is a hybrid of social network analysis (SNA), machine learning (ML), evolutionary computation (EC), and temporal logic (TL), presenting expressive contributions to empower cyber defense.

ContributorsSantana Marin, Ericsson (Author) / Shakarian, Paulo (Thesis advisor) / Doupe, Adam (Committee member) / Liu, Huan (Committee member) / Ferrara, Emilio (Committee member) / Arizona State University (Publisher)

Created2020

Improving AI planning by using extensible components

Description

Despite incremental improvements over decades, academic planning solutions see relatively little use in many industrial domains despite the relevance of planning paradigms to those problems. This work observes four shortfalls of existing academic solutions which contribute to this lack of adoption.

To address these shortfalls this work defines model-independent semantics for…

Despite incremental improvements over decades, academic planning solutions see relatively little use in many industrial domains despite the relevance of planning paradigms to those problems. This work observes four shortfalls of existing academic solutions which contribute to this lack of adoption.

To address these shortfalls this work defines model-independent semantics for planning and introduces an extensible planning library. This library is shown to produce feasible results on an existing benchmark domain, overcome the usual modeling limitations of traditional planners, and accommodate domain-dependent knowledge about the problem structure within the planning process.

ContributorsJonas, Michael (Author) / Gaffar, Ashraf (Thesis advisor) / Fainekos, Georgios (Committee member) / Doupe, Adam (Committee member) / Herley, Cormac (Committee member) / Arizona State University (Publisher)

Created2016

Filtering by