Search Content

Topic chains for determining risk of unauthorized information transfer

Description

Corporations invest considerable resources to create, preserve and analyze

their data; yet while organizations are interested in protecting against

unauthorized data transfer, there lacks a comprehensive metric to discriminate

what data are at risk of leaking.

This thesis motivates the need for a quantitative leakage risk metric, and

provides a risk assessment system,…

Corporations invest considerable resources to create, preserve and analyze

their data; yet while organizations are interested in protecting against

unauthorized data transfer, there lacks a comprehensive metric to discriminate

what data are at risk of leaking.

This thesis motivates the need for a quantitative leakage risk metric, and

provides a risk assessment system, called Whispers, for computing it. Using

unsupervised machine learning techniques, Whispers uncovers themes in an

organization's document corpus, including previously unknown or unclassified

data. Then, by correlating the document with its authors, Whispers can

identify which data are easier to contain, and conversely which are at risk.

Using the Enron email database, Whispers constructs a social network segmented

by topic themes. This graph uncovers communication channels within the

organization. Using this social network, Whispers determines the risk of each

topic by measuring the rate at which simulated leaks are not detected. For the

Enron set, Whispers identified 18 separate topic themes between January 1999

and December 2000. The highest risk data emanated from the legal department

with a leakage risk as high as 60%.

ContributorsWright, Jeremy (Author) / Syrotiuk, Violet (Thesis advisor) / Davulcu, Hasan (Committee member) / Yau, Stephen (Committee member) / Arizona State University (Publisher)

Created2014

E-mail header injections - an analysis of the World Wide Web

Description

E-Mail header injection vulnerability is a class of vulnerability that can occur in web applications that use user input to construct e-mail messages. E-Mail injection is possible when the mailing script fails to check for the presence of e-mail headers in user input (either form fields or URL parameters). The…

E-Mail header injection vulnerability is a class of vulnerability that can occur in web applications that use user input to construct e-mail messages. E-Mail injection is possible when the mailing script fails to check for the presence of e-mail headers in user input (either form fields or URL parameters). The vulnerability exists in the reference implementation of the built-in “mail” functionality in popular languages like PHP, Java, Python, and Ruby. With the proper injection string, this vulnerability can be exploited to inject additional headers and/or modify existing headers in an e-mail message, allowing an attacker to completely alter the content of the e-mail.

This thesis develops a scalable mechanism to automatically detect E-Mail Header Injection vulnerability and uses this mechanism to quantify the prevalence of E- Mail Header Injection vulnerabilities on the Internet. Using a black-box testing approach, the system crawled 21,675,680 URLs to find URLs which contained form fields. 6,794,917 such forms were found by the system, of which 1,132,157 forms contained e-mail fields. The system used this data feed to discern the forms that could be fuzzed with malicious payloads. Amongst the 934,016 forms tested, 52,724 forms were found to be injectable with more malicious payloads. The system tested 46,156 of these and was able to find 496 vulnerable URLs across 222 domains, which proves that the threat is widespread and deserves future research attention.

ContributorsChandramouli, Sai Prashanth (Author) / Doupe, Adam (Thesis advisor) / Ahn, Gail-Joon (Committee member) / Zhao, Ziming (Committee member) / Arizona State University (Publisher)

Created2016

Next Generation Black-Box Web Application Vulnerability Analysis Framework

Description

Web applications are an incredibly important aspect of our modern lives. Organizations

and developers use automated vulnerability analysis tools, also known as

scanners, to automatically find vulnerabilities in their web applications during development.

Scanners have traditionally fallen into two types of approaches: black-box

and white-box. In the black-box approaches, the scanner does not have…

Web applications are an incredibly important aspect of our modern lives. Organizations

and developers use automated vulnerability analysis tools, also known as

scanners, to automatically find vulnerabilities in their web applications during development.

Scanners have traditionally fallen into two types of approaches: black-box

and white-box. In the black-box approaches, the scanner does not have access to the

source code of the web application whereas a white-box approach has access to the

source code. Today’s state-of-the-art black-box vulnerability scanners employ various

methods to fuzz and detect vulnerabilities in a web application. However, these

scanners attempt to fuzz the web application with a number of known payloads and

to try to trigger a vulnerability. This technique is simple but does not understand

the web application that it is testing. This thesis, presents a new approach to vulnerability

analysis. The vulnerability analysis module presented uses a novel approach

of Inductive Reverse Engineering (IRE) to understand and model the web application.

IRE first attempts to understand the behavior of the web application by giving

certain number of input/output pairs to the web application. Then, the IRE module

hypothesizes a set of programs (in a limited language specific to web applications,

called AWL) that satisfy the input/output pairs. These hypotheses takes the form of

a directed acyclic graph (DAG). AWL vulnerability analysis module can then attempt

to detect vulnerabilities in this DAG. Further, it generates the payload based on the

DAG, and therefore this payload will be a precise payload to trigger the potential vulnerability

(based on our understanding of the program). It then tests this potential

vulnerability using the generated payload on the actual web application, and creates

a verification procedure to see if the potential vulnerability is actually vulnerable,

based on the web application’s response.

ContributorsKhairnar, Tejas (Author) / Doupe, Adam (Thesis advisor) / Ahn, Gail-Joon (Committee member) / Zhao, Ziming (Committee member) / Arizona State University (Publisher)

Created2017

IRE: A Framework For Inductive Reverse Engineering

Description

Reverse engineering is critical to reasoning about how a system behaves. While complete access to a system inherently allows for perfect analysis, partial access is inherently uncertain. This is the case foran individual agent in a distributed system. Inductive Reverse Engineering (IRE) enables analysis under

such circumstances. IRE does this by…

Reverse engineering is critical to reasoning about how a system behaves. While complete access to a system inherently allows for perfect analysis, partial access is inherently uncertain. This is the case foran individual agent in a distributed system. Inductive Reverse Engineering (IRE) enables analysis under

such circumstances. IRE does this by producing program spaces consistent with individual input-output examples for a given domain-specific language. Then, IRE intersects those program spaces to produce a generalized program consistent with all examples. IRE, an easy to use framework, allows this domain-specific language to be specified in the form of Theorist s, which produce Theory s, a succinct way of representing the program space.

Programs are often much more complex than simple string transformations. One of the ways in which they are more complex is in the way that they follow a conversation-like behavior, potentially following some underlying protocol. As a result, IRE represents program interactions as Conversations in order to

more correctly model a distributed system. This, for instance, enables IRE to model dynamically captured inputs received from other agents in the distributed system.

While domain-specific knowledge provided by a user is extremely valuable, such information is not always possible. IRE mitigates this by automatically inferring program grammars, allowing it to still perform efficient searches of the program space. It does this by intersecting conversations prior to synthesis in order to understand what portions of conversations are constant.

IRE exists to be a tool that can aid in automatic reverse engineering across numerous domains. Further, IRE aspires to be a centralized location and interface for implementing program synthesis and automatic black box analysis techniques.

ContributorsNelson, Connor David (Author) / Doupe, Adam (Thesis advisor) / Shoshitaishvili, Yan (Committee member) / Wang, Ruoyu (Committee member) / Arizona State University (Publisher)

Created2019

Filtering by

Topic chains for determining risk of unauthorized information transfer

E-mail header injections - an analysis of the World Wide Web

Next Generation Black-Box Web Application Vulnerability Analysis Framework

IRE: A Framework For Inductive Reverse Engineering