Matching Items (2)
Filtering by

Clear all filters

134439-Thumbnail Image.png
Description
In the area of hardware, reverse engineering was traditionally focused on developing clones—duplicated components that performed the same functionality of the original component. While reverse engineering techniques have been applied to software, these techniques have instead focused on understanding high-level software designs to ease the software maintenance burden. This approach

In the area of hardware, reverse engineering was traditionally focused on developing clones—duplicated components that performed the same functionality of the original component. While reverse engineering techniques have been applied to software, these techniques have instead focused on understanding high-level software designs to ease the software maintenance burden. This approach works well for traditional applications that contain source code, however, there are circumstances, particularly regarding web applications, where it would be very beneficial to clone a web application and no source code is present, e.g., for security testing of the application or for offline mock testing of a third-party web service. We call this the web application cloning problem.
This thesis presents a possible solution to the problem of web application cloning. Our approach is a novel application of inductive programming, which we call inductive reverse engineering. The goal of inductive reverse engineering is to automatically reverse engineer an abstraction of the web application’s code in a completely black-box manner. We build this approach using recent advances in inductive programming, and we solve several technical challenges to scale the inductive programming techniques to realistic-sized web applications. We target the initial version of our inductive reverse engineering tool to a subset of web applications, i.e., those that do not store state and those that do not have loops. We introduce an evaluation methodology for web application cloning techniques and evaluate our approach on several real-world web applications. The results indicate that inductive reverse engineering can effectively reverse engineer specific types of web applications. In the future, we hope to extend the power of inductive reverse engineering to web applications with state and to learn loops, while still maintaining tractability.
ContributorsLiao, Kevin (Author) / Doupe, Adam (Thesis director) / Ahn, Gail-Joon (Committee member) / Zhao, Ziming (Committee member) / Computer Science and Engineering Program (Contributor, Contributor) / W. P. Carey School of Business (Contributor) / Barrett, The Honors College (Contributor)
Created2017-05
157515-Thumbnail Image.png
Description
Reverse engineering is critical to reasoning about how a system behaves. While complete access to a system inherently allows for perfect analysis, partial access is inherently uncertain. This is the case foran individual agent in a distributed system. Inductive Reverse Engineering (IRE) enables analysis under

such circumstances. IRE does this by

Reverse engineering is critical to reasoning about how a system behaves. While complete access to a system inherently allows for perfect analysis, partial access is inherently uncertain. This is the case foran individual agent in a distributed system. Inductive Reverse Engineering (IRE) enables analysis under

such circumstances. IRE does this by producing program spaces consistent with individual input-output examples for a given domain-specific language. Then, IRE intersects those program spaces to produce a generalized program consistent with all examples. IRE, an easy to use framework, allows this domain-specific language to be specified in the form of Theorist s, which produce Theory s, a succinct way of representing the program space.

Programs are often much more complex than simple string transformations. One of the ways in which they are more complex is in the way that they follow a conversation-like behavior, potentially following some underlying protocol. As a result, IRE represents program interactions as Conversations in order to

more correctly model a distributed system. This, for instance, enables IRE to model dynamically captured inputs received from other agents in the distributed system.

While domain-specific knowledge provided by a user is extremely valuable, such information is not always possible. IRE mitigates this by automatically inferring program grammars, allowing it to still perform efficient searches of the program space. It does this by intersecting conversations prior to synthesis in order to understand what portions of conversations are constant.

IRE exists to be a tool that can aid in automatic reverse engineering across numerous domains. Further, IRE aspires to be a centralized location and interface for implementing program synthesis and automatic black box analysis techniques.
ContributorsNelson, Connor David (Author) / Doupe, Adam (Thesis advisor) / Shoshitaishvili, Yan (Committee member) / Wang, Ruoyu (Committee member) / Arizona State University (Publisher)
Created2019