Search Content

Using contextual information to improve phishing warning effectiveness

Description

Internet browsers are today capable of warning internet users of a potential phishing attack. Browsers identify these websites by referring to blacklists of reported phishing websites maintained by trusted organizations like Google, Phishtank etc. On identifying a Unified Resource Locator (URL) requested by a user as a reported phishing URL,…

Internet browsers are today capable of warning internet users of a potential phishing attack. Browsers identify these websites by referring to blacklists of reported phishing websites maintained by trusted organizations like Google, Phishtank etc. On identifying a Unified Resource Locator (URL) requested by a user as a reported phishing URL, browsers like Mozilla Firefox and Google Chrome display an 'active' warning message in an attempt to stop the user from making a potentially dangerous decision of visiting the website and sharing confidential information like username-password, credit card information, social security number etc.

However, these warnings are not always successful at safeguarding the user from a phishing attack. On several occasions, users ignore these warnings and 'click through' them, eventually landing at the potentially dangerous website and giving away confidential information. Failure to understand the warning, failure to differentiate different types of browser warnings, diminishing trust on browser warnings due to repeated encounter are some of the reasons that make users ignore these warnings. It is important to address these factors in order to eventually improve a user’s reaction to these warnings.

In this thesis, I propose a novel design to improve the effectiveness and reliability of phishing warning messages. This design utilizes the name of the target website that a fake website is mimicking, to display a simple, easy to understand and interactive warning message with the primary objective of keeping the user away from a potentially spoof website.

ContributorsSharma, Satyabrata (Author) / Bazzi, Rida (Thesis advisor) / Walker, Erin (Committee member) / Gaffar, Ashraf (Committee member) / Arizona State University (Publisher)

Created2015

Context-aware search principles in automated learning environments

Description

Many web search improvements have been developed since the advent of the modern search engine, but one underrepresented area is the application of specific customizations to search results for educational web sites. In order to address this issue and improve the relevance of search results in automated learning environments, this…

Many web search improvements have been developed since the advent of the modern search engine, but one underrepresented area is the application of specific customizations to search results for educational web sites. In order to address this issue and improve the relevance of search results in automated learning environments, this work has integrated context-aware search principles with applications of preference based re-ranking and query modifications. This research investigates several aspects of context-aware search principles, specifically context-sensitive and preference based re-ranking of results which take user inputs as to their preferred content, and combines this with search query modifications which automatically search for a variety of modified terms based on the given search query, integrating these results into the overall re-ranking for the context. The result of this work is a novel web search algorithm which could be applied to any online learning environment attempting to collect relevant resources for learning about a given topic. The algorithm has been evaluated through user studies comparing traditional search results to the context-aware results returned through the algorithm for a given topic. These studies explore how this integration of methods could provide improved relevance in the search results returned when compared against other modern search engines.

ContributorsVan Egmond, Eric (Author) / Burleson, Winslow (Thesis advisor) / Syrotiuk, Violet (Thesis advisor) / Nelson, Brian (Committee member) / Arizona State University (Publisher)

Created2014

Topic chains for determining risk of unauthorized information transfer

Description

Corporations invest considerable resources to create, preserve and analyze

their data; yet while organizations are interested in protecting against

unauthorized data transfer, there lacks a comprehensive metric to discriminate

what data are at risk of leaking.

This thesis motivates the need for a quantitative leakage risk metric, and

provides a risk assessment system,…

Corporations invest considerable resources to create, preserve and analyze

their data; yet while organizations are interested in protecting against

unauthorized data transfer, there lacks a comprehensive metric to discriminate

what data are at risk of leaking.

This thesis motivates the need for a quantitative leakage risk metric, and

provides a risk assessment system, called Whispers, for computing it. Using

unsupervised machine learning techniques, Whispers uncovers themes in an

organization's document corpus, including previously unknown or unclassified

data. Then, by correlating the document with its authors, Whispers can

identify which data are easier to contain, and conversely which are at risk.

Using the Enron email database, Whispers constructs a social network segmented

by topic themes. This graph uncovers communication channels within the

organization. Using this social network, Whispers determines the risk of each

topic by measuring the rate at which simulated leaks are not detected. For the

Enron set, Whispers identified 18 separate topic themes between January 1999

and December 2000. The highest risk data emanated from the legal department

with a leakage risk as high as 60%.

ContributorsWright, Jeremy (Author) / Syrotiuk, Violet (Thesis advisor) / Davulcu, Hasan (Committee member) / Yau, Stephen (Committee member) / Arizona State University (Publisher)

Created2014

Establishing distributed social network trust model in MobiCloud system

Description

This thesis proposed a novel approach to establish the trust model in a social network scenario based on users' emails. Email is one of the most important social connections nowadays. By analyzing email exchange activities among users, a social network trust model can be established to judge the trust rate…

This thesis proposed a novel approach to establish the trust model in a social network scenario based on users' emails. Email is one of the most important social connections nowadays. By analyzing email exchange activities among users, a social network trust model can be established to judge the trust rate between each two users. The whole trust checking process is divided into two steps: local checking and remote checking. Local checking directly contacts the email server to calculate the trust rate based on user's own email communication history. Remote checking is a distributed computing process to get help from user's social network friends and built the trust rate together. The email-based trust model is built upon a cloud computing framework called MobiCloud. Inside MobiCloud, each user occupies a virtual machine which can directly communicate with others. Based on this feature, the distributed trust model is implemented as a combination of local analysis and remote analysis in the cloud. Experiment results show that the trust evaluation model can give accurate trust rate even in a small scale social network which does not have lots of social connections. With this trust model, the security in both social network services and email communication could be improved.

ContributorsZhong, Yunji (Author) / Huang, Dijiang (Thesis advisor) / Dasgupta, Partha (Committee member) / Syrotiuk, Violet (Committee member) / Arizona State University (Publisher)

Created2011

Client-driven dynamic database updates

Description

This thesis addresses the problem of online schema updates where the goal is to be able to update relational database schemas without reducing the database system's availability. Unlike some other work in this area, this thesis presents an approach which is completely client-driven and does not require specialized database management…

This thesis addresses the problem of online schema updates where the goal is to be able to update relational database schemas without reducing the database system's availability. Unlike some other work in this area, this thesis presents an approach which is completely client-driven and does not require specialized database management systems (DBMS). Also, unlike other client-driven work, this approach provides support for a richer set of schema updates including vertical split (normalization), horizontal split, vertical and horizontal merge (union), difference and intersection. The update process automatically generates a runtime update client from a mapping between the old the new schemas. The solution has been validated by testing it on a relatively small database of around 300,000 records per table and less than 1 Gb, but with limited memory buffer size of 24 Mb. This thesis presents the study of the overhead of the update process as a function of the transaction rates and the batch size used to copy data from the old to the new schema. It shows that the overhead introduced is minimal for medium size applications and that the update can be achieved with no more than one minute of downtime.

ContributorsTyagi, Preetika (Author) / Bazzi, Rida (Thesis advisor) / Candan, Kasim S (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)

Created2011

The design and analysis of hash families for use in broadcast encryption

Description

Broadcast Encryption is the task of cryptographically securing communication in a broadcast environment so that only a dynamically specified subset of subscribers, called the privileged subset, may decrypt the communication. In practical applications, it is desirable for a Broadcast Encryption Scheme (BES) to demonstrate resilience against attacks by colluding, unprivileged…

Broadcast Encryption is the task of cryptographically securing communication in a broadcast environment so that only a dynamically specified subset of subscribers, called the privileged subset, may decrypt the communication. In practical applications, it is desirable for a Broadcast Encryption Scheme (BES) to demonstrate resilience against attacks by colluding, unprivileged subscribers. Minimal Perfect Hash Families (PHFs) have been shown to provide a basis for the construction of memory-efficient t-resilient Key Pre-distribution Schemes (KPSs) from multiple instances of 1-resilient KPSs. Using this technique, the task of constructing a large t-resilient BES is reduced to finding a near-minimal PHF of appropriate parameters. While combinatorial and probabilistic constructions exist for minimal PHFs with certain parameters, the complexity of constructing them in general is currently unknown. This thesis introduces a new type of hash family, called a Scattering Hash Family (ScHF), which is designed to allow for the scalable and ingredient-independent design of memory-efficient BESs for large parameters, specifically resilience and total number of subscribers. A general BES construction using ScHFs is shown, which constructs t-resilient KPSs from other KPSs of any resilience ≤w≤t. In addition to demonstrating how ScHFs can be used to produce BESs , this thesis explores several ScHF construction techniques. The initial technique demonstrates a probabilistic, non-constructive proof of existence for ScHFs . This construction is then derandomized into a direct, polynomial time construction of near-minimal ScHFs using the method of conditional expectations. As an alternative approach to direct construction, representing ScHFs as a k-restriction problem allows for the indirect construction of ScHFs via randomized post-optimization. Using the methods defined, ScHFs are constructed and the parameters' effects on solution size are analyzed. For large strengths, constructive techniques lose significant performance, and as such, asymptotic analysis is performed using the non-constructive existential results. This work concludes with an analysis of the benefits and disadvantages of BESs based on the constructed ScHFs. Due to the novel nature of ScHFs, the results of this analysis are used as the foundation for an empirical comparison between ScHF-based and PHF-based BESs . The primary bases of comparison are construction efficiency, key material requirements, and message transmission overhead.

ContributorsO'Brien, Devon James (Author) / Colbourn, Charles J (Thesis advisor) / Bazzi, Rida (Committee member) / Richa, Andrea (Committee member) / Arizona State University (Publisher)

Created2012

A network function virtualization based load balancer for TCP

Description

A load balancer is an essential part of many network systems. A load balancer is capable of dividing and redistributing incoming network traffic to different back end servers, thus improving reliability and performance. Existing load balancing solutions can be classified into two categories: hardware-based or software-based. Hardware-based load balancing systems…

A load balancer is an essential part of many network systems. A load balancer is capable of dividing and redistributing incoming network traffic to different back end servers, thus improving reliability and performance. Existing load balancing solutions can be classified into two categories: hardware-based or software-based. Hardware-based load balancing systems are hard to manage and force network administrators to scale up (replacing with more powerful but expensive hardware) when their system can not handle the growing traffic. Software-based solutions have a limitation when dealing with a single large TCP flow. In recent years, with the fast developments of virtualization technology, a new trend of network function virtualization (NFV) is being adopted. Instead of using proprietary hardware, an NFV network infrastructure uses virtual machines running to implement network functions such as load balancers, firewalls, etc. In this thesis, a new load balancing system is designed and evaluated. This system is high performance and flexible. It can fully utilize the bandwidth between a load balancer and back end servers compared to traditional load balancers such as HAProxy. The experimental results show that using this NFV load balancer could have $n$ ($n$ is the number of back end servers) times better performance than HAProxy. Also, an extract, transform and load (ETL) application was implemented to demonstrate that this load balancer can shorten data load time. The experiment shows that when loading a large data set (18.3GB), our load balancer needs only 28\% less time than traditional load balancer.

ContributorsWu, Jinxuan (Author) / Syrotiuk, Violet R. (Thesis advisor) / Bazzi, Rida (Committee member) / Huang, Dijiang (Committee member) / Arizona State University (Publisher)

Created2015

Covering arrays: generation and post-optimization

Description

Exhaustive testing is generally infeasible except in the smallest of systems. Research

has shown that testing the interactions among fewer (up to 6) components is generally

sufficient while retaining the capability to detect up to 99% of defects. This leads to a

substantial decrease in the number of tests. Covering arrays are combinatorial…

Exhaustive testing is generally infeasible except in the smallest of systems. Research

has shown that testing the interactions among fewer (up to 6) components is generally

sufficient while retaining the capability to detect up to 99% of defects. This leads to a

substantial decrease in the number of tests. Covering arrays are combinatorial objects

that guarantee that every interaction is tested at least once.

In the absence of direct constructions, forming small covering arrays is generally

an expensive computational task. Algorithms to generate covering arrays have been

extensively studied yet no single algorithm provides the smallest solution. More

recently research has been directed towards a new technique called post-optimization.

These algorithms take an existing covering array and attempt to reduce its size.

This thesis presents a new idea for post-optimization by representing covering

arrays as graphs. Some properties of these graphs are established and the results are

contrasted with existing post-optimization algorithms. The idea is then generalized to

close variants of covering arrays with surprising results which in some cases reduce

the size by 30%. Applications of the method to generation and test prioritization are

studied and some interesting results are reported.

ContributorsKaria, Rushang Vinod (Author) / Colbourn, Charles J (Thesis advisor) / Syrotiuk, Violet (Committee member) / Richa, Andréa W. (Committee member) / Arizona State University (Publisher)

Created2015

Smoothed Airtime Linear Tuning and Optimized REACT with Multi-hop Extensions

Description

Medium access control (MAC) is a fundamental problem in wireless networks.

In ad-hoc wireless networks especially, many of the performance and scaling issues

these networks face can be attributed to their use of the core IEEE 802.11 MAC

protocol: distributed coordination function (DCF). Smoothed Airtime Linear Tuning

(SALT) is a new contention window tuning…

Medium access control (MAC) is a fundamental problem in wireless networks.

In ad-hoc wireless networks especially, many of the performance and scaling issues

these networks face can be attributed to their use of the core IEEE 802.11 MAC

protocol: distributed coordination function (DCF). Smoothed Airtime Linear Tuning

(SALT) is a new contention window tuning algorithm proposed to address some of the

deficiencies of DCF in 802.11 ad-hoc networks. SALT works alongside a new user level

and optimized implementation of REACT, a distributed resource allocation protocol,

to ensure that each node secures the amount of airtime allocated to it by REACT.

The algorithm accomplishes that by tuning the contention window size parameter

that is part of the 802.11 backoff process. SALT converges more tightly on airtime

allocations than a contention window tuning algorithm from previous work and this

increases fairness in transmission opportunities and reduces jitter more than either

802.11 DCF or the other tuning algorithm. REACT and SALT were also extended

to the multi-hop flow scenario with the introduction of a new airtime reservation

algorithm. With a reservation in place multi-hop TCP throughput actually increased

when running SALT and REACT as compared to 802.11 DCF, and the combination of

protocols still managed to maintain its fairness and jitter advantages. All experiments

were performed on a wireless testbed, not in simulation.

ContributorsMellott, Matthew (Author) / Syrotiuk, Violet (Thesis advisor) / Colbourn, Charles (Committee member) / Tinnirello, Ilenia (Committee member) / Arizona State University (Publisher)

Created2018

Predicting and Interpreting Students Performance using Supervised Learning and Shapley Additive Explanations

Description

Due to large data resources generated by online educational applications, Educational Data Mining (EDM) has improved learning effects in different ways: Students Visualization, Recommendations for students, Students Modeling, Grouping Students, etc. A lot of programming assignments have the features like automating submissions, examining the test cases to verify the correctness,…

Due to large data resources generated by online educational applications, Educational Data Mining (EDM) has improved learning effects in different ways: Students Visualization, Recommendations for students, Students Modeling, Grouping Students, etc. A lot of programming assignments have the features like automating submissions, examining the test cases to verify the correctness, but limited studies compared different statistical techniques with latest frameworks, and interpreted models in a unified approach.

In this thesis, several data mining algorithms have been applied to analyze students’ code assignment submission data from a real classroom study. The goal of this work is to explore

and predict students’ performances. Multiple machine learning models and the model accuracy were evaluated based on the Shapley Additive Explanation.

The Cross-Validation shows the Gradient Boosting Decision Tree has the best precision 85.93% with average 82.90%. Features like Component grade, Due Date, Submission Times have higher impact than others. Baseline model received lower precision due to lack of non-linear fitting.

ContributorsTian, Wenbo (Author) / Hsiao, Ihan (Thesis advisor) / Bazzi, Rida (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)

Created2019