Search Content

Detection of advanced bots in smartphones through user profiling

Description

This thesis addresses the ever increasing threat of botnets in the smartphone domain and focuses on the Android platform and the botnets using Online Social Networks (OSNs) as Command and Control (C&C;) medium. With any botnet, C&C; is one of the components on which the survival of botnet depends. Individual…

This thesis addresses the ever increasing threat of botnets in the smartphone domain and focuses on the Android platform and the botnets using Online Social Networks (OSNs) as Command and Control (C&C;) medium. With any botnet, C&C; is one of the components on which the survival of botnet depends. Individual bots use the C&C; channel to receive commands and send the data. This thesis develops active host based approach for identifying the presence of bot based on the anomalies in the usage patterns of the user before and after the bot is installed on the user smartphone and alerting the user to the presence of the bot. A profile is constructed for each user based on the regular web usage patterns (achieved by intercepting the http(s) traffic) and implementing machine learning techniques to continuously learn the user's behavior and changes in the behavior and all the while looking for any anomalies in the user behavior above a threshold which will cause the user to be notified of the anomalous traffic. A prototype bot which uses OSN s as C&C; channel is constructed and used for testing. Users are given smartphones(Nexus 4 and Galaxy Nexus) running Application proxy which intercepts http(s) traffic and relay it to a server which uses the traffic and constructs the model for a particular user and look for any signs of anomalies. This approach lays the groundwork for the future host-based counter measures for smartphone botnets using OSN s as C&C; channel.

ContributorsKilari, Vishnu Teja (Author) / Xue, Guoliang (Thesis advisor) / Ahn, Gail-Joon (Committee member) / Dasgupta, Partha (Committee member) / Arizona State University (Publisher)

Created2013

An SDN-based IPS development framework in cloud networking environment

Description

Security has been one of the top concerns in cloud community while cloud resource abuse and malicious insiders are considered as top threats. Traditionally, Intrusion Detection Systems (IDS) and Intrusion Prevention Systems (IPS) have been widely deployed to manipulate cloud security, with the latter one providing additional prevention capability. However,…

Security has been one of the top concerns in cloud community while cloud resource abuse and malicious insiders are considered as top threats. Traditionally, Intrusion Detection Systems (IDS) and Intrusion Prevention Systems (IPS) have been widely deployed to manipulate cloud security, with the latter one providing additional prevention capability. However, as one of the most creative networking technologies, Software-Defined Networking (SDN) is rarely used to implement IDPS in the cloud computing environment because the lack of comprehensive development framework and processing flow. Simply migration from traditional IDS/IPS systems to SDN environment are not effective enough for detecting and defending malicious attacks. Hence, in this thesis, we present an IPS development framework to help user easily design and implement their defensive systems in cloud system by SDN technology. This framework enables SDN approaches to enhance the system security and performance. A Traffic Information Platform (TIP) is proposed as the cornerstone with several upper layer security modules such as Detection, Analysis and Prevention components. Benefiting from the flexible, compatible and programmable features of SDN, Customized Detection Engine, Network Topology Finder, Source Tracer and further user-developed security appliances are plugged in our framework to construct a SDN-based defensive system. Two main categories Python-based APIs are designed to support developers for further development. This system is designed and implemented based on the POX controller and Open vSwitch in the cloud computing environment. The efficiency of this framework is demonstrated by a sample IPS implementation and the performance of our framework is also evaluated.

ContributorsXiong, Zhengyang (Author) / Huang, Dijiang (Thesis advisor) / Xue, Guoliang (Committee member) / Dalvucu, Hasan (Committee member) / Arizona State University (Publisher)

Created2014

Infinite cacheflow: a rule-caching solution for software defined networks

Description

New OpenFlow switches support a wide range of network applications, such as firewalls, load balancers, routers, and traffic monitoring. While ternary content addressable memory (TCAM) allows switches to process packets at high speed based on multiple header fields, today's commodity switches support just thousands to tens of thousands of forwarding…

New OpenFlow switches support a wide range of network applications, such as firewalls, load balancers, routers, and traffic monitoring. While ternary content addressable memory (TCAM) allows switches to process packets at high speed based on multiple header fields, today's commodity switches support just thousands to tens of thousands of forwarding rules. To allow for finer-grained policies on this hardware, efficient ways to support the abstraction of a switch are needed with arbitrarily large rule tables. To do so, a hardware-software hybrid switch is designed that relies on rule caching to provide large rule tables at low cost. Unlike traditional caching solutions, neither individual rules are cached (to respect rule dependencies) nor compressed (to preserve the per-rule traffic counts). Instead long dependency chains are ``spliced'' to cache smaller groups of rules while preserving the semantics of the network policy. The proposed hybrid switch design satisfies three criteria: (1) responsiveness, to allow rapid changes to the cache with minimal effect on traffic throughput; (2) transparency, to faithfully support native OpenFlow semantics; (3) correctness, to cache rules while preserving the semantics of the original policy. The evaluation of the hybrid switch on large rule tables suggest that it can effectively expose the benefits of both hardware and software switches to the controller and to applications running on top of it.

ContributorsAlipourfard, Omid (Author) / Syrotiuk, Violet R. (Thesis advisor) / Richa, Andréa W. (Committee member) / Xue, Guoliang (Committee member) / Arizona State University (Publisher)

Created2014

Privacy preserving controls for Android applications

Description

Android is currently the most widely used mobile operating system. The permission model in Android governs the resource access privileges of applications. The permission model however is amenable to various attacks, including re-delegation attacks, background snooping attacks and disclosure of private information. This thesis is aimed at understanding, analyzing and…

Android is currently the most widely used mobile operating system. The permission model in Android governs the resource access privileges of applications. The permission model however is amenable to various attacks, including re-delegation attacks, background snooping attacks and disclosure of private information. This thesis is aimed at understanding, analyzing and performing forensics on application behavior. This research sheds light on several security aspects, including the use of inter-process communications (IPC) to perform permission re-delegation attacks.

Android permission system is more of app-driven rather than user controlled, which means it is the applications that specify their permission requirement and the only thing which the user can do is choose not to install a particular application based on the requirements. Given the all or nothing choice, users succumb to pressures and needs to accept permissions requested. This thesis proposes a couple of ways for providing the users finer grained control of application privileges. The same methods can be used to evade the Permission Re-delegation attack.

This thesis also proposes and implements a novel methodology in Android that can be used to control the access privileges of an Android application, taking into consideration the context of the running application. This application-context based permission usage is further used to analyze a set of sample applications. We found the evidence of applications spoofing or divulging user sensitive information such as location information, contact information, phone id and numbers, in the background. Such activities can be used to track users for a variety of privacy-intrusive purposes. We have developed implementations that minimize several forms of privacy leaks that are routinely done by stock applications.

ContributorsGollapudi, Narasimha Aditya (Author) / Dasgupta, Partha (Thesis advisor) / Xue, Guoliang (Committee member) / Doupe, Adam (Committee member) / Arizona State University (Publisher)

Created2014

Sparse learning package with stability selection and application to alzheimer's disease

Description

Sparse learning is a technique in machine learning for feature selection and dimensionality reduction, to find a sparse set of the most relevant features. In any machine learning problem, there is a considerable amount of irrelevant information, and separating relevant information from the irrelevant information has been a topic of…

Sparse learning is a technique in machine learning for feature selection and dimensionality reduction, to find a sparse set of the most relevant features. In any machine learning problem, there is a considerable amount of irrelevant information, and separating relevant information from the irrelevant information has been a topic of focus. In supervised learning like regression, the data consists of many features and only a subset of the features may be responsible for the result. Also, the features might require special structural requirements, which introduces additional complexity for feature selection. The sparse learning package, provides a set of algorithms for learning a sparse set of the most relevant features for both regression and classification problems. Structural dependencies among features which introduce additional requirements are also provided as part of the package. The features may be grouped together, and there may exist hierarchies and over- lapping groups among these, and there may be requirements for selecting the most relevant groups among them. In spite of getting sparse solutions, the solutions are not guaranteed to be robust. For the selection to be robust, there are certain techniques which provide theoretical justification of why certain features are selected. The stability selection, is a method for feature selection which allows the use of existing sparse learning methods to select the stable set of features for a given training sample. This is done by assigning probabilities for the features: by sub-sampling the training data and using a specific sparse learning technique to learn the relevant features, and repeating this a large number of times, and counting the probability as the number of times a feature is selected. Cross-validation which is used to determine the best parameter value over a range of values, further allows to select the best parameter value. This is done by selecting the parameter value which gives the maximum accuracy score. With such a combination of algorithms, with good convergence guarantees, stable feature selection properties and the inclusion of various structural dependencies among features, the sparse learning package will be a powerful tool for machine learning research. Modular structure, C implementation, ATLAS integration for fast linear algebraic subroutines, make it one of the best tool for a large sparse setting. The varied collection of algorithms, support for group sparsity, batch algorithms, are a few of the notable functionality of the SLEP package, and these features can be used in a variety of fields to infer relevant elements. The Alzheimer Disease(AD) is a neurodegenerative disease, which gradually leads to dementia. The SLEP package is used for feature selection for getting the most relevant biomarkers from the available AD dataset, and the results show that, indeed, only a subset of the features are required to gain valuable insights.

ContributorsThulasiram, Ramesh (Author) / Ye, Jieping (Thesis advisor) / Xue, Guoliang (Committee member) / Sen, Arunabha (Committee member) / Arizona State University (Publisher)

Created2011

Adapting sensing and transmission times to improve secondary user throughput in cognitive radio ad hoc networks

Description

Cognitive Radios (CR) are designed to dynamically reconﬁgure their transmission and/or reception parameters to utilize the bandwidth efﬁciently. With a rapidly ﬂuctuating radio environment, spectrum management becomes crucial for cognitive radios. In a Cognitive Radio Ad Hoc Network (CRAHN) setting, the sensing and transmission times of the cognitive radio play…

Cognitive Radios (CR) are designed to dynamically reconﬁgure their transmission and/or reception parameters to utilize the bandwidth efﬁciently. With a rapidly ﬂuctuating radio environment, spectrum management becomes crucial for cognitive radios. In a Cognitive Radio Ad Hoc Network (CRAHN) setting, the sensing and transmission times of the cognitive radio play a more important role because of the decentralized nature of the network. They have a direct impact on the throughput. Due to the tradeoff between throughput and the sensing time, ﬁnding optimal values for sensing time and transmission time is difﬁcult. In this thesis, a method is proposed to improve the throughput of a CRAHN by dynamically changing the sensing and transmission times. To simulate the CRAHN setting, ns-2, the network simulator with an extension for CRAHN is used. The CRAHN extension module implements the required Primary User (PU) and Secondary User (SU) and other CR functionalities to simulate a realistic CRAHN scenario. First, this work presents a detailed analysis of various CR parameters, their interactions, their individual contributions to the throughput to understand how they affect the transmissions in the network. Based on the results of this analysis, changes to the system model in the CRAHN extension are proposed. Instantaneous throughput of the network is introduced in the new model, which helps to determine how the parameters should adapt based on the current throughput. Along with instantaneous throughput, checks are done for interference with the PUs and their transmission power, before modifying these CR parameters. Simulation results demonstrate that the throughput of the CRAHN with the adaptive sensing and transmission times is signiﬁcantly higher as compared to that of non-adaptive parameters.

ContributorsBapat, Namrata Arun (Author) / Syrotiuk, Violet R. (Thesis advisor) / Ahn, Gail-Joon (Committee member) / Xue, Guoliang (Committee member) / Arizona State University (Publisher)

Created2012

Adaptive sampling and learning in recommendation systems

Description

This thesis studies recommendation systems and considers joint sampling and learning. Sampling in recommendation systems is to obtain users' ratings on specific items chosen by the recommendation platform, and learning is to infer the unknown ratings of users to items given the existing data. In this thesis, the problem is…

This thesis studies recommendation systems and considers joint sampling and learning. Sampling in recommendation systems is to obtain users' ratings on specific items chosen by the recommendation platform, and learning is to infer the unknown ratings of users to items given the existing data. In this thesis, the problem is formulated as an adaptive matrix completion problem in which sampling is to reveal the unknown entries of a $U\times M$ matrix where $U$ is the number of users, $M$ is the number of items, and each entry of the $U\times M$ matrix represents the rating of a user to an item. In the literature, this matrix completion problem has been studied under a static setting, i.e., recovering the matrix based on a set of partial ratings. This thesis considers both sampling and learning, and proposes an adaptive algorithm. The algorithm adapts its sampling and learning based on the existing data. The idea is to sample items that reveal more information based on the previous sampling results and then learn based on clustering. Performance of the proposed algorithm has been evaluated using simulations.

ContributorsZhu, Lingfang (Author) / Xue, Guoliang (Thesis advisor) / He, Jingrui (Committee member) / Tong, Hanghang (Committee member) / Arizona State University (Publisher)

Created2015

Multiobjective Optimization Based Approach for Truth Discovery

Description

There are many applications where the truth is unknown. The truth values are

guessed by different sources. The values of different properties can be obtained from

various sources. These will lead to the disagreement in sources. An important task

is to obtain the truth from these sometimes contradictory sources. In the extension

of computing…

There are many applications where the truth is unknown. The truth values are

guessed by different sources. The values of different properties can be obtained from

various sources. These will lead to the disagreement in sources. An important task

is to obtain the truth from these sometimes contradictory sources. In the extension

of computing the truth, the reliability of sources needs to be computed. There are

models which compute the precision values. In those earlier models Banerjee et al.

(2005) Dong and Naumann (2009) Kasneci et al. (2011) Li et al. (2012) Marian and

Wu (2011) Zhao and Han (2012) Zhao et al. (2012), multiple properties are modeled

individually. In one of the existing works, the heterogeneous properties are modeled in

a joined way. In that work, the framework i.e. Conflict Resolution on Heterogeneous

Data (CRH) framework is based on the single objective optimization. Due to the

single objective optimization and non-convex optimization problem, only one local

optimal solution is found. As this is a non-convex optimization problem, the optimal

point depends upon the initial point. This single objective optimization problem is

converted into a multi-objective optimization problem. Due to the multi-objective

optimization problem, the Pareto optimal points are computed. In an extension of

that, the single objective optimization problem is solved with numerous initial points.

The above two approaches are used for finding the solution better than the solution

obtained in the CRH with median as the initial point for the continuous variables and

majority voting as the initial point for the categorical variables. In the experiments,

the solution, coming from the CRH, lies in the Pareto optimal points of the multiobjective

optimization and the solution coming from the CRH is the optimum solution

in these experiments.

ContributorsJain, Karan (Author) / Xue, Guoliang (Thesis advisor) / Sen, Arunabha (Committee member) / Sarwat, Mohamed (Committee member) / Arizona State University (Publisher)

Created2019

Designing a Software Platform for Evaluating Cyber-Attacks on The Electric PowerGrid

Description

Energy management system (EMS) is at the heart of the operation and control of a modern electrical grid. Because of economic, safety, and security reasons, access to industrial grade EMS and real-world power system data is extremely limited. Therefore, the ability to simulate an EMS is invaluable in researching the…

Energy management system (EMS) is at the heart of the operation and control of a modern electrical grid. Because of economic, safety, and security reasons, access to industrial grade EMS and real-world power system data is extremely limited. Therefore, the ability to simulate an EMS is invaluable in researching the EMS in normal and anomalous operating conditions.

I first lay the groundwork for a basic EMS loop simulation in modern power grids and review a class of cybersecurity threats called false data injection (FDI) attacks. Then I propose a software architecture as the basis of software simulation of the EMS loop and explain an actual software platform built using the proposed architecture. I also explain in detail the power analysis libraries used for building the platform with examples and illustrations from the implemented application. Finally, I will use the platform to simulate FDI attacks on two synthetic power system test cases and analyze and visualize the consequences using the capabilities built into the platform.

ContributorsKhodadadeh, Roozbeh (Author) / Sankar, Lalitha (Thesis advisor) / Xue, Guoliang (Thesis advisor) / Kosut, Oliver (Committee member) / Arizona State University (Publisher)

Created2019

Comparing a commercial and an SDN-based load balancer in a campus network

Description

Commercial load balancers are often in use, and the production network at Arizona State University (ASU) is no exception. However, because the load balancer uses IP addresses, the solution does not apply to all applications. One such application is Rsyslog. This software processes syslog packets and stores them in files.…

Commercial load balancers are often in use, and the production network at Arizona State University (ASU) is no exception. However, because the load balancer uses IP addresses, the solution does not apply to all applications. One such application is Rsyslog. This software processes syslog packets and stores them in files. The loss rate of incoming log packets is high due to the incoming rate of the data. The Rsyslog servers are overwhelmed by the continuous data stream. To solve this problem a software defined networking (SDN) based load balancer is designed to perform a transport-level load balancing over the incoming load to Rsyslog servers. In this solution the load is forwarded to one Rsyslog server at a time, according to one of a Round-Robin, Random, or Load-Based policy. This gives time to other servers to process the data they have received and prevent them from being overwhelmed. The evaluation of the proposed solution is conducted a physical testbed with the same data feed as the commercial solution. The results suggest that the SDN-based load balancer is competitive with the commercial load balancer. Replacing the software OpenFlow switch with a hardware switch is likely to further improve the results.

ContributorsGhaffarinejad, Ashkan (Author) / Syrotiuk, Violet R. (Thesis advisor) / Xue, Guoliang (Committee member) / Huang, Dijiang (Committee member) / Arizona State University (Publisher)

Created2015

Filtering by