Search Content

Multiobjective Optimization Based Approach for Truth Discovery

Description

There are many applications where the truth is unknown. The truth values are

guessed by different sources. The values of different properties can be obtained from

various sources. These will lead to the disagreement in sources. An important task

is to obtain the truth from these sometimes contradictory sources. In the extension

of computing…

There are many applications where the truth is unknown. The truth values are

guessed by different sources. The values of different properties can be obtained from

various sources. These will lead to the disagreement in sources. An important task

is to obtain the truth from these sometimes contradictory sources. In the extension

of computing the truth, the reliability of sources needs to be computed. There are

models which compute the precision values. In those earlier models Banerjee et al.

(2005) Dong and Naumann (2009) Kasneci et al. (2011) Li et al. (2012) Marian and

Wu (2011) Zhao and Han (2012) Zhao et al. (2012), multiple properties are modeled

individually. In one of the existing works, the heterogeneous properties are modeled in

a joined way. In that work, the framework i.e. Conflict Resolution on Heterogeneous

Data (CRH) framework is based on the single objective optimization. Due to the

single objective optimization and non-convex optimization problem, only one local

optimal solution is found. As this is a non-convex optimization problem, the optimal

point depends upon the initial point. This single objective optimization problem is

converted into a multi-objective optimization problem. Due to the multi-objective

optimization problem, the Pareto optimal points are computed. In an extension of

that, the single objective optimization problem is solved with numerous initial points.

The above two approaches are used for finding the solution better than the solution

obtained in the CRH with median as the initial point for the continuous variables and

majority voting as the initial point for the categorical variables. In the experiments,

the solution, coming from the CRH, lies in the Pareto optimal points of the multiobjective

optimization and the solution coming from the CRH is the optimum solution

in these experiments.

ContributorsJain, Karan (Author) / Xue, Guoliang (Thesis advisor) / Sen, Arunabha (Committee member) / Sarwat, Mohamed (Committee member) / Arizona State University (Publisher)

Created2019

Designing a Software Platform for Evaluating Cyber-Attacks on The Electric PowerGrid

Description

Energy management system (EMS) is at the heart of the operation and control of a modern electrical grid. Because of economic, safety, and security reasons, access to industrial grade EMS and real-world power system data is extremely limited. Therefore, the ability to simulate an EMS is invaluable in researching the…

Energy management system (EMS) is at the heart of the operation and control of a modern electrical grid. Because of economic, safety, and security reasons, access to industrial grade EMS and real-world power system data is extremely limited. Therefore, the ability to simulate an EMS is invaluable in researching the EMS in normal and anomalous operating conditions.

I first lay the groundwork for a basic EMS loop simulation in modern power grids and review a class of cybersecurity threats called false data injection (FDI) attacks. Then I propose a software architecture as the basis of software simulation of the EMS loop and explain an actual software platform built using the proposed architecture. I also explain in detail the power analysis libraries used for building the platform with examples and illustrations from the implemented application. Finally, I will use the platform to simulate FDI attacks on two synthetic power system test cases and analyze and visualize the consequences using the capabilities built into the platform.

ContributorsKhodadadeh, Roozbeh (Author) / Sankar, Lalitha (Thesis advisor) / Xue, Guoliang (Thesis advisor) / Kosut, Oliver (Committee member) / Arizona State University (Publisher)

Created2019

Maximizing Routing Throughput with Applications to Delay Tolerant Networks

Description

Many applications require efficient data routing and dissemination in Delay Tolerant Networks (DTNs) in order to maximize the throughput of data in the network, such as providing healthcare to remote communities, and spreading related information in Mobile Social Networks (MSNs). In this thesis, the feasibility of using boats in the…

Many applications require efficient data routing and dissemination in Delay Tolerant Networks (DTNs) in order to maximize the throughput of data in the network, such as providing healthcare to remote communities, and spreading related information in Mobile Social Networks (MSNs). In this thesis, the feasibility of using boats in the Amazon Delta Riverine region as data mule nodes is investigated and a robust data routing algorithm based on a fountain code approach is designed to ensure fast and timely data delivery considering unpredictable boat delays, break-downs, and high transmission failures. Then, the scenario of providing healthcare in Amazon Delta Region is extended to a general All-or-Nothing (Splittable) Multicommodity Flow (ANF) problem and a polynomial time constant approximation algorithm is designed for the maximum throughput routing problem based on a randomized rounding scheme with applications to DTNs. In an MSN, message content is closely related to users’ preferences, and can be used to significantly impact the performance of data dissemination. An interest- and content-based algorithm is developed where the contents of the messages, along with the network structural information are taken into consideration when making message relay decisions in order to maximize data throughput in an MSN. Extensive experiments show the effectiveness of the above proposed data dissemination algorithm by comparing it with state-of-the-art techniques.

ContributorsLiu, Mengxue (Author) / Richa, Andréa W. (Thesis advisor) / Johnson, Thienne (Committee member) / Syrotiuk, Violet R. (Committee member) / Xue, Guoliang (Committee member) / Arizona State University (Publisher)

Created2018

Design, Analysis and Computation in Wireless and Optical Networks

Description

In the realm of network science, many topics can be abstracted as graph problems, such as routing, connectivity enhancement, resource/frequency allocation and so on. Though most of them are NP-hard to solve, heuristics as well as approximation algorithms are proposed to achieve reasonably good results. Accordingly, this dissertation studies graph…

In the realm of network science, many topics can be abstracted as graph problems, such as routing, connectivity enhancement, resource/frequency allocation and so on. Though most of them are NP-hard to solve, heuristics as well as approximation algorithms are proposed to achieve reasonably good results. Accordingly, this dissertation studies graph related problems encountered in real applications. Two problems studied in this dissertation are derived from wireless network, two more problems studied are under scenarios of FIWI and optical network, one more problem is in Radio- Frequency Identification (RFID) domain and the last problem is inspired by satellite deployment.

The objective of most of relay nodes placement problems, is to place the fewest number of relay nodes in the deployment area so that the network, formed by the sensors and the relay nodes, is connected. Under the fixed budget scenario, the expense involved in procuring the minimum number of relay nodes to make the network connected, may exceed the budget. In this dissertation, we study a family of problems whose goal is to design a network with “maximal connectedness” or “minimal disconnectedness”, subject to a fixed budget constraint. Apart from “connectivity”, we also study relay node problem in which degree constraint is considered. The balance of reducing the degree of the network while maximizing communication forms the basis of our d-degree minimum arrangement(d-MA) problem. In this dissertation, we look at several approaches to solving the generalized d-MA problem where we embed a graph onto a subgraph of a given degree.

In recent years, considerable research has been conducted on optical and FIWI networks. Utilizing a recently proposed concept “candidate trees” in optical network, this dissertation studies counting problem on complete graphs. Closed form expressions are given for certain cases and a polynomial counting algorithm for general cases is also presented. Routing plays a major role in FiWi networks. Accordingly to a novel path length metric which emphasizes on “heaviest edge”, this dissertation proposes a polynomial algorithm on single path computation. NP-completeness proof as well as approximation algorithm are presented for multi-path routing.

Radio-frequency identification (RFID) technology is extensively used at present for identification and tracking of a multitude of objects. In many configurations, simultaneous activation of two readers may cause a “reader collision” when tags are present in the intersection of the sensing ranges of both readers. This dissertation ad- dresses slotted time access for Readers and tries to provide a collision-free scheduling scheme while minimizing total reading time.

Finally, this dissertation studies a monitoring problem on the surface of the earth for significant environmental, social/political and extreme events using satellites as sensors. It is assumed that the impact of a significant event spills into neighboring regions and there will be corresponding indicators. Careful deployment of sensors, utilizing “Identifying Codes”, can ensure that even though the number of deployed sensors is fewer than the number of regions, it may be possible to uniquely identify the region where the event has taken place.

ContributorsZhou, Chenyang (Author) / Richa, Andrea (Thesis advisor) / Sen, Arunabha (Thesis advisor) / Xue, Guoliang (Committee member) / Walkowiak, Krzysztof (Committee member) / Arizona State University (Publisher)

Created2019

Mining Data with Feature Interactions

Description

Models using feature interactions have been applied successfully in many areas such as biomedical analysis, recommender systems. The popularity of using feature interactions mainly lies in (1) they are able to capture the nonlinearity of the data compared with linear effects and (2) they enjoy great interpretability. In this thesis,…

Models using feature interactions have been applied successfully in many areas such as biomedical analysis, recommender systems. The popularity of using feature interactions mainly lies in (1) they are able to capture the nonlinearity of the data compared with linear effects and (2) they enjoy great interpretability. In this thesis, I propose a series of formulations using feature interactions for real world problems and develop efficient algorithms for solving them.

Specifically, I first propose to directly solve the non-convex formulation of the weak hierarchical Lasso which imposes weak hierarchy on individual features and interactions but can only be approximately solved by a convex relaxation in existing studies. I further propose to use the non-convex weak hierarchical Lasso formulation for hypothesis testing on the interaction features with hierarchical assumptions. Secondly, I propose a type of bi-linear models that take advantage of interactions of features for drug discovery problems where specific drug-drug pairs or drug-disease pairs are of interest. These models are learned by maximizing the number of positive data pairs that rank above the average score of unlabeled data pairs. Then I generalize the method to the case of using the top-ranked unlabeled data pairs for representative construction and derive an efficient algorithm for the extended formulation. Last but not least, motivated by a special form of bi-linear models, I propose a framework that enables simultaneously subgrouping data points and building specific models on the subgroups for learning on massive and heterogeneous datasets. Experiments on synthetic and real datasets are conducted to demonstrate the effectiveness or efficiency of the proposed methods.

ContributorsLiu, Yashu (Author) / Ye, Jieping (Thesis advisor) / Xue, Guoliang (Thesis advisor) / Liu, Huan (Committee member) / Mittelmann, Hans D (Committee member) / Arizona State University (Publisher)

Created2018

Mobile cloud application framework and offloading strategies

Description

Mobile Cloud computing has shown its capability to support mobile devices for

provisioning computing, storage and communication resources. A distributed mobile

cloud service system called "POEM" is presented to manage the mobile cloud resource

and compose mobile cloud applications. POEM considers resource management not

only between mobile devices and clouds, but also among mobile…

Mobile Cloud computing has shown its capability to support mobile devices for

provisioning computing, storage and communication resources. A distributed mobile

cloud service system called "POEM" is presented to manage the mobile cloud resource

and compose mobile cloud applications. POEM considers resource management not

only between mobile devices and clouds, but also among mobile devices. It implements

both computation offloading and service composition features. The proposed POEM

solution is demonstrated by using OSGi and XMPP techniques.

Offloading is one major type of collaborations between mobile device and cloud

to achieve less execution time and less energy consumption. Offloading decisions for

mobile cloud collaboration involve many decision factors. One of important decision

factors is the network unavailability. This report presents an offloading decision model

that takes network unavailability into consideration. The application execution time

and energy consumption in both ideal network and network with some unavailability

are analyzed. Based on the presented theoretical model, an application partition

algorithm and a decision module are presented to produce an offloading decision that

is resistant to network unavailability.

Existing offloading models mainly focus on the one-to-one offloading relation. To

address the multi-factor and multi-site offloading mobile cloud application scenarios,

a multi-factor multi-site risk-based offloading model is presented, which abstracts the

offloading impact factors as for offloading benefit and offloading risk. The offloading

decision is made based on a comprehensive offloading risk evaluation. This presented

model is generic and expendable. Four offloading impact factors are presented to show

the construction and operation of the presented offloading model, which can be easily

extended to incorporate more factors to make offloading decision more comprehensive.

The overall offloading benefits and risks are aggregated based on the mobile cloud

users' preference.

The offloading topology may change during the whole application life. A set of

algorithms are presented to address the service topology reconfiguration problem in

several mobile cloud representative application scenarios, i.e., they are modeled as

finite horizon scenarios, infinite horizon scenarios, and large state space scenarios to

represent ad hoc, long-term, and large-scale mobile cloud service composition scenarios,

respectively.

ContributorsWu, Huijun (Author) / Huang, Dijiang (Thesis advisor) / Xue, Guoliang (Committee member) / Dasgupta, Partha (Committee member) / Mirchandani, Pitu (Committee member) / Arizona State University (Publisher)

Created2016

Toward customizable multi-tenant SaaS applications

Description

Nowadays, Computing is so pervasive that it has become indeed the 5th utility (after water, electricity, gas, telephony) as Leonard Kleinrock once envisioned. Evolved from utility computing, cloud computing has emerged as a computing infrastructure that enables rapid delivery of computing resources as a utility in a dynamically…

Nowadays, Computing is so pervasive that it has become indeed the 5th utility (after water, electricity, gas, telephony) as Leonard Kleinrock once envisioned. Evolved from utility computing, cloud computing has emerged as a computing infrastructure that enables rapid delivery of computing resources as a utility in a dynamically scalable, virtualized manner. However, the current industrial cloud computing implementations promote segregation among different cloud providers, which leads to user lockdown because of prohibitive migration cost. On the other hand, Service-Orented Computing (SOC) including service-oriented architecture (SOA) and Web Services (WS) promote standardization and openness with its enabling standards and communication protocols. This thesis proposes a Service-Oriented Cloud Computing Architecture by combining the best attributes of the two paradigms to promote an open, interoperable environment for cloud computing development. Mutil-tenancy SaaS applicantions built on top of SOCCA have more flexibility and are not locked down by a certain platform. Tenants residing on a multi-tenant application appear to be the sole owner of the application and not aware of the existence of others. A multi-tenant SaaS application accommodates each tenant’s unique requirements by allowing tenant-level customization. A complex SaaS application that supports hundreds, even thousands of tenants could have hundreds of customization points with each of them providing multiple options, and this could result in a huge number of ways to customize the application. This dissertation also proposes innovative customization approaches, which studies similar tenants’ customization choices and each individual users behaviors, then provides guided semi-automated customization process for the future tenants. A semi-automated customization process could enable tenants to quickly implement the customization that best suits their business needs.

ContributorsSun, Xin (Author) / Tsai, Wei-Tek (Thesis advisor) / Xue, Guoliang (Committee member) / Davulcu, Hasan (Committee member) / Sarjoughian, Hessam S. (Committee member) / Arizona State University (Publisher)

Created2016

Optimal resource allocation in social and critical infrastructure networks

Description

We live in a networked world with a multitude of networks, such as communication networks, electric power grid, transportation networks and water distribution networks, all around us. In addition to such physical (infrastructure) networks, recent years have seen tremendous proliferation of social networks, such as Facebook, Twitter, LinkedIn, Instagram, Google+…

We live in a networked world with a multitude of networks, such as communication networks, electric power grid, transportation networks and water distribution networks, all around us. In addition to such physical (infrastructure) networks, recent years have seen tremendous proliferation of social networks, such as Facebook, Twitter, LinkedIn, Instagram, Google+ and others. These powerful social networks are not only used for harnessing revenue from the infrastructure networks, but are also increasingly being used as “non-conventional sensors” for monitoring the infrastructure networks. Accordingly, nowadays, analyses of social and infrastructure networks go hand-in-hand. This dissertation studies resource allocation problems encountered in this set of diverse, heterogeneous, and interdependent networks. Three problems studied in this dissertation are encountered in the physical network domain while the three other problems studied are encountered in the social network domain.

The first problem from the infrastructure network domain relates to distributed files storage scheme with a goal of enhancing robustness of data storage by making it tolerant against large scale geographically-correlated failures. The second problem relates to placement of relay nodes in a deployment area with multiple sensor nodes with a goal of augmenting connectivity of the resulting network, while staying within the budget specifying the maximum number of relay nodes that can be deployed. The third problem studied in this dissertation relates to complex interdependencies that exist between infrastructure networks, such as power grid and communication network. The progressive recovery problem in an interdependent network is studied whose goal is to maximize system utility over the time when recovery process of failed entities takes place in a sequential manner.

The three problems studied from the social network domain relate to influence propagation in adversarial environment and political sentiment assessment in various states in a country with a goal of creation of a “political heat map” of the country. In the first problem of the influence propagation domain, the goal of the second player is to restrict the influence of the first player, while in the second problem the goal of the second player is to have a larger market share with least amount of initial investment.

ContributorsMazumder, Anisha (Author) / Sen, Arunabha (Thesis advisor) / Richa, Andrea (Committee member) / Xue, Guoliang (Committee member) / Reisslein, Martin (Committee member) / Arizona State University (Publisher)

Created2016

An investigation of machine learning for password evaluation

Description

Passwords are ubiquitous and are poised to stay that way due to their relative usability, security and deployability when compared with alternative authentication schemes. Unfortunately, humans struggle with some of the assumptions or requirements that are necessary for truly strong passwords. As administrators try to push users towards password complexity…

Passwords are ubiquitous and are poised to stay that way due to their relative usability, security and deployability when compared with alternative authentication schemes. Unfortunately, humans struggle with some of the assumptions or requirements that are necessary for truly strong passwords. As administrators try to push users towards password complexity and diversity, users still end up using predictable mangling patterns on old passwords and reusing the same passwords across services; users even inadvertently converge on the same patterns to a surprising degree, making an attacker’s job easier. This work explores using machine learning techniques to pick out strong passwords from weak ones, from a dataset of 10 million passwords, based on how structurally similar they were to the rest of the set.

ContributorsTodd, Margaret Nicole (Author) / Xue, Guoliang (Thesis advisor) / Ahn, Gail-Joon (Committee member) / Huang, Dijiang (Committee member) / Arizona State University (Publisher)

Created2016

Scaling Up Large-scale Sparse Learning and Its Application to Medical Imaging

Description

Large-scale $\ell_1$-regularized loss minimization problems arise in high-dimensional applications such as compressed sensing and high-dimensional supervised learning, including classification and regression problems. In many applications, it remains challenging to apply the sparse learning model to large-scale problems that have massive data samples with high-dimensional features. One popular and promising strategy…

Large-scale $\ell_1$-regularized loss minimization problems arise in high-dimensional applications such as compressed sensing and high-dimensional supervised learning, including classification and regression problems. In many applications, it remains challenging to apply the sparse learning model to large-scale problems that have massive data samples with high-dimensional features. One popular and promising strategy is to scaling up the optimization problem in parallel. Parallel solvers run multiple cores on a shared memory system or a distributed environment to speed up the computation, while the practical usage is limited by the huge dimension in the feature space and synchronization problems.

In this dissertation, I carry out the research along the direction with particular focuses on scaling up the optimization of sparse learning for supervised and unsupervised learning problems. For the supervised learning, I firstly propose an asynchronous parallel solver to optimize the large-scale sparse learning model in a multithreading environment. Moreover, I propose a distributed framework to conduct the learning process when the dataset is distributed stored among different machines. Then the proposed model is further extended to the studies of risk genetic factors for Alzheimer's Disease (AD) among different research institutions, integrating a group feature selection framework to rank the top risk SNPs for AD. For the unsupervised learning problem, I propose a highly efficient solver, termed Stochastic Coordinate Coding (SCC), scaling up the optimization of dictionary learning and sparse coding problems. The common issue for the medical imaging research is that the longitudinal features of patients among different time points are beneficial to study together. To further improve the dictionary learning model, I propose a multi-task dictionary learning method, learning the different task simultaneously and utilizing shared and individual dictionary to encode both consistent and changing imaging features.

ContributorsLi, Qingyang (Author) / Ye, Jieping (Thesis advisor) / Xue, Guoliang (Thesis advisor) / He, Jingrui (Committee member) / Wang, Yalin (Committee member) / Li, Jing (Committee member) / Arizona State University (Publisher)

Created2017

Filtering by