Search Content

Correlation based tools for analysis of dynamic networks

Description

Time series analysis of dynamic networks is an important area of study that helps in predicting changes in networks. Changes in networks are used to analyze deviations in the network characteristics. This analysis helps in characterizing any network that has dynamic behavior. This area of study has applications in many…

Time series analysis of dynamic networks is an important area of study that helps in predicting changes in networks. Changes in networks are used to analyze deviations in the network characteristics. This analysis helps in characterizing any network that has dynamic behavior. This area of study has applications in many domains such as communication networks, climate networks, social networks, transportation networks, and biological networks. The aim of this research is to analyze the structural characteristics of such dynamic networks. This thesis examines tools that help to analyze the structure of the networks and explores a technique for computation and analysis of a large climate dataset. The computations for analyzing the structural characteristics are done in a computing cluster and there is a linear speed up in computation time compared to a single-core computer. As an application, a large sea ice concentration anomaly dataset is analyzed. The large dataset is used to construct a correlation based graph. The results suggest that the climate data has the characteristics of a small-world graph.

ContributorsParamasivam, Kumaraguru (Author) / Colbourn, Charles J (Thesis advisor) / Sen, Arunabhas (Committee member) / Syrotiuk, Violet R. (Committee member) / Arizona State University (Publisher)

Created2011

Sparse learning package with stability selection and application to alzheimer's disease

Description

Sparse learning is a technique in machine learning for feature selection and dimensionality reduction, to find a sparse set of the most relevant features. In any machine learning problem, there is a considerable amount of irrelevant information, and separating relevant information from the irrelevant information has been a topic of…

Sparse learning is a technique in machine learning for feature selection and dimensionality reduction, to find a sparse set of the most relevant features. In any machine learning problem, there is a considerable amount of irrelevant information, and separating relevant information from the irrelevant information has been a topic of focus. In supervised learning like regression, the data consists of many features and only a subset of the features may be responsible for the result. Also, the features might require special structural requirements, which introduces additional complexity for feature selection. The sparse learning package, provides a set of algorithms for learning a sparse set of the most relevant features for both regression and classification problems. Structural dependencies among features which introduce additional requirements are also provided as part of the package. The features may be grouped together, and there may exist hierarchies and over- lapping groups among these, and there may be requirements for selecting the most relevant groups among them. In spite of getting sparse solutions, the solutions are not guaranteed to be robust. For the selection to be robust, there are certain techniques which provide theoretical justification of why certain features are selected. The stability selection, is a method for feature selection which allows the use of existing sparse learning methods to select the stable set of features for a given training sample. This is done by assigning probabilities for the features: by sub-sampling the training data and using a specific sparse learning technique to learn the relevant features, and repeating this a large number of times, and counting the probability as the number of times a feature is selected. Cross-validation which is used to determine the best parameter value over a range of values, further allows to select the best parameter value. This is done by selecting the parameter value which gives the maximum accuracy score. With such a combination of algorithms, with good convergence guarantees, stable feature selection properties and the inclusion of various structural dependencies among features, the sparse learning package will be a powerful tool for machine learning research. Modular structure, C implementation, ATLAS integration for fast linear algebraic subroutines, make it one of the best tool for a large sparse setting. The varied collection of algorithms, support for group sparsity, batch algorithms, are a few of the notable functionality of the SLEP package, and these features can be used in a variety of fields to infer relevant elements. The Alzheimer Disease(AD) is a neurodegenerative disease, which gradually leads to dementia. The SLEP package is used for feature selection for getting the most relevant biomarkers from the available AD dataset, and the results show that, indeed, only a subset of the features are required to gain valuable insights.

ContributorsThulasiram, Ramesh (Author) / Ye, Jieping (Thesis advisor) / Xue, Guoliang (Committee member) / Sen, Arunabha (Committee member) / Arizona State University (Publisher)

Created2011

Multi-task learning via structured regularization: formulations, algorithms, and applications

Description

Multi-task learning (MTL) aims to improve the generalization performance (of the resulting classifiers) by learning multiple related tasks simultaneously. Specifically, MTL exploits the intrinsic task relatedness, based on which the informative domain knowledge from each task can be shared across multiple tasks and thus facilitate the individual task learning. It…

Multi-task learning (MTL) aims to improve the generalization performance (of the resulting classifiers) by learning multiple related tasks simultaneously. Specifically, MTL exploits the intrinsic task relatedness, based on which the informative domain knowledge from each task can be shared across multiple tasks and thus facilitate the individual task learning. It is particularly desirable to share the domain knowledge (among the tasks) when there are a number of related tasks but only limited training data is available for each task. Modeling the relationship of multiple tasks is critical to the generalization performance of the MTL algorithms. In this dissertation, I propose a series of MTL approaches which assume that multiple tasks are intrinsically related via a shared low-dimensional feature space. The proposed MTL approaches are developed to deal with different scenarios and settings; they are respectively formulated as mathematical optimization problems of minimizing the empirical loss regularized by different structures. For all proposed MTL formulations, I develop the associated optimization algorithms to find their globally optimal solution efficiently. I also conduct theoretical analysis for certain MTL approaches by deriving the globally optimal solution recovery condition and the performance bound. To demonstrate the practical performance, I apply the proposed MTL approaches on different real-world applications: (1) Automated annotation of the Drosophila gene expression pattern images; (2) Categorization of the Yahoo web pages. Our experimental results demonstrate the efficiency and effectiveness of the proposed algorithms.

ContributorsChen, Jianhui (Author) / Ye, Jieping (Thesis advisor) / Kumar, Sudhir (Committee member) / Liu, Huan (Committee member) / Xue, Guoliang (Committee member) / Arizona State University (Publisher)

Created2011

Adaptive decentralized routing and detection of overlapping communities

Description

This dissertation studies routing in small-world networks such as grids plus long-range edges and real networks. Kleinberg showed that geography-based greedy routing in a grid-based network takes an expected number of steps polylogarithmic in the network size, thus justifying empirical efficiency observed beginning with Milgram. A counterpart for the grid-based…

This dissertation studies routing in small-world networks such as grids plus long-range edges and real networks. Kleinberg showed that geography-based greedy routing in a grid-based network takes an expected number of steps polylogarithmic in the network size, thus justifying empirical efficiency observed beginning with Milgram. A counterpart for the grid-based model is provided; it creates all edges deterministically and shows an asymptotically matching upper bound on the route length. The main goal is to improve greedy routing through a decentralized machine learning process. Two considered methods are based on weighted majority and an algorithm of de Farias and Megiddo, both learning from feedback using ensembles of experts. Tests are run on both artificial and real networks, with decentralized spectral graph embedding supplying geometric information for real networks where it is not intrinsically available. An important measure analyzed in this work is overpayment, the difference between the cost of the method and that of the shortest path. Adaptive routing overtakes greedy after about a hundred or fewer searches per node, consistently across different network sizes and types. Learning stabilizes, typically at overpayment of a third to a half of that by greedy. The problem is made more difficult by eliminating the knowledge of neighbors' locations or by introducing uncooperative nodes. Even under these conditions, the learned routes are usually better than the greedy routes. The second part of the dissertation is related to the community structure of unannotated networks. A modularity-based algorithm of Newman is extended to work with overlapping communities (including considerably overlapping communities), where each node locally makes decisions to which potential communities it belongs. To measure quality of a cover of overlapping communities, a notion of a node contribution to modularity is introduced, and subsequently the notion of modularity is extended from partitions to covers. The final part considers a problem of network anonymization, mostly by the means of edge deletion. The point of interest is utility preservation. It is shown that a concentration on the preservation of routing abilities might damage the preservation of community structure, and vice versa.

ContributorsBakun, Oleg (Author) / Konjevod, Goran (Thesis advisor) / Richa, Andrea (Thesis advisor) / Syrotiuk, Violet R. (Committee member) / Czygrinow, Andrzej (Committee member) / Arizona State University (Publisher)

Created2011

Structured sparse learning and its applications to biomedical and biological data

Description

Sparsity has become an important modeling tool in areas such as genetics, signal and audio processing, medical image processing, etc. Via the penalization of l-1 norm based regularization, the structured sparse learning algorithms can produce highly accurate models while imposing various predefined structures on the data, such as feature groups…

Sparsity has become an important modeling tool in areas such as genetics, signal and audio processing, medical image processing, etc. Via the penalization of l-1 norm based regularization, the structured sparse learning algorithms can produce highly accurate models while imposing various predefined structures on the data, such as feature groups or graphs. In this thesis, I first propose to solve a sparse learning model with a general group structure, where the predefined groups may overlap with each other. Then, I present three real world applications which can benefit from the group structured sparse learning technique. In the first application, I study the Alzheimer's Disease diagnosis problem using multi-modality neuroimaging data. In this dataset, not every subject has all data sources available, exhibiting an unique and challenging block-wise missing pattern. In the second application, I study the automatic annotation and retrieval of fruit-fly gene expression pattern images. Combined with the spatial information, sparse learning techniques can be used to construct effective representation of the expression images. In the third application, I present a new computational approach to annotate developmental stage for Drosophila embryos in the gene expression images. In addition, it provides a stage score that enables one to more finely annotate each embryo so that they are divided into early and late periods of development within standard stage demarcations. Stage scores help us to illuminate global gene activities and changes much better, and more refined stage annotations improve our ability to better interpret results when expression pattern matches are discovered between genes.

ContributorsYuan, Lei (Author) / Ye, Jieping (Thesis advisor) / Wang, Yalin (Committee member) / Xue, Guoliang (Committee member) / Kumar, Sudhir (Committee member) / Arizona State University (Publisher)

Created2013

A new backoff strategy using topological persistence in wireless networks

Description

Contention based IEEE 802.11MAC uses the binary exponential backoff algorithm (BEB) for the contention resolution. The protocol suffers poor performance in the heavily loaded networks and MANETs, high collision rate and packet drops, probabilistic delay guarantees, and unfairness. Many backoff strategies were proposed to improve the performance of IEEE 802.11…

Contention based IEEE 802.11MAC uses the binary exponential backoff algorithm (BEB) for the contention resolution. The protocol suffers poor performance in the heavily loaded networks and MANETs, high collision rate and packet drops, probabilistic delay guarantees, and unfairness. Many backoff strategies were proposed to improve the performance of IEEE 802.11 but all ignore the network topology and demand. Persistence is defined as the fraction of time a node is allowed to transmit, when this allowance should take into account topology and load, it is topology and load aware persistence (TLA). We develop a relation between contention window size and the TLA-persistence. We implement a new backoff strategy where the TLA-persistence is defined as the lexicographic max-min channel allocation. We use a centralized algorithm to calculate each node's TLApersistence and then convert it into a contention window size. The new backoff strategy is evaluated in simulation, comparing with that of the IEEE 802.11 using BEB. In most of the static scenarios like exposed terminal, flow in the middle, star topology, and heavy loaded multi-hop networks and in MANETs, through the simulation study, we show that the new backoff strategy achieves higher overall average throughput as compared to that of the IEEE 802.11 using BEB.

ContributorsBhyravajosyula, Sai Vishnu Kiran (Author) / Syrotiuk, Violet R. (Thesis advisor) / Sen, Arunabha (Committee member) / Richa, Andrea (Committee member) / Arizona State University (Publisher)

Created2013

Scheduled medium access control in mobile ad hoc networks

Description

The primary function of the medium access control (MAC) protocol is managing access to a shared communication channel. From the viewpoint of transmitters, the MAC protocol determines each transmitter's persistence, the fraction of time it is permitted to spend transmitting. Schedule-based schemes implement stable persistences, achieving low variation in delay…

The primary function of the medium access control (MAC) protocol is managing access to a shared communication channel. From the viewpoint of transmitters, the MAC protocol determines each transmitter's persistence, the fraction of time it is permitted to spend transmitting. Schedule-based schemes implement stable persistences, achieving low variation in delay and throughput, and sometimes bounding maximum delay. However, they adapt slowly, if at all, to changes in the network. Contention-based schemes are agile, adapting quickly to changes in perceived contention, but suffer from short-term unfairness, large variations in packet delay, and poor performance at high load. The perfect MAC protocol, it seems, embodies the strengths of both contention- and schedule-based approaches while avoiding their weaknesses. This thesis culminates in the design of a Variable-Weight and Adaptive Topology Transparent (VWATT) MAC protocol. The design of VWATT first required answers for two questions: (1) If a node is equipped with schedules of different weights, which weight should it employ? (2) How is the node to compute the desired weight in a network lacking centralized control? The first question is answered by the Topology- and Load-Aware (TLA) allocation which defines target persistences that conform to both network topology and traffic load. Simulations show the TLA allocation to outperform IEEE 802.11, improving on the expectation and variation of delay, throughput, and drop rate. The second question is answered in the design of an Adaptive Topology- and Load-Aware Scheduled (ATLAS) MAC that computes the TLA allocation in a decentralized and adaptive manner. Simulation results show that ATLAS converges quickly on the TLA allocation, supporting highly dynamic networks. With these questions answered, a construction based on transversal designs is given for a variable-weight topology transparent schedule that allows nodes to dynamically and independently select weights to accommodate local topology and traffic load. The schedule maintains a guarantee on maximum delay when the maximum neighbourhood size is not too large. The schedule is integrated with the distributed computation of ATLAS to create VWATT. Simulations indicate that VWATT offers the stable performance characteristics of a scheduled MAC while adapting quickly to changes in topology and traffic load.

ContributorsLutz, Jonathan (Author) / Colbourn, Charles J (Thesis advisor) / Syrotiuk, Violet R. (Thesis advisor) / Konjevod, Goran (Committee member) / Lloyd, Errol L. (Committee member) / Arizona State University (Publisher)

Created2013

Optimization for resource-constrained wireless networks

Description

Nowadays, wireless communications and networks have been widely used in our daily lives. One of the most important topics related to networking research is using optimization tools to improve the utilization of network resources. In this dissertation, we concentrate on optimization for resource-constrained wireless networks, and study two fundamental resource-allocation…

Nowadays, wireless communications and networks have been widely used in our daily lives. One of the most important topics related to networking research is using optimization tools to improve the utilization of network resources. In this dissertation, we concentrate on optimization for resource-constrained wireless networks, and study two fundamental resource-allocation problems: 1) distributed routing optimization and 2) anypath routing optimization. The study on the distributed routing optimization problem is composed of two main thrusts, targeted at understanding distributed routing and resource optimization for multihop wireless networks. The first thrust is dedicated to understanding the impact of full-duplex transmission on wireless network resource optimization. We propose two provably good distributed algorithms to optimize the resources in a full-duplex wireless network. We prove their optimality and also provide network status analysis using dual space information. The second thrust is dedicated to understanding the influence of network entity load constraints on network resource allocation and routing computation. We propose a provably good distributed algorithm to allocate wireless resources. In addition, we propose a new subgradient optimization framework, which can provide findgrained convergence, optimality, and dual space information at each iteration. This framework can provide a useful theoretical foundation for many networking optimization problems. The study on the anypath routing optimization problem is composed of two main thrusts. The first thrust is dedicated to understanding the computational complexity of multi-constrained anypath routing and designing approximate solutions. We prove that this problem is NP-hard when the number of constraints is larger than one. We present two polynomial time K-approximation algorithms. One is a centralized algorithm while the other one is a distributed algorithm. For the second thrust, we study directional anypath routing and present a cross-layer design of MAC and routing. For the MAC layer, we present a directional anycast MAC. For the routing layer, we propose two polynomial time routing algorithms to compute directional anypaths based on two antenna models, and prove their ptimality based on the packet delivery ratio metric.

ContributorsFang, Xi (Author) / Xue, Guoliang (Thesis advisor) / Yau, Sik-Sang (Committee member) / Ye, Jieping (Committee member) / Zhang, Junshan (Committee member) / Arizona State University (Publisher)

Created2013

Design, analysis and resource allocations in networks in presence of region-based faults

Description

Communication networks, both wired and wireless, are expected to have a certain level of fault-tolerance capability.These networks are also expected to ensure a graceful degradation in performance when some of the network components fail. Traditional studies on fault tolerance in communication networks, for the most part, make no assumptions regarding…

Communication networks, both wired and wireless, are expected to have a certain level of fault-tolerance capability.These networks are also expected to ensure a graceful degradation in performance when some of the network components fail. Traditional studies on fault tolerance in communication networks, for the most part, make no assumptions regarding the location of node/link faults, i.e., the faulty nodes and links may be close to each other or far from each other. However, in many real life scenarios, there exists a strong spatial correlation among the faulty nodes and links. Such failures are often encountered in disaster situations, e.g., natural calamities or enemy attacks. In presence of such region-based faults, many of traditional network analysis and fault-tolerant metrics, that are valid under non-spatially correlated faults, are no longer applicable. To this effect, the main thrust of this research is design and analysis of robust networks in presence of such region-based faults. One important finding of this research is that if some prior knowledge is available on the maximum size of the region that might be affected due to a region-based fault, this piece of knowledge can be effectively utilized for resource efficient design of networks. It has been shown in this dissertation that in some scenarios, effective utilization of this knowledge may result in substantial saving is transmission power in wireless networks. In this dissertation, the impact of region-based faults on the connectivity of wireless networks has been studied and a new metric, region-based connectivity, is proposed to measure the fault-tolerance capability of a network. In addition, novel metrics, such as the region-based component decomposition number(RBCDN) and region-based largest component size(RBLCS) have been proposed to capture the network state, when a region-based fault disconnects the network. Finally, this dissertation presents efficient resource allocation techniques that ensure tolerance against region-based faults, in distributed file storage networks and data center networks.

ContributorsBanerjee, Sujogya (Author) / Sen, Arunabha (Thesis advisor) / Xue, Guoliang (Committee member) / Richa, Andrea (Committee member) / Hurlbert, Glenn (Committee member) / Arizona State University (Publisher)

Created2013

Coping with selfish behavior in networks using game theory

Description

While network problems have been addressed using a central administrative domain with a single objective, the devices in most networks are actually not owned by a single entity but by many individual entities. These entities make their decisions independently and selfishly, and maybe cooperate with a small group of other…

While network problems have been addressed using a central administrative domain with a single objective, the devices in most networks are actually not owned by a single entity but by many individual entities. These entities make their decisions independently and selfishly, and maybe cooperate with a small group of other entities only when this form of coalition yields a better return. The interaction among multiple independent decision-makers necessitates the use of game theory, including economic notions related to markets and incentives. In this dissertation, we are interested in modeling, analyzing, addressing network problems caused by the selfish behavior of network entities. First, we study how the selfish behavior of network entities affects the system performance while users are competing for limited resource. For this resource allocation domain, we aim to study the selfish routing problem in networks with fair queuing on links, the relay assignment problem in cooperative networks, and the channel allocation problem in wireless networks. Another important aspect of this dissertation is the study of designing efficient mechanisms to incentivize network entities to achieve certain system objective. For this incentive mechanism domain, we aim to motivate wireless devices to serve as relays for cooperative communication, and to recruit smartphones for crowdsourcing. In addition, we apply different game theoretic approaches to problems in security and privacy domain. For this domain, we aim to analyze how a user could defend against a smart jammer, who can quickly learn about the user's transmission power. We also design mechanisms to encourage mobile phone users to participate in location privacy protection, in order to achieve k-anonymity.

ContributorsYang, Dejun (Author) / Xue, Guoliang (Thesis advisor) / Richa, Andrea (Committee member) / Sen, Arunabha (Committee member) / Zhang, Junshan (Committee member) / Arizona State University (Publisher)

Created2013

Filtering by