Search Content

Adaptive decentralized routing and detection of overlapping communities

Description

This dissertation studies routing in small-world networks such as grids plus long-range edges and real networks. Kleinberg showed that geography-based greedy routing in a grid-based network takes an expected number of steps polylogarithmic in the network size, thus justifying empirical efficiency observed beginning with Milgram. A counterpart for the grid-based…

This dissertation studies routing in small-world networks such as grids plus long-range edges and real networks. Kleinberg showed that geography-based greedy routing in a grid-based network takes an expected number of steps polylogarithmic in the network size, thus justifying empirical efficiency observed beginning with Milgram. A counterpart for the grid-based model is provided; it creates all edges deterministically and shows an asymptotically matching upper bound on the route length. The main goal is to improve greedy routing through a decentralized machine learning process. Two considered methods are based on weighted majority and an algorithm of de Farias and Megiddo, both learning from feedback using ensembles of experts. Tests are run on both artificial and real networks, with decentralized spectral graph embedding supplying geometric information for real networks where it is not intrinsically available. An important measure analyzed in this work is overpayment, the difference between the cost of the method and that of the shortest path. Adaptive routing overtakes greedy after about a hundred or fewer searches per node, consistently across different network sizes and types. Learning stabilizes, typically at overpayment of a third to a half of that by greedy. The problem is made more difficult by eliminating the knowledge of neighbors' locations or by introducing uncooperative nodes. Even under these conditions, the learned routes are usually better than the greedy routes. The second part of the dissertation is related to the community structure of unannotated networks. A modularity-based algorithm of Newman is extended to work with overlapping communities (including considerably overlapping communities), where each node locally makes decisions to which potential communities it belongs. To measure quality of a cover of overlapping communities, a notion of a node contribution to modularity is introduced, and subsequently the notion of modularity is extended from partitions to covers. The final part considers a problem of network anonymization, mostly by the means of edge deletion. The point of interest is utility preservation. It is shown that a concentration on the preservation of routing abilities might damage the preservation of community structure, and vice versa.

ContributorsBakun, Oleg (Author) / Konjevod, Goran (Thesis advisor) / Richa, Andrea (Thesis advisor) / Syrotiuk, Violet R. (Committee member) / Czygrinow, Andrzej (Committee member) / Arizona State University (Publisher)

Created2011

Covering arrays: algorithms and asymptotics

Description

Modern software and hardware systems are composed of a large number of components. Often different components of a system interact with each other in unforeseen and undesired ways to cause failures. Covering arrays are a useful mathematical tool for testing all possible t-way interactions among the components of a system.
…

Modern software and hardware systems are composed of a large number of components. Often different components of a system interact with each other in unforeseen and undesired ways to cause failures. Covering arrays are a useful mathematical tool for testing all possible t-way interactions among the components of a system.

The two major issues concerning covering arrays are explicit construction of a covering array, and exact or approximate determination of the covering array number---the minimum size of a covering array. Although these problems have been investigated extensively for the last couple of decades, in this thesis we present significant improvements on both of these questions using tools from the probabilistic method and randomized algorithms.

First, a series of improvements is developed on the previously known upper bounds on covering array numbers. An estimate for the discrete Stein-Lovász-Johnson bound is derived and the Stein- Lovász -Johnson bound is improved upon using an alteration strategy. Then group actions on the set of symbols are explored to establish two asymptotic upper bounds on covering array numbers that are tighter than any of the presently known bounds.

Second, an algorithmic paradigm, called the two-stage framework, is introduced for covering array construction. A number of concrete algorithms from this framework are analyzed, and it is shown that they outperform current methods in the range of parameter values that are of practical relevance. In some cases, a reduction in the number of tests by more than 50% is achieved.

Third, the Lovász local lemma is applied on covering perfect hash families to obtain an upper bound on covering array numbers that is tightest of all known bounds. This bound leads to a Moser-Tardos type algorithm that employs linear algebraic computation over finite fields to construct covering arrays. In some cases, this algorithm outperforms currently used methods by more than an 80% margin.

Finally, partial covering arrays are introduced to investigate a few practically relevant relaxations of the covering requirement. Using probabilistic methods, bounds are obtained on partial covering arrays that are significantly smaller than for covering arrays. Also, randomized algorithms are provided that construct such arrays in expected polynomial time.

ContributorsSarakāra, Kauśika (Author) / Colbourn, Charles J. (Thesis advisor) / Czygrinow, Andrzej (Committee member) / Richa, Andréa W. (Committee member) / Syrotiuk, Violet R. (Committee member) / Arizona State University (Publisher)

Created2016

The first-fit algorithm uses many colors on some interval graphs

Description

Graph coloring is about allocating resources that can be shared except where there are certain pairwise conflicts between recipients. The simplest coloring algorithm that attempts to conserve resources is called first fit. Interval graphs are used in models for scheduling (in computer science and operations research) and in biochemistry for…

Graph coloring is about allocating resources that can be shared except where there are certain pairwise conflicts between recipients. The simplest coloring algorithm that attempts to conserve resources is called first fit. Interval graphs are used in models for scheduling (in computer science and operations research) and in biochemistry for one-dimensional molecules such as genetic material. It is not known precisely how much waste in the worst case is due to the first-fit algorithm for coloring interval graphs. However, after decades of research the range is narrow. Kierstead proved that the performance ratio R is at most 40. Pemmaraju, Raman, and Varadarajan proved that R is at most 10. This can be improved to 8. Witsenhausen, and independently Chrobak and Slusarek, proved that R is at least 4. Slusarek improved this to 4.45. Kierstead and Trotter extended the method of Chrobak and Slusarek to one good for a lower bound of 4.99999 or so. The method relies on number sequences with a certain property of order. It is shown here that each sequence considered in the construction satisfies a linear recurrence; that R is at least 5; that the Fibonacci sequence is in some sense minimally useless for the construction; and that the Fibonacci sequence is a point of accumulation in some space for the useful sequences of the construction. Limitations of all earlier constructions are revealed.

ContributorsSmith, David A. (Author) / Kierstead, Henry A. (Thesis advisor) / Czygrinow, Andrzej (Committee member) / Gelb, Anne (Committee member) / Hurlbert, Glenn H. (Committee member) / Kadell, Kevin W. J. (Committee member) / Arizona State University (Publisher)

Created2010

Access Balancing in Storage Systems by Labelling Steiner Systems

Description

A storage system requiring file redundancy and on-line repairability can be represented as a Steiner system, a combinatorial design with the property that every $t$-subset of its points occurs in exactly one of its blocks. Under this representation, files are the points and storage units are the blocks of the…

A storage system requiring file redundancy and on-line repairability can be represented as a Steiner system, a combinatorial design with the property that every $t$-subset of its points occurs in exactly one of its blocks. Under this representation, files are the points and storage units are the blocks of the Steiner system, or vice-versa. Often, the popularities of the files of such storage systems run the gamut, with some files receiving hardly any attention, and others receiving most of it. For such systems, minimizing the difference in the collective popularity between any two storage units is nontrivial; this is the access balancing problem. With regard to the representative Steiner system, the access balancing problem in its simplest form amounts to constructing either a point or block labelling: an assignment of a set of integer labels (popularity ranks) to the Steiner system's point set or block set, respectively, requiring of the former assignment that the sums of the labelled points of any two blocks differ as little as possible and of the latter that the sums of the labels assigned to the containing blocks of any two distinct points differ as little as possible. The central aim of this dissertation is to supply point and block labellings for Steiner systems of block size greater than three, for which up to this point no attempt has been made. Four major results are given in this connection. First, motivated by the close connection between the size of the independent sets of a Steiner system and the quality of its labellings, a Steiner triple system of any admissible order is constructed with a pair of disjoint independent sets of maximum cardinality. Second, the spectrum of resolvable Bose triple systems is determined in order to label some Steiner 2-designs with block size four. Third, several kinds of independent sets are used to point-label Steiner 2-designs with block size four. Finally, optimal and close to optimal block labellings are given for an infinite class of 1-rotational resolvable Steiner 2-designs with arbitrarily large block size by exploiting their underlying group-theoretic properties.

ContributorsLusi, Dylan (Author) / Colbourn, Charles J (Thesis advisor) / Czygrinow, Andrzej (Committee member) / Fainekos, Georgios (Committee member) / Richa, Andrea (Committee member) / Arizona State University (Publisher)

Created2021

Hash Families and Applications to t-Restrictions

Description

The construction of many families of combinatorial objects remains a challenging problem. A t-restriction is an array where a predicate is satisfied for every t columns; an example is a perfect hash family (PHF). The composition of a PHF and any t-restriction satisfying predicate P yields another t-restriction also satisfying…

The construction of many families of combinatorial objects remains a challenging problem. A t-restriction is an array where a predicate is satisfied for every t columns; an example is a perfect hash family (PHF). The composition of a PHF and any t-restriction satisfying predicate P yields another t-restriction also satisfying P with more columns than the original t-restriction had. This thesis concerns three approaches in determining the smallest size of PHFs.

Firstly, hash families in which the associated property is satisfied at least some number lambda times are considered, called higher-index, which guarantees redundancy when constructing t-restrictions. Some direct and optimal constructions of hash families of higher index are given. A new recursive construction is established that generalizes previous results and generates higher-index PHFs with more columns. Probabilistic methods are employed to obtain an upper bound on the optimal size of higher-index PHFs when the number of columns is large. A new deterministic algorithm is developed that generates such PHFs meeting this bound, and computational results are reported.

Secondly, a restriction on the structure of PHFs is introduced, called fractal, a method from Blackburn. His method is extended in several ways; from homogeneous hash families (every row has the same number of symbols) to heterogeneous ones; and to distributing hash families, a relaxation of the predicate for PHFs. Recursive constructions with fractal hash families as ingredients are given, and improve upon on the best-known sizes of many PHFs.

Thirdly, a method of Colbourn and Lanus is extended in which they horizontally copied a given hash family and greedily applied transformations to each copy. Transformations of existential t-restrictions are introduced, which allow for the method to be applicable to any t-restriction having structure like those of hash families. A genetic algorithm is employed for finding the "best" such transformations. Computational results of the GA are reported using PHFs, as the number of transformations permitted is large compared to the number of symbols. Finally, an analysis is given of what trade-offs exist between computation time and the number of t-sets left not satisfying the predicate.

ContributorsDougherty, Ryan Edward (Author) / Colbourn, Charles J (Thesis advisor) / Czygrinow, Andrzej (Committee member) / Forrest, Stephanie (Committee member) / Richa, Andrea (Committee member) / Arizona State University (Publisher)

Created2019

Filtering by

Adaptive decentralized routing and detection of overlapping communities

Covering arrays: algorithms and asymptotics

The first-fit algorithm uses many colors on some interval graphs

Access Balancing in Storage Systems by Labelling Steiner Systems

Hash Families and Applications to t-Restrictions