Adaptive decentralized routing and detection of overlapping communities

Document
Description
This dissertation studies routing in small-world networks such as grids plus long-range edges and real networks. Kleinberg showed that geography-based greedy routing in a grid-based network takes an expected number of steps polylogarithmic in the network size, thus justifying empirical

This dissertation studies routing in small-world networks such as grids plus long-range edges and real networks. Kleinberg showed that geography-based greedy routing in a grid-based network takes an expected number of steps polylogarithmic in the network size, thus justifying empirical efficiency observed beginning with Milgram. A counterpart for the grid-based model is provided; it creates all edges deterministically and shows an asymptotically matching upper bound on the route length. The main goal is to improve greedy routing through a decentralized machine learning process. Two considered methods are based on weighted majority and an algorithm of de Farias and Megiddo, both learning from feedback using ensembles of experts. Tests are run on both artificial and real networks, with decentralized spectral graph embedding supplying geometric information for real networks where it is not intrinsically available. An important measure analyzed in this work is overpayment, the difference between the cost of the method and that of the shortest path. Adaptive routing overtakes greedy after about a hundred or fewer searches per node, consistently across different network sizes and types. Learning stabilizes, typically at overpayment of a third to a half of that by greedy. The problem is made more difficult by eliminating the knowledge of neighbors' locations or by introducing uncooperative nodes. Even under these conditions, the learned routes are usually better than the greedy routes. The second part of the dissertation is related to the community structure of unannotated networks. A modularity-based algorithm of Newman is extended to work with overlapping communities (including considerably overlapping communities), where each node locally makes decisions to which potential communities it belongs. To measure quality of a cover of overlapping communities, a notion of a node contribution to modularity is introduced, and subsequently the notion of modularity is extended from partitions to covers. The final part considers a problem of network anonymization, mostly by the means of edge deletion. The point of interest is utility preservation. It is shown that a concentration on the preservation of routing abilities might damage the preservation of community structure, and vice versa.