Matching Items (14)

150111-Thumbnail Image.png

Post-optimization: necessity analysis for combinatorial arrays

Description

Finding the optimal solution to a problem with an enormous search space can be challenging. Unless a combinatorial construction technique is found that also guarantees the optimality of the resulting

Finding the optimal solution to a problem with an enormous search space can be challenging. Unless a combinatorial construction technique is found that also guarantees the optimality of the resulting solution, this could be an infeasible task. If such a technique is unavailable, different heuristic methods are generally used to improve the upper bound on the size of the optimal solution. This dissertation presents an alternative method which can be used to improve a solution to a problem rather than construct a solution from scratch. Necessity analysis, which is the key to this approach, is the process of analyzing the necessity of each element in a solution. The post-optimization algorithm presented here utilizes the result of the necessity analysis to improve the quality of the solution by eliminating unnecessary objects from the solution. While this technique could potentially be applied to different domains, this dissertation focuses on k-restriction problems, where a solution to the problem can be presented as an array. A scalable post-optimization algorithm for covering arrays is described, which starts from a valid solution and performs necessity analysis to iteratively improve the quality of the solution. It is shown that not only can this technique improve upon the previously best known results, it can also be added as a refinement step to any construction technique and in most cases further improvements are expected. The post-optimization algorithm is then modified to accommodate every k-restriction problem; and this generic algorithm can be used as a starting point to create a reasonable sized solution for any such problem. This generic algorithm is then further refined for hash family problems, by adding a conflict graph analysis to the necessity analysis phase. By recoloring the conflict graphs a new degree of flexibility is explored, which can further improve the quality of the solution.

Contributors

Agent

Created

Date Created
  • 2011

150114-Thumbnail Image.png

A computational framework to model and learn context-specific gene regulatory networks from multi-source data

Description

Reverse engineering gene regulatory networks (GRNs) is an important problem in the domain of Systems Biology. Learning GRNs is challenging due to the inherent complexity of the real regulatory networks

Reverse engineering gene regulatory networks (GRNs) is an important problem in the domain of Systems Biology. Learning GRNs is challenging due to the inherent complexity of the real regulatory networks and the heterogeneity of samples in available biomedical data. Real world biological data are commonly collected from broad surveys (profiling studies) and aggregate highly heterogeneous biological samples. Popular methods to learn GRNs simplistically assume a single universal regulatory network corresponding to available data. They neglect regulatory network adaptation due to change in underlying conditions and cellular phenotype or both. This dissertation presents a novel computational framework to learn common regulatory interactions and networks underlying the different sets of relatively homogeneous samples from real world biological data. The characteristic set of samples/conditions and corresponding regulatory interactions defines the cellular context (context). Context, in this dissertation, represents the deterministic transcriptional activity within the specific cellular regulatory mechanism. The major contributions of this framework include - modeling and learning context specific GRNs; associating enriched samples with contexts to interpret contextual interactions using biological knowledge; pruning extraneous edges from the context-specific GRN to improve the precision of the final GRNs; integrating multisource data to learn inter and intra domain interactions and increase confidence in obtained GRNs; and finally, learning combinatorial conditioning factors from the data to identify regulatory cofactors. The framework, Expattern, was applied to both real world and synthetic data. Interesting insights were obtained into mechanism of action of drugs on analysis of NCI60 drug activity and gene expression data. Application to refractory cancer data and Glioblastoma multiforme yield GRNs that were readily annotated with context-specific phenotypic information. Refractory cancer GRNs also displayed associations between distinct cancers, not observed through only clustering. Performance comparisons on multi-context synthetic data show the framework Expattern performs better than other comparable methods.

Contributors

Agent

Created

Date Created
  • 2011

152172-Thumbnail Image.png

Scheduled medium access control in mobile ad hoc networks

Description

The primary function of the medium access control (MAC) protocol is managing access to a shared communication channel. From the viewpoint of transmitters, the MAC protocol determines each transmitter's persistence,

The primary function of the medium access control (MAC) protocol is managing access to a shared communication channel. From the viewpoint of transmitters, the MAC protocol determines each transmitter's persistence, the fraction of time it is permitted to spend transmitting. Schedule-based schemes implement stable persistences, achieving low variation in delay and throughput, and sometimes bounding maximum delay. However, they adapt slowly, if at all, to changes in the network. Contention-based schemes are agile, adapting quickly to changes in perceived contention, but suffer from short-term unfairness, large variations in packet delay, and poor performance at high load. The perfect MAC protocol, it seems, embodies the strengths of both contention- and schedule-based approaches while avoiding their weaknesses. This thesis culminates in the design of a Variable-Weight and Adaptive Topology Transparent (VWATT) MAC protocol. The design of VWATT first required answers for two questions: (1) If a node is equipped with schedules of different weights, which weight should it employ? (2) How is the node to compute the desired weight in a network lacking centralized control? The first question is answered by the Topology- and Load-Aware (TLA) allocation which defines target persistences that conform to both network topology and traffic load. Simulations show the TLA allocation to outperform IEEE 802.11, improving on the expectation and variation of delay, throughput, and drop rate. The second question is answered in the design of an Adaptive Topology- and Load-Aware Scheduled (ATLAS) MAC that computes the TLA allocation in a decentralized and adaptive manner. Simulation results show that ATLAS converges quickly on the TLA allocation, supporting highly dynamic networks. With these questions answered, a construction based on transversal designs is given for a variable-weight topology transparent schedule that allows nodes to dynamically and independently select weights to accommodate local topology and traffic load. The schedule maintains a guarantee on maximum delay when the maximum neighbourhood size is not too large. The schedule is integrated with the distributed computation of ATLAS to create VWATT. Simulations indicate that VWATT offers the stable performance characteristics of a scheduled MAC while adapting quickly to changes in topology and traffic load.

Contributors

Agent

Created

Date Created
  • 2013

151063-Thumbnail Image.png

Robust and efficient medium access despite jamming

Description

Interference constitutes a major challenge for communication networks operating over a shared medium where availability is imperative. This dissertation studies the problem of designing and analyzing efficient medium access protocols

Interference constitutes a major challenge for communication networks operating over a shared medium where availability is imperative. This dissertation studies the problem of designing and analyzing efficient medium access protocols which are robust against strong adversarial jamming. More specifically, four medium access (MAC) protocols (i.e., JADE, ANTIJAM, COMAC, and SINRMAC) which aim to achieve high throughput despite jamming activities under a variety of network and adversary models are presented. We also propose a self-stabilizing leader election protocol, SELECT, that can effectively elect a leader in the network with the existence of a strong adversary. Our protocols can not only deal with internal interference without the exact knowledge on the number of participants in the network, but they are also robust to unintentional or intentional external interference, e.g., due to co-existing networks or jammers. We model the external interference by a powerful adaptive and/or reactive adversary which can jam a (1 − ε)-portion of the time steps, where 0 < ε ≤ 1 is an arbitrary constant. We allow the adversary to be adaptive and to have complete knowledge of the entire protocol history. Moreover, in case the adversary is also reactive, it uses carrier sensing to make informed decisions to disrupt communications. Among the proposed protocols, JADE, ANTIJAM and COMAC are able to achieve Θ(1)-competitive throughput with the presence of the strong adversary; while SINRMAC is the first attempt to apply SINR model (i.e., Signal to Interference plus Noise Ratio), in robust medium access protocols design; the derived principles are also useful to build applications on top of the MAC layer, and we present SELECT, which is an exemplary study for leader election, which is one of the most fundamental tasks in distributed computing.

Contributors

Agent

Created

Date Created
  • 2012

156648-Thumbnail Image.png

Maximizing Routing Throughput with Applications to Delay Tolerant Networks

Description

Many applications require efficient data routing and dissemination in Delay Tolerant Networks (DTNs) in order to maximize the throughput of data in the network, such as providing healthcare to remote

Many applications require efficient data routing and dissemination in Delay Tolerant Networks (DTNs) in order to maximize the throughput of data in the network, such as providing healthcare to remote communities, and spreading related information in Mobile Social Networks (MSNs). In this thesis, the feasibility of using boats in the Amazon Delta Riverine region as data mule nodes is investigated and a robust data routing algorithm based on a fountain code approach is designed to ensure fast and timely data delivery considering unpredictable boat delays, break-downs, and high transmission failures. Then, the scenario of providing healthcare in Amazon Delta Region is extended to a general All-or-Nothing (Splittable) Multicommodity Flow (ANF) problem and a polynomial time constant approximation algorithm is designed for the maximum throughput routing problem based on a randomized rounding scheme with applications to DTNs. In an MSN, message content is closely related to users’ preferences, and can be used to significantly impact the performance of data dissemination. An interest- and content-based algorithm is developed where the contents of the messages, along with the network structural information are taken into consideration when making message relay decisions in order to maximize data throughput in an MSN. Extensive experiments show the effectiveness of the above proposed data dissemination algorithm by comparing it with state-of-the-art techniques.

Contributors

Agent

Created

Date Created
  • 2018

152849-Thumbnail Image.png

Infinite cacheflow: a rule-caching solution for software defined networks

Description

New OpenFlow switches support a wide range of network applications, such as firewalls, load balancers, routers, and traffic monitoring. While ternary content addressable memory (TCAM) allows switches to process packets

New OpenFlow switches support a wide range of network applications, such as firewalls, load balancers, routers, and traffic monitoring. While ternary content addressable memory (TCAM) allows switches to process packets at high speed based on multiple header fields, today's commodity switches support just thousands to tens of thousands of forwarding rules. To allow for finer-grained policies on this hardware, efficient ways to support the abstraction of a switch are needed with arbitrarily large rule tables. To do so, a hardware-software hybrid switch is designed that relies on rule caching to provide large rule tables at low cost. Unlike traditional caching solutions, neither individual rules are cached (to respect rule dependencies) nor compressed (to preserve the per-rule traffic counts). Instead long dependency chains are ``spliced'' to cache smaller groups of rules while preserving the semantics of the network policy. The proposed hybrid switch design satisfies three criteria: (1) responsiveness, to allow rapid changes to the cache with minimal effect on traffic throughput; (2) transparency, to faithfully support native OpenFlow semantics; (3) correctness, to cache rules while preserving the semantics of the original policy. The evaluation of the hybrid switch on large rule tables suggest that it can effectively expose the benefits of both hardware and software switches to the controller and to applications running on top of it.

Contributors

Agent

Created

Date Created
  • 2014

155047-Thumbnail Image.png

Covering arrays: algorithms and asymptotics

Description

Modern software and hardware systems are composed of a large number of components. Often different components of a system interact with each other in unforeseen and undesired ways to cause

Modern software and hardware systems are composed of a large number of components. Often different components of a system interact with each other in unforeseen and undesired ways to cause failures. Covering arrays are a useful mathematical tool for testing all possible t-way interactions among the components of a system.

The two major issues concerning covering arrays are explicit construction of a covering array, and exact or approximate determination of the covering array number---the minimum size of a covering array. Although these problems have been investigated extensively for the last couple of decades, in this thesis we present significant improvements on both of these questions using tools from the probabilistic method and randomized algorithms.

First, a series of improvements is developed on the previously known upper bounds on covering array numbers. An estimate for the discrete Stein-Lovász-Johnson bound is derived and the Stein- Lovász -Johnson bound is improved upon using an alteration strategy. Then group actions on the set of symbols are explored to establish two asymptotic upper bounds on covering array numbers that are tighter than any of the presently known bounds.

Second, an algorithmic paradigm, called the two-stage framework, is introduced for covering array construction. A number of concrete algorithms from this framework are analyzed, and it is shown that they outperform current methods in the range of parameter values that are of practical relevance. In some cases, a reduction in the number of tests by more than 50% is achieved.

Third, the Lovász local lemma is applied on covering perfect hash families to obtain an upper bound on covering array numbers that is tightest of all known bounds. This bound leads to a Moser-Tardos type algorithm that employs linear algebraic computation over finite fields to construct covering arrays. In some cases, this algorithm outperforms currently used methods by more than an 80% margin.

Finally, partial covering arrays are introduced to investigate a few practically relevant relaxations of the covering requirement. Using probabilistic methods, bounds are obtained on partial covering arrays that are significantly smaller than for covering arrays. Also, randomized algorithms are provided that construct such arrays in expected polynomial time.

Contributors

Agent

Created

Date Created
  • 2016

158544-Thumbnail Image.png

Improved Bi-criteria Approximation for the All-or-Nothing Multicommodity Flow Problem in Arbitrary Networks

Description

This thesis addresses the following fundamental maximum throughput routing problem: Given an arbitrary edge-capacitated n-node directed network and a set of k commodities, with source-destination pairs (s_i,t_i) and demands d_i>

This thesis addresses the following fundamental maximum throughput routing problem: Given an arbitrary edge-capacitated n-node directed network and a set of k commodities, with source-destination pairs (s_i,t_i) and demands d_i> 0, admit and route the largest possible number of commodities -- i.e., the maximum throughput -- to satisfy their demands.

The main contributions of this thesis are three-fold: First, a bi-criteria approximation algorithm is presented for this all-or-nothing multicommodity flow (ANF) problem. This algorithm is the first to achieve a constant approximation of the maximum throughput with an edge capacity violation ratio that is at most logarithmic in n, with high probability. The approach used is based on a version of randomized rounding that keeps splittable flows, rather than approximating those via a non-splittable path for each commodity: This allows it to work for arbitrary directed edge-capacitated graphs, unlike most of the prior work on the ANF problem. The algorithm also works if a weighted throughput is considered, where the benefit gained by fully satisfying the demand for commodity i is determined by a given weight w_i>0. Second, a derandomization of the algorithm is presented that maintains the same approximation bounds, using novel pessimistic estimators for Bernstein's inequality. In addition, it is shown how the framework can be adapted to achieve a polylogarithmic fraction of the maximum throughput while maintaining a constant edge capacity violation, if the network capacity is large enough. Lastly, one important aspect of the randomized and derandomized algorithms is their simplicity, which lends to efficient implementations in practice. The implementations of both randomized rounding and derandomized algorithms for the ANF problem are presented and show their efficiency in practice.

Contributors

Agent

Created

Date Created
  • 2020

154160-Thumbnail Image.png

Covering arrays: generation and post-optimization

Description

Exhaustive testing is generally infeasible except in the smallest of systems. Research

has shown that testing the interactions among fewer (up to 6) components is generally

sufficient while retaining the capability to

Exhaustive testing is generally infeasible except in the smallest of systems. Research

has shown that testing the interactions among fewer (up to 6) components is generally

sufficient while retaining the capability to detect up to 99% of defects. This leads to a

substantial decrease in the number of tests. Covering arrays are combinatorial objects

that guarantee that every interaction is tested at least once.

In the absence of direct constructions, forming small covering arrays is generally

an expensive computational task. Algorithms to generate covering arrays have been

extensively studied yet no single algorithm provides the smallest solution. More

recently research has been directed towards a new technique called post-optimization.

These algorithms take an existing covering array and attempt to reduce its size.

This thesis presents a new idea for post-optimization by representing covering

arrays as graphs. Some properties of these graphs are established and the results are

contrasted with existing post-optimization algorithms. The idea is then generalized to

close variants of covering arrays with surprising results which in some cases reduce

the size by 30%. Applications of the method to generation and test prioritization are

studied and some interesting results are reported.

Contributors

Agent

Created

Date Created
  • 2015

149501-Thumbnail Image.png

Detecting sybil nodes in static and dynamic networks

Description

Peer-to-peer systems are known to be vulnerable to the Sybil attack. The lack of a central authority allows a malicious user to create many fake identities (called Sybil nodes) pretending

Peer-to-peer systems are known to be vulnerable to the Sybil attack. The lack of a central authority allows a malicious user to create many fake identities (called Sybil nodes) pretending to be independent honest nodes. The goal of the malicious user is to influence the system on his/her behalf. In order to detect the Sybil nodes and prevent the attack, a reputation system is used for the nodes, built through observing its interactions with its peers. The construction makes every node a part of a distributed authority that keeps records on the reputation and behavior of the nodes. Records of interactions between nodes are broadcast by the interacting nodes and honest reporting proves to be a Nash Equilibrium for correct (non-Sybil) nodes. In this research is argued that in realistic communication schedule scenarios, simple graph-theoretic queries such as the computation of Strongly Connected Components and Densest Subgraphs, help in exposing those nodes most likely to be Sybil, which are then proved to be Sybil or not through a direct test executed by some peers.

Contributors

Agent

Created

Date Created
  • 2010