their data; yet while organizations are interested in protecting against
unauthorized data transfer, there lacks a comprehensive metric to discriminate
what data are at risk of leaking.
This thesis motivates the need for a quantitative leakage risk metric, and
provides a risk assessment system, called Whispers, for computing it. Using
unsupervised machine learning techniques, Whispers uncovers themes in an
organization's document corpus, including previously unknown or unclassified
data. Then, by correlating the document with its authors, Whispers can
identify which data are easier to contain, and conversely which are at risk.
Using the Enron email database, Whispers constructs a social network segmented
by topic themes. This graph uncovers communication channels within the
organization. Using this social network, Whispers determines the risk of each
topic by measuring the rate at which simulated leaks are not detected. For the
Enron set, Whispers identified 18 separate topic themes between January 1999
and December 2000. The highest risk data emanated from the legal department
with a leakage risk as high as 60%.
has shown that testing the interactions among fewer (up to 6) components is generally
sufficient while retaining the capability to detect up to 99% of defects. This leads to a
substantial decrease in the number of tests. Covering arrays are combinatorial objects
that guarantee that every interaction is tested at least once.
In the absence of direct constructions, forming small covering arrays is generally
an expensive computational task. Algorithms to generate covering arrays have been
extensively studied yet no single algorithm provides the smallest solution. More
recently research has been directed towards a new technique called post-optimization.
These algorithms take an existing covering array and attempt to reduce its size.
This thesis presents a new idea for post-optimization by representing covering
arrays as graphs. Some properties of these graphs are established and the results are
contrasted with existing post-optimization algorithms. The idea is then generalized to
close variants of covering arrays with surprising results which in some cases reduce
the size by 30%. Applications of the method to generation and test prioritization are
studied and some interesting results are reported.
In ad-hoc wireless networks especially, many of the performance and scaling issues
these networks face can be attributed to their use of the core IEEE 802.11 MAC
protocol: distributed coordination function (DCF). Smoothed Airtime Linear Tuning
(SALT) is a new contention window tuning algorithm proposed to address some of the
deficiencies of DCF in 802.11 ad-hoc networks. SALT works alongside a new user level
and optimized implementation of REACT, a distributed resource allocation protocol,
to ensure that each node secures the amount of airtime allocated to it by REACT.
The algorithm accomplishes that by tuning the contention window size parameter
that is part of the 802.11 backoff process. SALT converges more tightly on airtime
allocations than a contention window tuning algorithm from previous work and this
increases fairness in transmission opportunities and reduces jitter more than either
802.11 DCF or the other tuning algorithm. REACT and SALT were also extended
to the multi-hop flow scenario with the introduction of a new airtime reservation
algorithm. With a reservation in place multi-hop TCP throughput actually increased
when running SALT and REACT as compared to 802.11 DCF, and the combination of
protocols still managed to maintain its fairness and jitter advantages. All experiments
were performed on a wireless testbed, not in simulation.
Based on findings of previous studies, there was speculation that two well-known experimental design software packages, JMP and Design Expert, produced varying power outputs given the same design and user inputs. For context and scope, another popular experimental design software package, Minitab® Statistical Software version 17, was added to the comparison. The study compared multiple test cases run on the three software packages with a focus on 2k and 3K factorial design and adjusting the standard deviation effect size, number of categorical factors, levels, number of factors, and replicates. All six cases were run on all three programs and were attempted to be run at one, two, and three replicates each. There was an issue at the one replicate stage, however—Minitab does not allow for only one replicate full factorial designs and Design Expert will not provide power outputs for only one replicate unless there are three or more factors. From the analysis of these results, it was concluded that the differences between JMP 13 and Design Expert 10 were well within the margin of error and likely caused by rounding. The differences between JMP 13, Design Expert 10, and Minitab 17 on the other hand indicated a fundamental difference in the way Minitab addressed power calculation compared to the latest versions of JMP and Design Expert. This was found to be likely a cause of Minitab’s dummy variable coding as its default instead of the orthogonal coding default of the other two. Although dummy variable and orthogonal coding for factorial designs do not show a difference in results, the methods affect the overall power calculations. All three programs can be adjusted to use either method of coding, but the exact instructions for how are difficult to find and thus a follow-up guide on changing the coding for factorial variables would improve this issue.