Matching Items (16)
Filtering by
- Member of: Theses and Dissertations
- Status: Published
Description
Many web search improvements have been developed since the advent of the modern search engine, but one underrepresented area is the application of specific customizations to search results for educational web sites. In order to address this issue and improve the relevance of search results in automated learning environments, this work has integrated context-aware search principles with applications of preference based re-ranking and query modifications. This research investigates several aspects of context-aware search principles, specifically context-sensitive and preference based re-ranking of results which take user inputs as to their preferred content, and combines this with search query modifications which automatically search for a variety of modified terms based on the given search query, integrating these results into the overall re-ranking for the context. The result of this work is a novel web search algorithm which could be applied to any online learning environment attempting to collect relevant resources for learning about a given topic. The algorithm has been evaluated through user studies comparing traditional search results to the context-aware results returned through the algorithm for a given topic. These studies explore how this integration of methods could provide improved relevance in the search results returned when compared against other modern search engines.
ContributorsVan Egmond, Eric (Author) / Burleson, Winslow (Thesis advisor) / Syrotiuk, Violet (Thesis advisor) / Nelson, Brian (Committee member) / Arizona State University (Publisher)
Created2014
Description
Corporations invest considerable resources to create, preserve and analyze
their data; yet while organizations are interested in protecting against
unauthorized data transfer, there lacks a comprehensive metric to discriminate
what data are at risk of leaking.
This thesis motivates the need for a quantitative leakage risk metric, and
provides a risk assessment system, called Whispers, for computing it. Using
unsupervised machine learning techniques, Whispers uncovers themes in an
organization's document corpus, including previously unknown or unclassified
data. Then, by correlating the document with its authors, Whispers can
identify which data are easier to contain, and conversely which are at risk.
Using the Enron email database, Whispers constructs a social network segmented
by topic themes. This graph uncovers communication channels within the
organization. Using this social network, Whispers determines the risk of each
topic by measuring the rate at which simulated leaks are not detected. For the
Enron set, Whispers identified 18 separate topic themes between January 1999
and December 2000. The highest risk data emanated from the legal department
with a leakage risk as high as 60%.
their data; yet while organizations are interested in protecting against
unauthorized data transfer, there lacks a comprehensive metric to discriminate
what data are at risk of leaking.
This thesis motivates the need for a quantitative leakage risk metric, and
provides a risk assessment system, called Whispers, for computing it. Using
unsupervised machine learning techniques, Whispers uncovers themes in an
organization's document corpus, including previously unknown or unclassified
data. Then, by correlating the document with its authors, Whispers can
identify which data are easier to contain, and conversely which are at risk.
Using the Enron email database, Whispers constructs a social network segmented
by topic themes. This graph uncovers communication channels within the
organization. Using this social network, Whispers determines the risk of each
topic by measuring the rate at which simulated leaks are not detected. For the
Enron set, Whispers identified 18 separate topic themes between January 1999
and December 2000. The highest risk data emanated from the legal department
with a leakage risk as high as 60%.
ContributorsWright, Jeremy (Author) / Syrotiuk, Violet (Thesis advisor) / Davulcu, Hasan (Committee member) / Yau, Stephen (Committee member) / Arizona State University (Publisher)
Created2014
Description
This thesis proposed a novel approach to establish the trust model in a social network scenario based on users' emails. Email is one of the most important social connections nowadays. By analyzing email exchange activities among users, a social network trust model can be established to judge the trust rate between each two users. The whole trust checking process is divided into two steps: local checking and remote checking. Local checking directly contacts the email server to calculate the trust rate based on user's own email communication history. Remote checking is a distributed computing process to get help from user's social network friends and built the trust rate together. The email-based trust model is built upon a cloud computing framework called MobiCloud. Inside MobiCloud, each user occupies a virtual machine which can directly communicate with others. Based on this feature, the distributed trust model is implemented as a combination of local analysis and remote analysis in the cloud. Experiment results show that the trust evaluation model can give accurate trust rate even in a small scale social network which does not have lots of social connections. With this trust model, the security in both social network services and email communication could be improved.
ContributorsZhong, Yunji (Author) / Huang, Dijiang (Thesis advisor) / Dasgupta, Partha (Committee member) / Syrotiuk, Violet (Committee member) / Arizona State University (Publisher)
Created2011
Description
Exhaustive testing is generally infeasible except in the smallest of systems. Research
has shown that testing the interactions among fewer (up to 6) components is generally
sufficient while retaining the capability to detect up to 99% of defects. This leads to a
substantial decrease in the number of tests. Covering arrays are combinatorial objects
that guarantee that every interaction is tested at least once.
In the absence of direct constructions, forming small covering arrays is generally
an expensive computational task. Algorithms to generate covering arrays have been
extensively studied yet no single algorithm provides the smallest solution. More
recently research has been directed towards a new technique called post-optimization.
These algorithms take an existing covering array and attempt to reduce its size.
This thesis presents a new idea for post-optimization by representing covering
arrays as graphs. Some properties of these graphs are established and the results are
contrasted with existing post-optimization algorithms. The idea is then generalized to
close variants of covering arrays with surprising results which in some cases reduce
the size by 30%. Applications of the method to generation and test prioritization are
studied and some interesting results are reported.
has shown that testing the interactions among fewer (up to 6) components is generally
sufficient while retaining the capability to detect up to 99% of defects. This leads to a
substantial decrease in the number of tests. Covering arrays are combinatorial objects
that guarantee that every interaction is tested at least once.
In the absence of direct constructions, forming small covering arrays is generally
an expensive computational task. Algorithms to generate covering arrays have been
extensively studied yet no single algorithm provides the smallest solution. More
recently research has been directed towards a new technique called post-optimization.
These algorithms take an existing covering array and attempt to reduce its size.
This thesis presents a new idea for post-optimization by representing covering
arrays as graphs. Some properties of these graphs are established and the results are
contrasted with existing post-optimization algorithms. The idea is then generalized to
close variants of covering arrays with surprising results which in some cases reduce
the size by 30%. Applications of the method to generation and test prioritization are
studied and some interesting results are reported.
ContributorsKaria, Rushang Vinod (Author) / Colbourn, Charles J (Thesis advisor) / Syrotiuk, Violet (Committee member) / Richa, Andréa W. (Committee member) / Arizona State University (Publisher)
Created2015
Description
Medium access control (MAC) is a fundamental problem in wireless networks.
In ad-hoc wireless networks especially, many of the performance and scaling issues
these networks face can be attributed to their use of the core IEEE 802.11 MAC
protocol: distributed coordination function (DCF). Smoothed Airtime Linear Tuning
(SALT) is a new contention window tuning algorithm proposed to address some of the
deficiencies of DCF in 802.11 ad-hoc networks. SALT works alongside a new user level
and optimized implementation of REACT, a distributed resource allocation protocol,
to ensure that each node secures the amount of airtime allocated to it by REACT.
The algorithm accomplishes that by tuning the contention window size parameter
that is part of the 802.11 backoff process. SALT converges more tightly on airtime
allocations than a contention window tuning algorithm from previous work and this
increases fairness in transmission opportunities and reduces jitter more than either
802.11 DCF or the other tuning algorithm. REACT and SALT were also extended
to the multi-hop flow scenario with the introduction of a new airtime reservation
algorithm. With a reservation in place multi-hop TCP throughput actually increased
when running SALT and REACT as compared to 802.11 DCF, and the combination of
protocols still managed to maintain its fairness and jitter advantages. All experiments
were performed on a wireless testbed, not in simulation.
In ad-hoc wireless networks especially, many of the performance and scaling issues
these networks face can be attributed to their use of the core IEEE 802.11 MAC
protocol: distributed coordination function (DCF). Smoothed Airtime Linear Tuning
(SALT) is a new contention window tuning algorithm proposed to address some of the
deficiencies of DCF in 802.11 ad-hoc networks. SALT works alongside a new user level
and optimized implementation of REACT, a distributed resource allocation protocol,
to ensure that each node secures the amount of airtime allocated to it by REACT.
The algorithm accomplishes that by tuning the contention window size parameter
that is part of the 802.11 backoff process. SALT converges more tightly on airtime
allocations than a contention window tuning algorithm from previous work and this
increases fairness in transmission opportunities and reduces jitter more than either
802.11 DCF or the other tuning algorithm. REACT and SALT were also extended
to the multi-hop flow scenario with the introduction of a new airtime reservation
algorithm. With a reservation in place multi-hop TCP throughput actually increased
when running SALT and REACT as compared to 802.11 DCF, and the combination of
protocols still managed to maintain its fairness and jitter advantages. All experiments
were performed on a wireless testbed, not in simulation.
ContributorsMellott, Matthew (Author) / Syrotiuk, Violet (Thesis advisor) / Colbourn, Charles (Committee member) / Tinnirello, Ilenia (Committee member) / Arizona State University (Publisher)
Created2018
Description
As an example of "big data," we consider a repository of Arctic sea ice concentration data collected from satellites over the years 1979-2005. The data is represented by a graph, where vertices correspond to measurement points, and an edge is inserted between two vertices if the Pearson correlation coefficient between them exceeds a threshold. We investigate new questions about the structure of the graph related to betweenness, closeness centrality, vertex degrees, and characteristic path length. We also investigate whether an offset of weeks and years in graph generation results in a cosine similarity value that differs significantly from expected values. Finally, we relate the computational results to trends in Arctic ice.
ContributorsDougherty, Ryan Edward (Author) / Syrotiuk, Violet (Thesis director) / Colbourn, Charles (Committee member) / Barrett, The Honors College (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Computer Science and Engineering Program (Contributor)
Created2015-05
Description
When designing screening experiments for many factors, two problems quickly arise. The first is that testing all the different combinations of the factors and interactions creates an experiment that is too large to conduct in a practical amount of time. One way this problem is solved is with a combinatorial design called a locating array (LA) which can efficiently identify the factors and interactions most influential on a response. The second problem is how to ensure that combinations that prohibit some particular tests are absent, a requirement that is common in real-world systems. This research proposes a solution to the second problem using constraint satisfaction.
ContributorsMiller, Vincent Joseph (Author) / Syrotiuk, Violet (Thesis director) / Colbourn, Charles (Committee member) / Computer Science and Engineering Program (Contributor, Contributor) / Barrett, The Honors College (Contributor)
Created2019-05
Description
In software testing, components are tested individually to make sure each performs as expected. The next step is to confirm that two or more components are able to work together. This stage of testing is often difficult because there can be numerous configurations between just two components.
Covering arrays are one way to ensure a set of tests will cover every possible configuration at least once. However, on systems with many settings, it is computationally intensive to run every possible test. Test prioritization methods can identify tests of greater importance. This concept of test prioritization can help determine which tests can be removed with minimal impact to the overall testing of the system.
This thesis presents three algorithms that generate covering arrays that test the interaction of every two components at least twice. These algorithms extend the functionality of an established greedy test prioritization method to ensure important components are selected in earlier tests. The algorithms are tested on various inputs and the results reveal that on average, the resulting covering arrays are two-fifths to one-half times smaller than a covering array generated through brute force.
Covering arrays are one way to ensure a set of tests will cover every possible configuration at least once. However, on systems with many settings, it is computationally intensive to run every possible test. Test prioritization methods can identify tests of greater importance. This concept of test prioritization can help determine which tests can be removed with minimal impact to the overall testing of the system.
This thesis presents three algorithms that generate covering arrays that test the interaction of every two components at least twice. These algorithms extend the functionality of an established greedy test prioritization method to ensure important components are selected in earlier tests. The algorithms are tested on various inputs and the results reveal that on average, the resulting covering arrays are two-fifths to one-half times smaller than a covering array generated through brute force.
ContributorsAng, Nicole (Author) / Syrotiuk, Violet (Thesis advisor) / Colbourn, Charles (Committee member) / Richa, Andrea (Committee member) / Arizona State University (Publisher)
Created2015
Description
Robotic technology is advancing to the point where it will soon be feasible to deploy massive populations, or swarms, of low-cost autonomous robots to collectively perform tasks over large domains and time scales. Many of these tasks will require the robots to allocate themselves around the boundaries of regions or features of interest and achieve target objectives that derive from their resulting spatial configurations, such as forming a connected communication network or acquiring sensor data around the entire boundary. We refer to this spatial allocation problem as boundary coverage. Possible swarm tasks that will involve boundary coverage include cooperative load manipulation for applications in construction, manufacturing, and disaster response.
In this work, I address the challenges of controlling a swarm of resource-constrained robots to achieve boundary coverage, which I refer to as the problem of stochastic boundary coverage. I first examined an instance of this behavior in the biological phenomenon of group food retrieval by desert ants, and developed a hybrid dynamical system model of this process from experimental data. Subsequently, with the aid of collaborators, I used a continuum abstraction of swarm population dynamics, adapted from a modeling framework used in chemical kinetics, to derive stochastic robot control policies that drive a swarm to target steady-state allocations around multiple boundaries in a way that is robust to environmental variations.
Next, I determined the statistical properties of the random graph that is formed by a group of robots, each with the same capabilities, that have attached to a boundary at random locations. I also computed the probability density functions (pdfs) of the robot positions and inter-robot distances for this case.
I then extended this analysis to cases in which the robots have heterogeneous communication/sensing radii and attach to a boundary according to non-uniform, non-identical pdfs. I proved that these more general coverage strategies generate random graphs whose probability of connectivity is Sharp-P Hard to compute. Finally, I investigated possible approaches to validating our boundary coverage strategies in multi-robot simulations with realistic Wi-fi communication.
In this work, I address the challenges of controlling a swarm of resource-constrained robots to achieve boundary coverage, which I refer to as the problem of stochastic boundary coverage. I first examined an instance of this behavior in the biological phenomenon of group food retrieval by desert ants, and developed a hybrid dynamical system model of this process from experimental data. Subsequently, with the aid of collaborators, I used a continuum abstraction of swarm population dynamics, adapted from a modeling framework used in chemical kinetics, to derive stochastic robot control policies that drive a swarm to target steady-state allocations around multiple boundaries in a way that is robust to environmental variations.
Next, I determined the statistical properties of the random graph that is formed by a group of robots, each with the same capabilities, that have attached to a boundary at random locations. I also computed the probability density functions (pdfs) of the robot positions and inter-robot distances for this case.
I then extended this analysis to cases in which the robots have heterogeneous communication/sensing radii and attach to a boundary according to non-uniform, non-identical pdfs. I proved that these more general coverage strategies generate random graphs whose probability of connectivity is Sharp-P Hard to compute. Finally, I investigated possible approaches to validating our boundary coverage strategies in multi-robot simulations with realistic Wi-fi communication.
ContributorsPeruvemba Kumar, Ganesh (Author) / Berman, Spring M (Thesis advisor) / Fainekos, Georgios (Thesis advisor) / Bazzi, Rida (Committee member) / Syrotiuk, Violet (Committee member) / Taylor, Thomas (Committee member) / Arizona State University (Publisher)
Created2016
Description
As persistent non-volatile memory solutions become integrated in the computing ecosystem and landscape, traditional commodity file systems architected and developed for traditional block I/O based memory solutions must be reevaluated. A majority of commodity file systems have been architected and designed with the goal of managing data on non-volatile storage devices such as hard disk drives (HDDs) and solid state drives (SSDs). HDDs and SSDs are attached to a computing system via a controller or I/O hub, often referred to as the southbridge. The point of HDD and SSD attachment creates multiple levels of translation for any data managed by the CPU that must be stored in non-volatile memory (NVM) on an HDD or SSD. Storage Class Memory (SCM) devices provide the ability to store data at the CPU and DRAM level of a computing system. A novel set of modifications to the ext2 and ext4 commodity file systems to address the needs of SCM will be presented and discussed. An in-depth analysis of many existing file systems, from multiple sources, will be presented along with an analysis to identify key modifications and extensions that would be necessary to execute file system on SCM devices. From this analysis, modifications and extensions have been applied to the FAT commodity file system for key functional tests that will be presented to demonstrate the operation and execution of the file system extensions.
ContributorsRobles, Raymond (Author) / Syrotiuk, Violet (Thesis advisor) / Sohoni, Sohum (Committee member) / Wu, Carole-Jean (Committee member) / Arizona State University (Publisher)
Created2016