Matching Items (24)
Filtering by

Clear all filters

154497-Thumbnail Image.png
Description
Robotic technology is advancing to the point where it will soon be feasible to deploy massive populations, or swarms, of low-cost autonomous robots to collectively perform tasks over large domains and time scales. Many of these tasks will require the robots to allocate themselves around the boundaries of regions

Robotic technology is advancing to the point where it will soon be feasible to deploy massive populations, or swarms, of low-cost autonomous robots to collectively perform tasks over large domains and time scales. Many of these tasks will require the robots to allocate themselves around the boundaries of regions or features of interest and achieve target objectives that derive from their resulting spatial configurations, such as forming a connected communication network or acquiring sensor data around the entire boundary. We refer to this spatial allocation problem as boundary coverage. Possible swarm tasks that will involve boundary coverage include cooperative load manipulation for applications in construction, manufacturing, and disaster response.

In this work, I address the challenges of controlling a swarm of resource-constrained robots to achieve boundary coverage, which I refer to as the problem of stochastic boundary coverage. I first examined an instance of this behavior in the biological phenomenon of group food retrieval by desert ants, and developed a hybrid dynamical system model of this process from experimental data. Subsequently, with the aid of collaborators, I used a continuum abstraction of swarm population dynamics, adapted from a modeling framework used in chemical kinetics, to derive stochastic robot control policies that drive a swarm to target steady-state allocations around multiple boundaries in a way that is robust to environmental variations.

Next, I determined the statistical properties of the random graph that is formed by a group of robots, each with the same capabilities, that have attached to a boundary at random locations. I also computed the probability density functions (pdfs) of the robot positions and inter-robot distances for this case.

I then extended this analysis to cases in which the robots have heterogeneous communication/sensing radii and attach to a boundary according to non-uniform, non-identical pdfs. I proved that these more general coverage strategies generate random graphs whose probability of connectivity is Sharp-P Hard to compute. Finally, I investigated possible approaches to validating our boundary coverage strategies in multi-robot simulations with realistic Wi-fi communication.
ContributorsPeruvemba Kumar, Ganesh (Author) / Berman, Spring M (Thesis advisor) / Fainekos, Georgios (Thesis advisor) / Bazzi, Rida (Committee member) / Syrotiuk, Violet (Committee member) / Taylor, Thomas (Committee member) / Arizona State University (Publisher)
Created2016
154657-Thumbnail Image.png
Description
Several decades of transistor technology scaling has brought the threat of soft errors to modern embedded processors. Several techniques have been proposed to protect these systems from soft errors. However, their effectiveness in protecting the computation cannot be ascertained without accurate and quantitative estimation of system reliability. Vulnerability -- a

Several decades of transistor technology scaling has brought the threat of soft errors to modern embedded processors. Several techniques have been proposed to protect these systems from soft errors. However, their effectiveness in protecting the computation cannot be ascertained without accurate and quantitative estimation of system reliability. Vulnerability -- a metric that defines the probability of system-failure (reliability) through analytical models -- is the most effective mechanism for our current estimation and early design space exploration needs. Previous vulnerability estimation tools are based around the Sim-Alpha simulator which has been to shown to have several limitations. In this thesis, I present gemV: an accurate and comprehensive vulnerability estimation tool based on gem5. Gem5 is a popular cycle-accurate micro-architectural simulator that can model several different processor models in close to real hardware form. GemV can be used for fast and early design space exploration and also evaluate the protection afforded by commodity processors. gemV is comprehensive, since it models almost all sequential components of the processor. gemV is accurate because of fine-grain vulnerability tracking, accurate vulnerability modeling of squashed instructions, and accurate vulnerability modeling of shared data structures in gem5. gemV has been thoroughly validated against extensive fault injection experiments and achieves a 97\% accuracy with 95\% confidence. A micro-architect can use gemV to discover micro-architectural variants of a processor that minimize vulnerability for allowed performance penalty. A software developer can use gemV to explore the performance-vulnerability trade-off by choosing different algorithms and compiler optimizations, while the system designer can use gemV to explore the performance-vulnerability trade-offs of choosing different Insruction Set Architectures (ISA).
ContributorsTanikella, Srinivas Karthik (Author) / Shrivastava, Aviral (Thesis advisor) / Bazzi, Rida (Committee member) / Wu, Carole-Jean (Committee member) / Arizona State University (Publisher)
Created2016
154765-Thumbnail Image.png
Description
For the past three decades, the design of an effective strategy for generating poetry that matches that of a human’s creative capabilities and complexities has been an elusive goal in artificial intelligence (AI) and natural language generation (NLG) research, and among linguistic creativity researchers in particular. This thesis presents a

For the past three decades, the design of an effective strategy for generating poetry that matches that of a human’s creative capabilities and complexities has been an elusive goal in artificial intelligence (AI) and natural language generation (NLG) research, and among linguistic creativity researchers in particular. This thesis presents a novel approach to fixed verse poetry generation using neural word embeddings. During the course of generation, a two layered poetry classifier is developed. The first layer uses a lexicon based method to classify poems into types based on form and structure, and the second layer uses a supervised classification method to classify poems into subtypes based on content with an accuracy of 92%. The system then uses a two-layer neural network to generate poetry based on word similarities and word movements in a 50-dimensional vector space.

The verses generated by the system are evaluated using rhyme, rhythm, syllable counts and stress patterns. These computational features of language are considered for generating haikus, limericks and iambic pentameter verses. The generated poems are evaluated using a Turing test on both experts and non-experts. The user study finds that only 38% computer generated poems were correctly identified by nonexperts while 65% of the computer generated poems were correctly identified by experts. Although the system does not pass the Turing test, the results from the Turing test suggest an improvement of over 17% when compared to previous methods which use Turing tests to evaluate poetry generators.
ContributorsMagge, Arjun (Author) / Syrotiuk, Violet R. (Thesis advisor) / Baral, Chitta (Committee member) / Hogue, Cynthia (Committee member) / Bazzi, Rida (Committee member) / Arizona State University (Publisher)
Created2016
154792-Thumbnail Image.png
Description
The success of Bitcoin has generated significant interest in the financial community to understand whether the technological underpinnings of the cryptocurrency paradigm can be leveraged to improve the efficiency of financial processes in the existing infrastructure. Various alternative proposals, most notably, Ripple and Ethereum, aim to provide solutions to the

The success of Bitcoin has generated significant interest in the financial community to understand whether the technological underpinnings of the cryptocurrency paradigm can be leveraged to improve the efficiency of financial processes in the existing infrastructure. Various alternative proposals, most notably, Ripple and Ethereum, aim to provide solutions to the financial community in different ways. These proposals derive their security guarantees from either the computational hardness of proof-of-work or voting based distributed consensus mechanism, both of which can be computationally expensive. Furthermore, the financial audit requirements for a participating financial institutions have not been suitably addressed.

This thesis presents a novel approach of constructing a non-consensus based decentralized financial transaction processing model with a built-in efficient audit structure. The problem of decentralized inter-bank payment processing is used for the model design. The two key insights used in this work are (1) to utilize a majority signature based replicated storage protocol for transaction authorization, and (2) to construct individual self-verifiable audit trails for each node as opposed to a common Blockchain. Theoretical analysis shows that the model provides cryptographic security for transaction processing and the presented audit structure facilitates financial auditing of individual nodes in time independent of the number of transactions.
ContributorsGupta, Saurabh (Author) / Bazzi, Rida (Thesis advisor) / Ahn, Gail-Joon (Committee member) / Herlihy, Maurice (Committee member) / Arizona State University (Publisher)
Created2016
155726-Thumbnail Image.png
Description
Phishing is a form of online fraud where a spoofed website tries to gain access to user's sensitive information by tricking the user into believing that it is a benign website. There are several solutions to detect phishing attacks such as educating users, using blacklists or extracting phishing characteristics found

Phishing is a form of online fraud where a spoofed website tries to gain access to user's sensitive information by tricking the user into believing that it is a benign website. There are several solutions to detect phishing attacks such as educating users, using blacklists or extracting phishing characteristics found to exist in phishing attacks. In this thesis, we analyze approaches that extract features from phishing websites and train classification models with extracted feature set to classify phishing websites. We create an exhaustive list of all features used in these approaches and categorize them into 6 broader categories and 33 finer categories. We extract 59 features from the URL, URL redirects, hosting domain (WHOIS and DNS records) and popularity of the website and analyze their robustness in classifying a phishing website. Our emphasis is on determining the predictive performance of robust features. We evaluate the classification accuracy when using the entire feature set and when URL features or site popularity features are excluded from the feature set and show how our approach can be used to effectively predict specific types of phishing attacks such as shortened URLs and randomized URLs. Using both decision table classifiers and neural network classifiers, our results indicate that robust features seem to have enough predictive power to be used in practice.
ContributorsNamasivayam, Bhuvana Lalitha (Author) / Bazzi, Rida (Thesis advisor) / Zhao, Ziming (Committee member) / Liu, Huan (Committee member) / Arizona State University (Publisher)
Created2017
155634-Thumbnail Image.png
Description
Scientific workflows allow scientists to easily model and express the entire data processing steps, typically as a directed acyclic graph (DAG). These scientific workflows are made of a collection of tasks that usually take a long time to compute and that produce a considerable amount of intermediate datasets. Because

Scientific workflows allow scientists to easily model and express the entire data processing steps, typically as a directed acyclic graph (DAG). These scientific workflows are made of a collection of tasks that usually take a long time to compute and that produce a considerable amount of intermediate datasets. Because of the nature of scientific exploration, a scientific workflow can be modified and re-run multiple times, or new scientific workflows are created that might make use of past intermediate datasets. Storing intermediate datasets has the potential to save time in computations. Since storage is limited, one main problem that needs a solution is determining which intermediate datasets need to be saved at creation time in order to minimize the computational time of the workflows to be run in the future. This research thesis proposes the design and implementation of Pingo, a system that is capable of managing the computations of scientific workflows as well as the storage, provenance and deletion of intermediate datasets. Pingo uses the history of workflows submitted to the system to predict the most likely datasets to be needed in the future, and subjects the decision of dataset deletion to the optimization of the computational time of future workflows.
Contributorsde Armas, Jadiel (Author) / Bazzi, Rida (Thesis advisor) / Huang, Dijiang (Committee member) / Syrotiuk, Violet (Committee member) / Arizona State University (Publisher)
Created2017
149560-Thumbnail Image.png
Description
Reducing device dimensions, increasing transistor densities, and smaller timing windows, expose the vulnerability of processors to soft errors induced by charge carrying particles. Since these factors are inevitable in the advancement of processor technology, the industry has been forced to improve reliability on general purpose Chip Multiprocessors (CMPs). With the

Reducing device dimensions, increasing transistor densities, and smaller timing windows, expose the vulnerability of processors to soft errors induced by charge carrying particles. Since these factors are inevitable in the advancement of processor technology, the industry has been forced to improve reliability on general purpose Chip Multiprocessors (CMPs). With the availability of increased hardware resources, redundancy based techniques are the most promising methods to eradicate soft error failures in CMP systems. This work proposes a novel customizable and redundant CMP architecture (UnSync) that utilizes hardware based detection mechanisms (most of which are readily available in the processor), to reduce overheads during error free executions. In the presence of errors (which are infrequent), the always forward execution enabled recovery mechanism provides for resilience in the system. The inherent nature of UnSync architecture framework supports customization of the redundancy, and thereby provides means to achieve possible performance-reliability trade-offs in many-core systems. This work designs a detailed RTL model of UnSync architecture and performs hardware synthesis to compare the hardware (power/area) overheads incurred. It then compares the same with those of the Reunion technique, a state-of-the-art redundant multi-core architecture. This work also performs cycle-accurate simulations over a wide range of SPEC2000, and MiBench benchmarks to evaluate the performance efficiency achieved over that of the Reunion architecture. Experimental results show that, UnSync architecture reduces power consumption by 34.5% and improves performance by up to 20% with 13.3% less area overhead, when compared to Reunion architecture for the same level of reliability achieved.
ContributorsHong, Fei (Author) / Shrivastava, Aviral (Thesis advisor) / Bazzi, Rida (Committee member) / Fainekos, Georgios (Committee member) / Arizona State University (Publisher)
Created2011
168454-Thumbnail Image.png
Description
Federated Learning (FL) is envisaged to be a promising solution for collaboratively training a machine learning model while keeping the training data decentralized and private. Instead of sharing raw data to the central entity, the participating client devices share focused updates for aggregation to ensure global convergence of the model.

Federated Learning (FL) is envisaged to be a promising solution for collaboratively training a machine learning model while keeping the training data decentralized and private. Instead of sharing raw data to the central entity, the participating client devices share focused updates for aggregation to ensure global convergence of the model. Owing to the shortcomings of manually handcrafted neural network architectures, the research community is striving to develop Neural Architecture Search (NAS) approaches to automatically search for optimal networks that fit the clients’ data. Despite the inaccessibility of clients’ data in an FL setting, the federated NAS literature has recently witnessed great progress to apply these NAS techniques to an FL setting. However, one of the key bottlenecks of Federated Learning is the cost of communication between clients and the server, and the state-of-the-art federated NAS techniques search for networks with millions of parameters that require several rounds of communication to find the optimal weight parameters. Also, deploying a network having millions of parameters on edge devices (which are the typical participants in an FL process) is infeasible due to its computational limitations and increased latency. Thus, this work proposes Weight-Agnostic Federated Neural Architecture Search (WFNAS), a novel evolutionary framework to search for well-performing and minimally connected weight-agnostic network architectures in an FL setting. As the connectivity of the networks themselves is the solution, there is no need for weight training and hyperparameter tuning, reducing the communication overhead significantly. The experiments indicate a gain of nearly 40% for orthogonal (vertical FL) data distributions compared to local training. This work is the first federated NAS technique in the literature for vertical FL. Although the experiments are performed in a resource-constrained environment, the aim of this thesis is to show a new direction of research to the FL community.
ContributorsThakkar, Om (Author) / Bazzi, Rida (Thesis advisor) / Li, Baoxin (Committee member) / Zhang, Yu (Committee member) / Arizona State University (Publisher)
Created2021
157463-Thumbnail Image.png
Description
For systems having computers as a significant component, it becomes a critical task to identify the potential threats that the users of the system can present, while being both inside and outside the system. One of the most important factors that differentiate an insider from an outsider is the fact

For systems having computers as a significant component, it becomes a critical task to identify the potential threats that the users of the system can present, while being both inside and outside the system. One of the most important factors that differentiate an insider from an outsider is the fact that the insider being a part of the system, owns privileges that enable him/her access to the resources and processes of the system through valid capabilities. An insider with malicious intent can potentially be more damaging compared to outsiders. The above differences help to understand the notion and scope of an insider.

The significant loss to organizations due to the failure to detect and mitigate the insider threat has resulted in an increased interest in insider threat detection. The well-studied effective techniques proposed for defending against attacks by outsiders have not been proven successful against insider attacks. Although a number of security policies and models to deal with the insider threat have been developed, the approach taken by most organizations is the use of audit logs after the attack has taken place. Such approaches are inspired by academic research proposals to address the problem by tracking activities of the insider in the system. Although tracking and logging are important, it is argued that they are not sufficient. Thus, the necessity to predict the potential damage of an insider is considered to help build a stronger evaluation and mitigation strategy for the insider attack. In this thesis, the question that seeks to be answered is the following: `Considering the relationships that exist between the insiders and their role, their access to the resources and the resource set, what is the potential damage that an insider can cause?'

A general system model is introduced that can capture general insider attacks including those documented by Computer Emergency Response Team (CERT) for the Software Engineering Institute (SEI). Further, initial formulations of the damage potential for leakage and availability in the model is introduced. The model usefulness is shown by expressing 14 of actual attacks in the model and show how for each case the attack could have been mitigated.
ContributorsNolastname, Sharad (Author) / Bazzi, Rida (Thesis advisor) / Sen, Arunabha (Committee member) / Doupe, Adam (Committee member) / Arizona State University (Publisher)
Created2019
156582-Thumbnail Image.png
Description
Distributed systems are prone to attacks, called Sybil attacks, wherein an adversary may generate an unbounded number of bogus identities to gain control over the system. In this thesis, an algorithm, DownhillFlow, for mitigating such attacks is presented and

tested experimentally. The trust rankings produced by the algorithm are significantly better

Distributed systems are prone to attacks, called Sybil attacks, wherein an adversary may generate an unbounded number of bogus identities to gain control over the system. In this thesis, an algorithm, DownhillFlow, for mitigating such attacks is presented and

tested experimentally. The trust rankings produced by the algorithm are significantly better than those of the distributed SybilGuard protocol and only slightly worse than those of the best-known Sybil defense algorithm, ACL. The results obtained for ACL are

consistent with those obtained in previous studies. The running times of the algorithms are also tested and two results are obtained: first, DownhillFlow’s running time is found to be significantly faster than any existing algorithm including ACL, terminating in

slightly over one second on the 300,000-node DBLP graph. This allows it to be used in settings such as dynamic networks as-is with no additional functionality needed. Second, when ACL is configured such that it matches DownhillFlow’s speed, it fails to recognize

large portions of the input graphs and its accuracy among the portion of the graphs it does recognize becomes lower than that of DownhillFlow.
ContributorsBradley, Michael (Author) / Bazzi, Rida (Thesis advisor) / Richa, Andrea (Committee member) / Sen, Arunabha (Committee member) / Arizona State University (Publisher)
Created2018