Search Content

Client-driven dynamic database updates

Description

This thesis addresses the problem of online schema updates where the goal is to be able to update relational database schemas without reducing the database system's availability. Unlike some other work in this area, this thesis presents an approach which is completely client-driven and does not require specialized database management…

This thesis addresses the problem of online schema updates where the goal is to be able to update relational database schemas without reducing the database system's availability. Unlike some other work in this area, this thesis presents an approach which is completely client-driven and does not require specialized database management systems (DBMS). Also, unlike other client-driven work, this approach provides support for a richer set of schema updates including vertical split (normalization), horizontal split, vertical and horizontal merge (union), difference and intersection. The update process automatically generates a runtime update client from a mapping between the old the new schemas. The solution has been validated by testing it on a relatively small database of around 300,000 records per table and less than 1 Gb, but with limited memory buffer size of 24 Mb. This thesis presents the study of the overhead of the update process as a function of the transaction rates and the batch size used to copy data from the old to the new schema. It shows that the overhead introduced is minimal for medium size applications and that the update can be achieved with no more than one minute of downtime.

ContributorsTyagi, Preetika (Author) / Bazzi, Rida (Thesis advisor) / Candan, Kasim S (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)

Created2011

The design and analysis of hash families for use in broadcast encryption

Description

Broadcast Encryption is the task of cryptographically securing communication in a broadcast environment so that only a dynamically specified subset of subscribers, called the privileged subset, may decrypt the communication. In practical applications, it is desirable for a Broadcast Encryption Scheme (BES) to demonstrate resilience against attacks by colluding, unprivileged…

Broadcast Encryption is the task of cryptographically securing communication in a broadcast environment so that only a dynamically specified subset of subscribers, called the privileged subset, may decrypt the communication. In practical applications, it is desirable for a Broadcast Encryption Scheme (BES) to demonstrate resilience against attacks by colluding, unprivileged subscribers. Minimal Perfect Hash Families (PHFs) have been shown to provide a basis for the construction of memory-efficient t-resilient Key Pre-distribution Schemes (KPSs) from multiple instances of 1-resilient KPSs. Using this technique, the task of constructing a large t-resilient BES is reduced to finding a near-minimal PHF of appropriate parameters. While combinatorial and probabilistic constructions exist for minimal PHFs with certain parameters, the complexity of constructing them in general is currently unknown. This thesis introduces a new type of hash family, called a Scattering Hash Family (ScHF), which is designed to allow for the scalable and ingredient-independent design of memory-efficient BESs for large parameters, specifically resilience and total number of subscribers. A general BES construction using ScHFs is shown, which constructs t-resilient KPSs from other KPSs of any resilience ≤w≤t. In addition to demonstrating how ScHFs can be used to produce BESs , this thesis explores several ScHF construction techniques. The initial technique demonstrates a probabilistic, non-constructive proof of existence for ScHFs . This construction is then derandomized into a direct, polynomial time construction of near-minimal ScHFs using the method of conditional expectations. As an alternative approach to direct construction, representing ScHFs as a k-restriction problem allows for the indirect construction of ScHFs via randomized post-optimization. Using the methods defined, ScHFs are constructed and the parameters' effects on solution size are analyzed. For large strengths, constructive techniques lose significant performance, and as such, asymptotic analysis is performed using the non-constructive existential results. This work concludes with an analysis of the benefits and disadvantages of BESs based on the constructed ScHFs. Due to the novel nature of ScHFs, the results of this analysis are used as the foundation for an empirical comparison between ScHF-based and PHF-based BESs . The primary bases of comparison are construction efficiency, key material requirements, and message transmission overhead.

ContributorsO'Brien, Devon James (Author) / Colbourn, Charles J (Thesis advisor) / Bazzi, Rida (Committee member) / Richa, Andrea (Committee member) / Arizona State University (Publisher)

Created2012

Expansion Algorithms in Self-Organizing Particle Systems

Description

A primary goal in computer science is to develop autonomous systems. Usually, we provide computers with tasks and rules for completing those tasks, but what if we could extend this type of system to physical technology as well? In the field of programmable matter, researchers are tasked with developing synthetic…

A primary goal in computer science is to develop autonomous systems. Usually, we provide computers with tasks and rules for completing those tasks, but what if we could extend this type of system to physical technology as well? In the field of programmable matter, researchers are tasked with developing synthetic materials that can change their physical properties \u2014 such as color, density, and even shape \u2014 based on predefined rules or continuous, autonomous collection of input. In this research, we are most interested in particles that can perform computations, bond with other particles, and move. In this paper, we provide a theoretical particle model that can be used to simulate the performance of such physical particle systems, as well as an algorithm to perform expansion, wherein these particles can be used to enclose spaces or even objects.

ContributorsLaff, Miles (Author) / Richa, Andrea (Thesis director) / Bazzi, Rida (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor) / School of Mathematical and Statistical Sciences (Contributor)

Created2015-05

UnSync: a soft error resilient redundant CMP architecture

Description

Reducing device dimensions, increasing transistor densities, and smaller timing windows, expose the vulnerability of processors to soft errors induced by charge carrying particles. Since these factors are inevitable in the advancement of processor technology, the industry has been forced to improve reliability on general purpose Chip Multiprocessors (CMPs). With the…

Reducing device dimensions, increasing transistor densities, and smaller timing windows, expose the vulnerability of processors to soft errors induced by charge carrying particles. Since these factors are inevitable in the advancement of processor technology, the industry has been forced to improve reliability on general purpose Chip Multiprocessors (CMPs). With the availability of increased hardware resources, redundancy based techniques are the most promising methods to eradicate soft error failures in CMP systems. This work proposes a novel customizable and redundant CMP architecture (UnSync) that utilizes hardware based detection mechanisms (most of which are readily available in the processor), to reduce overheads during error free executions. In the presence of errors (which are infrequent), the always forward execution enabled recovery mechanism provides for resilience in the system. The inherent nature of UnSync architecture framework supports customization of the redundancy, and thereby provides means to achieve possible performance-reliability trade-offs in many-core systems. This work designs a detailed RTL model of UnSync architecture and performs hardware synthesis to compare the hardware (power/area) overheads incurred. It then compares the same with those of the Reunion technique, a state-of-the-art redundant multi-core architecture. This work also performs cycle-accurate simulations over a wide range of SPEC2000, and MiBench benchmarks to evaluate the performance efficiency achieved over that of the Reunion architecture. Experimental results show that, UnSync architecture reduces power consumption by 34.5% and improves performance by up to 20% with 13.3% less area overhead, when compared to Reunion architecture for the same level of reliability achieved.

ContributorsHong, Fei (Author) / Shrivastava, Aviral (Thesis advisor) / Bazzi, Rida (Committee member) / Fainekos, Georgios (Committee member) / Arizona State University (Publisher)

Created2011

Why Students in Computer Science Courses Cheat?

DescriptionThe goal of this study is to equip administrators and instructors with a deeper understanding of the apparent cheating problem in Computer Science courses, with proposed solutions to lower academic dishonesty from the students’ perspective.

ContributorsAl Yasari, Farah (Co-author) / Alyasari, Farah (Co-author) / Tadayon-Navabi, Farideh (Thesis director) / Bazzi, Rida (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2019-05

Design Patterns for Distributed Systems

Description

Implementing a distributed algorithm is more complicated than implementing a non-distributed algorithm. This is because distributed systems involve coordination of different processes each of which has a partial view of the global system state. The only way to share information in a distributed system is by message passing. Task that…

Implementing a distributed algorithm is more complicated than implementing a non-distributed algorithm. This is because distributed systems involve coordination of different processes each of which has a partial view of the global system state. The only way to share information in a distributed system is by message passing. Task that are straightforward in a non-distributed system, like deciding on the value of a global system state, can be quite complicated to achieve in a distributed system [1]. On top of the difficulties caused by the distributed nature of the computations, distributed systems typically need to be able to operate normally even if some of the nodes in the system are faulty which further adds to the uncertainty that processes have about the global state. Many factors make the implementation of a distributed algorithms difficult. Design patterns [2] are useful in simplifying the development of general algorithms. A design pattern describes a high level solution to a common, abstract problem that many systems may face. Common structural, creational, and behavioral problems are identified and elegantly solved by design patterns. By identifying features that an algorithm uses, and framing each feature as one of the common problems that a specific design pattern solves, designing a robust implementation of an algorithm becomes more manageable. In this way, design patterns can aid the implementation of algorithms. Unfortunately, design patterns are typically not discussed when developing distributed algorithms. Because correctly developing a distributed algorithm is difficult, many papers (eg. [1], [3], [4]) focus on verifying the correctness of the developed algorithm. Papers that are more practical ([5], [6]) establish the correctness of their algorithm and that their algorithm is efficient enough to be practical. However, papers on distributed algorithms usually make little mention of design patterns. The goal of this work was to gain experience implementing distributed systems including learning the application of design patterns and the application of related technical topics. This was achieved by implementing a currently unpublished algorithm that is tentatively called Bakery Consensus. Bakery Consensus is a replicated state-machine protocol that can tolerate servers with Byzantine faults, but assumes non-faulty clients. The algorithm also establishes non-skipping timestamps for each operation completed by the replicated state-machine. The design of the structure, communication, and creation of the different system parts depended heavily upon the book Design Patterns [2]. After implementing the system, the success of the in implementing its various parts was based upon their ability to satisfy the SOLID [7] principles as well as their ability to establish low coupling and high cohesion [8]. The rest of this paper is organized as follows. We begin by providing background information about distributed algorithms, including replicated state-machine protocols and the Bakery Consensus algorithm. Section 3 gives a background on several design patterns and software engineering principles that were used in the development process. Section 4 discusses the well designed parts of the system that used design patterns, and how these design patterns were chosen. Section 5 discusses well designed system parts that relied upon other technical topics. Section 6 discusses system parts that need redesign. The conclusion summarizes what was accomplished by the implementation process and the lessons learned about design patterns for distributed algorithms.

ContributorsStoutenburg, Tristan Kaleb (Author) / Bazzi, Rida (Thesis director) / Richa, Andrea (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2017-05

WFNAS: Weight-Agnostic Federated Neural Architecture Search

Description

Federated Learning (FL) is envisaged to be a promising solution for collaboratively training a machine learning model while keeping the training data decentralized and private. Instead of sharing raw data to the central entity, the participating client devices share focused updates for aggregation to ensure global convergence of the model.…

Federated Learning (FL) is envisaged to be a promising solution for collaboratively training a machine learning model while keeping the training data decentralized and private. Instead of sharing raw data to the central entity, the participating client devices share focused updates for aggregation to ensure global convergence of the model. Owing to the shortcomings of manually handcrafted neural network architectures, the research community is striving to develop Neural Architecture Search (NAS) approaches to automatically search for optimal networks that fit the clients’ data. Despite the inaccessibility of clients’ data in an FL setting, the federated NAS literature has recently witnessed great progress to apply these NAS techniques to an FL setting. However, one of the key bottlenecks of Federated Learning is the cost of communication between clients and the server, and the state-of-the-art federated NAS techniques search for networks with millions of parameters that require several rounds of communication to find the optimal weight parameters. Also, deploying a network having millions of parameters on edge devices (which are the typical participants in an FL process) is infeasible due to its computational limitations and increased latency. Thus, this work proposes Weight-Agnostic Federated Neural Architecture Search (WFNAS), a novel evolutionary framework to search for well-performing and minimally connected weight-agnostic network architectures in an FL setting. As the connectivity of the networks themselves is the solution, there is no need for weight training and hyperparameter tuning, reducing the communication overhead significantly. The experiments indicate a gain of nearly 40% for orthogonal (vertical FL) data distributions compared to local training. This work is the first federated NAS technique in the literature for vertical FL. Although the experiments are performed in a resource-constrained environment, the aim of this thesis is to show a new direction of research to the FL community.

ContributorsThakkar, Om (Author) / Bazzi, Rida (Thesis advisor) / Li, Baoxin (Committee member) / Zhang, Yu (Committee member) / Arizona State University (Publisher)

Created2021

Analysis of Russian Apps for TSPU-Related Risks

Description

The landscape of internet freedom and surveillance is constantly evolving, with various countries employing technical measures to control online information and monitor citizens. Russia's internet ecosystem presents a unique case study, with the recent establishment of a domestic Trusted Root Certificate Authority (CA) and the ongoing utilization of the "Technical…

The landscape of internet freedom and surveillance is constantly evolving, with various countries employing technical measures to control online information and monitor citizens. Russia's internet ecosystem presents a unique case study, with the recent establishment of a domestic Trusted Root Certificate Authority (CA) and the ongoing utilization of the "Technical Measures to Combat Threats" (TSPU) devices with government-mandated deployment by internet service providers. This thesis investigates the potential risks associated with these developments, focusing on the vulnerability of Russian Android applications to targeted JavaScript attacks compromising the privacy and security of their users.This analysis of Russian Android applications reveals the existence of the Russian CA certificate embedded into the application packages, enabling the Russian government to intercept and manipulate encrypted TLS traffic. Simulating TSPU behavior with mitmproxy demonstrates the susceptibility of all tested applications to JavaScript injection attacks, allowing targeted government surveillance. This thesis proposes several mitigation strategies and highlights the need for a systemic solution to address the security risks associated with government-controlled CAs in applications, considering Google Play Market restrictions on such certificate inclusion. This thesis contributes to the evolving discussion on internet freedom and cybersecurity in Russia by exposing the unique vulnerabilities faced by users within the Russian digital ecosystem.

ContributorsKonyukhov, Vitaliy (Author) / Crandall, Jedidiah (Thesis advisor) / Wang, Fish (Committee member) / Bazzi, Rida (Committee member) / Arizona State University (Publisher)

Created2024

Predicting and Interpreting Students Performance using Supervised Learning and Shapley Additive Explanations

Description

Due to large data resources generated by online educational applications, Educational Data Mining (EDM) has improved learning effects in different ways: Students Visualization, Recommendations for students, Students Modeling, Grouping Students, etc. A lot of programming assignments have the features like automating submissions, examining the test cases to verify the correctness,…

Due to large data resources generated by online educational applications, Educational Data Mining (EDM) has improved learning effects in different ways: Students Visualization, Recommendations for students, Students Modeling, Grouping Students, etc. A lot of programming assignments have the features like automating submissions, examining the test cases to verify the correctness, but limited studies compared different statistical techniques with latest frameworks, and interpreted models in a unified approach.

In this thesis, several data mining algorithms have been applied to analyze students’ code assignment submission data from a real classroom study. The goal of this work is to explore

and predict students’ performances. Multiple machine learning models and the model accuracy were evaluated based on the Shapley Additive Explanation.

The Cross-Validation shows the Gradient Boosting Decision Tree has the best precision 85.93% with average 82.90%. Features like Component grade, Due Date, Submission Times have higher impact than others. Baseline model received lower precision due to lack of non-linear fitting.

ContributorsTian, Wenbo (Author) / Hsiao, Ihan (Thesis advisor) / Bazzi, Rida (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)

Created2019

An Investigation of Flow-based Algorithms for Sybil Defense

Description

Distributed systems are prone to attacks, called Sybil attacks, wherein an adversary may generate an unbounded number of bogus identities to gain control over the system. In this thesis, an algorithm, DownhillFlow, for mitigating such attacks is presented and

tested experimentally. The trust rankings produced by the algorithm are significantly better…

Distributed systems are prone to attacks, called Sybil attacks, wherein an adversary may generate an unbounded number of bogus identities to gain control over the system. In this thesis, an algorithm, DownhillFlow, for mitigating such attacks is presented and

tested experimentally. The trust rankings produced by the algorithm are significantly better than those of the distributed SybilGuard protocol and only slightly worse than those of the best-known Sybil defense algorithm, ACL. The results obtained for ACL are

consistent with those obtained in previous studies. The running times of the algorithms are also tested and two results are obtained: first, DownhillFlow’s running time is found to be significantly faster than any existing algorithm including ACL, terminating in

slightly over one second on the 300,000-node DBLP graph. This allows it to be used in settings such as dynamic networks as-is with no additional functionality needed. Second, when ACL is configured such that it matches DownhillFlow’s speed, it fails to recognize

large portions of the input graphs and its accuracy among the portion of the graphs it does recognize becomes lower than that of DownhillFlow.

ContributorsBradley, Michael (Author) / Bazzi, Rida (Thesis advisor) / Richa, Andrea (Committee member) / Sen, Arunabha (Committee member) / Arizona State University (Publisher)

Created2018

Filtering by