Search Content

Concurrent Checkpointing for Embedded Real-Time Systems

Description

The Internet of Things ecosystem has spawned a wide variety of embedded real-time systems that complicate the identification and resolution of bugs in software. The methods of concurrent checkpoint provide a means to monitor the application state with the ability to replay the execution on like hardware and software,…

The Internet of Things ecosystem has spawned a wide variety of embedded real-time systems that complicate the identification and resolution of bugs in software. The methods of concurrent checkpoint provide a means to monitor the application state with the ability to replay the execution on like hardware and software, without holding off and delaying the execution of application threads. In this thesis, it is accomplished by monitoring physical memory of the application using a soft-dirty page tracker and measuring the various types of overhead when employing concurrent checkpointing. The solution presented is an advancement of the Checkpoint and Replay In Userspace (CRIU) thereby eliminating the large stalls and parasitic operation for each successive checkpoint. Impact and performance is measured using the Parsec 3.0 Benchmark suite and 4.11.12-rt16+ Linux kernel on a MinnowBoard Turbot Quad-Core board.

ContributorsPrinke, Michael L (Author) / Lee, Yann-Hang (Thesis advisor) / Shrivastava, Aviral (Committee member) / Zhao, Ming (Committee member) / Arizona State University (Publisher)

Created2018

Exploration of Edge Machine Learning-based Stress Detection Using Wearable Devices

Description

Stress is one of the critical factors in daily lives, as it has a profound impact onperformance at work and decision-making processes. With the development of IoT technology, smart wearables can handle diverse operations, including networking and recording biometric signals. Also, it has become easier for individual users to selfdetect stress with…

Stress is one of the critical factors in daily lives, as it has a profound impact onperformance at work and decision-making processes. With the development of IoT technology, smart wearables can handle diverse operations, including networking and recording biometric signals. Also, it has become easier for individual users to selfdetect stress with recorded data since these wearables as well as their accompanying smartphones now have data processing capability. Edge computing on such devices enables real-time feedback and in turn preemptive identification of reactions to stress. This can provide an opportunity to prevent more severe consequences that might result if stress is unaddressed. From a system perspective, leveraging edge computing allows saving energy such as network bandwidth and latency since it processes data in proximity to the data source. It can also strengthen privacy by implementing stress prediction at local devices without transferring personal information to the public cloud. This thesis presents a framework for real-time stress prediction using Fitbit and machine learning with the support from cloud computing. Fitbit is a wearable tracker that records biometric measurements using optical sensors on the wrist. It also provides developers with platforms to design custom applications. I developed an application for the Fitbit and the user’s accompanying mobile device to collect heart rate fluctuations and corresponding stress levels entered by users. I also established the dataset collected from police cadets during their academy training program. Machine learning classifiers for stress prediction are built using classic models and TensorFlow in the cloud. Lastly, the classifiers are optimized using model compression techniques for deploying them on the smartphones and analyzed how efficiently stress prediction can be performed on the edge.

ContributorsSim, Sang-Hun (Author) / Zhao, Ming (Thesis advisor) / Roberts, Nicole (Committee member) / Zou, Jia (Committee member) / Arizona State University (Publisher)

Created2022

Towards Scalable Security State Management in The Cloud

Description

Modern data center networks require efficient and scalable security analysis approaches that can analyze the relationship between the vulnerabilities. Utilizing the Attack Representation Methods (ARMs) and Attack Graphs (AGs) enables the security administrator to understand the cloud network’s current security situation at the low-level. However, the AG approach suffers from…

Modern data center networks require efficient and scalable security analysis approaches that can analyze the relationship between the vulnerabilities. Utilizing the Attack Representation Methods (ARMs) and Attack Graphs (AGs) enables the security administrator to understand the cloud network’s current security situation at the low-level. However, the AG approach suffers from scalability challenges. It relies on the connectivity between the services and the vulnerabilities associated with the services to allow the system administrator to realize its security state. In addition, the security policies created by the administrator can have conflicts among them, which is often detected in the data plane of the Software Defined Networking (SDN) system. Such conflicts can cause security breaches and increase the flow rules processing delay. This dissertation addresses these challenges with novel solutions to tackle the scalability issue of Attack Graphs and detect security policy conflictsin the application plane before they are transmitted into the data plane for final installation. Specifically, it introduces a segmentation-based scalable security state (S3) framework for the cloud network. This framework utilizes the well-known divide-and-conquer approach to divide the large network region into smaller, manageable segments. It follows a well-known segmentation approach derived from the K-means clustering algorithm to partition the system into segments based on the similarity between the services. Furthermore, the dissertation presents unified intent rules that abstract the network administration from the underlying network controller’s format. It develops a networking service solution to use a bounded formal model for network service compliance checking that significantly reduces the complexity of flow rule conflict checking at the data plane level. The solution can be expended from a single SDN domain to multiple SDN domains and hybrid networks by applying network service function chaining (SFC) for inter-domain policy management.

ContributorsSabur, Abdulhakim (Author) / Zhao, Ming (Thesis advisor) / Xue, Guoliang (Committee member) / Davulcu, Hasan (Committee member) / Zhang, Yanchao (Committee member) / Arizona State University (Publisher)

Created2023

Understanding Social Media Influence, Semantic Network Analysis, and Thematic Campaign Campaign Classification Using Machine Learning.

Description

Individuals and organizations have greater access to the world's population than ever before. The effects of Social Media Influence have already impacted the behaviour and actions of the world's population. This research employed mixed methods to investigate the mechanisms to further the understand of how Social Media Influence Campaigns (SMIC)…

Individuals and organizations have greater access to the world's population than ever before. The effects of Social Media Influence have already impacted the behaviour and actions of the world's population. This research employed mixed methods to investigate the mechanisms to further the understand of how Social Media Influence Campaigns (SMIC) impact the global community as well as develop tools and frameworks to conduct analysis. The research has qualitatively examined the perceptions of Social Media, specifically how leadership believe it will change and it's role within future conflict. This research has developed and tested semantic ontological modelling to provide insights into the nature of network related behaviour of SMICs. This research also developed exemplar data sets of SMICs. The insights gained from initial research were used to train Machine Learning classifiers to identify thematically related campaigns. This work has been conducted in close collaboration with Alliance Plus Network partner, University of New South Wales and the Australian Defence Force.

ContributorsJohnson, Nathan (Author) / Reisslein, Martin (Thesis advisor) / Turnbull, Benjamin (Committee member) / Zhao, Ming (Committee member) / Zhang, Yanchao (Committee member) / Arizona State University (Publisher)

Created2022

Optimizing Memory and Storage Disaggregation for Data-intensive Systems

Description

Data-intensive systems such as big data and large machine learning (ML) systems experience serious scalability challenges due to the ever-increasing data demand from ML and analytics applications and the resource fragmentation caused by conventional monolithic server architecture. Memory and storage disaggregation emerges as a pivotal technology to address these challenges…

Data-intensive systems such as big data and large machine learning (ML) systems experience serious scalability challenges due to the ever-increasing data demand from ML and analytics applications and the resource fragmentation caused by conventional monolithic server architecture. Memory and storage disaggregation emerges as a pivotal technology to address these challenges by decoupling memory and storage resources from individual servers and managing and provisioning them to applications as a shared resource pool. This dissertation investigates several important aspects of memory and storage disaggregation and proposes novel solutions to support data-intensive applications.First, caching is a fundamental way to utilize disaggregated storage, but building a large disaggregated cache is challenging because the commonly-used fix-sized cache block allocation scheme is unable to provide good cache performance with low memory overhead for diverse cloud workloads with vastly different I/O patterns. The dissertation proposes a novel adaptive cache block allocation approach that dynamically adjusts cache block sizes based on changing I/O patterns. This approach significantly improves I/O performance while reducing memory usage, outperforming traditional fixed-size cache systems in diverse cloud workloads. Evaluation shows that it improves read latency by 20% and write latency by 9%. It also reduces the amount of I/O traffic to cloud block storage by up to 74% while achieving up to 41% memory savings with only 2 ms. Second, large ML applications such as large language model (LLM) inference are memory demanding, but to support them using disaggregated memory brings challenges to memory management since disaggregated memory has higher memory access latency compared to local memory. The dissertation proposes latency-aware memory aggregation which cautiously distributes memory accesses to minimize the latency gap between local and disaggregated memory. It also proposes NUMA-aligned tensor parallelism to further improve the computing efficiency. With these optimizations, LLM inference achieves substantial speedups. For example, first token latency improves by 61%, and end-to-end latency improves by 43% for a LLM inference task which uses a model of 66 billion parameters when the batch size is 8. Finally, to address the cost, power consumption, and volatility of DRAM, the dissertation proposes to incorporate flash memory into memory pools within the disaggregation framework. By establishing a tiered memory architecture which combines fast-tier local DRAM with slow-tier DRAM and flash memory in the memory pool and effectively migrates data based on hotness across memory tiers, this approach not only reduces expenses but also maintains the overall performance and scalability of data-intensive systems. For example, with 50% saving in memory cost, the performance degradation of training ResNet50 on ImageNet dataset is only 2.68%. Together, these contributions systematically optimize the use of memory and storage disaggregation to deliver more efficient, scalable, and cost-effective systems for supporting the data explosion in today’s and future computing systems.

ContributorsYang, Qirui (Author) / Zhao, Ming (Thesis advisor) / Shrivastava, Aviral (Committee member) / Ren, Fengbo (Committee member) / Zou, Jia (Committee member) / Arizona State University (Publisher)

Created2024

FPGA-Based Edge-Computing Acceleration

Description

The rapid growth of Internet-of-things (IoT) and artificial intelligence applications have called forth a new computing paradigm--edge computing. Edge computing applications, such as video surveillance, autonomous driving, and augmented reality, are highly computationally intensive and require real-time processing. Current edge systems are typically based on commodity general-purpose hardware such as…

The rapid growth of Internet-of-things (IoT) and artificial intelligence applications have called forth a new computing paradigm--edge computing. Edge computing applications, such as video surveillance, autonomous driving, and augmented reality, are highly computationally intensive and require real-time processing. Current edge systems are typically based on commodity general-purpose hardware such as Central Processing Units (CPUs) and Graphical Processing Units (GPUs) , which are mainly designed for large, non-time-sensitive jobs in the cloud and do not match the needs of the edge workloads. Also, these systems are usually power hungry and are not suitable for resource-constrained edge deployments. Such application-hardware mismatch calls forth a new computing backbone to support the high-bandwidth, low-latency, and energy-efficient requirements. Also, the new system should be able to support a variety of edge applications with different characteristics. This thesis addresses the above challenges by studying the use of Field Programmable Gate Array (FPGA) -based computing systems for accelerating the edge workloads, from three critical angles. First, it investigates the feasibility of FPGAs for edge computing, in comparison to conventional CPUs and GPUs. Second, it studies the acceleration of common algorithmic characteristics, identified as loop patterns, using FPGAs, and develops a benchmark tool for analyzing the performance of these patterns on different accelerators. Third, it designs a new edge computing platform using multiple clustered FPGAs to provide high-bandwidth and low-latency acceleration of convolutional neural networks (CNNs) widely used in edge applications. Finally, it studies the acceleration of the emerging neural networks, randomly-wired neural networks, on the multi-FPGA platform. The experimental results from this work show that the new generation of workloads requires rethinking the current edge-computing architecture. First, through the acceleration of common loops, it demonstrates that FPGAs can outperform GPUs in specific loops types up to 14 times. Second, it shows the linear scalability of multi-FPGA platforms in accelerating neural networks. Third, it demonstrates the superiority of the new scheduler to optimally place randomly-wired neural networks on multi-FPGA platforms with 81.1 times better throughput than the available scheduling mechanisms.

ContributorsBiookaghazadeh, Saman (Author) / Zhao, Ming (Thesis advisor) / Ren, Fengbo (Thesis advisor) / Li, Baoxin (Committee member) / Seo, Jae-Sun (Committee member) / Arizona State University (Publisher)

Created2021

The Necessity of Error Correction In The Quantum World

Description

Quantum computers provide a promising future, where computationally difficult
problems can be executed exponentially faster than the current classical computers we have in use today. While there is tremendous research and development in the creation of quantum computers, there is a fundamental challenge that exists in the quantum world. Due to…

Quantum computers provide a promising future, where computationally difficult
problems can be executed exponentially faster than the current classical computers we have in use today. While there is tremendous research and development in the creation of quantum computers, there is a fundamental challenge that exists in the quantum world. Due to the fragility of the quantum world, error correction methods have originated since 1995 to tackle the giant problem. Since the birth of the idea that these powerful computers can crunch and process numbers beyond the limit of the current computers, there exist several mathematical error correcting codes that could potentially give the required stability in the fragile and fault tolerant quantum world. While there has been a multitude of possible solutions, there is no one single error correcting code that is the key to solving the problem. Almost every solution presented has shared with it a limiting factor or an issue that prevents it from becoming the breakthrough that is desperately needed.

This paper gives an introductory knowledge of what is the quantum world and why there is a need for error correcting topologies. Finally, it introduces one recent topology that could be added to the list of possible solutions to this central problem. Rather than focusing on the mathematical frameworks, the paper introduces the main concepts so that most readers even outside the major field of computer science can understand what the main problem is and how this topology attempts to solve it.

ContributorsAhmed, Umer (Author) / Colbourn, Charles (Thesis director) / Zhao, Ming (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2020-05

Filtering by

Concurrent Checkpointing for Embedded Real-Time Systems

Exploration of Edge Machine Learning-based Stress Detection Using Wearable Devices

Towards Scalable Security State Management in The Cloud

Understanding Social Media Influence, Semantic Network Analysis, and Thematic Campaign Campaign Classification Using Machine Learning.

Optimizing Memory and Storage Disaggregation for Data-intensive Systems

FPGA-Based Edge-Computing Acceleration

The Necessity of Error Correction In The Quantum World