Search Content

Matching Items (3)

Filtering by

All Subjects: Database
Creators: Zhao, Ming

On Processing Spatial Queries in Graph Database Management Systems

Description

Spatial data is fundamental in many applications like map services, land resource management, etc. Meanwhile, spatial data inherently comes with abundant context information because spatial entities themselves possess different properties, e.g., graph or textual information, etc. Among all these compound spatial data, geospatial graph data is one of the most challenging for the complexity of graph data. Graph data is commonly used to model real scenarios and searching for the matching subgraphs is fundamental in retrieving and analyzing graph data. With the ubiquity of spatial data, vertexes or edges in graphs are enriched with spatial location attributes side by side with other non-spatial attributes. Graph-based applications integrate spatial data into the graph model and provide more spatial-aware services. The co-existence of the graph and spatial data in the same geospatial graph triggers some new applications. To solve new problems in these applications, existing solutions develop an integrated system that incorporates the graph database and spatial database engines. However, existing approaches suffer from the architecture where graph data and spatial data are isolated. In this dissertation, I will explain two indexing frameworks, GeoReach and RisoTree, which can significantly accelerate the queries in geospatial graphs. GeoReach includes a query operator that adds spatial data awareness to a graph database management system. In GeoReach, the neighborhood spatial information is summarized and stored on each vertex in the graph. The summarization includes three different structures according to the location distribution. These spatial summaries are utilized to terminate the graph search early.RisoTree is a hierarchical tree structure where each node is represented by a minimum bounding rectangle (MBR). The MBR of a node is a rectangle that encloses all its children. A key difference between RisoTree and RTree is that RisoTree contains pre-materialized subgraph information to each index node. The subgraph information is utilized during the spatial index search phase to prune search paths that cannot satisfy the query graph pattern. The RisoTree index reduces the search space when the spatial filtering phase is performed with relatively light cost.

ContributorsSun, Yuhan (Author) / Sarwat, Mohamed (Thesis advisor) / Tong, Hanghang (Committee member) / Candan, Kasim S (Committee member) / Zhao, Ming (Committee member) / Arizona State University (Publisher)

Created2021

Database Storage Design for Model Serving Workloads

Description

The meteoric rise of Deep Neural Networks (DNN) has led to the development of various Machine Learning (ML) frameworks (e.g., Tensorflow, PyTorch). Every ML framework has a different way of handling DNN models, data types, operations involved, and the internal representations stored on disk or memory. There have been initiatives such as the Open Neural Network Exchange (ONNX) for a more standardized approach to machine learning for better interoperability between the various popular ML frameworks. Model Serving Platforms (MSP) (e.g., Tensorflow Serving, Clipper) are used for serving DNN models to applications and edge devices. These platforms have gained widespread use for their flexibility in serving DNN models created by various ML frameworks. They also have additional capabilities such as caching, automatic ensembling, and scheduling. However, few of these frameworks focus on optimizing the storage of these DNN models, some of which may take up to ∼130GB storage space(“Turing-NLG: A 17-billion-parameter language model by Microsoft” 2020). These MSPs leave it to the ML frameworks for optimizing the DNN model with various model compression techniques, such as quantization and pruning. This thesis investigates the viability of automatic cross-model compression using traditional deduplication techniques and storage optimizations. Scenarios are identified where different DNN models have shareable model weight parameters. “Chunking” a model into smaller pieces is explored as an approach for deduplication. This thesis also proposes a design for storage in a Relational Database Management System (RDBMS) that allows for automatic cross-model deduplication.

ContributorsDas, Amitabh (Author) / Zou, Jia (Thesis advisor) / Zhao, Ming (Thesis advisor) / Yang, Yingzhen (Committee member) / Arizona State University (Publisher)

Created2021

The Necessity of Error Correction In The Quantum World

Description

Quantum computers provide a promising future, where computationally difficult
problems can be executed exponentially faster than the current classical computers we have in use today. While there is tremendous research and development in the creation of quantum computers, there is a fundamental challenge that exists in the quantum world. Due to the fragility of the quantum world, error correction methods have originated since 1995 to tackle the giant problem. Since the birth of the idea that these powerful computers can crunch and process numbers beyond the limit of the current computers, there exist several mathematical error correcting codes that could potentially give the required stability in the fragile and fault tolerant quantum world. While there has been a multitude of possible solutions, there is no one single error correcting code that is the key to solving the problem. Almost every solution presented has shared with it a limiting factor or an issue that prevents it from becoming the breakthrough that is desperately needed.

This paper gives an introductory knowledge of what is the quantum world and why there is a need for error correcting topologies. Finally, it introduces one recent topology that could be added to the list of possible solutions to this central problem. Rather than focusing on the mathematical frameworks, the paper introduces the main concepts so that most readers even outside the major field of computer science can understand what the main problem is and how this topology attempts to solve it.

ContributorsAhmed, Umer (Author) / Colbourn, Charles (Thesis director) / Zhao, Ming (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2020-05