ASU Electronic Theses and Dissertations
This collection includes most of the ASU Theses and Dissertations from 2011 to present. ASU Theses and Dissertations are available in downloadable PDF format; however, a small percentage of items are under embargo. Information about the dissertations/theses includes degree information, committee members, an abstract, supporting data or media.
In addition to the electronic theses found in the ASU Digital Repository, ASU Theses and Dissertations can be found in the ASU Library Catalog.
Dissertations and Theses granted by Arizona State University are archived and made available through a joint effort of the ASU Graduate College and the ASU Libraries. For more information or questions about this collection contact or visit the Digital Repository ETD Library Guide or contact the ASU Graduate College at gradformat@asu.edu.
Filtering by
- Creators: Davulcu, Hasan
This research investigates the problem of preference aware skyline processing which consists of inferring the preferences of users and computing a skyline specific to that user, taking into account his preferences. This research proposes a model that transforms the data from a given space to a user preferential space where each attribute represents the preference of the user. This study proposes two techniques "Preferential Skyline Processing" and "Latent Skyline Processing" to efficiently compute preference aware skylines in the user preferential space. Finally, through extensive experiments and performance analysis the correctness of the recommendations and the algorithm's ability to outperform the naïve ones is confirmed.
Yet, often, queries come as part of a query workload. Therefore, there is a need
for index structures that can take into consideration existence of multiple queries in a
query workload and efficiently produce accurate results for the entire query workload.
These index structures should be scalable to handle large amounts of data as well as
large query workloads.
The main objective of this dissertation is to create and design scalable index structures
that are optimized for range query workloads. Range queries are an important
type of queries with wide-ranging applications. There are no existing index structures
that are optimized for efficient execution of range query workloads. There are
also unique challenges that need to be addressed for range queries in 1D, 2D, and
high-dimensional spaces. In this work, I introduce novel cost models, index selection
algorithms, and storage mechanisms that can tackle these challenges and efficiently
process a given range query workload in 1D, 2D, and high-dimensional spaces. In particular,
I introduce the index structures, HCS (for 1D spaces), cSHB (for 2D spaces),
and PSLSH (for high-dimensional spaces) that are designed specifically to efficiently
handle range query workload and the unique challenges arising from their respective
spaces. I experimentally show the effectiveness of the above proposed index structures
by comparing with state-of-the-art techniques.
processing, time series, and genome data. In higher dimensions, the phenomenon of
curse of dimensionality kills the effectiveness of most of the index structures, giving
way to approximate methods like Locality Sensitive Hashing (LSH), to answer similarity
searches. In addition to range searches and k-nearest neighbor searches, there
is a need to answer negative queries formed by excluded regions, in high-dimensional
data. Though there have been a slew of variants of LSH to improve efficiency, reduce
storage, and provide better accuracies, none of the techniques are capable of
answering queries in the presence of excluded regions.
This thesis provides a novel approach to handle such negative queries. This is
achieved by creating a prefix based hierarchical index structure. First, the higher
dimensional space is projected to a lower dimension space. Then, a one-dimensional
ordering is developed, while retaining the hierarchical traits. The algorithm intelligently
prunes the irrelevant candidates while answering queries in the presence of
excluded regions. While naive LSH would need to filter out the negative query results
from the main results, the new algorithm minimizes the need to fetch the redundant
results in the first place. Experiment results show that this reduces post-processing
cost thereby reducing the query processing time.
To address these challenges, I develop an innovative robust multi-variate fea- ture extraction algorithm over multi-dimensional temporal datasets, which is able to help understand and analyze various real-world applications. Furthermore, to an- swer queries over these features, I develop a novel resource-aware indexing framework to approximately solve top-k queries by leveraging onion-layer indexing in conjunc- tion with locality sensitive hashing. The proposed indexing scheme allows people to answer top-k queries by only accessing a bounded amount of data, which optimizes big data small for queries.