Filtering by
- Creators: Barrett, The Honors College
This research investigates the problem of preference aware skyline processing which consists of inferring the preferences of users and computing a skyline specific to that user, taking into account his preferences. This research proposes a model that transforms the data from a given space to a user preferential space where each attribute represents the preference of the user. This study proposes two techniques "Preferential Skyline Processing" and "Latent Skyline Processing" to efficiently compute preference aware skylines in the user preferential space. Finally, through extensive experiments and performance analysis the correctness of the recommendations and the algorithm's ability to outperform the naïve ones is confirmed.
Efficient processing of skyline queries on static data sources, data streams and incomplete datasets
An assumption commonly made by many skyline algorithms is that a skyline query is applied to a single static data source or data stream. Unfortunately, this assumption does not hold in many applications in which a skyline query may involve attributes belonging to multiple data sources and requires a join operation to be performed before the skyline can be produced. Recently, various skyline-join algorithms have been proposed to address this problem in the context of static data sources. However, these algorithms suffer from several drawbacks: they often need to scan the data sources exhaustively to obtain the skyline-join results; moreover, the pruning techniques employed to eliminate tuples are largely based on expensive tuple-to-tuple comparisons. On the other hand, most data stream techniques focus on single stream skyline queries, thus rendering them unsuitable for skyline-join queries.
Another assumption typically made by most of the earlier skyline algorithms is that the data is complete and all skyline attribute values are available. Due to this constraint, these algorithms cannot be applied to incomplete data sources in which some of the attribute values are missing and are represented by NULL values. There exists a definition of dominance for incomplete data, but this leads to undesirable consequences such as non-transitive and cyclic dominance relations both of which are detrimental to skyline processing.
Based on the aforementioned observations, the main goal of the research described in this dissertation is the design and development of a framework of skyline operators that effectively handles three distinct types of skyline queries: 1) skyline-join queries on static data sources, 2) skyline-window-join queries over data streams, and 3) strata-skyline queries on incomplete datasets. This dissertation presents the unique challenges posed by these skyline queries and addresses the shortcomings of current skyline techniques by proposing efficient methods to tackle the added overhead in processing skyline queries on static data sources, data streams, and incomplete datasets.
Geology and its tangential studies, collectively known and referred to in this thesis as geosciences, have been paramount to the transformation and advancement of society, fundamentally changing the way we view, interact and live with the surrounding natural and built environment. It is important to recognize the value and importance of this interdisciplinary scientific field while reconciling its ties to imperial and colonizing extractive systems which have led to harmful and invasive endeavors. This intersection among geosciences, (environmental) justice studies, and decolonization is intended to promote inclusive pedagogical models through just and equitable methodologies and frameworks as to prevent further injustices and promote recognition and healing of old wounds. By utilizing decolonial frameworks and highlighting the voices of peoples from colonized and exploited landscapes, this annotated syllabus tackles the issues previously described while proposing solutions involving place-based education and the recentering of land within geoscience pedagogical models. (abstract)
The ASU COVID-19 testing lab process was developed to operate as the primary testing site for all ASU staff, students, and specified external individuals. Tests are collected at various collection sites, including a walk-in site at the SDFC and various drive-up sites on campus; analysis is conducted on ASU campus and results are distributed virtually to all patients via the Health Services patient portal. The following is a literature review on past implementations of various process improvement techniques and how they can be applied to the ABCTL testing process to achieve laboratory goals. (abstract)
for capturing different aspects of the real world attributes has also led to an increase in dimensionality from uni-variate to multi-variate time series. This has facilitated richer data representation but also has necessitated algorithms determining similarity between two multi-variate time series for search and analysis.
Various algorithms have been extended from uni-variate to multi-variate case, such as multi-variate versions of Euclidean distance, edit distance, dynamic time warping. However, it has not been studied how these algorithms account for asynchronous in time series. Human gestures, for example, exhibit asynchrony in their patterns as different subjects perform the same gesture with varying movements in their patterns at different speeds. In this thesis, we propose several algorithms (some of which also leverage metadata describing the relationships among the variates). In particular, we present several techniques that leverage the contextual relationships among the variates when measuring multi-variate time series similarities. Based on the way correlation is leveraged, various weighing mechanisms have been proposed that determine the importance of a dimension for discriminating between the time series as giving the same weight to each dimension can led to misclassification. We next study the robustness of the considered techniques against different temporal asynchronies, including shifts and stretching.
Exhaustive experiments were carried on datasets with multiple types and amounts of temporal asynchronies. It has been observed that accuracy of algorithms that rely on data to discover variate relationships can be low under the presence of temporal asynchrony, whereas in case of algorithms that rely on external metadata, robustness against asynchronous distortions tends to be stronger. Specifically, algorithms using external metadata have better classification accuracy and cluster separation than existing state-of-the-art work, such as EROS, PCA, and naive dynamic time warping.
In this thesis, I propose novel spatial partitioning techniques, Max Containment Clustering and Max Containment Clustering with Separation, to create load-balanced partitions of a range query workload. Each partition takes a similar amount of time to process the spatial queries and reduces the response latency by minimizing the disk access cost and optimizing the bitmap operations. The partitions created are processed in parallel using cSHB indexes. The proposed techniques utilize the block-based organization of bitmaps in the cSHB index and improve the performance of the cSHB index for processing a range query workload.