Matching Items (10)

136386-Thumbnail Image.png

Query System for epiDMS and EnergyPlus

Description

With the development of technology, there has been a dramatic increase in the number of machine learning programs. These complex programs make conclusions and can predict or perform actions based

With the development of technology, there has been a dramatic increase in the number of machine learning programs. These complex programs make conclusions and can predict or perform actions based off of models from previous runs or input information. However, such programs require the storing of a very large amount of data. Queries allow users to extract only the information that helps for their investigation. The purpose of this thesis was to create a system with two important components, querying and visualization. Metadata was stored in Sedna as XML and time series data was stored in OpenTSDB as JSON. In order to connect the two databases, the time series ID was stored as a metric in the XML metadata. Queries should be simple, flexible, and return all data that fits the query parameters. The query language used was an extension of XQuery FLWOR that added time series parameters. Visualization should be easily understood and be organized in a way to easily find important information and details. Because of the possibility of a large amount of data being returned from a query, a multivariate heat map was used to visualize the time series results. The two programs that the system performed queries on was Energy Plus and Epidemic Simulation Data Management System. By creating such a system, it would be easier for people of the project's fields to find the relationship between metadata that leads to the desired results over time. Over the time of the thesis project, the overall software was completed, however the software must be optimized in order to take the enormous amount of data expected from the system.

Contributors

Agent

Created

Date Created
  • 2015-05

134821-Thumbnail Image.png

MosquitoDB

Description

Mosquito population data is a valuable resource for researchers and public health officials working to limit the spread of deadly zoonotic viruses such as Zika Virus and West Nile Virus.

Mosquito population data is a valuable resource for researchers and public health officials working to limit the spread of deadly zoonotic viruses such as Zika Virus and West Nile Virus. Unfortunately, this data is currently difficult to obtain and aggregate across the United States. Obtaining historical data often requires filing requests to individual States or Counties and hoping for a response. Current online systems available for accessing aggregated data are lacking essential features, or limited in scope. In order to make mosquito population data more accessible for United States researchers, epidemiologists, and public health officials, the MosquitoDB system has been developed. MosquitoDB consists of a JavaScript Web Application, connected to a SQL database, that makes submitting and retrieving United States mosquito population data much simpler and straight forward than alternative systems. The MosquitoDB software project is open source and publically available on GitHub, allowing community scrutiny and contributions to add or improve necessary features. For this Creative Project, the core MosquitoDB system was designed and developed with 3 main features: 1) Web Interface for querying mosquito data. 2) Web Interface for submitting mosquito data. 3) Web Services for querying/retrieving and submitting mosquito data. The Web Interface is essential for common end users, such as researchers and public health officials, to access historical data or submit new data. The Web Services provide building blocks for Web Applications that other developers can use to incorporate data into new applications. The current MosquitoDB system is live at https://zodo.asu.edu/mosquito and the public code repository is available at https://github.com/developerDemetri/mosquitodb.

Contributors

Created

Date Created
  • 2016-12

The Coffee Hutch

Description

The Coffee Hutch project is derived from the field of Computer Science and consists of a website, a database, and a mobile application for Android devices. This three-tiered scheme is

The Coffee Hutch project is derived from the field of Computer Science and consists of a website, a database, and a mobile application for Android devices. This three-tiered scheme is designed to support a point-of-sale payment system to be integrated with a standalone product dispensing machine. The website contains landing pages which provide navigation and functional capabilities for users. The site also features a variety of PHP web services which communicate with the database using SQL commands. The application, programmed in the Java language, makes use of these services in a simple, utilitarian design aimed at modification of user data stored in the database. This database, developed with MySQL and managed with the phpMyAdmin application, contains limited information in order to maximize speed of read and write accesses from the website and Android app. Together, these three components comprise an effective payment management system model with mobile capabilities. All of the components of this project were built at no cost. The website hosting service is free and the third-party services required (such as Paypal payment services) are simulated. These simulations allowed me to demonstrate the functionality of the three-tiered product without the necessity for monetary supplication. This thesis features every aspect of the development and testing of The Coffee Hutch software components. Requirements for each function of the software are specified in one section, and they are aligned with various pieces of the code in the source documentation. Test cases which address each requirement are outlined in another section of the thesis.

Contributors

Agent

Created

Date Created
  • 2016-12

133334-Thumbnail Image.png

Developing Inventory Control and Build Management Software for Spacecraft Engineering

Description

Engineering an object means engineering the process that creates the object. Today, software can make the task of tracking these processes robust and straightforward. When engineering requirements are strict and

Engineering an object means engineering the process that creates the object. Today, software can make the task of tracking these processes robust and straightforward. When engineering requirements are strict and strenuous, software custom-built for such processes can prove essential. The work for this project was developing ICDB, an inventory control and build management system created for spacecraft engineers at ASU to record each step of their engineering processes. In-house development means ICDB is more precisely designed around its users' functionality and cost requirements than most off-the-shelf commercial offerings. By placing a complex relational database behind an intuitive web application, ICDB enables organizations and their users to create and store parts libraries, assembly designs, purchasing and location records for inventory items, and more.

Contributors

Agent

Created

Date Created
  • 2018-05

135486-Thumbnail Image.png

Asset Forfeitures in Arizona Law Enforcement: A Record of How They Seize and Spend

Description

This creative project is the first draft of a database of financial records from Arizona law enforcement's use of the state asset forfeiture program from fiscal 2011-2015. Asset forfeiture is

This creative project is the first draft of a database of financial records from Arizona law enforcement's use of the state asset forfeiture program from fiscal 2011-2015. Asset forfeiture is a program by which law enforcement can seize property suspected to have been used in a crime and can then use the property, cash, or proceeds from the property's auction for its own purposes, raising questions of conflicts of interest. The paper explains the methodology and goals for the database, while the database itself represents more than 11,000 pages of financial records and is more than 70,300 cells large.

Contributors

Created

Date Created
  • 2016-05

137234-Thumbnail Image.png

Media of Two Worlds: The Influence That Media Has On Its Viewers

Description

My thesis is about media in both Italy and the United States, and how they evolved into the media we consume today. It revolves around my Journalism and Communication major,

My thesis is about media in both Italy and the United States, and how they evolved into the media we consume today. It revolves around my Journalism and Communication major, as well as my Italian minor. I have incorporated both areas of my studies in my thesis; such as the differences in two different worlds and how they cover and relay media to their viewers, the way in which media influences children, and how advancements such as social media affect journalism in today's society. Through my research, I was able to show that media exists all around the world but the way it is relayed to it's public changes, and influences its audience. I conducted my research via peer-reviewed articles, journals and accredited academic works as well as personal and anonymous surveys. I used my interviews and surveys to build off of the articles I found to make a firm and strong conclusion. The resources used in my thesis were different professionals who currently work, or worked with a credible and well-know media
ews outlet. I also gathered information from elementary, middle, high-school, and college students. Having a different variety of ages, helped me gage the influence media has on its consumers so that I could draw an accurate conclusion.

Contributors

Created

Date Created
  • 2014-05

137235-Thumbnail Image.png

Index-Based Similarity Joins

Description

Similarity Joins are some of the most useful and powerful data processing techniques. They retrieve all the pairs of data points between different data sets that are considered similar within

Similarity Joins are some of the most useful and powerful data processing techniques. They retrieve all the pairs of data points between different data sets that are considered similar within a certain threshold. This operation is useful in many situations, such as record linkage, data cleaning, and many other applications. While many techniques to perform Similarity Joins have been proposed, one of the most useful methods is the use of indexing structures to improve the performance of Similarity Joins. After spending pre-processing time to construct an index over a given dataset, the index structure allows for queries over that dataset to be performed significantly faster. Thus, if a dataset will have multiple Similarity Join queries performed over it, it can be beneficial to use index-based techniques to perform Similarity Join queries for that dataset. We present an extension to a previously proposed index structure, the eD-Index, which provides support for Similarity Join operators. We evaluate the performance of the algorithms and also investigate the configuration of parameters that maximizes the performance of the indexing structures. We also propose an algorithm for Multi-Way Similarity Joins using this index, which allows for Similarity Join queries between more than two data sets at a time.

Contributors

Agent

Created

Date Created
  • 2014-05

150212-Thumbnail Image.png

Client-driven dynamic database updates

Description

This thesis addresses the problem of online schema updates where the goal is to be able to update relational database schemas without reducing the database system's availability. Unlike some other

This thesis addresses the problem of online schema updates where the goal is to be able to update relational database schemas without reducing the database system's availability. Unlike some other work in this area, this thesis presents an approach which is completely client-driven and does not require specialized database management systems (DBMS). Also, unlike other client-driven work, this approach provides support for a richer set of schema updates including vertical split (normalization), horizontal split, vertical and horizontal merge (union), difference and intersection. The update process automatically generates a runtime update client from a mapping between the old the new schemas. The solution has been validated by testing it on a relatively small database of around 300,000 records per table and less than 1 Gb, but with limited memory buffer size of 24 Mb. This thesis presents the study of the overhead of the update process as a function of the transaction rates and the batch size used to copy data from the old to the new schema. It shows that the overhead introduced is minimal for medium size applications and that the update can be achieved with no more than one minute of downtime.

Contributors

Agent

Created

Date Created
  • 2011

152906-Thumbnail Image.png

TensorDB and tensor-relational model (TRM) for efficient tensor-relational operations

Description

Multidimensional data have various representations. Thanks to their simplicity in modeling multidimensional data and the availability of various mathematical tools (such as tensor decompositions) that support multi-aspect analysis of such

Multidimensional data have various representations. Thanks to their simplicity in modeling multidimensional data and the availability of various mathematical tools (such as tensor decompositions) that support multi-aspect analysis of such data, tensors are increasingly being used in many application domains including scientific data management, sensor data management, and social network data analysis. Relational model, on the other hand, enables semantic manipulation of data using relational operators, such as projection, selection, Cartesian-product, and set operators. For many multidimensional data applications, tensor operations as well as relational operations need to be supported throughout the data life cycle. In this thesis, we introduce a tensor-based relational data model (TRM), which enables both tensor- based data analysis and relational manipulations of multidimensional data, and define tensor-relational operations on this model. Then we introduce a tensor-relational data management system, so called, TensorDB. TensorDB is based on TRM, which brings together relational algebraic operations (for data manipulation and integration) and tensor algebraic operations (for data analysis). We develop optimization strategies for tensor-relational operations in both in-memory and in-database TensorDB. The goal of the TRM and TensorDB is to serve as a single environment that supports the entire life cycle of data; that is, data can be manipulated, integrated, processed, and analyzed.

Contributors

Agent

Created

Date Created
  • 2014

153003-Thumbnail Image.png

Unsupervised Bayesian data cleaning techniques for structured data

Description

Recent efforts in data cleaning have focused mostly on problems like data deduplication, record matching, and data standardization; few of these focus on fixing incorrect attribute values in tuples. Correcting

Recent efforts in data cleaning have focused mostly on problems like data deduplication, record matching, and data standardization; few of these focus on fixing incorrect attribute values in tuples. Correcting values in tuples is typically performed by a minimum cost repair of tuples that violate static constraints like CFDs (which have to be provided by domain experts, or learned from a clean sample of the database). In this thesis, I provide a method for correcting individual attribute values in a structured database using a Bayesian generative model and a statistical error model learned from the noisy database directly. I thus avoid the necessity for a domain expert or master data. I also show how to efficiently perform consistent query answering using this model over a dirty database, in case write permissions to the database are unavailable. A Map-Reduce architecture to perform this computation in a distributed manner is also shown. I evaluate these methods over both synthetic and real data.

Contributors

Agent

Created

Date Created
  • 2014