Matching Items (3)

Filtering by

Clear all filters

154834-Thumbnail Image.png

A semantic framework for integrating and publishing linked data on the Web

Description

Semantic web is the web of data that provides a common framework and technologies for sharing and reusing data in various applications. In semantic web terminology, linked data is the term used to describe a method of exposing and connecting

Semantic web is the web of data that provides a common framework and technologies for sharing and reusing data in various applications. In semantic web terminology, linked data is the term used to describe a method of exposing and connecting data on the web from different sources. The purpose of linked data and semantic web is to publish data in an open and standard format and to link this data with existing data on the Linked Open Data Cloud. The goal of this thesis to come up with a semantic framework for integrating and publishing linked data on the web. Traditionally integrating data from multiple sources usually involves an Extract-Transform-Load (ETL) framework to generate datasets for analytics and visualization. The thesis proposes introducing a semantic component in the ETL framework to semi-automate the generation and publishing of linked data. In this thesis, various existing ETL tools and data integration techniques have been analyzed and deficiencies have been identified. This thesis proposes a set of requirements for the semantic ETL framework by conducting a manual process to integrate data from various sources such as weather, holidays, airports, flight arrival, departure and delays. The research questions that are addressed are: (i) to what extent can the integration, generation, and publishing of linked data to the cloud using a semantic ETL framework be automated; (ii) does use of semantic technologies produce a richer data model and integrated data. Details of the methodology, data collection, and application that uses the linked data generated are presented. Evaluation is done by comparing traditional data integration approach with semantic ETL approach in terms of effort involved in integration, data model generated and querying the data generated.

Contributors

Agent

Created

Date Created
2016

156331-Thumbnail Image.png

Graph Search as a Feature in Imperative/Procedural Programming Languages

Description

Graph theory is a critical component of computer science and software engineering, with algorithms concerning graph traversal and comprehension powering much of the largest problems in both industry and research. Engineers and researchers often have an accurate view of their

Graph theory is a critical component of computer science and software engineering, with algorithms concerning graph traversal and comprehension powering much of the largest problems in both industry and research. Engineers and researchers often have an accurate view of their target graph, however they struggle to implement a correct, and efficient, search over that graph.

To facilitate rapid, correct, efficient, and intuitive development of graph based solutions we propose a new programming language construct - the search statement. Given a supra-root node, a procedure which determines the children of a given parent node, and optional definitions of the fail-fast acceptance or rejection of a solution, the search statement can conduct a search over any graph or network. Structurally, this statement is modelled after the common switch statement and is put into a largely imperative/procedural context to allow for immediate and intuitive development by most programmers. The Go programming language has been used as a foundation and proof-of-concept of the search statement. A Go compiler is provided which implements this construct.

Contributors

Agent

Created

Date Created
2018

154800-Thumbnail Image.png

MOOCLink: linking and maintaining qulity of data provided by various MOOC providers

Description

The concept of Linked Data is gaining widespread popularity and importance. The method of publishing and linking structured data on the web is called Linked Data. Emergence of Linked Data has made it possible to make sense of huge data,

The concept of Linked Data is gaining widespread popularity and importance. The method of publishing and linking structured data on the web is called Linked Data. Emergence of Linked Data has made it possible to make sense of huge data, which is scattered all over the web, and link multiple heterogeneous sources. This leads to the challenge of maintaining the quality of Linked Data, i.e., ensuring outdated data is removed and new data is included. The focus of this thesis is devising strategies to effectively integrate data from multiple sources, publish it as Linked Data, and maintain the quality of Linked Data. The domain used in the study is online education. With so many online courses offered by Massive Open Online Courses (MOOC), it is becoming increasingly difficult for an end user to gauge which course best fits his/her needs.

Users are spoilt for choices. It would be very helpful for them to make a choice if there is a single place where they can visually compare the offerings of various MOOC providers for the course they are interested in. Previous work has been done in this area through the MOOCLink project that involved integrating data from Coursera, EdX, and Udacity and generation of linked data, i.e. Resource Description Framework (RDF) triples.

The research objective of this thesis is to determine a methodology by which the quality

of data available through the MOOCLink application is maintained, as there are lots of new courses being constantly added and old courses being removed by data providers. This thesis presents the integration of data from various MOOC providers and algorithms for incrementally updating linked data to maintain their quality and compare it against a naïve approach in order to constantly keep the users engaged with up-to-date data. A master threshold value was determined through experiments and analysis that quantifies one algorithm being better than the other in terms of time efficiency. An evaluation of the tool shows the effectiveness of the algorithms presented in this thesis.

Contributors

Agent

Created

Date Created
2016