The concept of Linked Data is gaining widespread popularity and importance. The method of publishing and linking structured data on the web is called Linked Data. Emergence of Linked Data has made it possible to make sense of huge data, which is scattered all over the web, and link multiple heterogeneous sources. This leads to the challenge of maintaining the quality of Linked Data, i.e., ensuring outdated data is removed and new data is included. The focus of this thesis is devising strategies to effectively integrate data from multiple sources, publish it as Linked Data, and maintain the quality of Linked Data. The domain used in the study is online education. With so many online courses offered by Massive Open Online Courses (MOOC), it is becoming increasingly difficult for an end user to gauge which course best fits his/her needs.
Users are spoilt for choices. It would be very helpful for them to make a choice if there is a single place where they can visually compare the offerings of various MOOC providers for the course they are interested in. Previous work has been done in this area through the MOOCLink project that involved integrating data from Coursera, EdX, and Udacity and generation of linked data, i.e. Resource Description Framework (RDF) triples.
The research objective of this thesis is to determine a methodology by which the quality
of data available through the MOOCLink application is maintained, as there are lots of new courses being constantly added and old courses being removed by data providers. This thesis presents the integration of data from various MOOC providers and algorithms for incrementally updating linked data to maintain their quality and compare it against a naïve approach in order to constantly keep the users engaged with up-to-date data. A master threshold value was determined through experiments and analysis that quantifies one algorithm being better than the other in terms of time efficiency. An evaluation of the tool shows the effectiveness of the algorithms presented in this thesis.
- MOOCLink: linking and maintaining qulity of data provided by various MOOC providers