Matching Items (2)
154800-Thumbnail Image.png
Description
The concept of Linked Data is gaining widespread popularity and importance. The method of publishing and linking structured data on the web is called Linked Data. Emergence of Linked Data has made it possible to make sense of huge data, which is scattered all over the web, and link multiple

The concept of Linked Data is gaining widespread popularity and importance. The method of publishing and linking structured data on the web is called Linked Data. Emergence of Linked Data has made it possible to make sense of huge data, which is scattered all over the web, and link multiple heterogeneous sources. This leads to the challenge of maintaining the quality of Linked Data, i.e., ensuring outdated data is removed and new data is included. The focus of this thesis is devising strategies to effectively integrate data from multiple sources, publish it as Linked Data, and maintain the quality of Linked Data. The domain used in the study is online education. With so many online courses offered by Massive Open Online Courses (MOOC), it is becoming increasingly difficult for an end user to gauge which course best fits his/her needs.

Users are spoilt for choices. It would be very helpful for them to make a choice if there is a single place where they can visually compare the offerings of various MOOC providers for the course they are interested in. Previous work has been done in this area through the MOOCLink project that involved integrating data from Coursera, EdX, and Udacity and generation of linked data, i.e. Resource Description Framework (RDF) triples.

The research objective of this thesis is to determine a methodology by which the quality

of data available through the MOOCLink application is maintained, as there are lots of new courses being constantly added and old courses being removed by data providers. This thesis presents the integration of data from various MOOC providers and algorithms for incrementally updating linked data to maintain their quality and compare it against a naïve approach in order to constantly keep the users engaged with up-to-date data. A master threshold value was determined through experiments and analysis that quantifies one algorithm being better than the other in terms of time efficiency. An evaluation of the tool shows the effectiveness of the algorithms presented in this thesis.
ContributorsDhekne, Chinmay (Author) / Bansal, Srividya (Thesis advisor) / Bansal, Ajay (Committee member) / Sohoni, Sohum (Committee member) / Arizona State University (Publisher)
Created2016
155595-Thumbnail Image.png
Description
While predicting completion in Massive Open Online Courses (MOOCs) has been an active area of research in recent years, predicting completion in self-paced MOOCS, the fastest growing segment of open online courses, has largely been ignored. Using learning analytics and educational data mining techniques, this study examined data generated by

While predicting completion in Massive Open Online Courses (MOOCs) has been an active area of research in recent years, predicting completion in self-paced MOOCS, the fastest growing segment of open online courses, has largely been ignored. Using learning analytics and educational data mining techniques, this study examined data generated by over 4,600 individuals working in a self-paced, open enrollment college algebra MOOC over a period of eight months.

Although just 4% of these students completed the course, models were developed that could predict correctly nearly 80% of the time which students would complete the course and which would not, based on each student’s first day of work in the online course. Logistic regression was used as the primary tool to predict completion and focused on variables associated with self-regulated learning (SRL) and demographic variables available from survey information gathered as students begin edX courses (the MOOC platform employed).

The strongest SRL predictor was the amount of time students spent in the course on their first day. The number of math skills obtained the first day and the pace at which these skills were gained were also predictors, although pace was negatively correlated with completion. Prediction models using only SRL data obtained on the first day in the course correctly predicted course completion 70% of the time, whereas models based on first-day SRL and demographic data made correct predictions 79% of the time.
ContributorsCunningham, James Allan (Author) / Bitter, Gary (Thesis advisor) / Barber, Rebecca (Committee member) / Douglas, Ian (Committee member) / Arizona State University (Publisher)
Created2017