While predicting completion in Massive Open Online Courses (MOOCs) has been an active area of research in recent years, predicting completion in self-paced MOOCS, the fastest growing segment of open online courses, has largely been ignored. Using learning analytics and educational data mining techniques, this study examined data generated by over 4,600 individuals working in a self-paced, open enrollment college algebra MOOC over a period of eight months.
Although just 4% of these students completed the course, models were developed that could predict correctly nearly 80% of the time which students would complete the course and which would not, based on each student’s first day of work in the online course. Logistic regression was used as the primary tool to predict completion and focused on variables associated with self-regulated learning (SRL) and demographic variables available from survey information gathered as students begin edX courses (the MOOC platform employed).
The strongest SRL predictor was the amount of time students spent in the course on their first day. The number of math skills obtained the first day and the pace at which these skills were gained were also predictors, although pace was negatively correlated with completion. Prediction models using only SRL data obtained on the first day in the course correctly predicted course completion 70% of the time, whereas models based on first-day SRL and demographic data made correct predictions 79% of the time.