Search Content

Predicting Student Dropout in Self-Paced MOOC Course

Description

One persisting problem in Massive Open Online Courses (MOOCs) is the issue of student dropout from these courses. The prediction of student dropout from MOOC courses can identify the factors responsible for such an event and it can further initiate intervention before such an event to increase student success in…

One persisting problem in Massive Open Online Courses (MOOCs) is the issue of student dropout from these courses. The prediction of student dropout from MOOC courses can identify the factors responsible for such an event and it can further initiate intervention before such an event to increase student success in MOOC. There are different approaches and various features available for the prediction of student’s dropout in MOOC courses.In this research, the data derived from the self-paced math course ‘College Algebra and Problem Solving’ offered on the MOOC platform Open edX offered by Arizona State University (ASU) from 2016 to 2020 was considered. This research aims to predict the dropout of students from a MOOC course given a set of features engineered from the learning of students in a day. Machine Learning (ML) model used is Random Forest (RF) and this model is evaluated using the validation metrics like accuracy, precision, recall, F1-score, Area Under the Curve (AUC), Receiver Operating Characteristic (ROC) curve. The average rate of student learning progress was found to have more impact than other features. The model developed can predict the dropout or continuation of students on any given day in the MOOC course with an accuracy of 87.5%, AUC of 94.5%, precision of 88%, recall of 87.5%, and F1-score of 87.5% respectively. The contributing features and interactions were explained using Shapely values for the prediction of the model. The features engineered in this research are predictive of student dropout and could be used for similar courses to predict student dropout from the course. This model can also help in making interventions at a critical time to help students succeed in this MOOC course.

ContributorsDominic Ravichandran, Sheran Dass (Author) / Gary, Kevin (Thesis advisor) / Bansal, Ajay (Committee member) / Cunningham, James (Committee member) / Sannier, Adrian (Committee member) / Arizona State University (Publisher)

Created2021

Machine Learning Models for High-Dimensional Matched Data

Description

Matching or stratification is commonly used in observational studies to remove bias due to confounding variables. Analyzing matched data sets requires specific methods which handle dependency among observations within a stratum. Also, modern studies often include hundreds or thousands of variables. Traditional methods for matched data sets are challenged in…

Matching or stratification is commonly used in observational studies to remove bias due to confounding variables. Analyzing matched data sets requires specific methods which handle dependency among observations within a stratum. Also, modern studies often include hundreds or thousands of variables. Traditional methods for matched data sets are challenged in high-dimensional settings, mixed type variables (numerical and categorical), nonlinear andinteraction effects. Furthermore, machine learning research for such structured data is quite limited. This dissertation addresses this important gap and proposes machine learning models for identifying informative variables from high-dimensional matched data sets. The first part of this dissertation proposes a machine learning model to identify informative variables from high-dimensional matched case-control data sets. The outcome of interest in this study design is binary (case or control), and each stratum is assumed to have one unit from each outcome level. The proposed method which is referred to as Matched Forest (MF) is effective for large number of variables and identifying interaction effects. The second part of this dissertation proposes three enhancements of MF algorithm. First, a regularization framework is proposed to improve variable selection performance in excessively high-dimensional settings. Second, a classification method is proposed to classify unlabeled pairs of data. Third, two metrics are proposed to estimate the effects of important variables identified by MF. The third part proposes a machine learning model based on Neural Networks to identify important variables from a more generalized matched case-control data set where each stratum has one unit from case outcome level and more than one unit from control outcome level. This method which is referred to as Matched Neural Network (MNN) performs better than current algorithms to identify variables with interaction effects. Lastly, a generalized machine learning model is proposed to identify informative variables from high-dimensional matched data sets where the outcome has more than two levels. This method outperforms existing algorithms in the literature in identifying variables with complex nonlinear and interaction effects.

ContributorsShomal Zadeh, Nooshin (Author) / Runger, George (Thesis advisor) / Montgomery, Douglas (Committee member) / Shinde, Shilpa (Committee member) / Escobedo, Adolfo (Committee member) / Arizona State University (Publisher)

Created2021

Machine Learning and Vision Using Edge Devices for Multimodal Chatbots and Bio-meteorological Sensing

Description

Machine learning (ML) and deep learning (DL) has become an intrinsic part of multiple fields. The ability to solve complex problems makes machine learning a panacea. In the last few years, there has been an explosion of data generation, which has greatly improvised machine learning models. But this comes with…

Machine learning (ML) and deep learning (DL) has become an intrinsic part of multiple fields. The ability to solve complex problems makes machine learning a panacea. In the last few years, there has been an explosion of data generation, which has greatly improvised machine learning models. But this comes with a cost of high computation, which invariably increases power usage and cost of the hardware. In this thesis we explore applications of ML techniques, applied to two completely different fields - arts, media and theater and urban climate research using low-cost and low-powered edge devices. The multi-modal chatbot uses different machine learning techniques: natural language processing (NLP) and computer vision (CV) to understand inputs of the user and accordingly perform in the play and interact with the audience. This system is also equipped with other interactive hardware setups like movable LED systems, together they provide an experiential theatrical play tailored to each user. I will discuss how I used edge devices to achieve this AI system which has created a new genre in theatrical play. I will then discuss MaRTiny, which is an AI-based bio-meteorological system that calculates mean radiant temperature (MRT), which is an important parameter for urban climate research. It is also equipped with a vision system that performs different machine learning tasks like pedestrian and shade detection. The entire system costs around $200 which can potentially replace the existing setup worth $20,000. I will further discuss how I overcame the inaccuracies in MRT value caused by the system, using machine learning methods. These projects although belonging to two very different fields, are implemented using edge devices and use similar ML techniques. In this thesis I will detail out different techniques that are shared between these two projects and how they can be used in several other applications using edge devices.

ContributorsKulkarni, Karthik Kashinath (Author) / Jayasuriya, Suren (Thesis advisor) / Middel, Ariane (Thesis advisor) / Yu, Hongbin (Committee member) / Arizona State University (Publisher)

Created2021

Probabilistic Imitation Learning for Spatiotemporal Human-Robot Interaction

Description

Imitation learning is a promising methodology for teaching robots how to physically interact and collaborate with human partners. However, successful interaction requires complex coordination in time and space, i.e., knowing what to do as well as when to do it. This dissertation introduces Bayesian Interaction Primitives, a probabilistic imitation learning…

Imitation learning is a promising methodology for teaching robots how to physically interact and collaborate with human partners. However, successful interaction requires complex coordination in time and space, i.e., knowing what to do as well as when to do it. This dissertation introduces Bayesian Interaction Primitives, a probabilistic imitation learning framework which establishes a conceptual and theoretical relationship between human-robot interaction (HRI) and simultaneous localization and mapping. In particular, it is established that HRI can be viewed through the lens of recursive filtering in time and space. In turn, this relationship allows one to leverage techniques from an existing, mature field and develop a powerful new formulation which enables multimodal spatiotemporal inference in collaborative settings involving two or more agents. Through the development of exact and approximate variations of this method, it is shown in this work that it is possible to learn complex real-world interactions in a wide variety of settings, including tasks such as handshaking, cooperative manipulation, catching, hugging, and more.

ContributorsCampbell, Joseph (Author) / Ben Amor, Heni (Thesis advisor) / Fainekos, Georgios (Thesis advisor) / Yamane, Katsu (Committee member) / Kambhampati, Subbarao (Committee member) / Arizona State University (Publisher)

Created2021

Reduced Order Models and Approximations for Hardware Acceleration of Neural Networks

Description

Many real-world engineering problems require simulations to evaluate the design objectives and constraints. Often, due to the complexity of the system model, simulations can be prohibitive in terms of computation time. One approach to overcome this issue is to construct a surrogate model, which approximates the original model. The focus…

Many real-world engineering problems require simulations to evaluate the design objectives and constraints. Often, due to the complexity of the system model, simulations can be prohibitive in terms of computation time. One approach to overcome this issue is to construct a surrogate model, which approximates the original model. The focus of this work is on the data-driven surrogate models, in which empirical approximations of the output are performed given the input parameters. Recently neural networks (NN) have re-emerged as a popular method for constructing data-driven surrogate models. Although, NNs have achieved excellent accuracy and are widely used, they pose their own challenges. This work addresses two common challenges, the need for: (1) hardware acceleration and (2) uncertainty quantification (UQ) in the presence of input variability. The high demand in the inference phase of deep NNs in cloud servers/edge devices calls for the design of low power custom hardware accelerators. The first part of this work describes the design of an energy-efficient long short-term memory (LSTM) accelerator. The overarching goal is to aggressively reduce the power consumption and area of the LSTM components using approximate computing, and then use architectural level techniques to boost the performance. The proposed design is synthesized and placed and routed as an application-specific integrated circuit (ASIC). The results demonstrate that this accelerator is 1.2X and 3.6X more energy-efficient and area-efficient than the baseline LSTM. In the second part of this work, a robust framework is developed based on an alternate data-driven surrogate model referred to as polynomial chaos expansion (PCE) for addressing UQ. In contrast to many existing approaches, no assumptions are made on the elements of the function space and UQ is a function of the expansion coefficients. Moreover, the sensitivity of the output with respect to any subset of the input variables can be computed analytically by post-processing the PCE coefficients. This provides a systematic and incremental method to pruning or changing the order of the model. This framework is evaluated on several real-world applications from different domains and is extended for classification tasks as well.

ContributorsAzari, Elham (Author) / Vrudhula, Sarma (Thesis advisor) / Fainekos, Georgios (Committee member) / Ren, Fengbo (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)

Created2021

Learning from the Data Heterogeneity for Data Imputation

Description

Data mining, also known as big data analysis, has been identified as a critical and challenging process for a variety of applications in real-world problems. Numerous datasets are collected and generated every day to store the information. The rise in the number of data volumes and data modality has resulted…

Data mining, also known as big data analysis, has been identified as a critical and challenging process for a variety of applications in real-world problems. Numerous datasets are collected and generated every day to store the information. The rise in the number of data volumes and data modality has resulted in the increased demand for data mining methods and strategies of finding anomalies, patterns, and correlations within large data sets to predict outcomes. Effective machine learning methods are widely adapted to build the data mining pipeline for various purposes like business understanding, data understanding, data preparation, modeling, evaluation, and deployment. The major challenges for effectively and efficiently mining big data include (1) data heterogeneity and (2) missing data. Heterogeneity is the natural characteristic of big data, as the data is typically collected from different sources with diverse formats. The missing value is the most common issue faced by the heterogeneous data analysis, which resulted from variety of factors including the data collecting processing, user initiatives, erroneous data entries, and so on. In response to these challenges, in this thesis, three main research directions with application scenarios have been investigated: (1) Mining and Formulating Heterogeneous Data, (2) missing value imputation strategy in various application scenarios in both offline and online manner, and (3) missing value imputation for multi-modality data. Multiple strategies with theoretical analysis are presented, and the evaluation of the effectiveness of the proposed algorithms compared with state-of-the-art methods is discussed.

Contributorsliu, Xu (Author) / He, Jingrui (Thesis advisor) / Xue, Guoliang (Thesis advisor) / Li, Baoxin (Committee member) / Tong, Hanghang (Committee member) / Arizona State University (Publisher)

Created2021

Characterizing the Performance of Machine Learning Algorithms: A Study and Novel Techniques

Description

Classification in machine learning is quite crucial to solve many problems that the world is presented with today. Therefore, it is key to understand one’s problem and develop an efficient model to achieve a solution. One technique to achieve greater model selection and thus further ease in problem solving is…

Classification in machine learning is quite crucial to solve many problems that the world is presented with today. Therefore, it is key to understand one’s problem and develop an efficient model to achieve a solution. One technique to achieve greater model selection and thus further ease in problem solving is estimation of the Bayes Error Rate. This paper provides the development and analysis of two methods used to estimate the Bayes Error Rate on a given set of data to evaluate performance. The first method takes a “global” approach, looking at the data as a whole, and the second is more “local”—partitioning the data at the outset and then building up to a Bayes Error Estimation of the whole. It is found that one of the methods provides an accurate estimation of the true Bayes Error Rate when the dataset is at high dimension, while the other method provides accurate estimation at large sample size. This second conclusion, in particular, can have significant ramifications on “big data” problems, as one would be able to clarify the distribution with an accurate estimation of the Bayes Error Rate by using this method.

ContributorsLattus, Robert (Author) / Dasarathy, Gautam (Thesis director) / Berisha, Visar (Committee member) / Turaga, Pavan (Committee member) / Barrett, The Honors College (Contributor) / Electrical Engineering Program (Contributor)

Created2021-12

Agora: Introducing the Internet’s Opinion to Traditional Stock Analysis and Prediction

Description

This project aims to incorporate the aspect of sentiment analysis into traditional stock analysis to enhance stock rating predictions by applying a reliance on the opinion of various stocks from the Internet. Headlines from eight major news publications and conversations from Yahoo! Finance’s “Conversations” feature were parsed through the Valence…

This project aims to incorporate the aspect of sentiment analysis into traditional stock analysis to enhance stock rating predictions by applying a reliance on the opinion of various stocks from the Internet. Headlines from eight major news publications and conversations from Yahoo! Finance’s “Conversations” feature were parsed through the Valence Aware Dictionary for Sentiment Reasoning (VADER) natural language processing package to determine numerical polarities which represented positivity or negativity for a given stock ticker. These generated polarities were paired with stock metrics typically observed by stock analysts as the feature set for a Logistic Regression machine learning model. The model was trained on roughly 1500 major stocks to determine a binary classification between a “Buy” or “Not Buy” rating for each stock, and the results of the model were inserted into the back-end of the Agora Web UI which emulates search engine behavior specifically for stocks found in NYSE and NASDAQ. The model reported an accuracy of 82.5% and for most major stocks, the model’s prediction correlated with stock analysts’ ratings. Given the volatility of the stock market and the propensity for hive-mind behavior in online forums, the performance of the Logistic Regression model would benefit from incorporating historical stock data and more sources of opinion to balance any subjectivity in the model.

ContributorsRao, Jayanth (Author) / Ramaraju, Venkat (Co-author) / Bansal, Ajay (Thesis director) / Smith, James (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor) / School of Mathematical and Statistical Sciences (Contributor)

Created2021-12

A new machine learning based approach to NASA's propulsion engine diagnostic benchmark problem

Description

Gas turbine engine for aircraft propulsion represents one of the most physics-complex and safety-critical systems in the world. Its failure diagnostic is challenging due to the complexity of the model system, difficulty involved in practical testing and the infeasibility of creating homogeneous diagnostic performance evaluation criteria for the diverse engine…

Gas turbine engine for aircraft propulsion represents one of the most physics-complex and safety-critical systems in the world. Its failure diagnostic is challenging due to the complexity of the model system, difficulty involved in practical testing and the infeasibility of creating homogeneous diagnostic performance evaluation criteria for the diverse engine makes.

NASA has designed and publicized a standard benchmark problem for propulsion engine gas path diagnostic that enables comparisons among different engine diagnostic approaches. Some traditional model-based approaches and novel purely data-driven approaches such as machine learning, have been applied to this problem.

This study focuses on a different machine learning approach to the diagnostic problem. Some most common machine learning techniques, such as support vector machine, multi-layer perceptron, and self-organizing map are used to help gain insight into the different engine failure modes from the perspective of big data. They are organically integrated to achieve good performance based on a good understanding of the complex dataset.

The study presents a new hierarchical machine learning structure to enhance classification accuracy in NASA's engine diagnostic benchmark problem. The designed hierarchical structure produces an average diagnostic accuracy of 73.6%, which outperforms comparable studies that were most recently published.

ContributorsWu, Qiyu (Author) / Si, Jennie (Thesis advisor) / Wu, Teresa (Committee member) / Tsakalis, Konstantinos (Committee member) / Arizona State University (Publisher)

Created2015

FPGA Machine Learning: MLP and CNN Feedforward with Minimal Hardware Resources

Description

Machine learning is a powerful tool for processing and understanding the vast amounts of data produced by sensors every day. Machine learning has found use in a wide variety of fields, from making medical predictions through correlations invisible to the human eye to classifying images in computer vision applications. A…

Machine learning is a powerful tool for processing and understanding the vast amounts of data produced by sensors every day. Machine learning has found use in a wide variety of fields, from making medical predictions through correlations invisible to the human eye to classifying images in computer vision applications. A wide range of machine learning algorithms have been developed to attempt to solve these problems, each with different metrics in accuracy, throughput, and energy efficiency. However, even after they are trained, these algorithms require substantial computations to make a prediction. General-purpose CPUs are not well-optimized to this task, so other hardware solutions have developed over time, including the use of a GPU, FPGA, or ASIC.

This project considers the FPGA implementations of MLP and CNN feedforward. While FPGAs provide significant performance improvements, they come at a substantial financial cost. We explore the options of implementing these algorithms on a smaller budget. We successfully implement a multilayer perceptron that identifies handwritten digits from the MNIST dataset on a student-level DE10-Lite FPGA with a test accuracy of 91.99%. We also apply our trained network to external image data loaded through a webcam and a Raspberry Pi, but we observe lower test accuracy in these images. Later, we consider the requirements necessary to implement a more elaborate convolutional neural network on the same FPGA. The study deems the CNN implementation feasible in the criteria of memory requirements and basic architecture. We suggest the CNN implementation on the same FPGA to be worthy of further exploration.

ContributorsLythgoe, Zachary James (Author) / Allee, David (Thesis director) / Hartin, Olin (Committee member) / Electrical Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2019-12

Filtering by