Personalized Learning in a Virtual Hands-on Lab Platform for Computer Science Education

168452-Thumbnail Image.png
Description
Personalized learning is gaining popularity in online computer science education due to its characteristics of pacing the learning progress and adapting the instructional approach to each individual learner from a diverse background. Among various instructional methods in computer science education,

Personalized learning is gaining popularity in online computer science education due to its characteristics of pacing the learning progress and adapting the instructional approach to each individual learner from a diverse background. Among various instructional methods in computer science education, hands-on labs have unique requirements of understanding learners' behavior and assessing learners' performance for personalization. Hands-on labs are a critical learning approach for cybersecurity education. It provides real-world complex problem scenarios and helps learners develop a deeper understanding of knowledge and concepts while solving real-world problems. But there are unique challenges when using hands-on labs for cybersecurity education. Existing hands-on lab exercises materials are usually managed in a problem-centric fashion, while it lacks a coherent way to manage existing labs and provide productive lab exercising plans for cybersecurity learners. To solve these challenges, a personalized learning platform called ThoTh Lab specifically designed for computer science hands-on labs in a cloud environment is established. ThoTh Lab can identify the learning style from student activities and adapt learning material accordingly. With the awareness of student learning styles, instructors are able to use techniques more suitable for the specific student, and hence, improve the speed and quality of the learning process. ThoTh Lab also provides student performance prediction, which allows the instructors to change the learning progress and take other measurements to help the students timely. A knowledge graph in the cybersecurity domain is also constructed using Natural language processing (NLP) technologies including word embedding and hyperlink-based concept mining. This knowledge graph is then utilized during the regular learning process to build a personalized lab recommendation system by suggesting relevant labs based on students' past learning history to maximize their learning outcomes. To evaluate ThoTh Lab, several in-class experiments were carried out in cybersecurity classes for both graduate and undergraduate students at Arizona State University and data was collected over several semesters. The case studies show that, by leveraging the personalized lab platform, students tend to be more absorbed in a lab project, show more interest in the cybersecurity area, spend more effort on the project and gain enhanced learning outcomes.
Date Created
2021
Agent

OntoConnect: Domain-Agnostic Ontology Alignment using Neural Networks

161678-Thumbnail Image.png
Description
An ontology is a vocabulary that provides a formal description of entities within a domain and their relationships with other entities. Along with basic schema information, it also captures information in the form of metadata about cardinality, restrictions, hierarchy, and

An ontology is a vocabulary that provides a formal description of entities within a domain and their relationships with other entities. Along with basic schema information, it also captures information in the form of metadata about cardinality, restrictions, hierarchy, and semantic meaning. With the rapid growth of semantic (RDF) data on the web, many organizations like DBpedia, Earth Science Information Partners (ESIP) are publishing more and more data in RDF format. The ontology alignment task aims at linking two or more different ontologies from the same domain or different domains. It is a process of finding the semantic relationship between two or more ontological entities and/or instances. Information/data sharing among different systems is quite limited because of differences in data based on syntax, structures, and semantics. Ontology alignment is used to overcome the limitation of semantic interoperability of current vast distributed systems available on the Web. In spite of the availability of large hierarchical domain-specific datasets, automated ontology mapping is still a complex problem. Over the years, many techniques have been proposed for ontology instance alignment, schema alignment, and link discovery. Most of the available approaches require human intervention or work within a specific domain. The challenge involves representing an entity as a vector that encodes all context information of the entity such as hierarchical information, properties, constraints, etc. The ontological representation is rich in comparison with the regular data schema because of metadata about various properties, constraints, relationship to other entities within the domain, etc. While finding similarities between entities this metadata is often overlooked. The second challenge is that the comparison of two ontologies is an intense operation and highly depends on the domain and the language that the ontologies are expressed in. Most current methods require human intervention that leads to a time-consuming and cumbersome process and the output is prone to human errors. The proposed unsupervised recursive neural network technique achieves an F-measure of 80.3% on the Anatomy dataset and the proposed graph neural network technique achieves an F-measure of 81.0% on the Anatomy dataset.
Date Created
2021
Agent

Modeling the Complexity of Sankey Diagrams

148180-Thumbnail Image.png
Description

In this Barrett Honors Thesis, I developed a model to quantify the complexity of Sankey diagrams, which are a type of visualization technique that shows flow between groups. To do this, I created a carefully controlled dataset of synthetic Sankey

In this Barrett Honors Thesis, I developed a model to quantify the complexity of Sankey diagrams, which are a type of visualization technique that shows flow between groups. To do this, I created a carefully controlled dataset of synthetic Sankey diagrams of varying sizes as study stimuli. Then, a pair of online crowdsourced user studies were conducted and analyzed. User performance for Sankey diagrams of varying size and features (number of groups, number of timesteps, and number of flow crossings) were algorithmically modeled as a formula to quantify the complexity of these diagrams. Model accuracy was measured based on the performance of users in the second crowdsourced study. The results of my experiment conclusively demonstrates that the algorithmic complexity formula I created closely models the visual complexity of the Sankey Diagrams in the dataset.

Date Created
2021-05
Agent

Developing a Neural Network Based Adaptive Task Selection System for anUndergraduate Level Organic Chemistry Course

158074-Thumbnail Image.png
Description
In the last decade, the immense growth of computational power, enhanced data storage capabilities, and the increasing popularity of online learning systems has led to adaptive learning systems becoming more widely available. Parallel to infrastructure enhancements, more researchers have started

In the last decade, the immense growth of computational power, enhanced data storage capabilities, and the increasing popularity of online learning systems has led to adaptive learning systems becoming more widely available. Parallel to infrastructure enhancements, more researchers have started to study the adaptive task selection systems, concluding that suggesting tasks appropriate to students' needs may increase students' learning gains.

This work built an adaptive task selection system for undergraduate organic chemistry students using a deep learning algorithm. The proposed model is based on a recursive neural network (RNN) architecture built with Long-Short Term Memory (LSTM) cells that recommends organic chemistry practice questions to students depending on their previous question selections.

For this study, educational data were collected from the Organic Chemistry Practice Environment (OPE) that is used in the Organic Chemistry course at Arizona State University. The OPE has more than three thousand questions. Each question is linked to one or more knowledge components (KCs) to enable recommendations that precisely address the knowledge that students need. Subject matter experts made the connection between questions and related KCs.

A linear model derived from students' exam results was used to identify skilled students. The neural network based recommendation system was trained using those skilled students' problem solving attempt sequences so that the trained system recommends questions that will likely improve learning gains the most. The model was evaluated by measuring the predicted questions' accuracy against learners' actual task selections. The proposed model not only accurately predicted the learners' actual task selection but also the correctness of their answers.
Date Created
2020
Agent

Predicting Outcome of a Pitch Given the Type of Pitch for any Baseball Scenario

131311-Thumbnail Image.png
Description
This thesis serves as a baseline for the potential for prediction through machine learning (ML) in baseball. Hopefully, it also will serve as motivation for future work to expand and reach the potential of sabermetrics, advanced Statcast data and machine

This thesis serves as a baseline for the potential for prediction through machine learning (ML) in baseball. Hopefully, it also will serve as motivation for future work to expand and reach the potential of sabermetrics, advanced Statcast data and machine learning. The problem this thesis attempts to solve is predicting the outcome of a pitch. Given proper pitch data and situational data, is it possible to predict the result or outcome of a pitch? The result or outcome refers to the specific outcome of a pitch, beyond ball or strike, but if the hitter puts the ball in play for a double, this thesis shows how I attempted to predict that type of outcome. Before diving into my methods, I take a deep look into sabermetrics, advanced statistics and the history of the two in Major League Baseball. After this, I describe my implemented machine learning experiment. First, I found a dataset that is suitable for training a pitch prediction model, I then analyzed the features and used some feature engineering to select a set of 16 features, and finally, I trained and tested a pair of ML models on the data. I used a decision tree classifier and random forest classifier to test the data. I attempted to us a long short-term memory to improve my score, but came up short. Each classifier performed at around 60% accuracy. I also experimented using a neural network approach with a long short-term memory (LSTM) model, but this approach requires more feature engineering to beat the simpler classifiers. In this thesis, I show examples of five hitters that I test the models on and the accuracy for each hitter. This work shows promise that advanced classification models (likely requiring more feature engineering) can provide even better prediction outcomes, perhaps with 70% accuracy or higher! There is much potential for future work and to improve on this thesis, mainly through the proper construction of a neural network, more in-depth feature analysis/selection/extraction, and data visualization.
Date Created
2020-05
Agent

Providing Intelligent and Adaptive Support in Concept Map-based Learning Environments

157884-Thumbnail Image.png
Description
Concept maps are commonly used knowledge visualization tools and have been shown to have a positive impact on learning. The main drawbacks of concept mapping are the requirement of training, and lack of feedback support. Thus, prior research has attempted

Concept maps are commonly used knowledge visualization tools and have been shown to have a positive impact on learning. The main drawbacks of concept mapping are the requirement of training, and lack of feedback support. Thus, prior research has attempted to provide support and feedback in concept mapping, such as by developing computer-based concept mapping tools, offering starting templates and navigational supports, as well as providing automated feedback. Although these approaches have achieved promising results, there are still challenges that remain to be solved. For example, there is a need to create a concept mapping system that reduces the extraneous effort of editing a concept map while encouraging more cognitively beneficial behaviors. Also, there is little understanding of the cognitive process during concept mapping. What’s more, current feedback mechanisms in concept mapping only focus on the outcome of the map, instead of the learning process.

This thesis work strives to solve the fundamental research question: How to leverage computer technologies to intelligently support concept mapping to promote meaningful learning? To approach this research question, I first present an intelligent concept mapping system, MindDot, that supports concept mapping via innovative integration of two features, hyperlink navigation, and expert template. The system reduces the effort of creating and modifying concept maps while encouraging beneficial activities such as comparing related concepts and establishing relationships among them. I then present the comparative strategy metric that modes student learning by evaluating behavioral patterns and learning strategies. Lastly, I develop an adaptive feedback system that provides immediate diagnostic feedback in response to both the key learning behaviors during concept mapping and the correctness and completeness of the created maps.

Empirical evaluations indicated that the integrated navigational and template support in MindDot fostered effective learning behaviors and facilitating learning achievements. The comparative strategy model was shown to be highly representative of learning characteristics such as motivation, engagement, misconceptions, and predicted learning results. The feedback tutor also demonstrated positive impacts on supporting learning and assisting the development of effective learning strategies that prepare learners for future learning. This dissertation contributes to the field of supporting concept mapping with designs of technological affordances, a process-based student model, an adaptive feedback tutor, empirical evaluations of these proposed innovations, and implications for future support in concept mapping.
Date Created
2019
Agent

Explainable AI in Workflow Development and Verification Using Pi-Calculus

157864-Thumbnail Image.png
Description
Computer science education is an increasingly vital area of study with various challenges that increase the difficulty level for new students resulting in higher attrition rates. As part of an effort to resolve this issue, a new visual programming language

Computer science education is an increasingly vital area of study with various challenges that increase the difficulty level for new students resulting in higher attrition rates. As part of an effort to resolve this issue, a new visual programming language environment was developed for this research, the Visual IoT and Robotics Programming Language Environment (VIPLE). VIPLE is based on computational thinking and flowchart, which reduces the needs of memorization of detailed syntax in text-based programming languages. VIPLE has been used at Arizona State University (ASU) in multiple years and sections of FSE100 as well as in universities worldwide. Another major issue with teaching large programming classes is the potential lack of qualified teaching assistants to grade and offer insight to a student’s programs at a level beyond output analysis.

In this dissertation, I propose a novel framework for performing semantic autograding, which analyzes student programs at a semantic level to help students learn with additional and systematic help. A general autograder is not practical for general programming languages, due to the flexibility of semantics. A practical autograder is possible in VIPLE, because of its simplified syntax and restricted options of semantics. The design of this autograder is based on the concept of theorem provers. To achieve this goal, I employ a modified version of Pi-Calculus to represent VIPLE programs and Hoare Logic to formalize program requirements. By building on the inference rules of Pi-Calculus and Hoare Logic, I am able to construct a theorem prover that can perform automated semantic analysis. Furthermore, building on this theorem prover enables me to develop a self-learning algorithm that can learn the conditions for a program’s correctness according to a given solution program.
Date Created
2020
Agent

Advancing Large-Scale Creativity through Adaptive Inspirations and Research in Context

157095-Thumbnail Image.png
Description
An old proverb claims that “two heads are better than one”. Crowdsourcing research and practice have taken this to heart, attempting to show that thousands of heads can be even better. This is not limited to leveraging a crowd’s knowledge,

An old proverb claims that “two heads are better than one”. Crowdsourcing research and practice have taken this to heart, attempting to show that thousands of heads can be even better. This is not limited to leveraging a crowd’s knowledge, but also their creativity—the ability to generate something not only useful, but also novel. In practice, there are initiatives such as Free and Open Source Software communities developing innovative software. In research, the field of crowdsourced creativity, which attempts to design scalable support mechanisms, is blooming. However, both contexts still present many opportunities for advancement.

In this dissertation, I seek to advance both the knowledge of limitations in current technologies used in practice as well as the mechanisms that can be used for large-scale support. The overall research question I explore is: “How can we support large-scale creative collaboration in distributed online communities?” I first advance existing support techniques by evaluating the impact of active support in brainstorming performance. Furthermore, I leverage existing theoretical models of individual idea generation as well as recommender system techniques to design CrowdMuse, a novel adaptive large-scale idea generation system. CrowdMuse models users in order to adapt itself to each individual. I evaluate the system’s efficacy through two large-scale studies. I also advance knowledge of current large-scale practices by examining common communication channels under the lens of Creativity Support Tools, yielding a list of creativity bottlenecks brought about by the affordances of these channels. Finally, I connect both ends of this dissertation by deploying CrowdMuse in an Open Source online community for two weeks. I evaluate their usage of the system as well as its perceived benefits and issues compared to traditional communication tools.

This dissertation makes the following contributions to the field of large-scale creativity: 1) the design and evaluation of a first-of-its-kind adaptive brainstorming system; 2) the evaluation of the effects of active inspirations compared to simple idea exposure; 3) the development and application of a set of creativity support design heuristics to uncover creativity bottlenecks; and 4) an exploration of large-scale brainstorming systems’ usefulness to online communities.
Date Created
2019
Agent

Empowering Women in Zambia through Computational Thinking Curriculum

132493-Thumbnail Image.png
Description
The nonprofit organization, I Am Zambia, works to give supplemental education to young women in Lusaka. I Am Zambia is creating sustainable change by educating these females, who can then lift their families and communities out of poverty. The ultimate

The nonprofit organization, I Am Zambia, works to give supplemental education to young women in Lusaka. I Am Zambia is creating sustainable change by educating these females, who can then lift their families and communities out of poverty. The ultimate goal of this thesis was to explore and implement high level systematic problem solving through basic and specialized computational thinking curriculum at I Am Zambia in order to give these women an even larger stepping stool into a successful future.

To do this, a 4-week long pilot curriculum was created, implemented, and tested through an optional class at I Am Zambia, available to women who had already graduated from the year-long I Am Zambia Academy program. A total of 18 women ages 18-24 chose to enroll in the course. There were a total of 10 lessons, taught over 20 class period. These lessons covered four main computational thinking frameworks: introduction to computational thinking, algorithmic thinking, pseudocode, and debugging. Knowledge retention was tested through the use of a CS educational tool, QuizIt, created by the CSI Lab of School of Computing, Informatics and Decision Systems Engineering at Arizona State University. Furthermore, pre and post tests were given to assess the successfulness of the curriculum in teaching students the aforementioned concepts. 14 of the 18 students successfully completed the pre and post test.

Limitations of this study and suggestions for how to improve this curriculum in order to extend it into a year long course are also presented at the conclusion of this paper.
Date Created
2019-05
Agent

Towards Addressing Key Visual Processing Challenges in Social Media Computing

156951-Thumbnail Image.png
Description
Visual processing in social media platforms is a key step in gathering and understanding information in the era of Internet and big data. Online data is rich in content, but its processing faces many challenges including: varying scales for objects

Visual processing in social media platforms is a key step in gathering and understanding information in the era of Internet and big data. Online data is rich in content, but its processing faces many challenges including: varying scales for objects of interest, unreliable and/or missing labels, the inadequacy of single modal data and difficulty in analyzing high dimensional data. Towards facilitating the processing and understanding of online data, this dissertation primarily focuses on three challenges that I feel are of great practical importance: handling scale differences in computer vision tasks, such as facial component detection and face retrieval, developing efficient classifiers using partially labeled data and noisy data, and employing multi-modal models and feature selection to improve multi-view data analysis. For the first challenge, I propose a scale-insensitive algorithm to expedite and accurately detect facial landmarks. For the second challenge, I propose two algorithms that can be used to learn from partially labeled data and noisy data respectively. For the third challenge, I propose a new framework that incorporates feature selection modules into LDA models.
Date Created
2018
Agent