Search Content

Using ML to Predict Online Course Ratings

Description

The pandemic that hit in 2020 has boosted the growth of online learning that involves the booming of Massive Open Online Course (MOOC). To support this situation, it will be helpful to have tools that can help students in choosing between the different courses and can help instructors to understand…

The pandemic that hit in 2020 has boosted the growth of online learning that involves the booming of Massive Open Online Course (MOOC). To support this situation, it will be helpful to have tools that can help students in choosing between the different courses and can help instructors to understand what the students need. One of those tools is an online course ratings predictor. Using the predictor, online course instructors can learn the qualities that majority course takers deem as important, and thus they can adjust their lesson plans to fit those qualities. Meanwhile, students will be able to use it to help them in choosing the course to take by comparing the ratings. This research aims to find the best way to predict the rating of online courses using machine learning (ML). To create the ML model, different combinations of the length of the course, the number of materials it contains, the price of the course, the number of students taking the course, the course’s difficulty level, the usage of jargons or technical terms in the course description, the course’s instructors’ rating, the number of reviews the instructors got, and the number of classes the instructors have created on the same platform are used as the inputs. Meanwhile, the output of the model would be the average rating of a course. Data from 350 courses are used for this model, where 280 of them are used for training, 35 for testing, and the last 35 for validation. After trying out different machine learning models, wide neural networks model constantly gives the best training results while the medium tree model gives the best testing results. However, further research needs to be conducted as none of the results are not accurate, with 0.51 R-squared test result for the tree model.

ContributorsWidodo, Herlina (Author) / VanLehn, Kurt (Thesis director) / Craig, Scotty (Committee member) / Barrett, The Honors College (Contributor) / Department of Management and Entrepreneurship (Contributor) / Computer Science and Engineering Program (Contributor)

Created2021-12

Effects of Training Dataset Variance on Artificial Intelligence Image Generation

Description

This research paper explores the effects of data variance on the quality of Artificial Intelligence image generation models and the impact on a viewer's perception of the generated images. The study examines how the quality and accuracy of the images produced by these models are influenced by factors such as…

This research paper explores the effects of data variance on the quality of Artificial Intelligence image generation models and the impact on a viewer's perception of the generated images. The study examines how the quality and accuracy of the images produced by these models are influenced by factors such as size, labeling, and format of the training data. The findings suggest that reducing the training dataset size can lead to a decrease in image coherence, indicating that AI models get worse as the training dataset gets smaller. Moreover, the study makes surprising discoveries regarding AI image generation models that are trained on highly varied datasets. In addition, the study involves a survey in which people were asked to rate the subjective realism of the generated images on a scale ranging from 1 to 5 as well as sorting the images into their respective classes. The findings of this study emphasize the importance of considering dataset variance and size as a critical aspect of improving image generation models as well as the implications of using AI technology in the future.

ContributorsPunyamurthula, Rushil (Author) / Carter, Lynn (Thesis director) / Sarmento, Rick (Committee member) / Barrett, The Honors College (Contributor) / School of Sustainability (Contributor) / Computer Science and Engineering Program (Contributor)

Created2023-05

An Introduction to Unstructured Case Management

Description

In the age of information, collecting and processing large amounts of data is an integral part of running a business. From training artificial intelligence to driving decision making, the applications of data are far-reaching. However, it is difficult to process many types of data; namely, unstructured data. Unstructured data is…

In the age of information, collecting and processing large amounts of data is an integral part of running a business. From training artificial intelligence to driving decision making, the applications of data are far-reaching. However, it is difficult to process many types of data; namely, unstructured data. Unstructured data is “information that either does not have a predefined data model or is not organized in a pre-defined manner” (Balducci & Marinova 2018). Such data are difficult to put into spreadsheets and relational databases due to their lack of numeric values and often come in the form of text fields written by the consumers (Wolff, R. 2020). The goal of this project is to help in the development of a machine learning model to aid CommonSpirit Health and ServiceNow, hence why this approach using unstructured data was selected. This paper provides a general overview of the process of unstructured data management and explores some existing implementations and their efficacy. It will then discuss our approach to converting unstructured cases into usable data that were used to develop an artificial intelligence model which is estimated to be worth $400,000 and save CommonSpirit Health $1,200,000 in organizational impact.

ContributorsBergsagel, Matteo (Author) / De Waard, Jan (Co-author) / Chavez-Echeagaray, Maria Elena (Thesis director) / Burns, Christopher (Committee member) / Barrett, The Honors College (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Computer Science and Engineering Program (Contributor)

Created2022-05

Examining Descriptiveness of Narratives Generated using Planning and Large Language Models

Description

Narrative generation is an important field due to the high demand for stories in video game design and also in stories used in learning tools in the classroom. As these stories should contain depth, it is desired for these stories to ideally be more descriptive. There are tools that hel…

Narrative generation is an important field due to the high demand for stories in video game design and also in stories used in learning tools in the classroom. As these stories should contain depth, it is desired for these stories to ideally be more descriptive. There are tools that help with the creation of these stories, such as planning, which requires a domain as input, or GPT-3, which requires an input prompt to generate the stories. However, other aspects to consider are the coherence and variation of stories. To save time and effort and create multiple possible stories, we combined both planning and the Large Language Model (LLM) GPT-3 similar to how they were used in TattleTale to generate such stories while examining whether descriptive input prompts to GPT-3 affect the outputted stories. The stories generated are readable to the general public and overall, the prompts do not consistently affect descriptiveness of outputs across all stories tested. For this work, three stories with three variants each were created and tested for descriptiveness. To do so, adjectives, adverbs, prepositional phrases, and suboordinating conjunctions were counted using Natural Language Processing (NLP) tool spaCy for Part Of Speech (POS) tagging. This work has shown that descriptiveness is highly correlated with the amount of words in the story in general, so running GPT-3 to obtain longer stories is a feasible option to consider in order to obtain more descriptive stories. The limitations of GPT-3 have an impact on the descriptiveness of resulting stories due to GPT-3’s inconsistency and transformer architecture, and other methods of narrative generation such as simple planning could be more useful.

ContributorsDozier, Courtney (Author) / Chavez-Echeagary, Maria Elena (Thesis director) / Benjamin, Victor (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)

Created2022-12

Probable Perils: An Analysis of the Workings and Social Impact of ChatGPT

Description

This thesis provides an analysis of the potential issues of using ChatGPT, as despite its benefits it does have its concerns that may deter societal progress. The thesis first provides insight into how ChatGPT generates text and provides insight into how the process of generating its outputs can lead to…

This thesis provides an analysis of the potential issues of using ChatGPT, as despite its benefits it does have its concerns that may deter societal progress. The thesis first provides insight into how ChatGPT generates text and provides insight into how the process of generating its outputs can lead to a variety of issues in the output such as hallucinated and biased output. After explaining how these issues occur, the thesis focuses on the impact of these issues in important industries such as medicine, education, and security, comparing them to popular open-source models such as Llama and Falcon.

ContributorsTsai, Brandon (Author) / Martin, Thomas (Thesis director) / Shakarian, Paulo (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)

Created2024-05

Applications of Machine Learning to Botanical Classification

Description

In the field of botany, it is often necessary for plants to be identified based on their phenotypical characteristics, whether in person or using previously collected image samples. This work can be tedious and challenging for a human botanist to complete, as datasets can be large and several species of…

In the field of botany, it is often necessary for plants to be identified based on their phenotypical characteristics, whether in person or using previously collected image samples. This work can be tedious and challenging for a human botanist to complete, as datasets can be large and several species of plants strongly resemble each other. Various machine learning techniques, both supervised and unsupervised, can address this task with varying degrees of accuracy and efficiency thanks to their ability to identify subtle patterns in data. The objective of this research is to both conduct a review of previous studies that measure the effectiveness of various machine learning methods for plant identification and to build and test various models to draw up a comparison of the accuracies and efficiencies of the set of techniques. A review of the existing literature found that any of the studied machine learning techniques can yield a high level of accuracy when used in the correct situations and on a suitable dataset. The results gathered from the models built from this research show that all else being equal, complex convolutional neural networks perform the best on this task, yielding an accuracy of 85.4% on the larger dataset. The other models tested in descending order of accuracy on the same dataset are k-nearest neighbors, random forest, k-means clustering, and a decision tree classifier.

ContributorsOlsen, Laela (Author) / Carter, Lynn Robert (Thesis director) / Bhargav, Vishnu (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)

Created2024-05

Artificial Intelligence with Graph Neural Networks Applied to a Risk-like Board Game

Description

This project aspires to develop an AI capable of playing on a variety of maps in a Risk-like board game. While AI has been successfully applied to many other board games, such as Chess and Go, most research is confined to a single board and is inflexible to topological changes.…

This project aspires to develop an AI capable of playing on a variety of maps in a Risk-like board game. While AI has been successfully applied to many other board games, such as Chess and Go, most research is confined to a single board and is inflexible to topological changes. Further, almost all of these games are played on a rectangular grid. Contrarily, this project develops an AI player, referred to as GG-net, to play the online strategy game Warzone, which is based on the classic board game Risk. Warzone is played on a wide variety of irregularly shaped maps. Prior research has struggled to create an effective AI for Risk-like games due to the immense branching factor. The most successful attempts tended to rely on manually restricting the set of actions the AI considered while also engineering useful features for the AI to consider. GG-net uses no human knowledge, but rather a genetic algorithm combined with a graph neural network. Together, these methods allow GG-net to perform competitively across a multitude of maps. GG-net outperformed the built-in rule-based AI by 413 Elo (representing an 80.7% chance of winning) and an approach based on AlphaZero using graph neural networks by 304 Elo (representing a 74.2% chance of winning). This same advantage holds across both seen and unseen maps. GG-net appears to be a strong opponent on both small and medium maps, however, on large maps with hundreds of territories, inefficiencies in GG-net become more significant and GG-net struggles against the rule-based approach. Overall, GG-net was able to successfully learn the game and generalize across maps of a similar size, albeit further work is required for GG-net to become more successful on large maps.

ContributorsBauer, Andrew (Author) / Yang, Yezhou (Thesis director) / Harrison, Blake (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor) / School of Mathematical and Statistical Sciences (Contributor)

Created2022-05

Creative Frameworks: Developing Accessible Technological Frameworks for Creative Expression

Description

Artistic expression can be made more accessible through the use of technological interfaces such as auditory analysis, generative artificial intelligence models, and simplification of complicated systems, providing a way for human driven creativity to serve as an input that allow users to creatively express themselves. Studies and testing were done…

Artistic expression can be made more accessible through the use of technological interfaces such as auditory analysis, generative artificial intelligence models, and simplification of complicated systems, providing a way for human driven creativity to serve as an input that allow users to creatively express themselves. Studies and testing were done with industry standard performance technology and protocols to create an accessible interface for creative expression. Artificial intelligence models were created to generate art based on simple text inputs. Users were then invited to display their creativity using the software, and a comprehensive performance showcased the potential of the system for artistic expression.

ContributorsPardhe, Joshua (Author) / Lim, Kang Yi (Co-author) / Meuth, Ryan (Thesis director) / Brian, Jennifer (Committee member) / Hermann, Kristen (Committee member) / Barrett, The Honors College (Contributor) / Dean, W.P. Carey School of Business (Contributor) / Watts College of Public Service & Community Solut (Contributor) / Computer Science and Engineering Program (Contributor)

Created2022-05

Creative Frameworks: Developing Accessible Technological Frameworks for Creative Expression

Description

Artistic expression can be made more accessible through the use of technological interfaces such as auditory analysis, generative artificial intelligence models, and simplification of complicated systems, providing a way for human driven creativity to serve as an input that allow users to creatively express themselves. Studies and testing were done…

Artistic expression can be made more accessible through the use of technological interfaces such as auditory analysis, generative artificial intelligence models, and simplification of complicated systems, providing a way for human driven creativity to serve as an input that allow users to creatively express themselves. Studies and testing were done with industry standard performance technology and protocols to create an accessible interface for creative expression. Artificial intelligence models were created to generate art based on simple text inputs. Users were then invited to display their creativity using the software, and a comprehensive performance showcased the potential of the system for artistic expression.

ContributorsLim, Kang Yi (Author) / Pardhe, Joshua (Co-author) / Meuth, Ryan (Thesis director) / Brian, Jennifer (Committee member) / Hermann, Kristen (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)

Created2022-05

Investigating Stress Among Police Training Cadets Using Machine Learning

Description

As threats emerge, change, and grow, the life of a police officer continues to intensify. To help support police training curriculums and police cadets through this critical career juncture, this study proposes a state of the art approach to stress prediction and intervention through wearable devices and machine learning models.…

As threats emerge, change, and grow, the life of a police officer continues to intensify. To help support police training curriculums and police cadets through this critical career juncture, this study proposes a state of the art approach to stress prediction and intervention through wearable devices and machine learning models. As an integral first step of a larger study, the goal of this research is to provide relevant information to machine learning models to formulate a correlation between stress and police officers’ physiological responses on and off on the job. Fitbit devices were leveraged for data collection and were complemented with a custom built Fitbit application, called StressManager, and study dashboard, termed StressWatch. This analysis uses data collected from 15 training cadets at the Phoenix Police Regional Training Academy over a 13 week span. Close collaboration with these participants was essential; the quality of data collection relied on consistent “syncing” and troubleshooting of the Fitbit devices. After the data were collected and cleaned, features related to steps, calories, movement, location, and heart rate were extracted from the Fitbit API and other supplemental resources and passed through to empirically chosen machine learning models. From the results of these models, we formulate that events of increased intensity combined with physiological spikes contribute to the overall stress perception of a police training cadet

ContributorsParanjpe, Tara (Author) / Zhao, Ming (Thesis director) / Roberts, Nicole (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)

Created2022-05

Filtering by