This collection includes most of the ASU Theses and Dissertations from 2011 to present. ASU Theses and Dissertations are available in downloadable PDF format; however, a small percentage of items are under embargo. Information about the dissertations/theses includes degree information, committee members, an abstract, supporting data or media.

In addition to the electronic theses found in the ASU Digital Repository, ASU Theses and Dissertations can be found in the ASU Library Catalog.

Dissertations and Theses granted by Arizona State University are archived and made available through a joint effort of the ASU Graduate College and the ASU Libraries. For more information or questions about this collection contact or visit the Digital Repository ETD Library Guide or contact the ASU Graduate College at gradformat@asu.edu.

Displaying 41 - 46 of 46
Filtering by

Clear all filters

161655-Thumbnail Image.png
Description
There has been a substantial development in the field of data transmission in the last two decades. One does not have to wait much for a high-definition video to load on the systems anymore. Data compression is one of the most important technologies that helped achieve this seamless data transmission

There has been a substantial development in the field of data transmission in the last two decades. One does not have to wait much for a high-definition video to load on the systems anymore. Data compression is one of the most important technologies that helped achieve this seamless data transmission experience. It helps to store or send more data using less memory or network resources. However, it appears that there is a limit on the amount of compression that can be achieved with the existing lossless data compression techniques because they rely on the frequency of characters or set of characters in the data. The thesis proposes a lossless data compression technique in which the data is compressed by representing it as a set of parameters that can reproduce the original data without any loss when given to the corresponding mathematical equation. The mathematical equation used in the thesis is the sum of the first N terms in a geometric series. Various changes are made to this mathematical equation so that any given data can be compressed and decompressed. According to the proposed technique, the whole data is taken as a single decimal number and replaced with one of the terms of the used equation. All the other terms of the equation are computed and stored as a compressed file. The performance of the developed technique is evaluated in terms of compression ratio, compression time and decompression time. The evaluation metrics are then compared with the other existing techniques of the same domain.
ContributorsGrewal, Karandeep Singh (Author) / Gonzalez Sanchez, Javier (Thesis advisor) / Bansal, Ajay (Committee member) / Findler, Michael (Committee member) / Arizona State University (Publisher)
Created2021
161626-Thumbnail Image.png
Description
Calculus as a math course is important subject students need to succeed in, in order to venture into STEM majors. This thesis focuses on the early detection of at-risk students in a calculus course which can provide the proper intervention that might help them succeed in the course. Calculus has

Calculus as a math course is important subject students need to succeed in, in order to venture into STEM majors. This thesis focuses on the early detection of at-risk students in a calculus course which can provide the proper intervention that might help them succeed in the course. Calculus has high failure rates which corroborates with the data collected from Arizona State University that shows that 40% of the 3266 students whose data were used failed in their calculus course.This thesis proposes to utilize educational big data to detect students at high risk of failure and their eventual early detection and subsequent intervention can be useful. Some existing studies similar to this thesis make use of open-scale data that are lower in data count and perform predictions on low-impact Massive Open Online Courses(MOOC) based courses. In this thesis, an automatic detection method of academically at-risk students by using learning management systems(LMS) activity data along with the student information system(SIS) data from Arizona State University(ASU) for the course calculus for engineers I (MAT 265) is developed. The method will detect students at risk by employing machine learning to identify key features that contribute to the success of a student. This thesis also proposes a new technique to convert this button click data into a button click sequence which can be used as inputs to classifiers. In addition, the advancements in Natural Language Processing field can be used by adopting methods such as part-of-speech (POS) tagging and tools such as Facebook Fasttext word embeddings to convert these button click sequences into numeric vectors before feeding them into the classifiers. The thesis proposes two preprocessing techniques and evaluates them on 3 different machine learning ensembles to determine their performance across the two modalities of the class.
ContributorsDileep, Akshay Kumar (Author) / Bansal, Ajay (Thesis advisor) / Cunningham, James (Committee member) / Acuna, Ruben (Committee member) / Arizona State University (Publisher)
Created2021
161629-Thumbnail Image.png
Description
One persisting problem in Massive Open Online Courses (MOOCs) is the issue of student dropout from these courses. The prediction of student dropout from MOOC courses can identify the factors responsible for such an event and it can further initiate intervention before such an event to increase student success in

One persisting problem in Massive Open Online Courses (MOOCs) is the issue of student dropout from these courses. The prediction of student dropout from MOOC courses can identify the factors responsible for such an event and it can further initiate intervention before such an event to increase student success in MOOC. There are different approaches and various features available for the prediction of student’s dropout in MOOC courses.In this research, the data derived from the self-paced math course ‘College Algebra and Problem Solving’ offered on the MOOC platform Open edX offered by Arizona State University (ASU) from 2016 to 2020 was considered. This research aims to predict the dropout of students from a MOOC course given a set of features engineered from the learning of students in a day. Machine Learning (ML) model used is Random Forest (RF) and this model is evaluated using the validation metrics like accuracy, precision, recall, F1-score, Area Under the Curve (AUC), Receiver Operating Characteristic (ROC) curve. The average rate of student learning progress was found to have more impact than other features. The model developed can predict the dropout or continuation of students on any given day in the MOOC course with an accuracy of 87.5%, AUC of 94.5%, precision of 88%, recall of 87.5%, and F1-score of 87.5% respectively. The contributing features and interactions were explained using Shapely values for the prediction of the model. The features engineered in this research are predictive of student dropout and could be used for similar courses to predict student dropout from the course. This model can also help in making interventions at a critical time to help students succeed in this MOOC course.
ContributorsDominic Ravichandran, Sheran Dass (Author) / Gary, Kevin (Thesis advisor) / Bansal, Ajay (Committee member) / Cunningham, James (Committee member) / Sannier, Adrian (Committee member) / Arizona State University (Publisher)
Created2021
161678-Thumbnail Image.png
Description
An ontology is a vocabulary that provides a formal description of entities within a domain and their relationships with other entities. Along with basic schema information, it also captures information in the form of metadata about cardinality, restrictions, hierarchy, and semantic meaning. With the rapid growth of semantic (RDF) data

An ontology is a vocabulary that provides a formal description of entities within a domain and their relationships with other entities. Along with basic schema information, it also captures information in the form of metadata about cardinality, restrictions, hierarchy, and semantic meaning. With the rapid growth of semantic (RDF) data on the web, many organizations like DBpedia, Earth Science Information Partners (ESIP) are publishing more and more data in RDF format. The ontology alignment task aims at linking two or more different ontologies from the same domain or different domains. It is a process of finding the semantic relationship between two or more ontological entities and/or instances. Information/data sharing among different systems is quite limited because of differences in data based on syntax, structures, and semantics. Ontology alignment is used to overcome the limitation of semantic interoperability of current vast distributed systems available on the Web. In spite of the availability of large hierarchical domain-specific datasets, automated ontology mapping is still a complex problem. Over the years, many techniques have been proposed for ontology instance alignment, schema alignment, and link discovery. Most of the available approaches require human intervention or work within a specific domain. The challenge involves representing an entity as a vector that encodes all context information of the entity such as hierarchical information, properties, constraints, etc. The ontological representation is rich in comparison with the regular data schema because of metadata about various properties, constraints, relationship to other entities within the domain, etc. While finding similarities between entities this metadata is often overlooked. The second challenge is that the comparison of two ontologies is an intense operation and highly depends on the domain and the language that the ontologies are expressed in. Most current methods require human intervention that leads to a time-consuming and cumbersome process and the output is prone to human errors. The proposed unsupervised recursive neural network technique achieves an F-measure of 80.3% on the Anatomy dataset and the proposed graph neural network technique achieves an F-measure of 81.0% on the Anatomy dataset.
ContributorsChakraborty, Jaydeep (Author) / Bansal, Srividya (Thesis advisor) / Sherif, Mohamed (Committee member) / Bansal, Ajay (Committee member) / Hsiao, Sharon (Committee member) / Arizona State University (Publisher)
Created2021
190879-Thumbnail Image.png
Description
Open Information Extraction (OIE) is a subset of Natural Language Processing (NLP) that constitutes the processing of natural language into structured and machine-readable data. This thesis uses data in Resource Description Framework (RDF) triple format that comprises of a subject, predicate, and object. The extraction of RDF triples from

Open Information Extraction (OIE) is a subset of Natural Language Processing (NLP) that constitutes the processing of natural language into structured and machine-readable data. This thesis uses data in Resource Description Framework (RDF) triple format that comprises of a subject, predicate, and object. The extraction of RDF triples from natural language is an essential step towards importing data into web ontologies as part of the linked open data cloud on the Semantic web. There have been a number of related techniques for extraction of triples from plain natural language text including but not limited to ClausIE, OLLIE, Reverb, and DeepEx. This proposed study aims to reduce the dependency on conventional machine learning models since they require training datasets, and the models are not easily customizable or explainable. By leveraging a context-free grammar (CFG) based model, this thesis aims to address some of these issues while minimizing the trade-offs on performance and accuracy. Furthermore, a deep-dive is conducted to analyze the strengths and limitations of the proposed approach.
ContributorsSingh, Varun (Author) / Bansal, Srividya (Thesis advisor) / Bansal, Ajay (Committee member) / Mehlhase, Alexandra (Committee member) / Arizona State University (Publisher)
Created2023
193524-Thumbnail Image.png
Description
Astronomy has a data de-noising problem. The quantity of data produced by astronomical instruments is immense, and a wide variety of noise is present in this data including artifacts. Many types of this noise are not easily filtered using traditional handwritten algorithms. Deep learning techniques present a potential solution to

Astronomy has a data de-noising problem. The quantity of data produced by astronomical instruments is immense, and a wide variety of noise is present in this data including artifacts. Many types of this noise are not easily filtered using traditional handwritten algorithms. Deep learning techniques present a potential solution to the identification and filtering of these more difficult types of noise. In this thesis, deep learning approaches to two astronomical data de-noising steps are attempted and evaluated. Pre-existing simulation tools are utilized to generate a high-quality training dataset for deep neural network models. These models are then tested on real-world data. One set of models masks diffraction spikes from bright stars in James Webb Space Telescope data. A second set of models identifies and masks regions of the sky that would interfere with sky surface brightness measurements. The results obtained indicate that many such astronomical data de-noising and analysis problems can use this approach of simulating a high-quality training dataset and then utilizing a deep learning model trained on that dataset.
ContributorsJeffries, Charles George (Author) / Bansal, Ajay (Thesis advisor) / Windhorst, Rogier (Committee member) / Acuna, Ruben (Committee member) / Arizona State University (Publisher)
Created2024