Search Content

Maroon and Gold: Mobile Application

Description

Currently, students at Arizona State University are restricted to cards when using their college's local currency. This currency, Maroon and Gold dollars (M&G), is a primary source of meal plans for many students. When relying on card readers, students risk security and convenience. The security is risked due to the…

Currently, students at Arizona State University are restricted to cards when using their college's local currency. This currency, Maroon and Gold dollars (M&G), is a primary source of meal plans for many students. When relying on card readers, students risk security and convenience. The security is risked due to the constant student id number on each card. A student's identification number never changes and is located on each card. If the student loses their card, their account information is permanently compromised. Convenience is an issue because, currently, students must make a purchase in order to see their current account balance. Another major issue is that businesses must purchase external hardware in order to use the M&G System. An online or mobile system would eliminate the need for a physical card and allow businesses to function without external card readers. Such a system would have access to financial information of businesses and students at ASU. Thus, the system require severe scrutiny by a well-trusted team of professionals before being implemented. My objective was to help bring such a system to life. To do this, I decided to make a mobile application prototype to serve as a baseline and to demonstrate the features of such a system. As a baseline, it needed to have a realistic, professional appearance, with the ability to accurately demonstrate feature functionality. Before developing the app, I set out to determine the User Interactions and User Experience designs (UI/UX) by conducting a series of informal interviews with local students and businesses. After the designs were finalized, I started implementation of the actual application in Android Studio. This creative project consists of a mobile application, a contained database, a GUI (Graphics User Interface) prototype, and a technical document.

ContributorsReigel, Justin Bryce (Author) / Bansal, Ajay (Thesis director) / Lindquist, Timothy (Committee member) / Software Engineering (Contributor) / Barrett, The Honors College (Contributor)

Created2016-05

Collaborative Computation in Self-Organizing Particle Systems

Description

Many forms of programmable matter have been proposed for various tasks. We use an abstract model of self-organizing particle systems for programmable matter which could be used for a variety of applications, including smart paint and coating materials for engineering or programmable cells for medical uses. Previous research using this…

Many forms of programmable matter have been proposed for various tasks. We use an abstract model of self-organizing particle systems for programmable matter which could be used for a variety of applications, including smart paint and coating materials for engineering or programmable cells for medical uses. Previous research using this model has focused on shape formation and other spatial configuration problems, including line formation, compression, and coating. In this work we study foundational computational tasks that exceed the capabilities of the individual constant memory particles described by the model. These tasks represent new ways to use these self-organizing systems, which, in conjunction with previous shape and configuration work, make the systems useful for a wider variety of tasks. We present an implementation of a counter using a line of particles, which makes it possible for the line of particles to count to and store values much larger than their individual capacities. We then present an algorithm that takes a matrix and a vector as input and then sets up and uses a rectangular block of particles to compute the matrix-vector multiplication. This setup also utilizes the counter implementation to store the resulting vector from the matrix-vector multiplication. Operations such as counting and matrix multiplication can leverage the distributed and dynamic nature of the self-organizing system to be more efficient and adaptable than on traditional linear computing hardware. Such computational tools also give the systems more power to make complex decisions when adapting to new situations or to analyze the data they collect, reducing reliance on a central controller for setup and output processing. Finally, we demonstrate an application of similar types of computations with self-organizing systems to image processing, with an implementation of an image edge detection algorithm.

ContributorsPorter, Alexandra Marie (Author) / Richa, Andrea (Thesis director) / Xue, Guoliang (Committee member) / School of Music (Contributor) / Computer Science and Engineering Program (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Barrett, The Honors College (Contributor)

Created2016-12

A Novel Historical Safety Metric for Evaluating Road Networks

Description

37,461 automobile accident fatalities occured in the United States in 2016 ("Quick Facts 2016", 2017). Improving the safety of roads has traditionally been approached by governmental agencies including the National Highway Traffic Safety Administration and State Departments of Transporation. In past literature, automobile crash data is analyzed using time-series prediction…

37,461 automobile accident fatalities occured in the United States in 2016 ("Quick Facts 2016", 2017). Improving the safety of roads has traditionally been approached by governmental agencies including the National Highway Traffic Safety Administration and State Departments of Transporation. In past literature, automobile crash data is analyzed using time-series prediction technicques to identify road segments and/or intersections likely to experience future crashes (Lord & Mannering, 2010). After dangerous zones have been identified road modifications can be implemented improving public safety. This project introduces a historical safety metric for evaluating the relative danger of roads in a road network. The historical safety metric can be used to update routing choices of individual drivers improving public safety by avoiding historically more dangerous routes. The metric is constructed using crash frequency, severity, location and traffic information. An analysis of publically-available crash and traffic data in Allgeheny County, Pennsylvania is used to generate the historical safety metric for a specific road network. Methods for evaluating routes based on the presented historical safety metric are included using the Mann Whitney U Test to evaluate the significance of routing decisions. The evaluation method presented requires routes have at least 20 crashes to be compared with significance testing. The safety of the road network is visualized using a heatmap to present distribution of the metric throughout Allgeheny County.

ContributorsGupta, Ariel Meron (Author) / Bansal, Ajay (Thesis director) / Sodemann, Angela (Committee member) / Engineering Programs (Contributor) / Barrett, The Honors College (Contributor)

Created2017-12

ReL GoalD (Reinforcement Learning for Goal Dependencies)

Description

In this project, the use of deep neural networks for the process of selecting actions to execute within an environment to achieve a goal is explored. Scenarios like this are common in crafting based games such as Terraria or Minecraft. Goals in these environments have recursive sub-goal dependencies which form…

In this project, the use of deep neural networks for the process of selecting actions to execute within an environment to achieve a goal is explored. Scenarios like this are common in crafting based games such as Terraria or Minecraft. Goals in these environments have recursive sub-goal dependencies which form a dependency tree. An agent operating within these environments have access to low amounts of data about the environment before interacting with it, so it is crucial that this agent is able to effectively utilize a tree of dependencies and its environmental surroundings to make judgements about which sub-goals are most efficient to pursue at any point in time. A successful agent aims to minimizes cost when completing a given goal. A deep neural network in combination with Q-learning techniques was employed to act as the agent in this environment. This agent consistently performed better than agents using alternate models (models that used dependency tree heuristics or human-like approaches to make sub-goal oriented choices), with an average performance advantage of 33.86% (with a standard deviation of 14.69%) over the best alternate agent. This shows that machine learning techniques can be consistently employed to make goal-oriented choices within an environment with recursive sub-goal dependencies and low amounts of pre-known information.

ContributorsKoleber, Derek (Author) / Acuna, Ruben (Thesis director) / Bansal, Ajay (Committee member) / W.P. Carey School of Business (Contributor) / Software Engineering (Contributor) / Barrett, The Honors College (Contributor)

Created2018-05

Locating counting sensors in traffic network to estimate origin-destination volumes

Description

Improving the quality of Origin-Destination (OD) demand estimates increases the effectiveness of design, evaluation and implementation of traffic planning and management systems. The associated bilevel Sensor Location Flow-Estimation problem considers two important research questions: (1) how to compute the best estimates of the flows of interest by using anticipated data…

Improving the quality of Origin-Destination (OD) demand estimates increases the effectiveness of design, evaluation and implementation of traffic planning and management systems. The associated bilevel Sensor Location Flow-Estimation problem considers two important research questions: (1) how to compute the best estimates of the flows of interest by using anticipated data from given candidate sensors location; and (2) how to decide on the optimum subset of links where sensors should be located. In this dissertation, a decision framework is developed to optimally locate and obtain high quality OD volume estimates in vehicular traffic networks. The framework includes a traffic assignment model to load the OD traffic volumes on routes in a known choice set, a sensor location model to decide on which subset of links to locate counting sensors to observe traffic volumes, and an estimation model to obtain best estimates of OD or route flow volumes. The dissertation first addresses the deterministic route flow estimation problem given apriori knowledge of route flows and their uncertainties. Two procedures are developed to locate "perfect" and "noisy" sensors respectively. Next, it addresses a stochastic route flow estimation problem. A hierarchical linear Bayesian model is developed, where the real route flows are assumed to be generated from a Multivariate Normal distribution with two parameters: "mean" and "variance-covariance matrix". The prior knowledge for the "mean" parameter is described by a probability distribution. When assuming the "variance-covariance matrix" parameter is known, a Bayesian A-optimal design is developed. When the "variance-covariance matrix" parameter is unknown, Markov Chain Monte Carlo approach is used to estimate the aposteriori quantities. In all the sensor location model the objective is the maximization of the reduction in the variances of the distribution of the estimates of the OD volume. Developed models are compared with other available models in the literature. The comparison showed that the models developed performed better than available models.

ContributorsWang, Ning (Author) / Mirchandani, Pitu (Thesis advisor) / Murray, Alan (Committee member) / Pendyala, Ram (Committee member) / Runger, George C. (Committee member) / Zhang, Muhong (Committee member) / Arizona State University (Publisher)

Created2013

Early Detection of At-Risk Students Using LMS Data

Description

Calculus as a math course is important subject students need to succeed in, in order to venture into STEM majors. This thesis focuses on the early detection of at-risk students in a calculus course which can provide the proper intervention that might help them succeed in the course. Calculus has…

Calculus as a math course is important subject students need to succeed in, in order to venture into STEM majors. This thesis focuses on the early detection of at-risk students in a calculus course which can provide the proper intervention that might help them succeed in the course. Calculus has high failure rates which corroborates with the data collected from Arizona State University that shows that 40% of the 3266 students whose data were used failed in their calculus course.This thesis proposes to utilize educational big data to detect students at high risk of failure and their eventual early detection and subsequent intervention can be useful. Some existing studies similar to this thesis make use of open-scale data that are lower in data count and perform predictions on low-impact Massive Open Online Courses(MOOC) based courses. In this thesis, an automatic detection method of academically at-risk students by using learning management systems(LMS) activity data along with the student information system(SIS) data from Arizona State University(ASU) for the course calculus for engineers I (MAT 265) is developed. The method will detect students at risk by employing machine learning to identify key features that contribute to the success of a student. This thesis also proposes a new technique to convert this button click data into a button click sequence which can be used as inputs to classifiers. In addition, the advancements in Natural Language Processing field can be used by adopting methods such as part-of-speech (POS) tagging and tools such as Facebook Fasttext word embeddings to convert these button click sequences into numeric vectors before feeding them into the classifiers. The thesis proposes two preprocessing techniques and evaluates them on 3 different machine learning ensembles to determine their performance across the two modalities of the class.

ContributorsDileep, Akshay Kumar (Author) / Bansal, Ajay (Thesis advisor) / Cunningham, James (Committee member) / Acuna, Ruben (Committee member) / Arizona State University (Publisher)

Created2021

Predicting Student Dropout in Self-Paced MOOC Course

Description

One persisting problem in Massive Open Online Courses (MOOCs) is the issue of student dropout from these courses. The prediction of student dropout from MOOC courses can identify the factors responsible for such an event and it can further initiate intervention before such an event to increase student success in…

One persisting problem in Massive Open Online Courses (MOOCs) is the issue of student dropout from these courses. The prediction of student dropout from MOOC courses can identify the factors responsible for such an event and it can further initiate intervention before such an event to increase student success in MOOC. There are different approaches and various features available for the prediction of student’s dropout in MOOC courses.In this research, the data derived from the self-paced math course ‘College Algebra and Problem Solving’ offered on the MOOC platform Open edX offered by Arizona State University (ASU) from 2016 to 2020 was considered. This research aims to predict the dropout of students from a MOOC course given a set of features engineered from the learning of students in a day. Machine Learning (ML) model used is Random Forest (RF) and this model is evaluated using the validation metrics like accuracy, precision, recall, F1-score, Area Under the Curve (AUC), Receiver Operating Characteristic (ROC) curve. The average rate of student learning progress was found to have more impact than other features. The model developed can predict the dropout or continuation of students on any given day in the MOOC course with an accuracy of 87.5%, AUC of 94.5%, precision of 88%, recall of 87.5%, and F1-score of 87.5% respectively. The contributing features and interactions were explained using Shapely values for the prediction of the model. The features engineered in this research are predictive of student dropout and could be used for similar courses to predict student dropout from the course. This model can also help in making interventions at a critical time to help students succeed in this MOOC course.

ContributorsDominic Ravichandran, Sheran Dass (Author) / Gary, Kevin (Thesis advisor) / Bansal, Ajay (Committee member) / Cunningham, James (Committee member) / Sannier, Adrian (Committee member) / Arizona State University (Publisher)

Created2021

Learning Causality with Networked Observational Data

Description

This dissertation considers the question of how convenient access to copious networked observational data impacts our ability to learn causal knowledge. It investigates in what ways learning causality from such data is different from -- or the same as -- the traditional causal inference which often deals with small scale…

This dissertation considers the question of how convenient access to copious networked observational data impacts our ability to learn causal knowledge. It investigates in what ways learning causality from such data is different from -- or the same as -- the traditional causal inference which often deals with small scale i.i.d. data collected from randomized controlled trials? For example, how can we exploit network information for a series of tasks in the area of learning causality? To answer this question, the dissertation is written toward developing a suite of novel causal learning algorithms that offer actionable insights for a series of causal inference tasks with networked observational data. The work aims to benefit real-world decision-making across a variety of highly influential applications. In the first part of this dissertation, it investigates the task of inferring individual-level causal effects from networked observational data. First, it presents a representation balancing-based framework for handling the influence of hidden confounders to achieve accurate estimates of causal effects. Second, it extends the framework with an adversarial learning approach to properly combine two types of existing heuristics: representation balancing and treatment prediction. The second part of the dissertation describes a framework for counterfactual evaluation of treatment assignment policies with networked observational data. A novel framework that captures patterns of hidden confounders is developed to provide more informative input for downstream counterfactual evaluation methods. The third part presents a framework for debiasing two-dimensional grid-based e-commerce search with observational search log data where there is an implicit network connecting neighboring products in a search result page. A novel inverse propensity scoring framework that models user behavior patterns for two-dimensional display in e-commerce websites is developed, which aims to optimize online performance of ranking algorithms with offline log data.

ContributorsGuo, Ruocheng (Author) / Liu, Huan (Thesis advisor) / Candan, K. Selcuk (Committee member) / Xue, Guoliang (Committee member) / Kiciman, Emre (Committee member) / Arizona State University (Publisher)

Created2021

Transformer-based Automatic Mapping of Clinical Notes to Specific Clinical Concepts

Description

A significant proportion of medical errors exist in crucial medical information, and most stem from misinterpreting non-standardized clinical notes. Clinical Skills exam offered by the United States Medical Licensing Examination (USMLE) was put in place to certify patient note-taking skills before medical students joined professional practices, offering the first line…

A significant proportion of medical errors exist in crucial medical information, and most stem from misinterpreting non-standardized clinical notes. Clinical Skills exam offered by the United States Medical Licensing Examination (USMLE) was put in place to certify patient note-taking skills before medical students joined professional practices, offering the first line of defense in protecting patients from medical errors. Nonetheless, the exams were discontinued in 2021 following high costs and resource usage in scoring the exams. This thesis compares four transformer-based models, namely BERT (Bidirectional Encoder Representations from Transformers) Base Uncased, Emilyalsentzer Bio_ClinicalBERT, RoBERTa (Robustly Optimized BERT Pre-Training Approach), and DeBERTa (Decoding-enhanced BERT with disentangled attention), with the goal to map free text in patient notes to clinical concepts present in the exam rubric. The impact of context-specific embeddings on BERT was also studied to determine the need for a clinical BERT in Clinical Skills exam. This thesis proposes the use of DeBERTa as a backbone model in patient note scoring for the USMLE Clinical Skills exam after comparing it with three other transformer models. Disentangled attention and enhanced mask decoder integrated into DeBERTa were credited for the high performance of DeBERTa as compared to the other models. Besides, the effect of meta pseudo labeling was also investigated in this thesis, which in turn, further enhanced DeBERTa’s performance.

ContributorsGanesh, Jay (Author) / Bansal, Ajay (Thesis advisor) / Mehlhase, Alexandra (Committee member) / Findler, Michael (Committee member) / Arizona State University (Publisher)

Created2022

Semantic Information Extraction From Natural Language Using a Learning and Rule-Based Approach

Description

Open Information Extraction (OIE) is a subset of Natural Language Processing (NLP) that constitutes the processing of natural language into structured and machine-readable data. This thesis uses data in Resource Description Framework (RDF) triple format that comprises of a subject, predicate, and object. The extraction of RDF triples from…

Open Information Extraction (OIE) is a subset of Natural Language Processing (NLP) that constitutes the processing of natural language into structured and machine-readable data. This thesis uses data in Resource Description Framework (RDF) triple format that comprises of a subject, predicate, and object. The extraction of RDF triples from natural language is an essential step towards importing data into web ontologies as part of the linked open data cloud on the Semantic web. There have been a number of related techniques for extraction of triples from plain natural language text including but not limited to ClausIE, OLLIE, Reverb, and DeepEx. This proposed study aims to reduce the dependency on conventional machine learning models since they require training datasets, and the models are not easily customizable or explainable. By leveraging a context-free grammar (CFG) based model, this thesis aims to address some of these issues while minimizing the trade-offs on performance and accuracy. Furthermore, a deep-dive is conducted to analyze the strengths and limitations of the proposed approach.

ContributorsSingh, Varun (Author) / Bansal, Srividya (Thesis advisor) / Bansal, Ajay (Committee member) / Mehlhase, Alexandra (Committee member) / Arizona State University (Publisher)

Created2023