Search Content

Computer Vision: Improving Detection and Tracking for Occluded and Blurry Settings

Description

Computer vision and tracking has become an area of great interest for many reasons, including self-driving cars, identification of vehicles and drivers on roads, and security camera monitoring, all of which are expanding in the modern digital era. When working with practical systems that are constrained in multiple ways, such…

Computer vision and tracking has become an area of great interest for many reasons, including self-driving cars, identification of vehicles and drivers on roads, and security camera monitoring, all of which are expanding in the modern digital era. When working with practical systems that are constrained in multiple ways, such as video quality or viewing angle, algorithms that work well theoretically can have a high error rate in practice. This thesis studies several ways in which that error can be minimized.This thesis describes an application in a practical system. This project is to detect, track and count people entering different lanes at an airport security checkpoint, using CCTV videos as a primary source. This thesis improves an existing algorithm that is not optimized for this particular problem and has a high error rate when comparing the algorithm counts with the true volume of users. The high error rate is caused by many people crowding into security lanes at the same time. The camera from which footage was captured is located at a poor angle, and thus many of the people occlude each other and cause the existing algorithm to miss people. One solution is to count only heads; since heads are smaller than a full body, they will occlude less, and in addition, since the camera is angled from above, the heads in back will appear higher and will not be occluded by people in front. One of the primary improvements to the algorithm is to combine both person detections and head detections to improve the accuracy. The proposed algorithm also improves the accuracy of detections. The existing algorithm used the COCO training dataset, which works well in scenarios where people are visible and not occluded. However, the available video quality in this project was not very good, with people often blocking each other from the camera’s view. Thus, a different training set was needed that could detect people even in poor-quality frames and with occlusion. The new training set is the first algorithmic improvement, and although occasionally performing worse, corrected the error by 7.25% on average.

ContributorsLarsen, Andrei (Author) / Askin, Ronald (Thesis advisor) / Sefair, Jorge (Thesis advisor) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)

Created2021

Learning Policies for Model-Based Reinforcement Learning Using Distributed Reward Formulation

Description

This work explores combining state-of-the-art \gls{mbrl} algorithms focused on learning complex policies with large state-spaces and augmenting them with distributional reward perspective on \gls{rl} algorithms. Distributional \gls{rl} provides a probabilistic reward formulation as opposed to the classic \gls{rl} formulation which models the estimation of this distributional return. These probabilistic reward…

This work explores combining state-of-the-art \gls{mbrl} algorithms focused on learning complex policies with large state-spaces and augmenting them with distributional reward perspective on \gls{rl} algorithms. Distributional \gls{rl} provides a probabilistic reward formulation as opposed to the classic \gls{rl} formulation which models the estimation of this distributional return. These probabilistic reward formulations help the agent choose highly risk-averse actions, which in turn makes the learning more stable. To evaluate this idea, I experiment in simulation on complex high-dimensional environments when subject under different noisy conditions.

ContributorsAgarwal, Nikhil (Author) / Ben Amor, Heni (Thesis advisor) / Phielipp, Mariano (Committee member) / DV, Hemanth (Committee member) / Arizona State University (Publisher)

Created2021

Analyzing, Understanding, and Improving Predicted Variable Names in Decompiled Binary Code

Description

Reverse engineers use decompilers to analyze binaries when their source code is unavailable. A binary decompiler attempts to transform binary programs to their corresponding high-level source code by recovering and inferring the information that was lost during the compilation process. One type of information that is lost during compilation is…

Reverse engineers use decompilers to analyze binaries when their source code is unavailable. A binary decompiler attempts to transform binary programs to their corresponding high-level source code by recovering and inferring the information that was lost during the compilation process. One type of information that is lost during compilation is variable names, which are critical for reverse engineers to analyze and understand programs. Traditional binary decompilers generally use automatically generated, placeholder variable names that are meaningless or have little correlation with their intended semantics. Having correct or meaningful variable names in decompiled code, instead of placeholder variable names, greatly increases the readability of decompiled binary code. Decompiled Identifier Renaming Engine (DIRE) is a state-of-the-art, deep-learning-based solution that automatically predicts variable names in decompiled binary code. However, DIRE's prediction result is far from perfect. The first goal of this research project is to take a close look at the current state-of-the-art solution for automated variable name prediction on decompilation output of binary code, assess the prediction quality, and understand how the prediction result can be improved. Then, as the second goal of this research project, I aim to improve the prediction quality of variable names. With a thorough understanding of DIRE's issues, I focus on improving the quality of training data. This thesis proposes a novel approach to improving the quality of the training data by normalizing variable names and converting their abbreviated forms to their full forms. I implemented and evaluated the proposed approach on a data set of over 10k and 20k binaries and showed improvements over DIRE.

ContributorsBajaj, Ati Priya (Author) / Wang, Ruoyu (Thesis advisor) / Baral, Chitta (Committee member) / Shoshitaishvili, Yan (Committee member) / Arizona State University (Publisher)

Created2021

Predicting Student Dropout in Self-Paced MOOC Course

Description

One persisting problem in Massive Open Online Courses (MOOCs) is the issue of student dropout from these courses. The prediction of student dropout from MOOC courses can identify the factors responsible for such an event and it can further initiate intervention before such an event to increase student success in…

One persisting problem in Massive Open Online Courses (MOOCs) is the issue of student dropout from these courses. The prediction of student dropout from MOOC courses can identify the factors responsible for such an event and it can further initiate intervention before such an event to increase student success in MOOC. There are different approaches and various features available for the prediction of student’s dropout in MOOC courses.In this research, the data derived from the self-paced math course ‘College Algebra and Problem Solving’ offered on the MOOC platform Open edX offered by Arizona State University (ASU) from 2016 to 2020 was considered. This research aims to predict the dropout of students from a MOOC course given a set of features engineered from the learning of students in a day. Machine Learning (ML) model used is Random Forest (RF) and this model is evaluated using the validation metrics like accuracy, precision, recall, F1-score, Area Under the Curve (AUC), Receiver Operating Characteristic (ROC) curve. The average rate of student learning progress was found to have more impact than other features. The model developed can predict the dropout or continuation of students on any given day in the MOOC course with an accuracy of 87.5%, AUC of 94.5%, precision of 88%, recall of 87.5%, and F1-score of 87.5% respectively. The contributing features and interactions were explained using Shapely values for the prediction of the model. The features engineered in this research are predictive of student dropout and could be used for similar courses to predict student dropout from the course. This model can also help in making interventions at a critical time to help students succeed in this MOOC course.

ContributorsDominic Ravichandran, Sheran Dass (Author) / Gary, Kevin (Thesis advisor) / Bansal, Ajay (Committee member) / Cunningham, James (Committee member) / Sannier, Adrian (Committee member) / Arizona State University (Publisher)

Created2021

Examining User Engagement via Facial Expressions in Augmented Reality with Dynamic Time Warping

Description

Augmented Reality (AR) has progressively demonstrated its helpfulness for novicesto learn highly complex and abstract concepts by visualizing details in an immersive environment. However, some studies show that similar results could also be obtained in environments that do not involve AR. To explore the potential of AR in advancing transformative engagement in education,…

Augmented Reality (AR) has progressively demonstrated its helpfulness for novicesto learn highly complex and abstract concepts by visualizing details in an immersive environment. However, some studies show that similar results could also be obtained in environments that do not involve AR. To explore the potential of AR in advancing transformative engagement in education, I propose modeling facial expressions as implicit feedback when one is being immersed in the environment. I developed a Unity application to record and log the users' application operations and facial images. A neural network-based model, Visual Geometry Group 19 (VGG19, Simonyan and Zisserman (2014)), is adopted to recognize emotions from the captured facial images. A within-subject user study was designed and conducted to assess the sentiment and user engagement differences in AR and non-AR tasks. To analyze the collected data, Dynamic Time Warping (DTW) was applied to identify the emotional similarities between AR and non-AR environments. The results indicate that users showed an increase in emotion patterns and application operations throughout the AR tasks in comparison to non-AR tasks. The emotion patterns observed in the analysis show that non-AR provides less implicit feedback compared to AR tasks. The DTW analysis reveals that users' emotion change patterns appear to be more distant from neutral emotions in AR than non-AR tasks. Succinctly put, the users in the AR task demonstrated more active use of the application and yielded ranges of emotions while operating it.

ContributorsPapakannu, Kushal Reddy (Author) / Hsiao, Ihan (Thesis advisor) / Bryan, Chris (Committee member) / Glenberg, Mina Johnson (Committee member) / Arizona State University (Publisher)

Created2021

OntoConnect: Domain-Agnostic Ontology Alignment using Neural Networks

Description

An ontology is a vocabulary that provides a formal description of entities within a domain and their relationships with other entities. Along with basic schema information, it also captures information in the form of metadata about cardinality, restrictions, hierarchy, and semantic meaning. With the rapid growth of semantic (RDF) data…

An ontology is a vocabulary that provides a formal description of entities within a domain and their relationships with other entities. Along with basic schema information, it also captures information in the form of metadata about cardinality, restrictions, hierarchy, and semantic meaning. With the rapid growth of semantic (RDF) data on the web, many organizations like DBpedia, Earth Science Information Partners (ESIP) are publishing more and more data in RDF format. The ontology alignment task aims at linking two or more different ontologies from the same domain or different domains. It is a process of finding the semantic relationship between two or more ontological entities and/or instances. Information/data sharing among different systems is quite limited because of differences in data based on syntax, structures, and semantics. Ontology alignment is used to overcome the limitation of semantic interoperability of current vast distributed systems available on the Web. In spite of the availability of large hierarchical domain-specific datasets, automated ontology mapping is still a complex problem. Over the years, many techniques have been proposed for ontology instance alignment, schema alignment, and link discovery. Most of the available approaches require human intervention or work within a specific domain. The challenge involves representing an entity as a vector that encodes all context information of the entity such as hierarchical information, properties, constraints, etc. The ontological representation is rich in comparison with the regular data schema because of metadata about various properties, constraints, relationship to other entities within the domain, etc. While finding similarities between entities this metadata is often overlooked. The second challenge is that the comparison of two ontologies is an intense operation and highly depends on the domain and the language that the ontologies are expressed in. Most current methods require human intervention that leads to a time-consuming and cumbersome process and the output is prone to human errors. The proposed unsupervised recursive neural network technique achieves an F-measure of 80.3% on the Anatomy dataset and the proposed graph neural network technique achieves an F-measure of 81.0% on the Anatomy dataset.

ContributorsChakraborty, Jaydeep (Author) / Bansal, Srividya (Thesis advisor) / Sherif, Mohamed (Committee member) / Bansal, Ajay (Committee member) / Hsiao, Sharon (Committee member) / Arizona State University (Publisher)

Created2021

Medical Image Segmentation Using Interactive Refinement

Description

Image segmentation is an important and challenging area of research in computer vision with various applications in medical imaging. Image segmentation refers to the process of partitioning an image into meaningful parts having similar attributes. Traditional manual segmentation approaches rely on human expertise to outline object boundaries in images which…

Image segmentation is an important and challenging area of research in computer vision with various applications in medical imaging. Image segmentation refers to the process of partitioning an image into meaningful parts having similar attributes. Traditional manual segmentation approaches rely on human expertise to outline object boundaries in images which is a tedious and expensive process. In recent years, Deep Convolutional Neural Networks have demonstrated excellent performance in tasks such as detection, localization, recognition and segmentation of objects. However, these models require a large set of labeled training data which is difficult to obtain for medical images. To solve this problem, interactive segmentation techniques can be used to serve as a trade-off between fully automated and manual approaches. This allows a human expert in the loop as a form of guidance and refinement together with deep neural networks. This thesis proposes an interactive training strategy for segmentation, where a robot-user is utilized during training to mimic an actual annotator and provide corrections to the predicted masks by drawing scribbles. These scribbles are then used as supervisory signals and fed to the network; which interactively refines the segmentation map through several iterations of training. Further, the conducted experiments using various heuristic click strategies demonstrate that user interaction in the form of curves inside the organ of interest achieve optimal editing performance. Moreover, by using the popular image segmentation architectures based on U-Net as base models, segmentation performance is further improved; signifying that the accuracy gain of the interactive correction conform to the accuracy of the initial segmentation map.

ContributorsGoyal, Diksha (Author) / Liang, Jianming Dr. (Thesis advisor) / Wang, Yalin Dr. (Committee member) / Demakethepalli Venkateswara, Hemanth Kumar Dr. (Committee member) / Arizona State University (Publisher)

Created2021

We Need to Talk About Robustness to Adversarial Attacks While Removing Spurious Dataset Biases

Description

Machine learning models can pick up biases and spurious correlations from training data and projects and amplify these biases during inference, thus posing significant challenges in real-world settings. One approach to mitigating this is a class of methods that can identify filter out bias-inducing samples from the training datasets to…

Machine learning models can pick up biases and spurious correlations from training data and projects and amplify these biases during inference, thus posing significant challenges in real-world settings. One approach to mitigating this is a class of methods that can identify filter out bias-inducing samples from the training datasets to force models to avoid being exposed to biases. However, the filtering leads to a considerable wastage of resources as most of the dataset created is discarded as biased. This work deals with avoiding the wastage of resources by identifying and quantifying the biases. I further elaborate on the implications of dataset filtering on robustness (to adversarial attacks) and generalization (to out-of-distribution samples). The findings suggest that while dataset filtering does help to improve OOD(Out-Of-Distribution) generalization, it has a significant negative impact on robustness to adversarial attacks. It also shows that transforming bias-inducing samples into adversarial samples (instead of eliminating them from the dataset) can significantly boost robustness without sacrificing generalization.

ContributorsSachdeva, Bhavdeep Singh (Author) / Baral, Chitta (Thesis advisor) / Liu, Huan (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)

Created2021

3D In-Air-Handwriting based User Login and Identity Input Method

Description

Applications over a gesture-based human-computer interface (HCI) require a new user login method with gestures because it does not have traditional input devices. For example, a user may be asked to verify the identity to unlock a device in a mobile or wearable platform, or sign in to a virtual…

Applications over a gesture-based human-computer interface (HCI) require a new user login method with gestures because it does not have traditional input devices. For example, a user may be asked to verify the identity to unlock a device in a mobile or wearable platform, or sign in to a virtual site over a Virtual Reality (VR) or Augmented Reality (AR) headset, where no physical keyboard or touchscreen is available. This dissertation presents a unified user login framework and an identity input method using 3D In-Air-Handwriting (IAHW), where a user can log in to a virtual site by writing a passcode in the air very fast like a signature. The presented research contains multiple tasks that span motion signal modeling, user authentication, user identification, template protection, and a thorough evaluation in both security and usability. The results of this research show around 0.1% to 3% Equal Error Rate (EER) in user authentication in different conditions as well as 93% accuracy in user identification, on a dataset with over 100 users and two types of gesture input devices. Besides, current research in this area is severely limited by the availability of the gesture input device, datasets, and software tools. This study provides an infrastructure for IAHW research with an open-source library and open datasets of more than 100K IAHW hand movement signals. Additionally, the proposed user identity input method can be extended to a general word input method for both English and Chinese using limited training data. Hence, this dissertation can help the research community in both cybersecurity and HCI to explore IAHW as a new direction, and potentially pave the way to practical adoption of such technologies in the future.

ContributorsLu, Duo (Author) / Huang, Dijiang (Thesis advisor) / Li, Baoxin (Committee member) / Zhang, Junshan (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)

Created2021

Towards Human-Machine Symbiosis: Design for Effective AI Facilitation

Description

The rapid increase in the volume and complexity of data lead to accelerated Artificial Intelligence (AI) applications, primarily as intelligent machines, in everyday life. Providing explanations is considered an imperative ability for an AI agent in a human-robot teaming framework, which provides the rationale behind an AI agent's decision-making. Therefore,…

The rapid increase in the volume and complexity of data lead to accelerated Artificial Intelligence (AI) applications, primarily as intelligent machines, in everyday life. Providing explanations is considered an imperative ability for an AI agent in a human-robot teaming framework, which provides the rationale behind an AI agent's decision-making. Therefore, the validity of the AI models is constrained based on their ability to explain their decision-making rationale. On the other hand, AI agents cannot perceive the social situation that human experts may recognize using their background knowledge, specifically in cybersecurity and the military. Social behavior depends on situation awareness, and it relies on interpretability, transparency, and fairness when we envision efficient Human-AI collaboration. Consequently, the human remains an essential element for planning, especially when the problem's constraints are difficult to express for an agent in a dynamic setting. This dissertation will first develop different model-based explanation generation approaches to predict where the human teammate would misunderstand the plan and, therefore, generate an explanation accordingly. The robot's generated explanation or interactive explicable behavior maintains the human teammate's cognitive workload and increases the overall team situation awareness throughout human-robot interaction. Further, it will focus on a rule-based model to preserve the collaborative engagement of the team by exploring essential aspects of the facilitator agent design. In addition to recognizing wherein the plan might be discrepancies, focusing on the decision-making process provides insight into the reason behind the conflict between the human expectation and the robot's behavior. Employing a rule-based framework will shift the focus from assisting an individual (human) teammate to helping the team interactively while maintaining collaboration. Hence, concentrating on teaming provides the opportunity to recognize some cognitive biases that skew the teammate's expectations and affect interaction behavior. This dissertation investigates how to maintain collaboration engagement or cognitive readiness for collaborative planning tasks. Moreover, this dissertation aims to lay out a planning framework focusing on the human teammate's cognitive abilities to understand the machine-provided explanations while collaborating on a planning task. Consequently, this dissertation explored the design for AI facilitator, helping a team tasked with a challenging task to plan collaboratively, mitigating the teaming biases, and communicate effectively. This dissertation investigates the effect of some cognitive biases on the task outcome and shapes the utility function. The facilitator's role is to facilitate goal alignment, the consensus of planning strategies, utility management, effective communication, and mitigate biases.

ContributorsZakershahrak, Mehrdad (Author) / Cooke, Nancy NC (Thesis advisor) / Zhang, Yu YZ (Thesis advisor) / Ben Amor, Hani HB (Committee member) / Srivastava, Siddharth SS (Committee member) / Hsiao, Sharon SH (Committee member) / Arizona State University (Publisher)

Created2021

Filtering by