Matching Items (17)
Filtering by

Clear all filters

133397-Thumbnail Image.png
Description
Students learn in various ways \u2014 visualization, auditory, memorizing, or making analogies. Traditional lecturing in engineering courses and the learning styles of engineering students are inharmonious causing students to be at a disadvantage based on their learning style (Felder & Silverman, 1988). My study analyzes the traditional approach to learning

Students learn in various ways \u2014 visualization, auditory, memorizing, or making analogies. Traditional lecturing in engineering courses and the learning styles of engineering students are inharmonious causing students to be at a disadvantage based on their learning style (Felder & Silverman, 1988). My study analyzes the traditional approach to learning coding skills which is unnatural to engineering students with no previous exposure and examining if visual learning enhances introductory computer science education. Visual and text-based learning are evaluated to determine how students learn introductory coding skills and associated problem solving skills. My study was conducted to observe how the two types of learning aid the students in learning how to problem solve as well as how much knowledge can be obtained in a short period of time. The application used for visual learning was Scratch and Repl.it was used for text-based learning. Two exams were made to measure the progress made by each student. The topics covered by the exam were initialization, variable reassignment, output, if statements, if else statements, nested if statements, logical operators, arrays/lists, while loop, type casting, functions, object orientation, and sorting. Analysis of the data collected in the study allow us to observe whether the traditional method of teaching programming or block-based programming is more beneficial and in what topics of introductory computer science concepts.
ContributorsVidaure, Destiny Vanessa (Author) / Meuth, Ryan (Thesis director) / Yang, Yezhou (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)
Created2018-05
155401-Thumbnail Image.png
Description
This work presents a communication paradigm, using a context-aware mixed reality approach, for instructing human workers when collaborating with robots. The main objective of this approach is to utilize the physical work environment as a canvas to communicate task-related instructions and robot intentions in the form of visual cues. A

This work presents a communication paradigm, using a context-aware mixed reality approach, for instructing human workers when collaborating with robots. The main objective of this approach is to utilize the physical work environment as a canvas to communicate task-related instructions and robot intentions in the form of visual cues. A vision-based object tracking algorithm is used to precisely determine the pose and state of physical objects in and around the workspace. A projection mapping technique is used to overlay visual cues on tracked objects and the workspace. Simultaneous tracking and projection onto objects enables the system to provide just-in-time instructions for carrying out a procedural task. Additionally, the system can also inform and warn humans about the intentions of the robot and safety of the workspace. It was hypothesized that using this system for executing a human-robot collaborative task will improve the overall performance of the team and provide a positive experience to the human partner. To test this hypothesis, an experiment involving human subjects was conducted and the performance (both objective and subjective) of the presented system was compared with a conventional method based on printed instructions. It was found that projecting visual cues enabled human subjects to collaborate more effectively with the robot and resulted in higher efficiency in completing the task.
ContributorsKalpagam Ganesan, Ramsundar (Author) / Ben Amor, Hani (Thesis advisor) / Yang, Yezhou (Committee member) / Zhang, Yu (Committee member) / Arizona State University (Publisher)
Created2017
155809-Thumbnail Image.png
Description
Light field imaging is limited in its computational processing demands of high

sampling for both spatial and angular dimensions. Single-shot light field cameras

sacrifice spatial resolution to sample angular viewpoints, typically by multiplexing

incoming rays onto a 2D sensor array. While this resolution can be recovered using

compressive sensing, these iterative solutions are slow

Light field imaging is limited in its computational processing demands of high

sampling for both spatial and angular dimensions. Single-shot light field cameras

sacrifice spatial resolution to sample angular viewpoints, typically by multiplexing

incoming rays onto a 2D sensor array. While this resolution can be recovered using

compressive sensing, these iterative solutions are slow in processing a light field. We

present a deep learning approach using a new, two branch network architecture,

consisting jointly of an autoencoder and a 4D CNN, to recover a high resolution

4D light field from a single coded 2D image. This network decreases reconstruction

time significantly while achieving average PSNR values of 26-32 dB on a variety of

light fields. In particular, reconstruction time is decreased from 35 minutes to 6.7

minutes as compared to the dictionary method for equivalent visual quality. These

reconstructions are performed at small sampling/compression ratios as low as 8%,

allowing for cheaper coded light field cameras. We test our network reconstructions

on synthetic light fields, simulated coded measurements of real light fields captured

from a Lytro Illum camera, and real coded images from a custom CMOS diffractive

light field camera. The combination of compressive light field capture with deep

learning allows the potential for real-time light field video acquisition systems in the

future.
ContributorsGupta, Mayank (Author) / Turaga, Pavan (Thesis advisor) / Yang, Yezhou (Committee member) / Li, Baoxin (Committee member) / Arizona State University (Publisher)
Created2017
168714-Thumbnail Image.png
Description
Deep neural network-based methods have been proved to achieve outstanding performance on object detection and classification tasks. Deep neural networks follow the ``deeper model with deeper confidence'' belief to gain a higher recognition accuracy. However, reducing these networks' computational costs remains a challenge, which impedes their deployment on embedded devices.

Deep neural network-based methods have been proved to achieve outstanding performance on object detection and classification tasks. Deep neural networks follow the ``deeper model with deeper confidence'' belief to gain a higher recognition accuracy. However, reducing these networks' computational costs remains a challenge, which impedes their deployment on embedded devices. For instance, the intersection management of Connected Autonomous Vehicles (CAVs) requires running computationally intensive object recognition algorithms on low-power traffic cameras. This dissertation aims to study the effect of a dynamic hardware and software approach to address this issue. Characteristics of real-world applications can facilitate this dynamic adjustment and reduce the computation. Specifically, this dissertation starts with a dynamic hardware approach that adjusts itself based on the toughness of input and extracts deeper features if needed. Next, an adaptive learning mechanism has been studied that use extracted feature from previous inputs to improve system performance. Finally, a system (ARGOS) was proposed and evaluated that can be run on embedded systems while maintaining the desired accuracy. This system adopts shallow features at inference time, but it can switch to deep features if the system desires a higher accuracy. To improve the performance, ARGOS distills the temporal knowledge from deep features to the shallow system. Moreover, ARGOS reduces the computation furthermore by focusing on regions of interest. The response time and mean average precision are adopted for the performance evaluation to evaluate the proposed ARGOS system.
ContributorsFarhadi, Mohammad (Author) / Yang, Yezhou (Thesis advisor) / Vrudhula, Sarma (Committee member) / Wu, Carole-Jean (Committee member) / Ren, Yi (Committee member) / Arizona State University (Publisher)
Created2022
171818-Thumbnail Image.png
Description
Recent advances in autonomous vehicle (AV) technologies have ensured that autonomous driving will soon be present in real-world traffic. Despite the potential of AVs, many studies have shown that traffic accidents in hybrid traffic environments (where both AVs and human-driven vehicles (HVs) are present) are inevitable because of the unpredictability

Recent advances in autonomous vehicle (AV) technologies have ensured that autonomous driving will soon be present in real-world traffic. Despite the potential of AVs, many studies have shown that traffic accidents in hybrid traffic environments (where both AVs and human-driven vehicles (HVs) are present) are inevitable because of the unpredictability of human-driven vehicles. Given that eliminating accidents is impossible, an achievable goal of designing AVs is to design them in a way so that they will not be blamed for any accident in which they are involved in. This work proposes BlaFT – a Blame-Free motion planning algorithm in hybrid Traffic. BlaFT is designed to be compatible with HVs and other AVs, and will not be blamed for accidents in a structured road environment. Also, it proves that no accidents will happen if all AVs are using the BlaFT motion planner and that when in hybrid traffic, the AV using BlaFT will be blame-free even if it is involved in a collision. The work instantiated scores of BlaFT and HV vehicles in an urban road scape loop in the 'Simulation of Urban MObility', ran the simulation for several hours, and observe that as the percentage of BlaFT vehicles increases, the traffic becomes safer. Adding BlaFT vehicles to HVs also increases the efficiency of traffic as a whole by up to 34%.
ContributorsPark, Sanggu (Author) / Shrivastava, Aviral (Thesis advisor) / Wang, Ruoyu (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)
Created2022
156281-Thumbnail Image.png
Description
Currently, one of the biggest limiting factors for long-term deployment of autonomous systems is the power constraints of a platform. In particular, for aerial robots such as unmanned aerial vehicles (UAVs), the energy resource is the main driver of mission planning and operation definitions, as everything revolved around flight time.

Currently, one of the biggest limiting factors for long-term deployment of autonomous systems is the power constraints of a platform. In particular, for aerial robots such as unmanned aerial vehicles (UAVs), the energy resource is the main driver of mission planning and operation definitions, as everything revolved around flight time. The focus of this work is to develop a new method of energy storage and charging for autonomous UAV systems, for use during long-term deployments in a constrained environment. We developed a charging solution that allows pre-equipped UAV system to land on top of designated charging pads and rapidly replenish their battery reserves, using a contact charging point. This system is designed to work with all types of rechargeable batteries, focusing on Lithium Polymer (LiPo) packs, that incorporate a battery management system for increased reliability. The project also explores optimization methods for fleets of UAV systems, to increase charging efficiency and extend battery lifespans. Each component of this project was first designed and tested in computer simulation. Following positive feedback and results, prototypes for each part of this system were developed and rigorously tested. Results show that the contact charging method is able to charge LiPo batteries at a 1-C rate, which is the industry standard rate, maintaining the same safety and efficiency standards as modern day direct connection chargers. Control software for these base stations was also created, to be integrated with a fleet management system, and optimizes UAV charge levels and distribution to extend LiPo battery lifetimes while still meeting expected mission demand. Each component of this project (hardware/software) was designed for manufacturing and implementation using industry standard tools, making it ideal for large-scale implementations. This system has been successfully tested with a fleet of UAV systems at Arizona State University, and is currently being integrated into an Arizona smart city environment for deployment.
ContributorsMian, Sami (Author) / Panchanathan, Sethuraman (Thesis advisor) / Berman, Spring (Committee member) / Yang, Yezhou (Committee member) / McDaniel, Troy (Committee member) / Arizona State University (Publisher)
Created2018
171862-Thumbnail Image.png
Description
Deep neural networks have been shown to be vulnerable to adversarial attacks. Typical attack strategies alter authentic data subtly so as to obtain adversarial samples that resemble the original but otherwise would cause a network's misbehavior such as a high misclassification rate. Various attack approaches have been reported, with some

Deep neural networks have been shown to be vulnerable to adversarial attacks. Typical attack strategies alter authentic data subtly so as to obtain adversarial samples that resemble the original but otherwise would cause a network's misbehavior such as a high misclassification rate. Various attack approaches have been reported, with some showing state-of-the-art performance in attacking certain networks. In the meanwhile, many defense mechanisms have been proposed in the literature, some of which are quite effective for guarding against typical attacks. Yet, most of these attacks fail when the targeted network modifies its architecture or uses another set of parameters and vice versa. Moreover, the emerging of more advanced deep neural networks, such as generative adversarial networks (GANs), has made the situation more complicated and the game between the attack and defense is continuing. This dissertation aims at exploring the venerability of the deep neural networks by investigating the mechanisms behind the success/failure of the existing attack and defense approaches. Therefore, several deep learning-based approaches have been proposed to study the problem from different perspectives. First, I developed an adversarial attack approach by exploring the unlearned region of a typical deep neural network which is often over-parameterized. Second, I proposed an end-to-end learning framework to analyze the images generated by different GAN models. Third, I developed a defense mechanism that can secure the deep neural network against adversarial attacks with a defense layer consisting of a set of orthogonal kernels. Substantial experiments are conducted to unveil the potential factors that contribute to attack/defense effectiveness. This dissertation also concludes with a discussion of possible future works of achieving a robust deep neural network.
ContributorsDing, Yuzhen (Author) / Li, Baoxin (Thesis advisor) / Davulcu, Hasan (Committee member) / Venkateswara, Hemanth Kumar Demakethepalli (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)
Created2022
157673-Thumbnail Image.png
Description
In this thesis, I present two new datasets and a modification to the existing models in the form of a novel attention mechanism for Natural Language Inference (NLI). The new datasets have been carefully synthesized from various existing corpora released for different tasks.

The task of NLI is to determine the

In this thesis, I present two new datasets and a modification to the existing models in the form of a novel attention mechanism for Natural Language Inference (NLI). The new datasets have been carefully synthesized from various existing corpora released for different tasks.

The task of NLI is to determine the possibility of a sentence referred to as “Hypothesis” being true given that another sentence referred to as “Premise” is true. In other words, the task is to identify whether the “Premise” entails, contradicts or remains neutral with regards to the “Hypothesis”. NLI is a precursor to solving many Natural Language Processing (NLP) tasks such as Question Answering and Semantic Search. For example, in Question Answering systems, the question is paraphrased to form a declarative statement which is treated as the hypothesis. The options are treated as the premise. The option with the maximum entailment score is considered as the answer. Considering the applications of NLI, the importance of having a strong NLI system can't be stressed enough.

Many large-scale datasets and models have been released in order to advance the field of NLI. While all of these models do get good accuracy on the test sets of the datasets they were trained on, they fail to capture the basic understanding of “Entities” and “Roles”. They often make the mistake of inferring that “John went to the market.” from “Peter went to the market.” failing to capture the notion of “Entities”. In other cases, these models don't understand the difference in the “Roles” played by the same entities in “Premise” and “Hypothesis” sentences and end up wrongly inferring that “Peter drove John to the stadium.” from “John drove Peter to the stadium.”

The lack of understanding of “Roles” can be attributed to the lack of such examples in the various existing datasets. The reason for the existing model’s failure in capturing the notion of “Entities” is not just due to the lack of such examples in the existing NLI datasets. It can also be attributed to the strict use of vector similarity in the “word-to-word” attention mechanism being used in the existing architectures.

To overcome these issues, I present two new datasets to help make the NLI systems capture the notion of “Entities” and “Roles”. The “NER Changed” (NC) dataset and the “Role-Switched” (RS) dataset contains examples of Premise-Hypothesis pairs that require the understanding of “Entities” and “Roles” respectively in order to be able to make correct inferences. This work shows how the existing architectures perform poorly on the “NER Changed” (NC) dataset even after being trained on the new datasets. In order to help the existing architectures, understand the notion of “Entities”, this work proposes a modification to the “word-to-word” attention mechanism. Instead of relying on vector similarity alone, the modified architectures learn to incorporate the “Symbolic Similarity” as well by using the Named-Entity features of the Premise and Hypothesis sentences. The new modified architectures not only perform significantly better than the unmodified architectures on the “NER Changed” (NC) dataset but also performs as well on the existing datasets.
ContributorsShrivastava, Ishan (Author) / Baral, Chitta (Thesis advisor) / Anwar, Saadat (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)
Created2019
157602-Thumbnail Image.png
Description
Reasoning with commonsense knowledge is an integral component of human behavior. It is due to this capability that people know that a weak person may not be able to lift someone. It has been a long standing goal of the Artificial Intelligence community to simulate such commonsense reasoning abilities in

Reasoning with commonsense knowledge is an integral component of human behavior. It is due to this capability that people know that a weak person may not be able to lift someone. It has been a long standing goal of the Artificial Intelligence community to simulate such commonsense reasoning abilities in machines. Over the years, many advances have been made and various challenges have been proposed to test their abilities. The Winograd Schema Challenge (WSC) is one such Natural Language Understanding (NLU) task which was also proposed as an alternative to the Turing Test. It is made up of textual question answering problems which require resolution of a pronoun to its correct antecedent.

In this thesis, two approaches of developing NLU systems to solve the Winograd Schema Challenge are demonstrated. To this end, a semantic parser is presented, various kinds of commonsense knowledge are identified, techniques to extract commonsense knowledge are developed and two commonsense reasoning algorithms are presented. The usefulness of the developed tools and techniques is shown by applying them to solve the challenge.
ContributorsSharma, Arpita (Author) / Baral, Chitta (Thesis advisor) / Lee, Joohyung (Committee member) / Papotti, Paolo (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)
Created2019
157886-Thumbnail Image.png
Description
Visual navigation is a multi-disciplinary field across computer vision, machine learning and robotics. It is of great significance in both research and industrial applications. An intelligent agent with visual navigation ability will be capable of performing the following tasks: actively explore in environments, distinguish and localize a requested target and

Visual navigation is a multi-disciplinary field across computer vision, machine learning and robotics. It is of great significance in both research and industrial applications. An intelligent agent with visual navigation ability will be capable of performing the following tasks: actively explore in environments, distinguish and localize a requested target and approach the target following acquired strategies. Despite a variety of advances in mobile robotics, enabling an autonomous with above-mentioned abilities is still a challenging and complex task. However, the solution to the task is very likely to accelerate the landing of assistive robots.

Reinforcement learning is a method that trains autonomous robot based on rewarding desired behaviors to help it obtain an action policy that maximizes rewards while the robot interacting with the environment. Through trial and error, an agent learns sophisticated and skillful strategies to handle complex tasks in the environment. Inspired by navigation procedures of human beings that when navigating through environments, humans reason about accessible spaces and geometry of the environment a lot based on first-person view, figure out the destination and then ease over, this work develops a model that maps from pixels to actions and inherently estimate the target as well as the free-space map. The model has three major constituents: (i) a cognitive mapper that maps the topologic free-space map from first-person view images, (ii) a target recognition network that locates a desired object and (iii) an action policy deep reinforcement learning network. Further, a planner model with cascade architecture based on multi-scale semantic top-down occupancy map input is proposed.
ContributorsZheng, Shibin (Author) / Yang, Yezhou (Thesis advisor) / Zhang, Wenlong (Committee member) / Ren, Yi (Committee member) / Arizona State University (Publisher)
Created2019