Matching Items (16)
Filtering by

Clear all filters

136785-Thumbnail Image.png
Description
This paper presents the design and evaluation of a haptic interface for augmenting human-human interpersonal interactions by delivering facial expressions of an interaction partner to an individual who is blind using a visual-to-tactile mapping of facial action units and emotions. Pancake shaftless vibration motors are mounted on the back of

This paper presents the design and evaluation of a haptic interface for augmenting human-human interpersonal interactions by delivering facial expressions of an interaction partner to an individual who is blind using a visual-to-tactile mapping of facial action units and emotions. Pancake shaftless vibration motors are mounted on the back of a chair to provide vibrotactile stimulation in the context of a dyadic (one-on-one) interaction across a table. This work explores the design of spatiotemporal vibration patterns that can be used to convey the basic building blocks of facial movements according to the Facial Action Unit Coding System. A behavioral study was conducted to explore the factors that influence the naturalness of conveying affect using vibrotactile cues.
ContributorsBala, Shantanu (Author) / Panchanathan, Sethuraman (Thesis director) / McDaniel, Troy (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor) / Department of Psychology (Contributor)
Created2014-05
133225-Thumbnail Image.png
Description
Speech nasality disorders are characterized by abnormal resonance in the nasal cavity. Hypernasal speech is of particular interest, characterized by an inability to prevent improper nasalization of vowels, and poor articulation of plosive and fricative consonants, and can lead to negative communicative and social consequences. It can be associated with

Speech nasality disorders are characterized by abnormal resonance in the nasal cavity. Hypernasal speech is of particular interest, characterized by an inability to prevent improper nasalization of vowels, and poor articulation of plosive and fricative consonants, and can lead to negative communicative and social consequences. It can be associated with a range of conditions, including cleft lip or palate, velopharyngeal dysfunction (a physical or neurological defective closure of the soft palate that regulates resonance between the oral and nasal cavity), dysarthria, or hearing impairment, and can also be an early indicator of developing neurological disorders such as ALS. Hypernasality is typically scored perceptually by a Speech Language Pathologist (SLP). Misdiagnosis could lead to inadequate treatment plans and poor treatment outcomes for a patient. Also, for some applications, particularly screening for early neurological disorders, the use of an SLP is not practical. Hence this work demonstrates a data-driven approach to objective assessment of hypernasality, through the use of Goodness of Pronunciation features. These features capture the overall precision of articulation of speaker on a phoneme-by-phoneme basis, allowing demonstrated models to achieve a Pearson correlation coefficient of 0.88 on low-nasality speakers, the population of most interest for this sort of technique. These results are comparable to milestone methods in this domain.
ContributorsSaxon, Michael Stephen (Author) / Berisha, Visar (Thesis director) / McDaniel, Troy (Committee member) / Electrical Engineering Program (Contributor, Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Barrett, The Honors College (Contributor)
Created2018-05
171505-Thumbnail Image.png
Description
The impact of Artificial Intelligence (AI) has increased significantly in daily life. AI is taking big strides towards moving into areas of life that are critical such as healthcare but, also into areas such as entertainment and leisure. Deep neural networks have been pivotal in making all these advancements possible.

The impact of Artificial Intelligence (AI) has increased significantly in daily life. AI is taking big strides towards moving into areas of life that are critical such as healthcare but, also into areas such as entertainment and leisure. Deep neural networks have been pivotal in making all these advancements possible. But, a well-known problem with deep neural networks is the lack of explanations for the choices it makes. To combat this, several methods have been tried in the field of research. One example of this is assigning rankings to the individual features and how influential they are in the decision-making process. In contrast a newer class of methods focuses on Concept Activation Vectors (CAV) which focus on extracting higher-level concepts from the trained model to capture more information as a mixture of several features and not just one. The goal of this thesis is to employ concepts in a novel domain: to explain how a deep learning model uses computer vision to classify music into different genres. Due to the advances in the field of computer vision with deep learning for classification tasks, it is rather a standard practice now to convert an audio clip into corresponding spectrograms and use those spectrograms as image inputs to the deep learning model. Thus, a pre-trained model can classify the spectrogram images (representing songs) into musical genres. The proposed explanation system called “Why Pop?” tries to answer certain questions about the classification process such as what parts of the spectrogram influence the model the most, what concepts were extracted and how are they different for different classes. These explanations aid the user gain insights into the model’s learnings, biases, and the decision-making process.
ContributorsSharma, Shubham (Author) / Bryan, Chris (Thesis advisor) / McDaniel, Troy (Committee member) / Sarwat, Mohamed (Committee member) / Arizona State University (Publisher)
Created2022
156281-Thumbnail Image.png
Description
Currently, one of the biggest limiting factors for long-term deployment of autonomous systems is the power constraints of a platform. In particular, for aerial robots such as unmanned aerial vehicles (UAVs), the energy resource is the main driver of mission planning and operation definitions, as everything revolved around flight time.

Currently, one of the biggest limiting factors for long-term deployment of autonomous systems is the power constraints of a platform. In particular, for aerial robots such as unmanned aerial vehicles (UAVs), the energy resource is the main driver of mission planning and operation definitions, as everything revolved around flight time. The focus of this work is to develop a new method of energy storage and charging for autonomous UAV systems, for use during long-term deployments in a constrained environment. We developed a charging solution that allows pre-equipped UAV system to land on top of designated charging pads and rapidly replenish their battery reserves, using a contact charging point. This system is designed to work with all types of rechargeable batteries, focusing on Lithium Polymer (LiPo) packs, that incorporate a battery management system for increased reliability. The project also explores optimization methods for fleets of UAV systems, to increase charging efficiency and extend battery lifespans. Each component of this project was first designed and tested in computer simulation. Following positive feedback and results, prototypes for each part of this system were developed and rigorously tested. Results show that the contact charging method is able to charge LiPo batteries at a 1-C rate, which is the industry standard rate, maintaining the same safety and efficiency standards as modern day direct connection chargers. Control software for these base stations was also created, to be integrated with a fleet management system, and optimizes UAV charge levels and distribution to extend LiPo battery lifetimes while still meeting expected mission demand. Each component of this project (hardware/software) was designed for manufacturing and implementation using industry standard tools, making it ideal for large-scale implementations. This system has been successfully tested with a fleet of UAV systems at Arizona State University, and is currently being integrated into an Arizona smart city environment for deployment.
ContributorsMian, Sami (Author) / Panchanathan, Sethuraman (Thesis advisor) / Berman, Spring (Committee member) / Yang, Yezhou (Committee member) / McDaniel, Troy (Committee member) / Arizona State University (Publisher)
Created2018
171531-Thumbnail Image.png
Description
The reality of smart cities is here and now. The issues of data privacy in tech applications are apparent in smart cities. Privacy as an issue raised by many and addressed by few remains critical for smart cities’ success. It is the common responsibility of smart cities, tech application makers,

The reality of smart cities is here and now. The issues of data privacy in tech applications are apparent in smart cities. Privacy as an issue raised by many and addressed by few remains critical for smart cities’ success. It is the common responsibility of smart cities, tech application makers, and users to embark on the journey to solutions. Privacy is an individual problem that smart cities need to provide a collective solution for. The research focuses on understanding users’ data privacy preferences, what information they consider private, and what they need to protect. The research identifies the data security loopholes, data privacy roadblocks, and common opportunities for change to implement a proactive privacy-driven tech solution necessary to address and resolve tech-induced data privacy concerns among citizens. This dissertation aims at addressing the issue of data privacy in tech applications based on known methodologies to address the concerns they allow. Through this research, a data privacy survey on tech applications was conducted, and the results reveal users’ desires to become a part of the solution by becoming aware and taking control of their data privacy while using tech applications. So, this dissertation gives an overview of the data privacy issues in tech, discusses available data privacy basis, elaborates on the different steps needed to create a robust remedy to data privacy concerns in enabling users’ awareness and control, and proposes two privacy applications one as a data privacy awareness solution and the other as a representation of the privacy control framework to address data privacy concerns in smart cities.
ContributorsMusafiri Mimo, Edgard (Author) / McDaniel, Troy (Thesis advisor) / Michael, Katina (Committee member) / Sullivan, Kenneth (Committee member) / Arizona State University (Publisher)
Created2022
171832-Thumbnail Image.png
Description
Visual Odometry is one of the key aspects of robotic localization and mapping. Visual Odometry consists of many geometric-based approaches that convert visual data (images) into pose estimates of where the robot is in space. The classical geometric methods have shown promising results; they are carefully crafted and built explicitly

Visual Odometry is one of the key aspects of robotic localization and mapping. Visual Odometry consists of many geometric-based approaches that convert visual data (images) into pose estimates of where the robot is in space. The classical geometric methods have shown promising results; they are carefully crafted and built explicitly for these tasks. However, such geometric methods require extreme fine-tuning and extensive prior knowledge to set up these systems for different scenarios. Classical Geometric approaches also require significant post-processing and optimization to minimize the error between the estimated pose and the global truth. In this body of work, the deep learning model was formed by combining SuperPoint and SuperGlue. The resulting model does not require any prior fine-tuning. It has been trained to enable both outdoor and indoor settings. The proposed deep learning model is applied to the Karlsruhe Institute of Technology and Toyota Technological Institute dataset along with other classical geometric visual odometry models. The proposed deep learning model has not been trained on the Karlsruhe Institute of Technology and Toyota Technological Institute dataset. It is only during experimentation that the deep learning model is first introduced to the Karlsruhe Institute of Technology and Toyota Technological Institute dataset. Using the monocular grayscale images from the visual odometer files of the Karlsruhe Institute of Technology and Toyota Technological Institute dataset, through the experiment to test the viability of the models for different sequences. The experiment has been performed on eight different sequences and has obtained the Absolute Trajectory Error and the time taken for each sequence to finish the computation. From the obtained results, there are inferences drawn from the classical and deep learning approaches.
ContributorsVaidyanathan, Venkatesh (Author) / Venkateswara, Hemanth (Thesis advisor) / McDaniel, Troy (Thesis advisor) / Michael, Katina (Committee member) / Arizona State University (Publisher)
Created2022
171933-Thumbnail Image.png
Description
As people begin to live longer and the population shifts to having more olderadults on Earth than young children, radical solutions will be needed to ease the burden on society. It will be essential to develop technology that can age with the individual. One solution is to keep older adults in their

As people begin to live longer and the population shifts to having more olderadults on Earth than young children, radical solutions will be needed to ease the burden on society. It will be essential to develop technology that can age with the individual. One solution is to keep older adults in their homes longer through smart home and smart living technology, allowing them to age in place. People have many choices when choosing where to age in place, including their own homes, assisted living facilities, nursing homes, or family members. No matter where people choose to age, they may face isolation and financial hardships. It is crucial to keep finances in mind when developing Smart Home technology. Smart home technologies seek to allow individuals to stay inside their homes for as long as possible, yet little work looks at how we can use technology in different life stages. Robots are poised to impact society and ease burns at home and in the workforce. Special attention has been given to social robots to ease isolation. As social robots become accepted into society, researchers need to understand how these robots should mimic natural conversation. My work attempts to answer this question within social robotics by investigating how to make conversational robots natural and reciprocal. I investigated this through a 2x2 Wizard of Oz between-subjects user study. The study lasted four months, testing four different levels of interactivity with the robot. None of the levels were significantly different from the others, an unexpected result. I then investigated the robot’s personality, the participant’s trust, and the participant’s acceptance of the robot and how that influenced the study.
ContributorsMiller, Jordan (Author) / McDaniel, Troy (Thesis advisor) / Michael, Katina (Committee member) / Cooke, Nancy (Committee member) / Bryan, Chris (Committee member) / Li, Baoxin (Committee member) / Arizona State University (Publisher)
Created2022
171660-Thumbnail Image.png
Description
With an aging population, the number of later in life health related incidents like stroke stand to become more prevalent. Unfortunately, the majority those who are most at risk for debilitating heath episodes are either uninsured or under insured when it comes to long term physical/occupational therapy. As insurance companies

With an aging population, the number of later in life health related incidents like stroke stand to become more prevalent. Unfortunately, the majority those who are most at risk for debilitating heath episodes are either uninsured or under insured when it comes to long term physical/occupational therapy. As insurance companies lower coverage and/or raise prices of plans with sufficient coverage, it can be expected that the proportion of uninsured/under insured to fully insured people will rise. To address this, lower cost alternative methods of treatment must be developed so people can obtain the treated required for a sufficient recovery. The presented robotic glove employs low cost fabric soft pneumatic actuators which use a closed loop feedback controller based on readings from embedded soft sensors. This provides the device with proprioceptive abilities for the dynamic control of each independent actuator. Force and fatigue tests were performed to determine the viability of the actuator design. A Box and Block test along with a motion capture study was completed to study the performance of the device. This paper presents the design and classification of a soft robotic glove with a feedback controller as a at-home stroke rehabilitation device.
ContributorsAxman, Reed C (Author) / Zhang, Wenlong (Thesis advisor) / Santello, Marco (Committee member) / McDaniel, Troy (Committee member) / Arizona State University (Publisher)
Created2022
158117-Thumbnail Image.png
Description
Visual object recognition has achieved great success with advancements in deep learning technologies. Notably, the existing recognition models have gained human-level performance on many of the recognition tasks. However, these models are data hungry, and their performance is constrained by the amount of training data. Inspired by the human ability

Visual object recognition has achieved great success with advancements in deep learning technologies. Notably, the existing recognition models have gained human-level performance on many of the recognition tasks. However, these models are data hungry, and their performance is constrained by the amount of training data. Inspired by the human ability to recognize object categories based on textual descriptions of objects and previous visual knowledge, the research community has extensively pursued the area of zero-shot learning. In this area of research, machine vision models are trained to recognize object categories that are not observed during the training process. Zero-shot learning models leverage textual information to transfer visual knowledge from seen object categories in order to recognize unseen object categories.

Generative models have recently gained popularity as they synthesize unseen visual features and convert zero-shot learning into a classical supervised learning problem. These generative models are trained using seen classes and are expected to implicitly transfer the knowledge from seen to unseen classes. However, their performance is stymied by overfitting towards seen classes, which leads to substandard performance in generalized zero-shot learning. To address this concern, this dissertation proposes a novel generative model that leverages the semantic relationship between seen and unseen categories and explicitly performs knowledge transfer from seen categories to unseen categories. Experiments were conducted on several benchmark datasets to demonstrate the efficacy of the proposed model for both zero-shot learning and generalized zero-shot learning. The dissertation also provides a unique Student-Teacher based generative model for zero-shot learning and concludes with future research directions in this area.
ContributorsVyas, Maunil Rohitbhai (Author) / Panchanathan, Sethuraman (Thesis advisor) / Venkateswara, Hemanth (Thesis advisor) / McDaniel, Troy (Committee member) / Arizona State University (Publisher)
Created2020
158120-Thumbnail Image.png
Description
Humans perceive the environment using multiple modalities like vision, speech (language), touch, taste, and smell. The knowledge obtained from one modality usually complements the other. Learning through several modalities helps in constructing an accurate model of the environment. Most of the current vision and language models are modality-specific and, in

Humans perceive the environment using multiple modalities like vision, speech (language), touch, taste, and smell. The knowledge obtained from one modality usually complements the other. Learning through several modalities helps in constructing an accurate model of the environment. Most of the current vision and language models are modality-specific and, in many cases, extensively use deep-learning based attention mechanisms for learning powerful representations. This work discusses the role of attention in associating vision and language for generating shared representation. Language Image Transformer (LIT) is proposed for learning multi-modal representations of the environment. It uses a training objective based on Contrastive Predictive Coding (CPC) to maximize the Mutual Information (MI) between the visual and linguistic representations. It learns the relationship between the modalities using the proposed cross-modal attention layers. It is trained and evaluated using captioning datasets, MS COCO, and Conceptual Captions. The results and the analysis offers a perspective on the use of Mutual Information Maximisation (MIM) for generating generalizable representations across multiple modalities.
ContributorsRamakrishnan, Raghavendran (Author) / Panchanathan, Sethuraman (Thesis advisor) / Venkateswara, Hemanth Kumar (Thesis advisor) / McDaniel, Troy (Committee member) / Arizona State University (Publisher)
Created2020