Search Content

Resonant microbeam high resolution vibrotactile haptic display

Description

One type of assistive device for the blind has attempted to convert visual information into information that can be perceived through another sense, such as touch or hearing. A vibrotactile haptic display assistive device consists of an array of vibrating elements placed against the skin, allowing the blind individual to…

One type of assistive device for the blind has attempted to convert visual information into information that can be perceived through another sense, such as touch or hearing. A vibrotactile haptic display assistive device consists of an array of vibrating elements placed against the skin, allowing the blind individual to receive visual information through touch. However, these approaches have two significant technical challenges: large vibration element size and the number of microcontroller pins required for vibration control, both causing excessively low resolution of the device. Here, I propose and investigate a type of high-resolution vibrotactile haptic display which overcomes these challenges by utilizing a ‘microbeam’ as the vibrating element. These microbeams can then be actuated using only one microcontroller pin connected to a speaker or surface transducer. This approach could solve the low-resolution problem currently present in all haptic displays. In this paper, the results of an investigation into the manufacturability of such a device, simulation of the vibrational characteristics, and prototyping and experimental validation of the device concept are presented. The possible reasons of the frequency shift between the result of the forced or free response of beams and the frequency calculated based on a lumped mass approximation are investigated. It is found that one of the important reasons for the frequency shift is the size effect, the dependency of the elastic modulus on the size and kind of material. This size effect on A2 tool steel for Micro-Meso scale cantilever beams for the proposed system is investigated.

ContributorsWi, Daehan (Author) / SODEMANN, ANGELA A (Thesis advisor) / Redkar, Sangram (Committee member) / McDaniel, Troy (Committee member) / Arizona State University (Publisher)

Created2019

Exploring the Design of Vibrotactile Cues for Visio-Haptic Sensory Substitution

Description

This paper presents the design and evaluation of a haptic interface for augmenting human-human interpersonal interactions by delivering facial expressions of an interaction partner to an individual who is blind using a visual-to-tactile mapping of facial action units and emotions. Pancake shaftless vibration motors are mounted on the back of…

This paper presents the design and evaluation of a haptic interface for augmenting human-human interpersonal interactions by delivering facial expressions of an interaction partner to an individual who is blind using a visual-to-tactile mapping of facial action units and emotions. Pancake shaftless vibration motors are mounted on the back of a chair to provide vibrotactile stimulation in the context of a dyadic (one-on-one) interaction across a table. This work explores the design of spatiotemporal vibration patterns that can be used to convey the basic building blocks of facial movements according to the Facial Action Unit Coding System. A behavioral study was conducted to explore the factors that influence the naturalness of conveying affect using vibrotactile cues.

ContributorsBala, Shantanu (Author) / Panchanathan, Sethuraman (Thesis director) / McDaniel, Troy (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor) / Department of Psychology (Contributor)

Created2014-05

Using Goodness of Pronunciation Features for Spoken Nasality Detection

Description

Speech nasality disorders are characterized by abnormal resonance in the nasal cavity. Hypernasal speech is of particular interest, characterized by an inability to prevent improper nasalization of vowels, and poor articulation of plosive and fricative consonants, and can lead to negative communicative and social consequences. It can be associated with…

Speech nasality disorders are characterized by abnormal resonance in the nasal cavity. Hypernasal speech is of particular interest, characterized by an inability to prevent improper nasalization of vowels, and poor articulation of plosive and fricative consonants, and can lead to negative communicative and social consequences. It can be associated with a range of conditions, including cleft lip or palate, velopharyngeal dysfunction (a physical or neurological defective closure of the soft palate that regulates resonance between the oral and nasal cavity), dysarthria, or hearing impairment, and can also be an early indicator of developing neurological disorders such as ALS. Hypernasality is typically scored perceptually by a Speech Language Pathologist (SLP). Misdiagnosis could lead to inadequate treatment plans and poor treatment outcomes for a patient. Also, for some applications, particularly screening for early neurological disorders, the use of an SLP is not practical. Hence this work demonstrates a data-driven approach to objective assessment of hypernasality, through the use of Goodness of Pronunciation features. These features capture the overall precision of articulation of speaker on a phoneme-by-phoneme basis, allowing demonstrated models to achieve a Pearson correlation coefficient of 0.88 on low-nasality speakers, the population of most interest for this sort of technique. These results are comparable to milestone methods in this domain.

ContributorsSaxon, Michael Stephen (Author) / Berisha, Visar (Thesis director) / McDaniel, Troy (Committee member) / Electrical Engineering Program (Contributor, Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Barrett, The Honors College (Contributor)

Created2018-05

Fused Filament Fabrication of Prosthetic Components for Trans-Humeral Upper Limb Prosthetics

Description

Presented below is the design and fabrication of prosthetic components consisting of an attachment, tactile sensing, and actuator systems with Fused Filament Fabrication (FFF) technique. The attachment system is a thermoplastic osseointegrated upper limb prosthesis for average adult trans-humeral amputation with mechanical properties greater than upper limb skeletal bone. The…

Presented below is the design and fabrication of prosthetic components consisting of an attachment, tactile sensing, and actuator systems with Fused Filament Fabrication (FFF) technique. The attachment system is a thermoplastic osseointegrated upper limb prosthesis for average adult trans-humeral amputation with mechanical properties greater than upper limb skeletal bone. The prosthetic designed has: a one-step surgical process, large cavities for bone tissue ingrowth, uses a material that has an elastic modulus less than skeletal bone, and can be fabricated on one system.

FFF osseointegration screw is an improvement upon the current two-part osseointegrated prosthetics that are composed of a fixture and abutment. The current prosthetic design requires two invasive surgeries for implantation and are made of titanium, which has an elastic modulus greater than bone. An elastic modulus greater than bone causes stress shielding and overtime can cause loosening of the prosthetic.

The tactile sensor is a thermoplastic piezo-resistive sensor for daily activities for a prosthetic’s feedback system. The tactile sensor is manufactured from a low elastic modulus composite comprising of a compressible thermoplastic elastomer and conductive carbon. Carbon is in graphite form and added in high filler ratios. The printed sensors were compared to sensors that were fabricated in a gravity mold to highlight the difference in FFF sensors to molded sensors. The 3D printed tactile sensor has a thickness and feel similar to human skin, has a simple fabrication technique, can detect forces needed for daily activities, and can be manufactured in to user specific geometries.

Lastly, a biomimicking skeletal muscle actuator for prosthetics was developed. The actuator developed is manufactured with Fuse Filament Fabrication using a shape memory polymer composite that has non-linear contractile and passive forces, contractile forces and strains comparable to mammalian skeletal muscle, reaction time under one second, low operating temperature, and has a low mass, volume, and material costs. The actuator improves upon current prosthetic actuators that provide rigid, linear force with high weight, cost, and noise.

ContributorsLathers, Steven (Author) / La Belle, Jeffrey (Thesis advisor) / Vowels, David (Committee member) / Lockhart, Thurmon (Committee member) / Abbas, James (Committee member) / McDaniel, Troy (Committee member) / Arizona State University (Publisher)

Created2017

Why Pop? A System to Explain How Deep Learning Models Classify Music

Description

The impact of Artificial Intelligence (AI) has increased significantly in daily life. AI is taking big strides towards moving into areas of life that are critical such as healthcare but, also into areas such as entertainment and leisure. Deep neural networks have been pivotal in making all these advancements possible.…

The impact of Artificial Intelligence (AI) has increased significantly in daily life. AI is taking big strides towards moving into areas of life that are critical such as healthcare but, also into areas such as entertainment and leisure. Deep neural networks have been pivotal in making all these advancements possible. But, a well-known problem with deep neural networks is the lack of explanations for the choices it makes. To combat this, several methods have been tried in the field of research. One example of this is assigning rankings to the individual features and how influential they are in the decision-making process. In contrast a newer class of methods focuses on Concept Activation Vectors (CAV) which focus on extracting higher-level concepts from the trained model to capture more information as a mixture of several features and not just one. The goal of this thesis is to employ concepts in a novel domain: to explain how a deep learning model uses computer vision to classify music into different genres. Due to the advances in the field of computer vision with deep learning for classification tasks, it is rather a standard practice now to convert an audio clip into corresponding spectrograms and use those spectrograms as image inputs to the deep learning model. Thus, a pre-trained model can classify the spectrogram images (representing songs) into musical genres. The proposed explanation system called “Why Pop?” tries to answer certain questions about the classification process such as what parts of the spectrogram influence the model the most, what concepts were extracted and how are they different for different classes. These explanations aid the user gain insights into the model’s learnings, biases, and the decision-making process.

ContributorsSharma, Shubham (Author) / Bryan, Chris (Thesis advisor) / McDaniel, Troy (Committee member) / Sarwat, Mohamed (Committee member) / Arizona State University (Publisher)

Created2022

A Proactive Systematic Approach to Enhance and Preserve Users’ Tech Applications Data Privacy Awareness and Control in Smart Cities

Description

The reality of smart cities is here and now. The issues of data privacy in tech applications are apparent in smart cities. Privacy as an issue raised by many and addressed by few remains critical for smart cities’ success. It is the common responsibility of smart cities, tech application makers,…

The reality of smart cities is here and now. The issues of data privacy in tech applications are apparent in smart cities. Privacy as an issue raised by many and addressed by few remains critical for smart cities’ success. It is the common responsibility of smart cities, tech application makers, and users to embark on the journey to solutions. Privacy is an individual problem that smart cities need to provide a collective solution for. The research focuses on understanding users’ data privacy preferences, what information they consider private, and what they need to protect. The research identifies the data security loopholes, data privacy roadblocks, and common opportunities for change to implement a proactive privacy-driven tech solution necessary to address and resolve tech-induced data privacy concerns among citizens. This dissertation aims at addressing the issue of data privacy in tech applications based on known methodologies to address the concerns they allow. Through this research, a data privacy survey on tech applications was conducted, and the results reveal users’ desires to become a part of the solution by becoming aware and taking control of their data privacy while using tech applications. So, this dissertation gives an overview of the data privacy issues in tech, discusses available data privacy basis, elaborates on the different steps needed to create a robust remedy to data privacy concerns in enabling users’ awareness and control, and proposes two privacy applications one as a data privacy awareness solution and the other as a representation of the privacy control framework to address data privacy concerns in smart cities.

ContributorsMusafiri Mimo, Edgard (Author) / McDaniel, Troy (Thesis advisor) / Michael, Katina (Committee member) / Sullivan, Kenneth (Committee member) / Arizona State University (Publisher)

Created2022

Zero Shot Learning for Visual Object Recognition with Generative Models

Description

Visual object recognition has achieved great success with advancements in deep learning technologies. Notably, the existing recognition models have gained human-level performance on many of the recognition tasks. However, these models are data hungry, and their performance is constrained by the amount of training data. Inspired by the human ability…

Visual object recognition has achieved great success with advancements in deep learning technologies. Notably, the existing recognition models have gained human-level performance on many of the recognition tasks. However, these models are data hungry, and their performance is constrained by the amount of training data. Inspired by the human ability to recognize object categories based on textual descriptions of objects and previous visual knowledge, the research community has extensively pursued the area of zero-shot learning. In this area of research, machine vision models are trained to recognize object categories that are not observed during the training process. Zero-shot learning models leverage textual information to transfer visual knowledge from seen object categories in order to recognize unseen object categories.

Generative models have recently gained popularity as they synthesize unseen visual features and convert zero-shot learning into a classical supervised learning problem. These generative models are trained using seen classes and are expected to implicitly transfer the knowledge from seen to unseen classes. However, their performance is stymied by overfitting towards seen classes, which leads to substandard performance in generalized zero-shot learning. To address this concern, this dissertation proposes a novel generative model that leverages the semantic relationship between seen and unseen categories and explicitly performs knowledge transfer from seen categories to unseen categories. Experiments were conducted on several benchmark datasets to demonstrate the efficacy of the proposed model for both zero-shot learning and generalized zero-shot learning. The dissertation also provides a unique Student-Teacher based generative model for zero-shot learning and concludes with future research directions in this area.

ContributorsVyas, Maunil Rohitbhai (Author) / Panchanathan, Sethuraman (Thesis advisor) / Venkateswara, Hemanth (Thesis advisor) / McDaniel, Troy (Committee member) / Arizona State University (Publisher)

Created2020

Language Image Transformer

Description

Humans perceive the environment using multiple modalities like vision, speech (language), touch, taste, and smell. The knowledge obtained from one modality usually complements the other. Learning through several modalities helps in constructing an accurate model of the environment. Most of the current vision and language models are modality-specific and, in…

Humans perceive the environment using multiple modalities like vision, speech (language), touch, taste, and smell. The knowledge obtained from one modality usually complements the other. Learning through several modalities helps in constructing an accurate model of the environment. Most of the current vision and language models are modality-specific and, in many cases, extensively use deep-learning based attention mechanisms for learning powerful representations. This work discusses the role of attention in associating vision and language for generating shared representation. Language Image Transformer (LIT) is proposed for learning multi-modal representations of the environment. It uses a training objective based on Contrastive Predictive Coding (CPC) to maximize the Mutual Information (MI) between the visual and linguistic representations. It learns the relationship between the modalities using the proposed cross-modal attention layers. It is trained and evaluated using captioning datasets, MS COCO, and Conceptual Captions. The results and the analysis offers a perspective on the use of Mutual Information Maximisation (MIM) for generating generalizable representations across multiple modalities.

ContributorsRamakrishnan, Raghavendran (Author) / Panchanathan, Sethuraman (Thesis advisor) / Venkateswara, Hemanth Kumar (Thesis advisor) / McDaniel, Troy (Committee member) / Arizona State University (Publisher)

Created2020

Accessible Retail Shopping For The Visually Impaired Using Deep Learning

Description

Over the past decade, advancements in neural networks have been instrumental in achieving remarkable breakthroughs in the field of computer vision. One of the applications is in creating assistive technology to improve the lives of visually impaired people by making the world around them more accessible. A lot of research…

Over the past decade, advancements in neural networks have been instrumental in achieving remarkable breakthroughs in the field of computer vision. One of the applications is in creating assistive technology to improve the lives of visually impaired people by making the world around them more accessible. A lot of research in convolutional neural networks has led to human-level performance in different vision tasks including image classification, object detection, instance segmentation, semantic segmentation, panoptic segmentation and scene text recognition. All the before mentioned tasks, individually or in combination, have been used to create assistive technologies to improve accessibility for the blind.

This dissertation outlines various applications to improve accessibility and independence for visually impaired people during shopping by helping them identify products in retail stores. The dissertation includes the following contributions; (i) A dataset containing images of breakfast-cereal products and a classifier using a deep neural (ResNet) network; (ii) A dataset for training a text detection and scene-text recognition model; (iii) A model for text detection and scene-text recognition to identify product images using a user-controlled camera; (iv) A dataset of twenty thousand products with product information and related images that can be used to train and test a system designed to identify products.

ContributorsPatel, Akshar (Author) / Panchanathan, Sethuraman (Thesis advisor) / Venkateswara, Hemanth (Thesis advisor) / McDaniel, Troy (Committee member) / Arizona State University (Publisher)

Created2020

Incremental Learning With Sample Generation From Pretrained Networks

Description

In the last decade deep learning based models have revolutionized machine learning and computer vision applications. However, these models are data-hungry and training them is a time-consuming process. In addition, when deep neural networks are updated to augment their prediction space with new data, they run into the problem of…

In the last decade deep learning based models have revolutionized machine learning and computer vision applications. However, these models are data-hungry and training them is a time-consuming process. In addition, when deep neural networks are updated to augment their prediction space with new data, they run into the problem of catastrophic forgetting, where the model forgets previously learned knowledge as it overfits to the newly available data. Incremental learning algorithms enable deep neural networks to prevent catastrophic forgetting by retaining knowledge of previously observed data while also learning from newly available data.

This thesis presents three models for incremental learning; (i) Design of an algorithm for generative incremental learning using a pre-trained deep neural network classifier; (ii) Development of a hashing based clustering algorithm for efficient incremental learning; (iii) Design of a student-teacher coupled neural network to distill knowledge for incremental learning. The proposed algorithms were evaluated using popular vision datasets for classification tasks. The thesis concludes with a discussion about the feasibility of using these techniques to transfer information between networks and also for incremental learning applications.

ContributorsPatil, Rishabh (Author) / Venkateswara, Hemanth (Thesis advisor) / Panchanathan, Sethuraman (Thesis advisor) / McDaniel, Troy (Committee member) / Arizona State University (Publisher)

Created2020

Filtering by