Matching Items (30)
Filtering by
- All Subjects: Computer Science
- All Subjects: engineering
Description
Machine learning models convert raw data in the form of video, images, audio,
text, etc. into feature representations that are convenient for computational process-
ing. Deep neural networks have proven to be very efficient feature extractors for a
variety of machine learning tasks. Generative models based on deep neural networks
introduce constraints on the feature space to learn transferable and disentangled rep-
resentations. Transferable feature representations help in training machine learning
models that are robust across different distributions of data. For example, with the
application of transferable features in domain adaptation, models trained on a source
distribution can be applied to a data from a target distribution even though the dis-
tributions may be different. In style transfer and image-to-image translation, disen-
tangled representations allow for the separation of style and content when translating
images.
This thesis examines learning transferable data representations in novel deep gen-
erative models. The Semi-Supervised Adversarial Translator (SAT) utilizes adversar-
ial methods and cross-domain weight sharing in a neural network to extract trans-
ferable representations. These transferable interpretations can then be decoded into
the original image or a similar image in another domain. The Explicit Disentangling
Network (EDN) utilizes generative methods to disentangle images into their core at-
tributes and then segments sets of related attributes. The EDN can separate these
attributes by controlling the ow of information using a novel combination of losses
and network architecture. This separation of attributes allows precise modi_cations
to speci_c components of the data representation, boosting the performance of ma-
chine learning tasks. The effectiveness of these models is evaluated across domain
adaptation, style transfer, and image-to-image translation tasks.
text, etc. into feature representations that are convenient for computational process-
ing. Deep neural networks have proven to be very efficient feature extractors for a
variety of machine learning tasks. Generative models based on deep neural networks
introduce constraints on the feature space to learn transferable and disentangled rep-
resentations. Transferable feature representations help in training machine learning
models that are robust across different distributions of data. For example, with the
application of transferable features in domain adaptation, models trained on a source
distribution can be applied to a data from a target distribution even though the dis-
tributions may be different. In style transfer and image-to-image translation, disen-
tangled representations allow for the separation of style and content when translating
images.
This thesis examines learning transferable data representations in novel deep gen-
erative models. The Semi-Supervised Adversarial Translator (SAT) utilizes adversar-
ial methods and cross-domain weight sharing in a neural network to extract trans-
ferable representations. These transferable interpretations can then be decoded into
the original image or a similar image in another domain. The Explicit Disentangling
Network (EDN) utilizes generative methods to disentangle images into their core at-
tributes and then segments sets of related attributes. The EDN can separate these
attributes by controlling the ow of information using a novel combination of losses
and network architecture. This separation of attributes allows precise modi_cations
to speci_c components of the data representation, boosting the performance of ma-
chine learning tasks. The effectiveness of these models is evaluated across domain
adaptation, style transfer, and image-to-image translation tasks.
ContributorsEusebio, Jose Miguel Ang (Author) / Panchanathan, Sethuraman (Thesis advisor) / Davulcu, Hasan (Committee member) / Venkateswara, Hemanth (Committee member) / Arizona State University (Publisher)
Created2018
Description
One type of assistive device for the blind has attempted to convert visual information into information that can be perceived through another sense, such as touch or hearing. A vibrotactile haptic display assistive device consists of an array of vibrating elements placed against the skin, allowing the blind individual to receive visual information through touch. However, these approaches have two significant technical challenges: large vibration element size and the number of microcontroller pins required for vibration control, both causing excessively low resolution of the device. Here, I propose and investigate a type of high-resolution vibrotactile haptic display which overcomes these challenges by utilizing a ‘microbeam’ as the vibrating element. These microbeams can then be actuated using only one microcontroller pin connected to a speaker or surface transducer. This approach could solve the low-resolution problem currently present in all haptic displays. In this paper, the results of an investigation into the manufacturability of such a device, simulation of the vibrational characteristics, and prototyping and experimental validation of the device concept are presented. The possible reasons of the frequency shift between the result of the forced or free response of beams and the frequency calculated based on a lumped mass approximation are investigated. It is found that one of the important reasons for the frequency shift is the size effect, the dependency of the elastic modulus on the size and kind of material. This size effect on A2 tool steel for Micro-Meso scale cantilever beams for the proposed system is investigated.
ContributorsWi, Daehan (Author) / SODEMANN, ANGELA A (Thesis advisor) / Redkar, Sangram (Committee member) / McDaniel, Troy (Committee member) / Arizona State University (Publisher)
Created2019
Description
Parkinson's disease is a neurodegenerative disorder in the central nervous system that affects a host of daily activities and involves a variety of symptoms; these include tremors, slurred speech, and rigid muscles. It is the second most common movement disorder globally. In Stage 3 of Parkinson's, afflicted individuals begin to develop an abnormal gait pattern known as freezing of gait (FoG), which is characterized by decreased step length, shuffling, and eventually complete loss of movement; they are unable to move, and often results in a fall. Surface electromyography (sEMG) is a diagnostic tool to measure electrical activity in the muscles to assess overall muscle function. Most conventional EMG systems, however, are bulky, tethered to a single location, expensive, and primarily used in a lab or clinical setting. This project explores an affordable, open-source, and portable platform called Open Brain-Computer Interface (OpenBCI). The purpose of the proposed device is to detect gait patterns by leveraging the surface electromyography (EMG) signals from the OpenBCI and to help a patient overcome an episode using haptic feedback mechanisms. Previously designed devices with similar intended purposes utilize accelerometry as a method of detection as well as audio and visual feedback mechanisms in their design.
ContributorsAnantuni, Lekha (Author) / McDaniel, Troy (Thesis director) / Tadayon, Arash (Committee member) / Harrington Bioengineering Program (Contributor) / School of Human Evolution and Social Change (Contributor) / Barrett, The Honors College (Contributor)
Created2016-05
Description
This paper presents work that was done to create a system capable of facial expression recognition (FER) using deep convolutional neural networks (CNNs) and test multiple configurations and methods. CNNs are able to extract powerful information about an image using multiple layers of generic feature detectors. The extracted information can be used to understand the image better through recognizing different features present within the image. Deep CNNs, however, require training sets that can be larger than a million pictures in order to fine tune their feature detectors. For the case of facial expression datasets, none of these large datasets are available. Due to this limited availability of data required to train a new CNN, the idea of using naïve domain adaptation is explored. Instead of creating and using a new CNN trained specifically to extract features related to FER, a previously trained CNN originally trained for another computer vision task is used. Work for this research involved creating a system that can run a CNN, can extract feature vectors from the CNN, and can classify these extracted features. Once this system was built, different aspects of the system were tested and tuned. These aspects include the pre-trained CNN that was used, the layer from which features were extracted, normalization used on input images, and training data for the classifier. Once properly tuned, the created system returned results more accurate than previous attempts on facial expression recognition. Based on these positive results, naïve domain adaptation is shown to successfully leverage advantages of deep CNNs for facial expression recognition.
ContributorsEusebio, Jose Miguel Ang (Author) / Panchanathan, Sethuraman (Thesis director) / McDaniel, Troy (Committee member) / Venkateswara, Hemanth (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)
Created2016-05
Description
This paper presents the design and evaluation of a haptic interface for augmenting human-human interpersonal interactions by delivering facial expressions of an interaction partner to an individual who is blind using a visual-to-tactile mapping of facial action units and emotions. Pancake shaftless vibration motors are mounted on the back of a chair to provide vibrotactile stimulation in the context of a dyadic (one-on-one) interaction across a table. This work explores the design of spatiotemporal vibration patterns that can be used to convey the basic building blocks of facial movements according to the Facial Action Unit Coding System. A behavioral study was conducted to explore the factors that influence the naturalness of conveying affect using vibrotactile cues.
ContributorsBala, Shantanu (Author) / Panchanathan, Sethuraman (Thesis director) / McDaniel, Troy (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor) / Department of Psychology (Contributor)
Created2014-05
Description
This paper presents a system to deliver automated, noninvasive, and effective fine motor rehabilitation through a rhythm-based game using a Leap Motion Controller. The system is a rhythm game where hand gestures are used as input and must match the rhythm and gestures shown on screen, thus allowing a physical therapist to represent an exercise session involving the user's hand and finger joints as a series of patterns. Fine motor rehabilitation plays an important role in the recovery and improvement of the effects of stroke, Parkinson's disease, multiple sclerosis, and more. Individuals with these conditions possess a wide range of impairment in terms of fine motor movement. The serious game developed takes this into account and is designed to work with individuals with different levels of impairment. In a pilot study, under partnership with South West Advanced Neurological Rehabilitation (SWAN Rehab) in Phoenix, Arizona, we compared the performance of individuals with fine motor impairment to individuals without this impairment to determine whether a human-centered approach and adapting to an user's range of motion can allow an individual with fine motor impairment to perform at a similar level as a non-impaired user.
ContributorsShah, Vatsal Nimishkumar (Author) / McDaniel, Troy (Thesis director) / Tadayon, Ramin (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)
Created2018-05
Description
This paper presents an overview of The Dyadic Interaction Assistant for Individuals with Visual Impairments with a focus on the software component. The system is designed to communicate facial information (facial Action Units, facial expressions, and facial features) to an individual with visual impairments in a dyadic interaction between two people sitting across from each other. Comprised of (1) a webcam, (2) software, and (3) a haptic device, the system can also be described as a series of input, processing, and output stages, respectively. The processing stage of the system builds on the open source FaceTracker software and the application Computer Expression Recognition Toolbox (CERT). While these two sources provide the facial data, the program developed through the IDE Qt Creator and several AppleScripts are used to adapt the information to a Graphical User Interface (GUI) and output the data to a comma-separated values (CSV) file. It is the first software to convey all 3 types of facial information at once in real-time. Future work includes testing and evaluating the quality of the software with human subjects (both sighted and blind/low vision), integrating the haptic device to complete the system, and evaluating the entire system with human subjects (sighted and blind/low vision).
ContributorsBrzezinski, Chelsea Victoria (Author) / Balasubramanian, Vineeth (Thesis director) / McDaniel, Troy (Committee member) / Venkateswara, Hemanth (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)
Created2013-05
Description
Ophthalmoscopes are integral to diagnosing various eye conditions; however, they often come at a hefty cost and are not generally portable, limiting access. With the increase in the prevalence of smart devices and improvements to their imaging capabilities, these devices have the potential to benefit areas where specialized imaging infrastructure is not well established. Smart device cameras alone cannot replace an ophthalmoscope. However, with the addition of lens and optics, it becomes possible to take diagnostic quality images. The goal is to design a modular system that acts as an adapter to a smart device enabling any user to take retinal images and corneal images with little to no previous experience. The device should be cost-effective, reliable, and easy to use. The device is not meant to replace conventional funduscopes but acts in areas where current units fail. Applications in non-optimal settings, low resource areas, or areas that currently receive suboptimal care due to geographic or socioeconomic barriers are examples where this device could be used. The introduction of screening programs run by nonspecialized medical personnel with devices that can capture and transmit quality eye images minimizes the long-term complications of degenerative eye conditions.
ContributorsSpyres, Dean (Author) / McDaniel, Troy (Thesis advisor) / Patel, Dave (Committee member) / Gintz, Jerry (Committee member) / Arizona State University (Publisher)
Created2022
Description
The Oasis app is a self-appraisal tool for potential or current problem gamblers to take control of their habits by providing periodic check-in notifications during a gambling session and allowing users to see their progress over time. Oasis is backed by substantial background research surrounding addiction intervention methods, especially in the field of self-appraisal messaging, and applies this messaging in a familiar mobile notification form that can effectively change user’s behavior. User feedback was collected and used to improve the app, and the results show a promising tool that could help those who need it in the future.
ContributorsBlunt, Thomas (Author) / Meuth, Ryan (Thesis director) / McDaniel, Troy (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)
Created2023-05
Description
The impact of Artificial Intelligence (AI) has increased significantly in daily life. AI is taking big strides towards moving into areas of life that are critical such as
healthcare but, also into areas such as entertainment and leisure. Deep neural
networks have been pivotal in making all these advancements possible. But, a well-known problem with deep neural networks is the lack of explanations for the choices
it makes. To combat this, several methods have been tried in the field of research.
One example of this is assigning rankings to the individual features and how influential
they are in the decision-making process. In contrast a newer class of methods focuses
on Concept Activation Vectors (CAV) which focus on extracting higher-level concepts
from the trained model to capture more information as a mixture of several features
and not just one. The goal of this thesis is to employ concepts in a novel domain: to
explain how a deep learning model uses computer vision to classify music into different
genres. Due to the advances in the field of computer vision with deep learning for
classification tasks, it is rather a standard practice now to convert an audio clip into
corresponding spectrograms and use those spectrograms as image inputs to the deep
learning model. Thus, a pre-trained model can classify the spectrogram images
(representing songs) into musical genres. The proposed explanation system called
“Why Pop?” tries to answer certain questions about the classification process such as
what parts of the spectrogram influence the model the most, what concepts were
extracted and how are they different for different classes. These explanations aid the
user gain insights into the model’s learnings, biases, and the decision-making process.
ContributorsSharma, Shubham (Author) / Bryan, Chris (Thesis advisor) / McDaniel, Troy (Committee member) / Sarwat, Mohamed (Committee member) / Arizona State University (Publisher)
Created2022