Matching Items (2)
Filtering by

Clear all filters

155339-Thumbnail Image.png
Description
The widespread adoption of computer vision models is often constrained by the issue of domain mismatch. Models that are trained with data belonging to one distribution, perform poorly when tested with data from a different distribution. Variations in vision based data can be attributed to the following reasons, viz., differences

The widespread adoption of computer vision models is often constrained by the issue of domain mismatch. Models that are trained with data belonging to one distribution, perform poorly when tested with data from a different distribution. Variations in vision based data can be attributed to the following reasons, viz., differences in image quality (resolution, brightness, occlusion and color), changes in camera perspective, dissimilar backgrounds and an inherent diversity of the samples themselves. Machine learning techniques like transfer learning are employed to adapt computational models across distributions. Domain adaptation is a special case of transfer learning, where knowledge from a source domain is transferred to a target domain in the form of learned models and efficient feature representations.

The dissertation outlines novel domain adaptation approaches across different feature spaces; (i) a linear Support Vector Machine model for domain alignment; (ii) a nonlinear kernel based approach that embeds domain-aligned data for enhanced classification; (iii) a hierarchical model implemented using deep learning, that estimates domain-aligned hash values for the source and target data, and (iv) a proposal for a feature selection technique to reduce cross-domain disparity. These adaptation procedures are tested and validated across a range of computer vision applications like object classification, facial expression recognition, digit recognition, and activity recognition. The dissertation also provides a unique perspective of domain adaptation literature from the point-of-view of linear, nonlinear and hierarchical feature spaces. The dissertation concludes with a discussion on the future directions for research that highlight the role of domain adaptation in an era of rapid advancements in artificial intelligence.
ContributorsDemakethepalli Venkateswara, Hemanth (Author) / Panchanathan, Sethuraman (Thesis advisor) / Li, Baoxin (Committee member) / Davulcu, Hasan (Committee member) / Ye, Jieping (Committee member) / Chakraborty, Shayok (Committee member) / Arizona State University (Publisher)
Created2017
158224-Thumbnail Image.png
Description
Societal infrastructure is built with vision at the forefront of daily life. For those with

severe visual impairments, this creates countless barriers to the participation and

enjoyment of life’s opportunities. Technological progress has been both a blessing and

a curse in this regard. Digital text together with screen readers and refreshable Braille

displays have

Societal infrastructure is built with vision at the forefront of daily life. For those with

severe visual impairments, this creates countless barriers to the participation and

enjoyment of life’s opportunities. Technological progress has been both a blessing and

a curse in this regard. Digital text together with screen readers and refreshable Braille

displays have made whole libraries readily accessible and rideshare tech has made

independent mobility more attainable. Simultaneously, screen-based interactions and

experiences have only grown in pervasiveness and importance, precluding many of

those with visual impairments.

Sensory Substituion, the process of substituting an unavailable modality with

another one, has shown promise as an alternative to accomodation, but in recent

years meaningful strides in Sensory Substitution for vision have declined in frequency.

Given recent advances in Computer Vision, this stagnation is especially disconcerting.

Designing Sensory Substitution Devices (SSDs) for vision for use in interactive settings

that leverage modern Computer Vision techniques presents a variety of challenges

including perceptual bandwidth, human-computer-interaction, and person-centered

machine learning considerations. To surmount these barriers an approach called Per-

sonal Foveated Haptic Gaze (PFHG), is introduced. PFHG consists of two primary

components: a human visual system inspired interaction paradigm that is intuitive

and flexible enough to generalize to a variety of applications called Foveated Haptic

Gaze (FHG), and a person-centered learning component to address the expressivity

limitations of most SSDs. This component is called One-Shot Object Detection by

Data Augmentation (1SODDA), a one-shot object detection approach that allows a

user to specify the objects they are interested in locating visually and with minimal

effort realizing an object detection model that does so effectively.

The Personal Foveated Haptic Gaze framework was realized in a virtual and real-

world application: playing a 3D, interactive, first person video game (DOOM) and

finding user-specified real-world objects. User study results found Foveated Haptic

Gaze to be an effective and intuitive interface for interacting with dynamic visual

world using solely haptics. Additionally, 1SODDA achieves competitive performance

among few-shot object detection methods and high-framerate many-shot object de-

tectors. The combination of which paves the way for modern Sensory Substitution

Devices for vision.
ContributorsFakhri, Bijan (Author) / Panchanathan, Sethuraman (Thesis advisor) / McDaniel, Troy L (Committee member) / Venkateswara, Hemanth (Committee member) / Amor, Heni (Committee member) / Arizona State University (Publisher)
Created2020