Modern Sensory Substitution for Vision in Dynamic Environments

Description

Societal infrastructure is built with vision at the forefront of daily life. For those with

severe visual impairments, this creates countless barriers to the participation and

enjoyment of life’s opportunities. Technological progress has been both a blessing and

a curse in this regard.…

Societal infrastructure is built with vision at the forefront of daily life. For those with

severe visual impairments, this creates countless barriers to the participation and

enjoyment of life’s opportunities. Technological progress has been both a blessing and

a curse in this regard. Digital text together with screen readers and refreshable Braille

displays have made whole libraries readily accessible and rideshare tech has made

independent mobility more attainable. Simultaneously, screen-based interactions and

experiences have only grown in pervasiveness and importance, precluding many of

those with visual impairments.

Sensory Substituion, the process of substituting an unavailable modality with

another one, has shown promise as an alternative to accomodation, but in recent

years meaningful strides in Sensory Substitution for vision have declined in frequency.

Given recent advances in Computer Vision, this stagnation is especially disconcerting.

Designing Sensory Substitution Devices (SSDs) for vision for use in interactive settings

that leverage modern Computer Vision techniques presents a variety of challenges

including perceptual bandwidth, human-computer-interaction, and person-centered

machine learning considerations. To surmount these barriers an approach called Per-

sonal Foveated Haptic Gaze (PFHG), is introduced. PFHG consists of two primary

components: a human visual system inspired interaction paradigm that is intuitive

and flexible enough to generalize to a variety of applications called Foveated Haptic

Gaze (FHG), and a person-centered learning component to address the expressivity

limitations of most SSDs. This component is called One-Shot Object Detection by

Data Augmentation (1SODDA), a one-shot object detection approach that allows a

user to specify the objects they are interested in locating visually and with minimal

effort realizing an object detection model that does so effectively.

The Personal Foveated Haptic Gaze framework was realized in a virtual and real-

world application: playing a 3D, interactive, first person video game (DOOM) and

finding user-specified real-world objects. User study results found Foveated Haptic

Gaze to be an effective and intuitive interface for interacting with dynamic visual

world using solely haptics. Additionally, 1SODDA achieves competitive performance

among few-shot object detection methods and high-framerate many-shot object de-

tectors. The combination of which paves the way for modern Sensory Substitution

Devices for vision.