Matching Items (13)

152801-Thumbnail Image.png

Audiovisual perception of dysarthric speech in older adults compared to younger adults

Description

Everyday speech communication typically takes place face-to-face. Accordingly, the task of perceiving speech is a multisensory phenomenon involving both auditory and visual information. The current investigation examines how visual information

Everyday speech communication typically takes place face-to-face. Accordingly, the task of perceiving speech is a multisensory phenomenon involving both auditory and visual information. The current investigation examines how visual information influences recognition of dysarthric speech. It also explores where the influence of visual information is dependent upon age. Forty adults participated in the study that measured intelligibility (percent words correct) of dysarthric speech in auditory versus audiovisual conditions. Participants were then separated into two groups: older adults (age range 47 to 68) and young adults (age range 19 to 36) to examine the influence of age. Findings revealed that all participants, regardless of age, improved their ability to recognize dysarthric speech when visual speech was added to the auditory signal. The magnitude of this benefit, however, was greater for older adults when compared with younger adults. These results inform our understanding of how visual speech information influences understanding of dysarthric speech.

Contributors

Agent

Created

Date Created
  • 2014

155148-Thumbnail Image.png

Subjective and objective evaluation of visual attention models

Description

Visual attention (VA) is the study of mechanisms that allow the human visual system (HVS) to selectively process relevant visual information. This work focuses on the subjective and objective evaluation

Visual attention (VA) is the study of mechanisms that allow the human visual system (HVS) to selectively process relevant visual information. This work focuses on the subjective and objective evaluation of computational VA models for the distortion-free case as well as in the presence of image distortions.

Existing VA models are traditionally evaluated by using VA metrics that quantify the match between predicted saliency and fixation data obtained from eye-tracking experiments on human observers. Though there is a considerable number of objective VA metrics, there exists no study that validates that these metrics are adequate for the evaluation of VA models. This work constructs a VA Quality (VAQ) Database by subjectively assessing the prediction performance of VA models on distortion-free images. Additionally, shortcomings in existing metrics are discussed through illustrative examples and a new metric that uses local weights based on fixation density and that overcomes these flaws, is proposed. The proposed VA metric outperforms all other popular existing metrics in terms of the correlation with subjective ratings.

In practice, the image quality is affected by a host of factors at several stages of the image processing pipeline such as acquisition, compression, and transmission. However, none of the existing studies have discussed the subjective and objective evaluation of visual saliency models in the presence of distortion. In this work, a Distortion-based Visual Attention Quality (DVAQ) subjective database is constructed to evaluate the quality of VA maps for images in the presence of distortions. For creating this database, saliency maps obtained from images subjected to various types of distortions, including blur, noise and compression, and varying levels of distortion severity are rated by human observers in terms of their visual resemblance to corresponding ground-truth fixation density maps. The performance of traditionally used as well as recently proposed VA metrics are evaluated by correlating their scores with the human subjective ratings. In addition, an objective evaluation of 20 state-of-the-art VA models is performed using the top-performing VA metrics together with a study of how the VA models’ prediction performance changes with different types and levels of distortions.

Contributors

Agent

Created

Date Created
  • 2016

154879-Thumbnail Image.png

Eye movements and the label feedback effect: speaking modulates visual search, but probably not visual perception

Description

The label-feedback hypothesis (Lupyan, 2007) proposes that language can modulate low- and high-level visual processing, such as “priming” a visual object. Lupyan and Swingley (2012) found that repeating target

The label-feedback hypothesis (Lupyan, 2007) proposes that language can modulate low- and high-level visual processing, such as “priming” a visual object. Lupyan and Swingley (2012) found that repeating target names facilitates visual search, resulting in shorter reaction times (RTs) and higher accuracy. However, a design limitation made their results challenging to assess. This study evaluated whether self-directed speech influences target locating (i.e. attentional guidance) or target identification after location (i.e. decision time), testing whether the Label Feedback Effect reflects changes in visual attention or some other mechanism (e.g. template maintenance in working memory). Across three experiments, search RTs and eye movements were analyzed from four within-subject conditions. People spoke target names, nonwords, irrelevant (absent) object names, or irrelevant (present) object names. Speaking target names weakly facilitates visual search, but speaking different names strongly inhibits search. The most parsimonious account is that language affects target maintenance during search, rather than visual perception.

Contributors

Agent

Created

Date Created
  • 2016

155902-Thumbnail Image.png

Spatial-temporal characteristics of multisensory integration

Description

We experience spatial separation and temporal asynchrony between visual and

haptic information in many virtual-reality, augmented-reality, or teleoperation systems.

Three studies were conducted to examine the spatial and temporal characteristic of

multisensory integration.

We experience spatial separation and temporal asynchrony between visual and

haptic information in many virtual-reality, augmented-reality, or teleoperation systems.

Three studies were conducted to examine the spatial and temporal characteristic of

multisensory integration. Participants interacted with virtual springs using both visual and

haptic senses, and their perception of stiffness and ability to differentiate stiffness were

measured. The results revealed that a constant visual delay increased the perceived stiffness,

while a variable visual delay made participants depend more on the haptic sensations in

stiffness perception. We also found that participants judged stiffness stiffer when they

interact with virtual springs at faster speeds, and interaction speed was positively correlated

with stiffness overestimation. In addition, it has been found that participants could learn an

association between visual and haptic inputs despite the fact that they were spatially

separated, resulting in the improvement of typing performance. These results show the

limitations of Maximum-Likelihood Estimation model, suggesting that a Bayesian

inference model should be used.

Contributors

Agent

Created

Date Created
  • 2017

150440-Thumbnail Image.png

Efficient perceptual super-resolution

Description

Super-Resolution (SR) techniques are widely developed to increase image resolution by fusing several Low-Resolution (LR) images of the same scene to overcome sensor hardware limitations and reduce media impairments in

Super-Resolution (SR) techniques are widely developed to increase image resolution by fusing several Low-Resolution (LR) images of the same scene to overcome sensor hardware limitations and reduce media impairments in a cost-effective manner. When choosing a solution for the SR problem, there is always a trade-off between computational efficiency and High-Resolution (HR) image quality. Existing SR approaches suffer from extremely high computational requirements due to the high number of unknowns to be estimated in the solution of the SR inverse problem. This thesis proposes efficient iterative SR techniques based on Visual Attention (VA) and perceptual modeling of the human visual system. In the first part of this thesis, an efficient ATtentive-SELective Perceptual-based (AT-SELP) SR framework is presented, where only a subset of perceptually significant active pixels is selected for processing by the SR algorithm based on a local contrast sensitivity threshold model and a proposed low complexity saliency detector. The proposed saliency detector utilizes a probability of detection rule inspired by concepts of luminance masking and visual attention. The second part of this thesis further enhances on the efficiency of selective SR approaches by presenting an ATtentive (AT) SR framework that is completely driven by VA region detectors. Additionally, different VA techniques that combine several low-level features, such as center-surround differences in intensity and orientation, patch luminance and contrast, bandpass outputs of patch luminance and contrast, and difference of Gaussians of luminance intensity are integrated and analyzed to illustrate the effectiveness of the proposed selective SR frameworks. The proposed AT-SELP SR and AT-SR frameworks proved to be flexible by integrating a Maximum A Posteriori (MAP)-based SR algorithm as well as a fast two-stage Fusion-Restoration (FR) SR estimator. By adopting the proposed selective SR frameworks, simulation results show significant reduction on average in computational complexity with comparable visual quality in terms of quantitative metrics such as PSNR, SNR or MAE gains, and subjective assessment. The third part of this thesis proposes a Perceptually Weighted (WP) SR technique that incorporates unequal weighting parameters in the cost function of iterative SR problems. The proposed approach is inspired by the unequal processing of the Human Visual System (HVS) to different local image features in an image. Simulation results show an enhanced reconstruction quality and faster convergence rates when applied to the MAP-based and FR-based SR schemes.

Contributors

Agent

Created

Date Created
  • 2011

150444-Thumbnail Image.png

Motion supports object recognition: insight into possible interactions between the two primary pathways of the human visual system

Description

The present study explores the role of motion in the perception of form from dynamic occlusion, employing color to help isolate the contributions of both visual pathways. Although the cells

The present study explores the role of motion in the perception of form from dynamic occlusion, employing color to help isolate the contributions of both visual pathways. Although the cells that respond to color cues in the environment usually feed into the ventral stream, humans can perceive motion based on chromatic cues. The current study was designed to use grey, green, and red stimuli to successively limit the amount of information available to the dorsal stream pathway, while providing roughly equal information to the ventral system. Twenty-one participants identified shapes that were presented in grey, green, and red and were defined by dynamic occlusion. The shapes were then presented again in a static condition where the maximum occlusions were presented as before, but without motion. Results showed an interaction between the motion and static conditions in that when the speed of presentation increased, performance in the motion conditions became significantly less accurate than in the static conditions. The grey and green motion conditions crossed static performance at the same point, whereas the red motion condition crossed at a much slower speed. These data are consistent with a model of neural processing in which the main visual systems share information. Moreover, they support the notion that presenting stimuli in specific colors may help isolate perceptual pathways for scientific investigation. Given the potential for chromatic cues to target specific visual systems in the performance of dynamic object recognition, exploring these perceptual parameters may help our understanding of human visual processing.

Contributors

Agent

Created

Date Created
  • 2011

157098-Thumbnail Image.png

The perceptual motor-effects of the Ebbinghaus illusion on golf putting

Description

Previous research has shown that perceptual illusions can enhance golf putting performance, and the effect has been explained as being due to enhanced expectancies. The present study was designed to

Previous research has shown that perceptual illusions can enhance golf putting performance, and the effect has been explained as being due to enhanced expectancies. The present study was designed to further understand this effect by measuring putting in 3 additional variations to the Ebbinghaus illusion and by measuring putting kinematics. Nineteen ASU students with minimal golf experience putted to the following illusion conditions: a target, a target surrounded by small circles, a target surrounded by large circles, a target surrounded by both large and small circles, no target surrounded by small circles and no target surrounded by large circles. Neither perceived target size nor putting error was significantly affected by the illusion conditions. Time to peak speed was found to be significantly greater for the two conditions with no target, and lowest for the condition with the target by itself. Suggestions for future research include having split groups with and without perceived performance feedback as well as general performance feedback. The size conditions utilized within this study should continue to be explored as more consistent data could be collected within groups.

Contributors

Agent

Created

Date Created
  • 2019

153453-Thumbnail Image.png

Audiovisual sentence recognition in bimodal and bilateral cochlear implant users

Description

The present study describes audiovisual sentence recognition in normal hearing listeners, bimodal cochlear implant (CI) listeners and bilateral CI listeners. This study explores a new set of sentences (the AzAV

The present study describes audiovisual sentence recognition in normal hearing listeners, bimodal cochlear implant (CI) listeners and bilateral CI listeners. This study explores a new set of sentences (the AzAV sentences) that were created to have equal auditory intelligibility and equal gain from visual information.

The aims of Experiment I were to (i) compare the lip reading difficulty of the AzAV sentences to that of other sentence materials, (ii) compare the speech-reading ability of CI listeners to that of normal-hearing listeners and (iii) assess the gain in speech understanding when listeners have both auditory and visual information from easy-to-lip-read and difficult-to-lip read sentences. In addition, the sentence lists were subjected to a multi-level text analysis to determine the factors that make sentences easy or difficult to speech read.

The results of Experiment I showed that (i) the AzAV sentences were relatively difficult to lip read, (ii) that CI listeners and normal-hearing listeners did not differ in lip reading ability and (iii) that sentences with low lip-reading intelligibility (10-15 % correct) provide about a 30 percentage point improvement in speech understanding when added to the acoustic stimulus, while sentences with high lip-reading intelligibility (30-60 % correct) provide about a 50 percentage point improvement in the same comparison. The multi-level text analyses showed that the familiarity of phrases in the sentences was the primary driving factor that affects the lip reading difficulty.

The aim of Experiment II was to investigate the value, when visual information is present, of bimodal hearing and bilateral cochlear implants. The results of Experiment II showed that when visual information is present, low-frequency acoustic hearing can be of value to speech understanding for patients fit with a single CI. However, when visual information was available no gain was seen from the provision of a second CI, i.e., bilateral CIs. As was the case in Experiment I, visual information provided about a 30 percentage point improvement in speech understanding.

Contributors

Agent

Created

Date Created
  • 2015

152678-Thumbnail Image.png

Visual recognition for dynamic scenes

Description

Recognition memory was investigated for naturalistic dynamic scenes. Although visual recognition for static objects and scenes has been investigated previously and found to be extremely robust in terms of fidelity

Recognition memory was investigated for naturalistic dynamic scenes. Although visual recognition for static objects and scenes has been investigated previously and found to be extremely robust in terms of fidelity and retention, visual recognition for dynamic scenes has received much less attention. In four experiments, participants view a number of clips from novel films and are then tasked to complete a recognition test containing frames from the previously viewed films and difficult foil frames. Recognition performance is good when foils are taken from other parts of the same film (Experiment 1), but degrades greatly when foils are taken from unseen gaps from within the viewed footage (Experiments 3 and 4). Removing all non-target frames had a serious effect on recognition performance (Experiment 2). Across all experiments, presenting the films as a random series of clips seemed to have no effect on recognition performance. Patterns of accuracy and response latency in Experiments 3 and 4 appear to be a result of a serial-search process. It is concluded that visual representations of dynamic scenes may be stored as units of events, and participant's old
ew judgments of individual frames were better characterized by a cued-recall paradigm than traditional recognition judgments.

Contributors

Agent

Created

Date Created
  • 2014

152570-Thumbnail Image.png

The effects of implied motion training on general cortical processing

Description

Current research has identified a specific type of visual experience that leads to faster cortical processing. Specifically, performance on perceptual learning of a directional-motion leads to faster cortical processing. This

Current research has identified a specific type of visual experience that leads to faster cortical processing. Specifically, performance on perceptual learning of a directional-motion leads to faster cortical processing. This is important on two levels; first, cortical processing is positively correlated with cognitive functions and inversely related to age, frontal lobe lesions, and some cognitive disorders. Second, temporal processing has been shown to be relatively stable over time. In order to expand on this line of research, we examined the effects of a different, but relevant visual experience (i.e., implied motion) on cortical processing. Previous fMRI studies have indicated that static images that imply motion activate area V5 or middle temporal/medial superior temporal complex (MT/MST+) of the visual cortex, the same brain region that is activated in response to real motion. Therefore, we hypothesized that visual experience of implied motion may parallel the positive relationship between real directional-motion and cortical processing. Seven subjects participated in a visual task of implied motion for 4 days, and a pre- and post-test of cortical processing. The results indicated that performance on implied motion is systematically different from performance on a dot motion task. Despite individual differences in performance, overall cortical processing increased from day 1 to day 4.

Contributors

Agent

Created

Date Created
  • 2014