Matching Items (26)
152941-Thumbnail Image.png
Description
Head movement is known to have the benefit of improving the accuracy of sound localization for humans and animals. Marmoset is a small bodied New World monkey species and it has become an emerging model for studying the auditory functions. This thesis aims to detect the horizontal and vertical

Head movement is known to have the benefit of improving the accuracy of sound localization for humans and animals. Marmoset is a small bodied New World monkey species and it has become an emerging model for studying the auditory functions. This thesis aims to detect the horizontal and vertical rotation of head movement in marmoset monkeys.

Experiments were conducted in a sound-attenuated acoustic chamber. Head movement of marmoset monkey was studied under various auditory and visual stimulation conditions. With increasing complexity, these conditions are (1) idle, (2) sound-alone, (3) sound and visual signals, and (4) alert signal by opening and closing of the chamber door. All of these conditions were tested with either house light on or off. Infra-red camera with a frame rate of 90 Hz was used to capture of the head movement of monkeys. To assist the signal detection, two circular markers were attached to the top of monkey head. The data analysis used an image-based marker detection scheme. Images were processed using the Computation Vision Toolbox in Matlab. The markers and their positions were detected using blob detection techniques. Based on the frame-by-frame information of marker positions, the angular position, velocity and acceleration were extracted in horizontal and vertical planes. Adaptive Otsu Thresholding, Kalman filtering and bound setting for marker properties were used to overcome a number of challenges encountered during this analysis, such as finding image segmentation threshold, continuously tracking markers during large head movement, and false alarm detection.

The results show that the blob detection method together with Kalman filtering yielded better performances than other image based techniques like optical flow and SURF features .The median of the maximal head turn in the horizontal plane was in the range of 20 to 70 degrees and the median of the maximal velocity in horizontal plane was in the range of a few hundreds of degrees per second. In comparison, the natural alert signal - door opening and closing - evoked the faster head turns than other stimulus conditions. These results suggest that behaviorally relevant stimulus such as alert signals evoke faster head-turn responses in marmoset monkeys.
ContributorsSimhadri, Sravanthi (Author) / Zhou, Yi (Thesis advisor) / Turaga, Pavan (Thesis advisor) / Berisha, Visar (Committee member) / Arizona State University (Publisher)
Created2014
153418-Thumbnail Image.png
Description
This study consisted of several related projects on dynamic spatial hearing by both human and robot listeners. The first experiment investigated the maximum number of sound sources that human listeners could localize at the same time. Speech stimuli were presented simultaneously from different loudspeakers at multiple time intervals. The maximum

This study consisted of several related projects on dynamic spatial hearing by both human and robot listeners. The first experiment investigated the maximum number of sound sources that human listeners could localize at the same time. Speech stimuli were presented simultaneously from different loudspeakers at multiple time intervals. The maximum of perceived sound sources was close to four. The second experiment asked whether the amplitude modulation of multiple static sound sources could lead to the perception of auditory motion. On the horizontal and vertical planes, four independent noise sound sources with 60° spacing were amplitude modulated with consecutively larger phase delay. At lower modulation rates, motion could be perceived by human listeners in both cases. The third experiment asked whether several sources at static positions could serve as "acoustic landmarks" to improve the localization of other sources. Four continuous speech sound sources were placed on the horizontal plane with 90° spacing and served as the landmarks. The task was to localize a noise that was played for only three seconds when the listener was passively rotated in a chair in the middle of the loudspeaker array. The human listeners were better able to localize the sound sources with landmarks than without. The other experiments were with the aid of an acoustic manikin in an attempt to fuse binaural recording and motion data to localize sounds sources. A dummy head with recording devices was mounted on top of a rotating chair and motion data was collected. The fourth experiment showed that an Extended Kalman Filter could be used to localize sound sources in a recursive manner. The fifth experiment demonstrated the use of a fitting method for separating multiple sounds sources.
ContributorsZhong, Xuan (Author) / Yost, William (Thesis advisor) / Zhou, Yi (Committee member) / Dorman, Michael (Committee member) / Helms Tillery, Stephen (Committee member) / Arizona State University (Publisher)
Created2015
153939-Thumbnail Image.png
Description
Sound localization can be difficult in a reverberant environment. Fortunately listeners can utilize various perceptual compensatory mechanisms to increase the reliability of sound localization when provided with ambiguous physical evidence. For example, the directional information of echoes can be perceptually suppressed by the direct sound to achieve a single, fused

Sound localization can be difficult in a reverberant environment. Fortunately listeners can utilize various perceptual compensatory mechanisms to increase the reliability of sound localization when provided with ambiguous physical evidence. For example, the directional information of echoes can be perceptually suppressed by the direct sound to achieve a single, fused auditory event in a process called the precedence effect (Litovsky et al., 1999). Visual cues also influence sound localization through a phenomenon known as the ventriloquist effect. It is classically demonstrated by a puppeteer who speaks without visible lip movements while moving the mouth of a puppet synchronously with his/her speech (Gelder and Bertelson, 2003). If the ventriloquist is successful, sound will be “captured” by vision and be perceived to be originating at the location of the puppet. This thesis investigates the influence of vision on the spatial localization of audio-visual stimuli. Participants seated in a sound-attenuated room indicated their perceived locations of either ISI or level-difference stimuli in free field conditions. Two types of stereophonic phantom sound sources, created by modulating the inter-stimulus time interval (ISI) or level difference between two loudspeakers, were used as auditory stimuli. The results showed that the light cues influenced auditory spatial perception to a greater extent for the ISI stimuli than the level difference stimuli. A binaural signal analysis further revealed that the greater visual bias for the ISI phantom sound sources was correlated with the increasingly ambiguous binaural cues of the ISI signals. This finding suggests that when sound localization cues are unreliable, perceptual decisions become increasingly biased towards vision for finding a sound source. These results support the cue saliency theory underlying cross-modal bias and extend this theory to include stereophonic phantom sound sources.
ContributorsMontagne, Christopher (Author) / Zhou, Yi (Thesis advisor) / Buneo, Christopher A (Thesis advisor) / Yost, William A. (Committee member) / Arizona State University (Publisher)
Created2015
155938-Thumbnail Image.png
Description
Music training is associated with measurable physiologic changes in the auditory pathway. Benefits of music training have also been demonstrated in the areas of working memory, auditory attention, and speech perception in noise. The purpose of this study was to determine whether long-term auditory experience secondary to music

Music training is associated with measurable physiologic changes in the auditory pathway. Benefits of music training have also been demonstrated in the areas of working memory, auditory attention, and speech perception in noise. The purpose of this study was to determine whether long-term auditory experience secondary to music training enhances the ability to detect, learn, and recall new words.

Participants consisted of 20 young adult musicians and 20 age-matched non-musicians. In addition to completing word recognition and non-word detection tasks, each participant learned 10 nonsense words in a rapid word-learning task. All tasks were completed in quiet and in multi-talker babble. Next-day retention of the learned words was examined in isolation and in context. Cortical auditory evoked responses to vowel stimuli were recorded to obtain latencies and amplitudes for the N1, P2, and P3a components. Performance was compared across groups and listening conditions. Correlations between the behavioral tasks and the cortical auditory evoked responses were also examined.

No differences were found between groups (musicians vs. non-musicians) on any of the behavioral tasks. Nor did the groups differ in cortical auditory evoked response latencies or amplitudes, with the exception of P2 latencies, which were significantly longer in musicians than in non-musicians. Performance was significantly poorer in babble than in quiet on word recognition and non-word detection, but not on word learning, learned-word retention, or learned-word detection. CAEP latencies collapsed across group were significantly longer and amplitudes were significantly smaller in babble than in quiet. P2 latencies in quiet were positively correlated with word recognition in quiet, while P3a latencies in babble were positively correlated with word recognition and learned-word detection in babble. No other significant correlations were observed between CAEPs and performance on behavioral tasks.

These results indicated that, for young normal-hearing adults, auditory experience resulting from long-term music training did not provide an advantage for learning new information in either favorable (quiet) or unfavorable (babble) listening conditions. Results of the present study suggest that the relationship between music training and the strength of cortical auditory evoked responses may be more complex or too weak to be observed in this population.
ContributorsStewart, Elizabeth (Author) / Pittman, Andrea (Thesis advisor) / Cone, Barbara (Committee member) / Zhou, Yi (Committee member) / Arizona State University (Publisher)
Created2017
156081-Thumbnail Image.png
Description
Auditory scene analysis (ASA) is the process through which listeners parse and organize their acoustic environment into relevant auditory objects. ASA functions by exploiting natural regularities in the structure of auditory information. The current study investigates spectral envelope and its contribution to the perception of changes in pitch and loudness.

Auditory scene analysis (ASA) is the process through which listeners parse and organize their acoustic environment into relevant auditory objects. ASA functions by exploiting natural regularities in the structure of auditory information. The current study investigates spectral envelope and its contribution to the perception of changes in pitch and loudness. Experiment 1 constructs a perceptual continuum of twelve f0- and intensity-matched vowel phonemes (i.e. a pure timbre manipulation) and reveals spectral envelope as a primary organizational dimension. The extremes of this dimension are i (as in “bee”) and Ʌ (“bun”). Experiment 2 measures the strength of the relationship between produced f0 and the previously observed phonetic-pitch continuum at three different levels of phonemic constraint. Scat performances and, to a lesser extent, recorded interviews were found to exhibit changes in accordance with the natural regularity; specifically, f0 changes were correlated with the phoneme pitch-height continuum. The more constrained case of lyrical singing did not exhibit the natural regularity. Experiment 3 investigates participant ratings of pitch and loudness as stimuli vary in f0, intensity, and the phonetic-pitch continuum. Psychophysical functions derived from the results reveal that moving from i to Ʌ is equivalent to a .38 semitone decrease in f0 and a .75 dB decrease in intensity. Experiment 4 examines the potentially functional aspect of the pitch, loudness, and spectral envelope relationship. Detection thresholds of stimuli in which all three dimensions change congruently (f0 increase, intensity increase, Ʌ to i) or incongruently (no f0 change, intensity increase, i to Ʌ) are compared using an objective version of the method of limits. Congruent changes did not provide a detection benefit over incongruent changes; however, when the contribution of phoneme change was removed, congruent changes did offer a slight detection benefit, as in previous research. While this relationship does not offer a detection benefit at threshold, there is a natural regularity for humans to produce phonemes at higher f0s according to their relative position on the pitch height continuum. Likewise, humans have a bias to detect pitch and loudness changes in phoneme sweeps in accordance with the natural regularity.
ContributorsPatten, K. Jakob (Author) / Mcbeath, Michael K (Thesis advisor) / Amazeen, Eric L (Committee member) / Glenberg, Arthur W (Committee member) / Zhou, Yi (Committee member) / Arizona State University (Publisher)
Created2017
133916-Thumbnail Image.png
Description
The purpose of the present study was to determine if vocabulary knowledge is related to degree of hearing loss. A 50-question multiple-choice vocabulary test comprised of old and new words was administered to 43 adults with hearing loss (19 to 92 years old) and 51 adults with normal hearing (20

The purpose of the present study was to determine if vocabulary knowledge is related to degree of hearing loss. A 50-question multiple-choice vocabulary test comprised of old and new words was administered to 43 adults with hearing loss (19 to 92 years old) and 51 adults with normal hearing (20 to 40 years old). Degree of hearing loss ranged from mild to moderately-severe as determined by bilateral pure-tone thresholds. Education levels ranged from some high school to graduate degrees. It was predicted that knowledge of new words would decrease with increasing hearing loss, whereas knowledge of old words would be unaffected. The Test of Contemporary Vocabulary (TCV) was developed for this study and contained words with old and new definitions. The vocabulary scores were subjected to repeated-measures ANOVA with definition type (old and new) as the within-subjects factor. Hearing level and education were between-subjects factors, while age was entered as a covariate. The results revealed no main effect of age or education level, while a significant main effect of hearing level was observed. Specifically, performance for new words decreased significantly as degree of hearing loss increased. A similar effect was not observed for old words. These results indicate that knowledge of new definitions is inversely related to degree of hearing loss.
ContributorsMarzan, Nicole Ann (Author) / Pittman, Andrea (Thesis director) / Azuma, Tamiko (Committee member) / Wexler, Kathryn (Committee member) / Department of Speech and Hearing Science (Contributor) / Barrett, The Honors College (Contributor)
Created2018-05
135844-Thumbnail Image.png
Description
Head turning is a common sound localization strategy in primates. A novel system that can track head movement and acoustic signals received at the entrance to the ear canal was tested to obtain binaural sound localization information during fast head movement of marmoset monkey. Analysis of binaural information was conducted

Head turning is a common sound localization strategy in primates. A novel system that can track head movement and acoustic signals received at the entrance to the ear canal was tested to obtain binaural sound localization information during fast head movement of marmoset monkey. Analysis of binaural information was conducted with a focus on inter-aural level difference (ILD) and inter-aural time difference (ITD) at various head positions over time. The results showed that during fast head turns, the ITDs showed significant and clear changes in trajectory in response to low frequency stimuli. However, significant phase ambiguity occurred at frequencies greater than 2 kHz. Analysis of ITD and ILD information with animal vocalization as the stimulus was also tested. The results indicated that ILDs may provide more information in understanding the dynamics of head movement in response to animal vocalizations in the environment. The primary significance of this experimentation is the successful implementation of a system capable of simultaneously recording head movement and acoustic signals at the ear canals. The collected data provides insight into the usefulness of ITD and ILD as binaural cues during head movement.
ContributorsLabban, Kyle John (Author) / Zhou, Yi (Thesis director) / Buneo, Christopher (Committee member) / Dorman, Michael (Committee member) / Harrington Bioengineering Program (Contributor) / Barrett, The Honors College (Contributor)
Created2016-05
134484-Thumbnail Image.png
Description
The purpose of the present study was to determine if an automated speech perception task yields results that are equivalent to a word recognition test used in audiometric evaluations. This was done by testing 51 normally hearing adults using a traditional word recognition task (NU-6) and an automated Non-Word Detection

The purpose of the present study was to determine if an automated speech perception task yields results that are equivalent to a word recognition test used in audiometric evaluations. This was done by testing 51 normally hearing adults using a traditional word recognition task (NU-6) and an automated Non-Word Detection task. Stimuli for each task were presented in quiet as well as in six signal-to-noise ratios (SNRs) increasing in 3 dB increments (+0 dB, +3 dB, +6 dB, +9 dB, + 12 dB, +15 dB). A two one-sided test procedure (TOST) was used to determine equivalency of the two tests. This approach required the performance for both tasks to be arcsine transformed and converted to z-scores in order to calculate the difference in scores across listening conditions. These values were then compared to a predetermined criterion to establish if equivalency exists. It was expected that the TOST procedure would reveal equivalency between the traditional word recognition task and the automated Non-Word Detection Task. The results confirmed that the two tasks differed by no more than 2 test items in any of the listening conditions. Overall, the results indicate that the automated Non-Word Detection task could be used in addition to, or in place of, traditional word recognition tests. In addition, the features of an automated test such as the Non-Word Detection task offer additional benefits including rapid administration, accurate scoring, and supplemental performance data (e.g., error analyses) beyond those obtained in traditional speech perception measures.
ContributorsStahl, Amy Nicole (Author) / Pittman, Andrea (Thesis director) / Boothroyd, Arthur (Committee member) / McBride, Ingrid (Committee member) / School of Human Evolution and Social Change (Contributor) / Department of Speech and Hearing Science (Contributor) / Barrett, The Honors College (Contributor)
Created2017-05
137603-Thumbnail Image.png
Description
The purpose of this study was to explore the effects of word type, phonotactic probability, word frequency, and neighborhood density on the vocabularies of children with mild-to-moderate hearing loss compared to children with normal hearing. This was done by assigning values for these parameters to each test item on the

The purpose of this study was to explore the effects of word type, phonotactic probability, word frequency, and neighborhood density on the vocabularies of children with mild-to-moderate hearing loss compared to children with normal hearing. This was done by assigning values for these parameters to each test item on the Peabody Picture Vocabulary Test (Version III, Form B) to quantify and characterize the performance of children with hearing loss relative to that of children with normal hearing. It was expected that PPVT IIIB scores would: 1) Decrease as the degree of hearing loss increased. 2) Increase as a function of age 3) Be more positively related to nouns than to verbs or attributes. 4) Be negatively related to phonotactic probability. 5) Be negatively related to word frequency 6) Be negatively related to neighborhood density. All but one of the expected outcomes was observed. PPVT IIIB performance decreased as hearing loss increased, and increased with age. Performance for nouns, verbs, and attributes increased with PPVT IIIB performance, whereas neighborhood density decreased. Phonotactic probability was expected to decrease as PPVT IIIB performance increased, but instead it increased due to the confounding effects of word length and the order of words on the test. Age and hearing level were rejected by the multiple regression analyses as contributors to PPVT IIIB performance for the children with hearing loss. Overall, the results indicate that there is a 2-year difference in vocabulary age between children with normal hearing and children with hearing loss, and that this may be due to factors external to the child (such as word frequency and phonotactic probability) rather than the child's age and hearing level. This suggests that children with hearing loss need continued clinical services (amplification) as well as additional support services in school throughout childhood.
ContributorsLatto, Allison Renee (Author) / Pittman, Andrea (Thesis director) / Gray, Shelley (Committee member) / Brinkley, Shara (Committee member) / Barrett, The Honors College (Contributor) / Department of Speech and Hearing Science (Contributor)
Created2013-05
135494-Thumbnail Image.png
Description
Hearing and vision are two senses that most individuals use on a daily basis. The simultaneous presentation of competing visual and auditory stimuli often affects our sensory perception. It is often believed that vision is the more dominant sense over audition in spatial localization tasks. Recent work suggests that visual

Hearing and vision are two senses that most individuals use on a daily basis. The simultaneous presentation of competing visual and auditory stimuli often affects our sensory perception. It is often believed that vision is the more dominant sense over audition in spatial localization tasks. Recent work suggests that visual information can influence auditory localization when the sound is emanating from a physical location or from a phantom location generated through stereophony (the so-called "summing localization"). The present study investigates the role of cross-modal fusion in an auditory localization task. The focuses of the experiments are two-fold: (1) reveal the extent of fusion between auditory and visual stimuli and (2) investigate how fusion is correlated with the amount of visual bias a subject experiences. We found that fusion often occurs when light flash and "summing localization" stimuli were presented from the same hemifield. However, little correlation was observed between the magnitude of visual bias and the extent of perceived fusion between light and sound stimuli. In some cases, subjects reported distinctive locations for light and sound and still experienced visual capture.
ContributorsBalderas, Leslie Ann (Author) / Zhou, Yi (Thesis director) / Yost, William (Committee member) / Department of Speech and Hearing Science (Contributor) / Barrett, The Honors College (Contributor)
Created2016-05