Search Content

Let's Talk Monkey- Quantitative Analysis of Marmoset Monkey Calls

Description

The marmoset monkey (Callithrix jacchus) is a new-world primate species native to South America rainforests. Because they rely on vocal communication to navigate and survive, marmosets have evolved as a promising primate model to study vocal production, perception, cognition, and social interactions. The purpose of this project is to provide…

The marmoset monkey (Callithrix jacchus) is a new-world primate species native to South America rainforests. Because they rely on vocal communication to navigate and survive, marmosets have evolved as a promising primate model to study vocal production, perception, cognition, and social interactions. The purpose of this project is to provide an initial assessment on the vocal repertoire of a marmoset colony raised at Arizona State University and call types they use in different social conditions. The vocal production of a colony of 16 marmoset monkeys was recorded in 3 different conditions with three repeats of each condition. The positive condition involves a caretaker distributing food, the negative condition involves an experimenter taking a marmoset out of his cage to a different room, and the control condition is the normal state of the colony with no human interference. A total of 5396 samples of calls were collected during a total of 256 minutes of audio recordings. Call types were analyzed in semi-automated computer programs developed in the Laboratory of Auditory Computation and Neurophysiology. A total of 5 major call types were identified and their variants in different social conditions were analyzed. The results showed that the total number of calls and the type of calls made differed in the three social conditions, suggesting that monkey vocalization signals and depends on the social context.

ContributorsFernandez, Jessmin Natalie (Author) / Zhou, Yi (Thesis director) / Berisha, Visar (Committee member) / School of International Letters and Cultures (Contributor) / Department of Psychology (Contributor) / School of Life Sciences (Contributor) / Barrett, The Honors College (Contributor)

Created2019-05

A Mixed Reality Platform for Systematic Investigation of the Neural Mechanisms of Multisensory Integration During Motor Planning

Description

Multisensory integration is the process by which information from different sensory modalities is integrated by the nervous system. This process is important not only from a basic science perspective but also for translational reasons, e.g., for the development of closed-loop neural prosthetic systems. A mixed virtual reality platform was developed…

Multisensory integration is the process by which information from different sensory modalities is integrated by the nervous system. This process is important not only from a basic science perspective but also for translational reasons, e.g., for the development of closed-loop neural prosthetic systems. A mixed virtual reality platform was developed to study the neural mechanisms of multisensory integration for the upper limb during motor planning. The platform allows for selection of different arms and manipulation of the locations of physical and virtual target cues in the environment. The system was tested with two non-human primates (NHP) trained to reach to multiple virtual targets. Arm kinematic data as well as neural spiking data from primary motor (M1) and dorsal premotor cortex (PMd) were collected. The task involved manipulating visual information about initial arm position by rendering the virtual avatar arm in either its actual position (veridical (V) condition) or in a different shifted (e.g., small vs large shifts) position (perturbed (P) condition) prior to movement. Tactile feedback was modulated in blocks by placing or removing the physical start cue on the table (tactile (T), and no-tactile (NT) conditions, respectively). Behaviorally, errors in initial movement direction were larger when the physical start cue was absent. Slightly larger directional errors were found in the P condition compared to the V condition for some movement directions. Both effects were consistent with the idea that erroneous or reduced information about initial hand location led to movement direction-dependent reach planning errors. Neural correlates of these behavioral effects were probed using population decoding techniques. For small shifts in the visual position of the arm, no differences in decoding accuracy between the T and NT conditions were observed in either M1 or PMd. However, for larger visual shifts, decoding accuracy decreased in the NT condition, but only in PMd. Thus, activity in PMd, but not M1, may reflect the uncertainty in reach planning that results when sensory cues regarding initial hand position are erroneous or absent.

ContributorsPhataraphruk, Preyaporn Kris (Author) / Buneo, Christopher A (Thesis advisor) / Zhou, Yi (Committee member) / Helms Tillery, Steve (Committee member) / Greger, Bradley (Committee member) / Santello, Marco (Committee member) / Arizona State University (Publisher)

Created2023

Marmoset Calls Labeling

Description

Callithrix jacchus, also known as a common marmoset, is native to the new world. These marmosets possess a wide range of vocal repertoire that is interesting to observe for the purpose of understanding their group communication and their fight or flight responses to the environment around them. In this project,…

Callithrix jacchus, also known as a common marmoset, is native to the new world. These marmosets possess a wide range of vocal repertoire that is interesting to observe for the purpose of understanding their group communication and their fight or flight responses to the environment around them. In this project, I am continuing with the project that a previous student, Jasmin, had done to find more data for her study. For the most part, my project entailed recording and labeling the marmoset’s calls into different types.

ContributorsTran, Anh (Author) / Zhou, Yi (Thesis director) / Berisha, Visar (Committee member) / Barrett, The Honors College (Contributor)

Created2021-05

Diffusion Tensor Imaging of Parkinson’s Disease Patients and Their Cognitive Assessments

Description

Diffusion Tensor Imaging may be used to understand brain differences within PD. Within the last couple of decades there has been an explosion of learning and development in neuroimaging techniques. Today, it is possible to monitor and track where a brain is needing blood during a specific task without much…

Diffusion Tensor Imaging may be used to understand brain differences within PD. Within the last couple of decades there has been an explosion of learning and development in neuroimaging techniques. Today, it is possible to monitor and track where a brain is needing blood during a specific task without much delay such as when using functional Magnetic Resonance Imaging (fMRI). It is also possible to track and visualize where and at which orientation water molecules in the brain are moving like in Diffusion Tensor Imaging (DTI). Data on certain diseases such as Parkinson’s Disease (PD) has grown considerably, and it is now known that people with PD can be assessed with cognitive tests in combination with neuroimaging to diagnose whether people with PD have cognitive decline in addition to any motor ability decline. The Montreal Cognitive Assessment (MoCA), Modified Semantic Fluency Test (MSF) and Mini-Mental State Exam (MMSE) are the primary tools and are often combined with fMRI or DTI for diagnosing if people with PD also have a mild cognitive impairment (MCI). The current thesis explored a group of cohort of PD patients and classified based on their MoCA, MSF, and Lexical Fluency (LF) scores. The results indicate specific brain differences in whether PD patients were low or high scorers on LF and MoCA scores. The current study’s findings adds to the existing literature that DTI may be more sensitive in detecting differences based on clinical scores.

ContributorsAndrade, Eric (Author) / Oforoi, Edward (Thesis advisor) / Zhou, Yi (Committee member) / Liss, Julie (Committee member) / Arizona State University (Publisher)

Created2022

Examining Corrective Responses and Adaptive Responses to Formant Perturbations

Description

The ability to detect and correct errors during and after speech production is essential for maintaining accuracy and avoiding disruption in communication. Thus, it is crucial to understand the basic mechanisms underlying how the speech-motor system evaluates different errors and correspondingly corrects them. This study aims to explore the impact…

The ability to detect and correct errors during and after speech production is essential for maintaining accuracy and avoiding disruption in communication. Thus, it is crucial to understand the basic mechanisms underlying how the speech-motor system evaluates different errors and correspondingly corrects them. This study aims to explore the impact of three different features of errors, introduced by formant perturbations, on corrective and adaptive responses: (1) magnitude of errors, (2) direction of errors, and (3) extent of exposure to errors. Participants were asked to produce the vowel /ε/ in the context of consonant-vowel-consonant words. Participant-specific formant perturbations were applied for three magnitudes of 0.5, 1, 1.5 along the /ε-æ/ line in two directions of simultaneous F1-F2 shift (i.e., shift in the ε-æ direction) and shift to outside the vowel space. Perturbations were applied randomly in a compensation paradigm, so each perturbed trial was preceded and succeeded by several unperturbed trials. It was observed that (1) corrective and adaptive responses were larger for larger magnitude errors, (2) corrective and adaptive responses were larger for errors in the /ε-æ/ direction, (3) corrective and adaptive responses were generally in the /ε-ɪ/ direction regardless of perturbation direction and magnitude, (4) corrective responses were larger for perturbations in the earlier trials of the experiment.

ContributorsSreedhar, Anuradha Jyothi (Author) / Daliri, Ayoub (Thesis advisor) / Rogalsky, Corianne (Committee member) / Zhou, Yi (Committee member) / Arizona State University (Publisher)

Created2024

The Mechanisms of Auditory Training with Cochlear Implant Simulations

Description

Cochlear implants (CIs) restore hearing to nearly one million individuals with severe-to-profound hearing loss. However, with limited spectral and temporal resolution, CI users may rely heavily on top-down processing using cognitive resources for speech recognition in noise, and change the weighting of different acoustic cues for pitch-related listening tasks such…

Cochlear implants (CIs) restore hearing to nearly one million individuals with severe-to-profound hearing loss. However, with limited spectral and temporal resolution, CI users may rely heavily on top-down processing using cognitive resources for speech recognition in noise, and change the weighting of different acoustic cues for pitch-related listening tasks such as Mandarin tone recognition. While auditory training is known to improve CI users’ performance in these tasks as measured by percent correct scores, the effects of training on cue weighting, listening effort, and untrained tasks need to be better understood, in order to maximize the training benefits. This dissertation addressed these questions by training normal-hearing (NH) listeners listening to CI simulation. Study 1 examined whether Mandarin tone recognition training with enhanced amplitude envelope cues may improve tone recognition scores and increase the weighting of amplitude envelope cues over fundamental frequency (F0) contours. Compared to no training or natural-amplitude-envelope training, enhanced-amplitude-envelope training increased the benefits of amplitude envelope enhancement for tone recognition but did not increase the weighting of amplitude or F0 cues. Listeners attending more to amplitude envelope cues in the pre-test improved more in tone recognition after enhanced-amplitude-envelope training. Study 2 extended Study 1 to compare the generalization effects of tone recognition training alone, vowel recognition training alone, and combined tone and vowel recognition training. The results showed that tone recognition training did not improve vowel recognition or vice versa, although tones and vowels are always produced together in Mandarin. Only combined tone and vowel recognition training improved sentence recognition, showing that both suprasegmental (i.e., tones) and segmental cues (i.e., vowels) were essential for sentence recognition in Mandarin. Study 3 investigated the impact of phoneme recognition training on listening effort of sentence recognition in noise, as measured by a dual-task paradigm, pupillometry, and subjective ratings. It was found that phoneme recognition training improved sentence recognition in noise. The dual-task paradigm and pupillometry indicated that from pre-test to post-test, listening effort reduced in the control group without training, but remained unchanged in the training group. This suggests that training may have motivated listeners to stay focused on the challenging task of sentence recognition in noise. Overall, non-clinical measures such as cue weighting and listening effort can enrich our understanding of the training-induced perceptual and cognitive effects, and allow us to better predict and assess the training outcomes.

ContributorsKim, Seeon (Author) / Luo, Xin (Thesis advisor) / Azuma, Tamiko (Committee member) / Zhou, Yi (Committee member) / Arizona State University (Publisher)

Created2024

Music-Remixing Preferences of Prelingual and Postlingual Cochlear Implant Users

Description

The poor spectral and temporal resolution of cochlear implants (CIs) limit their users’ music enjoyment. Remixing music by boosting vocals while attenuating spectrally complex instruments has been shown to benefit music enjoyment of postlingually deaf CI users. However, the effectiveness of music remixing in prelingually deaf CI users is still…

The poor spectral and temporal resolution of cochlear implants (CIs) limit their users’ music enjoyment. Remixing music by boosting vocals while attenuating spectrally complex instruments has been shown to benefit music enjoyment of postlingually deaf CI users. However, the effectiveness of music remixing in prelingually deaf CI users is still unknown. This study compared the music-remixing preferences of nine postlingually deaf, late-implanted CI users and seven prelingually deaf, early-implanted CI users, as well as their ratings of song familiarity and vocal pleasantness. Twelve songs were selected from the most streamed tracks on Spotify for testing. There were six remixed versions of each song: Original, Music-6 (6-dB attenuation of all instruments), Music-12 (12-dB attenuation of all instruments), Music-3-3-12 (3-dB attenuation of bass and drums and 12-dB attenuation of other instruments), Vocals-6 (6-dB attenuation of vocals), and Vocals-12 (12-dB attenuation of vocals). It was found that the prelingual group preferred the Music-6 and Original versions over the other versions, while the postlingual group preferred the Vocals-12 version over the Music-12 version. The prelingual group was more familiar with the songs than the postlingual group. However, the song familiarity rating did not significantly affect the patterns of preference ratings in each group. The prelingual group also had higher vocal pleasantness ratings than the postlingual group. For the prelingual group, higher vocal pleasantness led to higher preference ratings for the Music-12 version. For the postlingual group, their overall preference for the Vocals-12 version was driven by their preference ratings for songs with very unpleasant vocals. These results suggest that the patient factor of auditory experience and stimulus factor of vocal pleasantness may affect the music-remixing preferences of CI users. As such, the music-remixing strategy needs to be customized for individual patients and songs.

ContributorsVecellio, Amanda Paige (Author) / Luo, Xin (Thesis advisor) / Ringenbach, Shannon (Committee member) / Berisha, Visar (Committee member) / Zhou, Yi (Committee member) / Arizona State University (Publisher)

Created2024

A computational model of the relationship between speech intelligibility and speech acoustics

Description

Speech intelligibility measures how much a speaker can be understood by a listener. Traditional measures of intelligibility, such as word accuracy, are not sufficient to reveal the reasons of intelligibility degradation. This dissertation investigates the underlying sources of intelligibility degradations from both perspectives of the speaker and the listener. Segmental…

Speech intelligibility measures how much a speaker can be understood by a listener. Traditional measures of intelligibility, such as word accuracy, are not sufficient to reveal the reasons of intelligibility degradation. This dissertation investigates the underlying sources of intelligibility degradations from both perspectives of the speaker and the listener. Segmental phoneme errors and suprasegmental lexical boundary errors are developed to reveal the perceptual strategies of the listener. A comprehensive set of automated acoustic measures are developed to quantify variations in the acoustic signal from three perceptual aspects, including articulation, prosody, and vocal quality. The developed measures have been validated on a dysarthric speech dataset with various severity degrees. Multiple regression analysis is employed to show the developed measures could predict perceptual ratings reliably. The relationship between the acoustic measures and the listening errors is investigated to show the interaction between speech production and perception. The hypothesize is that the segmental phoneme errors are mainly caused by the imprecise articulation, while the sprasegmental lexical boundary errors are due to the unreliable phonemic information as well as the abnormal rhythm and prosody patterns. To test the hypothesis, within-speaker variations are simulated in different speaking modes. Significant changes have been detected in both the acoustic signals and the listening errors. Results of the regression analysis support the hypothesis by showing that changes in the articulation-related acoustic features are important in predicting changes in listening phoneme errors, while changes in both of the articulation- and prosody-related features are important in predicting changes in lexical boundary errors. Moreover, significant correlation has been achieved in the cross-validation experiment, which indicates that it is possible to predict intelligibility variations from acoustic signal.

ContributorsJiao, Yishan (Author) / Berisha, Visar (Thesis advisor) / Liss, Julie (Thesis advisor) / Zhou, Yi (Committee member) / Arizona State University (Publisher)

Created2019

Head rotation detection in marmoset monkeys

Description

Head movement is known to have the benefit of improving the accuracy of sound localization for humans and animals. Marmoset is a small bodied New World monkey species and it has become an emerging model for studying the auditory functions. This thesis aims to detect the horizontal and vertical…

Head movement is known to have the benefit of improving the accuracy of sound localization for humans and animals. Marmoset is a small bodied New World monkey species and it has become an emerging model for studying the auditory functions. This thesis aims to detect the horizontal and vertical rotation of head movement in marmoset monkeys.

Experiments were conducted in a sound-attenuated acoustic chamber. Head movement of marmoset monkey was studied under various auditory and visual stimulation conditions. With increasing complexity, these conditions are (1) idle, (2) sound-alone, (3) sound and visual signals, and (4) alert signal by opening and closing of the chamber door. All of these conditions were tested with either house light on or off. Infra-red camera with a frame rate of 90 Hz was used to capture of the head movement of monkeys. To assist the signal detection, two circular markers were attached to the top of monkey head. The data analysis used an image-based marker detection scheme. Images were processed using the Computation Vision Toolbox in Matlab. The markers and their positions were detected using blob detection techniques. Based on the frame-by-frame information of marker positions, the angular position, velocity and acceleration were extracted in horizontal and vertical planes. Adaptive Otsu Thresholding, Kalman filtering and bound setting for marker properties were used to overcome a number of challenges encountered during this analysis, such as finding image segmentation threshold, continuously tracking markers during large head movement, and false alarm detection.

The results show that the blob detection method together with Kalman filtering yielded better performances than other image based techniques like optical flow and SURF features .The median of the maximal head turn in the horizontal plane was in the range of 20 to 70 degrees and the median of the maximal velocity in horizontal plane was in the range of a few hundreds of degrees per second. In comparison, the natural alert signal - door opening and closing - evoked the faster head turns than other stimulus conditions. These results suggest that behaviorally relevant stimulus such as alert signals evoke faster head-turn responses in marmoset monkeys.

ContributorsSimhadri, Sravanthi (Author) / Zhou, Yi (Thesis advisor) / Turaga, Pavan (Thesis advisor) / Berisha, Visar (Committee member) / Arizona State University (Publisher)

Created2014

Dynamic spatial hearing by human and robot listeners

Description

This study consisted of several related projects on dynamic spatial hearing by both human and robot listeners. The first experiment investigated the maximum number of sound sources that human listeners could localize at the same time. Speech stimuli were presented simultaneously from different loudspeakers at multiple time intervals. The maximum…

This study consisted of several related projects on dynamic spatial hearing by both human and robot listeners. The first experiment investigated the maximum number of sound sources that human listeners could localize at the same time. Speech stimuli were presented simultaneously from different loudspeakers at multiple time intervals. The maximum of perceived sound sources was close to four. The second experiment asked whether the amplitude modulation of multiple static sound sources could lead to the perception of auditory motion. On the horizontal and vertical planes, four independent noise sound sources with 60° spacing were amplitude modulated with consecutively larger phase delay. At lower modulation rates, motion could be perceived by human listeners in both cases. The third experiment asked whether several sources at static positions could serve as "acoustic landmarks" to improve the localization of other sources. Four continuous speech sound sources were placed on the horizontal plane with 90° spacing and served as the landmarks. The task was to localize a noise that was played for only three seconds when the listener was passively rotated in a chair in the middle of the loudspeaker array. The human listeners were better able to localize the sound sources with landmarks than without. The other experiments were with the aid of an acoustic manikin in an attempt to fuse binaural recording and motion data to localize sounds sources. A dummy head with recording devices was mounted on top of a rotating chair and motion data was collected. The fourth experiment showed that an Extended Kalman Filter could be used to localize sound sources in a recursive manner. The fifth experiment demonstrated the use of a fitting method for separating multiple sounds sources.

ContributorsZhong, Xuan (Author) / Yost, William (Thesis advisor) / Zhou, Yi (Committee member) / Dorman, Michael (Committee member) / Helms Tillery, Stephen (Committee member) / Arizona State University (Publisher)

Created2015

Theses and Dissertations

Filtering by

Let's Talk Monkey- Quantitative Analysis of Marmoset Monkey Calls

A Mixed Reality Platform for Systematic Investigation of the Neural Mechanisms of Multisensory Integration During Motor Planning

Marmoset Calls Labeling

Diffusion Tensor Imaging of Parkinson’s Disease Patients and Their Cognitive Assessments

Examining Corrective Responses and Adaptive Responses to Formant Perturbations

The Mechanisms of Auditory Training with Cochlear Implant Simulations

Music-Remixing Preferences of Prelingual and Postlingual Cochlear Implant Users

A computational model of the relationship between speech intelligibility and speech acoustics

Head rotation detection in marmoset monkeys

Dynamic spatial hearing by human and robot listeners