Matching Items (6)
Filtering by

Clear all filters

156177-Thumbnail Image.png
Description
The activation of the primary motor cortex (M1) is common in speech perception tasks that involve difficult listening conditions. Although the challenge of recognizing and discriminating non-native speech sounds appears to be an instantiation of listening under difficult circumstances, it is still unknown if M1 recruitment is facilitatory of second

The activation of the primary motor cortex (M1) is common in speech perception tasks that involve difficult listening conditions. Although the challenge of recognizing and discriminating non-native speech sounds appears to be an instantiation of listening under difficult circumstances, it is still unknown if M1 recruitment is facilitatory of second language speech perception. The purpose of this study was to investigate the role of M1 associated with speech motor centers in processing acoustic inputs in the native (L1) and second language (L2), using repetitive Transcranial Magnetic Stimulation (rTMS) to selectively alter neural activity in M1. Thirty-six healthy English/Spanish bilingual subjects participated in the experiment. The performance on a listening word-to-picture matching task was measured before and after real- and sham-rTMS to the orbicularis oris (lip muscle) associated M1. Vowel Space Area (VSA) obtained from recordings of participants reading a passage in L2 before and after real-rTMS, was calculated to determine its utility as an rTMS aftereffect measure. There was high variability in the aftereffect of the rTMS protocol to the lip muscle among the participants. Approximately 50% of participants showed an inhibitory effect of rTMS, evidenced by smaller motor evoked potentials (MEPs) area, whereas the other 50% had a facilitatory effect, with larger MEPs. This suggests that rTMS has a complex influence on M1 excitability, and relying on grand-average results can obscure important individual differences in rTMS physiological and functional outcomes. Evidence of motor support to word recognition in the L2 was found. Participants showing an inhibitory aftereffect of rTMS on M1 produced slower and less accurate responses in the L2 task, whereas those showing a facilitatory aftereffect of rTMS on M1 produced more accurate responses in L2. In contrast, no effect of rTMS was found on the L1, where accuracy and speed were very similar after sham- and real-rTMS. The L2 VSA measure was indicative of the aftereffect of rTMS to M1 associated with speech production, supporting its utility as an rTMS aftereffect measure. This result revealed an interesting and novel relation between cerebral motor cortex activation and speech measures.
ContributorsBarragan, Beatriz (Author) / Liss, Julie (Thesis advisor) / Berisha, Visar (Committee member) / Rogalsky, Corianne (Committee member) / Restrepo, Adelaida (Committee member) / Arizona State University (Publisher)
Created2018
157084-Thumbnail Image.png
Description
Cognitive deficits often accompany language impairments post-stroke. Past research has focused on working memory in aphasia, but attention is largely underexplored. Therefore, this dissertation will first quantify attention deficits post-stroke before investigating whether preserved cognitive abilities, including attention, can improve auditory sentence comprehension post-stroke. In Experiment 1a, three components of

Cognitive deficits often accompany language impairments post-stroke. Past research has focused on working memory in aphasia, but attention is largely underexplored. Therefore, this dissertation will first quantify attention deficits post-stroke before investigating whether preserved cognitive abilities, including attention, can improve auditory sentence comprehension post-stroke. In Experiment 1a, three components of attention (alerting, orienting, executive control) were measured in persons with aphasia and matched-controls using visual and auditory versions of the well-studied Attention Network Test. Experiment 1b then explored the neural resources supporting each component of attention in the visual and auditory modalities in chronic stroke participants. The results from Experiment 1a indicate that alerting, orienting, and executive control are uniquely affected by presentation modality. The lesion-symptom mapping results from Experiment 1b associated the left angular gyrus with visual executive control, the left supramarginal gyrus with auditory alerting, and Broca’s area (pars opercularis) with auditory orienting attention post-stroke. Overall, these findings indicate that perceptual modality may impact the lateralization of some aspects of attention, thus auditory attention may be more susceptible to impairment after a left hemisphere stroke.

Prosody, rhythm and pitch changes associated with spoken language may improve spoken language comprehension in persons with aphasia by recruiting intact cognitive abilities (e.g., attention and working memory) and their associated non-lesioned brain regions post-stroke. Therefore, Experiment 2 explored the relationship between cognition, two unique prosody manipulations, lesion location, and auditory sentence comprehension in persons with chronic stroke and matched-controls. The combined results from Experiment 2a and 2b indicate that stroke participants with better auditory orienting attention and a specific left fronto-parietal network intact had greater comprehension of sentences spoken with sentence prosody. For list prosody, participants with deficits in auditory executive control and/or short-term memory and the left angular gyrus and globus pallidus relatively intact, demonstrated better comprehension of sentences spoken with list prosody. Overall, the results from Experiment 2 indicate that following a left hemisphere stroke, individuals need good auditory attention and an intact left fronto-parietal network to benefit from typical sentence prosody, yet when cognitive deficits are present and this fronto-parietal network is damaged, list prosody may be more beneficial.
ContributorsLaCroix, Arianna (Author) / Rogalsky, Corianne (Thesis advisor) / Azuma, Tamiko (Committee member) / Braden, B. Blair (Committee member) / Liss, Julie (Committee member) / Arizona State University (Publisher)
Created2019
157359-Thumbnail Image.png
Description
Speech intelligibility measures how much a speaker can be understood by a listener. Traditional measures of intelligibility, such as word accuracy, are not sufficient to reveal the reasons of intelligibility degradation. This dissertation investigates the underlying sources of intelligibility degradations from both perspectives of the speaker and the listener. Segmental

Speech intelligibility measures how much a speaker can be understood by a listener. Traditional measures of intelligibility, such as word accuracy, are not sufficient to reveal the reasons of intelligibility degradation. This dissertation investigates the underlying sources of intelligibility degradations from both perspectives of the speaker and the listener. Segmental phoneme errors and suprasegmental lexical boundary errors are developed to reveal the perceptual strategies of the listener. A comprehensive set of automated acoustic measures are developed to quantify variations in the acoustic signal from three perceptual aspects, including articulation, prosody, and vocal quality. The developed measures have been validated on a dysarthric speech dataset with various severity degrees. Multiple regression analysis is employed to show the developed measures could predict perceptual ratings reliably. The relationship between the acoustic measures and the listening errors is investigated to show the interaction between speech production and perception. The hypothesize is that the segmental phoneme errors are mainly caused by the imprecise articulation, while the sprasegmental lexical boundary errors are due to the unreliable phonemic information as well as the abnormal rhythm and prosody patterns. To test the hypothesis, within-speaker variations are simulated in different speaking modes. Significant changes have been detected in both the acoustic signals and the listening errors. Results of the regression analysis support the hypothesis by showing that changes in the articulation-related acoustic features are important in predicting changes in listening phoneme errors, while changes in both of the articulation- and prosody-related features are important in predicting changes in lexical boundary errors. Moreover, significant correlation has been achieved in the cross-validation experiment, which indicates that it is possible to predict intelligibility variations from acoustic signal.
ContributorsJiao, Yishan (Author) / Berisha, Visar (Thesis advisor) / Liss, Julie (Thesis advisor) / Zhou, Yi (Committee member) / Arizona State University (Publisher)
Created2019
158464-Thumbnail Image.png
Description
In many biological research studies, including speech analysis, clinical research, and prediction studies, the validity of the study is dependent on the effectiveness of the training data set to represent the target population. For example, in speech analysis, if one is performing emotion classification based on speech, the performance of

In many biological research studies, including speech analysis, clinical research, and prediction studies, the validity of the study is dependent on the effectiveness of the training data set to represent the target population. For example, in speech analysis, if one is performing emotion classification based on speech, the performance of the classifier is mainly dependent on the number and quality of the training data set. For small sample sizes and unbalanced data, classifiers developed in this context may be focusing on the differences in the training data set rather than emotion (e.g., focusing on gender, age, and dialect).

This thesis evaluates several sampling methods and a non-parametric approach to sample sizes required to minimize the effect of these nuisance variables on classification performance. This work specifically focused on speech analysis applications, and hence the work was done with speech features like Mel-Frequency Cepstral Coefficients (MFCC) and Filter Bank Cepstral Coefficients (FBCC). The non-parametric divergence (D_p divergence) measure was used to study the difference between different sampling schemes (Stratified and Multistage sampling) and the changes due to the sentence types in the sampling set for the process.
ContributorsMariajohn, Aaquila (Author) / Berisha, Visar (Thesis advisor) / Spanias, Andreas (Committee member) / Liss, Julie (Committee member) / Arizona State University (Publisher)
Created2020
168768-Thumbnail Image.png
Description
Diffusion Tensor Imaging may be used to understand brain differences within PD. Within the last couple of decades there has been an explosion of learning and development in neuroimaging techniques. Today, it is possible to monitor and track where a brain is needing blood during a specific task without much

Diffusion Tensor Imaging may be used to understand brain differences within PD. Within the last couple of decades there has been an explosion of learning and development in neuroimaging techniques. Today, it is possible to monitor and track where a brain is needing blood during a specific task without much delay such as when using functional Magnetic Resonance Imaging (fMRI). It is also possible to track and visualize where and at which orientation water molecules in the brain are moving like in Diffusion Tensor Imaging (DTI). Data on certain diseases such as Parkinson’s Disease (PD) has grown considerably, and it is now known that people with PD can be assessed with cognitive tests in combination with neuroimaging to diagnose whether people with PD have cognitive decline in addition to any motor ability decline. The Montreal Cognitive Assessment (MoCA), Modified Semantic Fluency Test (MSF) and Mini-Mental State Exam (MMSE) are the primary tools and are often combined with fMRI or DTI for diagnosing if people with PD also have a mild cognitive impairment (MCI). The current thesis explored a group of cohort of PD patients and classified based on their MoCA, MSF, and Lexical Fluency (LF) scores. The results indicate specific brain differences in whether PD patients were low or high scorers on LF and MoCA scores. The current study’s findings adds to the existing literature that DTI may be more sensitive in detecting differences based on clinical scores.
ContributorsAndrade, Eric (Author) / Oforoi, Edward (Thesis advisor) / Zhou, Yi (Committee member) / Liss, Julie (Committee member) / Arizona State University (Publisher)
Created2022
190765-Thumbnail Image.png
Description
Speech analysis for clinical applications has emerged as a burgeoning field, providing valuable insights into an individual's physical and physiological state. Researchers have explored speech features for clinical applications, such as diagnosing, predicting, and monitoring various pathologies. Before presenting the new deep learning frameworks, this thesis introduces a study on

Speech analysis for clinical applications has emerged as a burgeoning field, providing valuable insights into an individual's physical and physiological state. Researchers have explored speech features for clinical applications, such as diagnosing, predicting, and monitoring various pathologies. Before presenting the new deep learning frameworks, this thesis introduces a study on conventional acoustic feature changes in subjects with post-traumatic headache (PTH) attributed to mild traumatic brain injury (mTBI). This work demonstrates the effectiveness of using speech signals to assess the pathological status of individuals. At the same time, it highlights some of the limitations of conventional acoustic and linguistic features, such as low repeatability and generalizability. Two critical characteristics of speech features are (1) good robustness, as speech features need to generalize across different corpora, and (2) high repeatability, as speech features need to be invariant to all confounding factors except the pathological state of targets. This thesis presents two research thrusts in the context of speech signals in clinical applications that focus on improving the robustness and repeatability of speech features, respectively. The first thrust introduces a deep learning framework to generate acoustic feature embeddings sensitive to vocal quality and robust across different corpora. A contrastive loss combined with a classification loss is used to train the model jointly, and data-warping techniques are employed to improve the robustness of embeddings. Empirical results demonstrate that the proposed method achieves high in-corpus and cross-corpus classification accuracy and generates good embeddings sensitive to voice quality and robust across different corpora. The second thrust introduces using the intra-class correlation coefficient (ICC) to evaluate the repeatability of embeddings. A novel regularizer, the ICC regularizer, is proposed to regularize deep neural networks to produce embeddings with higher repeatability. This ICC regularizer is implemented and applied to three speech applications: a clinical application, speaker verification, and voice style conversion. The experimental results reveal that the ICC regularizer improves the repeatability of learned embeddings compared to the contrastive loss, leading to enhanced performance in downstream tasks.
ContributorsZhang, Jianwei (Author) / Jayasuriya, Suren (Thesis advisor) / Berisha, Visar (Thesis advisor) / Liss, Julie (Committee member) / Spanias, Andreas (Committee member) / Arizona State University (Publisher)
Created2023