Matching Items (12)
Filtering by

Clear all filters

151634-Thumbnail Image.png
Description
Two groups of cochlear implant (CI) listeners were tested for sound source localization and for speech recognition in complex listening environments. One group (n=11) wore bilateral CIs and, potentially, had access to interaural level difference (ILD) cues, but not interaural timing difference (ITD) cues. The second group (n=12) wore a

Two groups of cochlear implant (CI) listeners were tested for sound source localization and for speech recognition in complex listening environments. One group (n=11) wore bilateral CIs and, potentially, had access to interaural level difference (ILD) cues, but not interaural timing difference (ITD) cues. The second group (n=12) wore a single CI and had low-frequency, acoustic hearing in both the ear contralateral to the CI and in the implanted ear. These `hearing preservation' listeners, potentially, had access to ITD cues but not to ILD cues. At issue in this dissertation was the value of the two types of information about sound sources, ITDs and ILDs, for localization and for speech perception when speech and noise sources were separated in space. For Experiment 1, normal hearing (NH) listeners and the two groups of CI listeners were tested for sound source localization using a 13 loudspeaker array. For the NH listeners, the mean RMS error for localization was 7 degrees, for the bilateral CI listeners, 20 degrees, and for the hearing preservation listeners, 23 degrees. The scores for the two CI groups did not differ significantly. Thus, both CI groups showed equivalent, but poorer than normal, localization. This outcome using the filtered noise bands for the normal hearing listeners, suggests ILD and ITD cues can support equivalent levels of localization. For Experiment 2, the two groups of CI listeners were tested for speech recognition in noise when the noise sources and targets were spatially separated in a simulated `restaurant' environment and in two versions of a `cocktail party' environment. At issue was whether either CI group would show benefits from binaural hearing, i.e., better performance when the noise and targets were separated in space. Neither of the CI groups showed spatial release from masking. However, both groups showed a significant binaural advantage (a combination of squelch and summation), which also maintained separation of the target and noise, indicating the presence of some binaural processing or `unmasking' of speech in noise. Finally, localization ability in Experiment 1 was not correlated with binaural advantage in Experiment 2.
ContributorsLoiselle, Louise (Author) / Dorman, Michael F. (Thesis advisor) / Yost, William A. (Thesis advisor) / Azuma, Tamiko (Committee member) / Liss, Julie (Committee member) / Arizona State University (Publisher)
Created2013
171445-Thumbnail Image.png
Description
Stroke is the leading cause of long-term disability in the U.S., with up to 60% of strokescausing speech loss. Individuals with severe stroke, who require the most frequent, intense speech therapy, often cannot adhere to treatments due to high cost and low success rates. Therefore, the ability to make functionally

Stroke is the leading cause of long-term disability in the U.S., with up to 60% of strokescausing speech loss. Individuals with severe stroke, who require the most frequent, intense speech therapy, often cannot adhere to treatments due to high cost and low success rates. Therefore, the ability to make functionally significant changes in individuals with severe post- stroke aphasia remains a key challenge for the rehabilitation community. This dissertation aimed to evaluate the efficacy of Startle Adjuvant Rehabilitation Therapy (START), a tele-enabled, low- cost treatment, to improve quality of life and speech in individuals with severe-to-moderate stroke. START is the exposure to startling acoustic stimuli during practice of motor tasks in individuals with stroke. START increases the speed and intensity of practice in severely impaired post-stroke reaching, with START eliciting muscle activity 2-3 times higher than maximum voluntary contraction. Voluntary reaching distance, onset, and final accuracy increased after a session of START, suggesting a rehabilitative effect. However, START has not been evaluated during impaired speech. The objective of this study is to determine if impaired speech can be elicited by startling acoustic stimuli, and if three days of START training can enhance clinical measures of moderate to severe post-stroke aphasia and apraxia of speech. This dissertation evaluates START in 42 individuals with post-stroke speech impairment via telehealth in a Phase 0 clinical trial. Results suggest that impaired speech can be elicited by startling acoustic stimuli and that START benefits individuals with severe-to-moderate post-stroke impairments in both linguistic and motor speech domains. This fills an important gap in aphasia care, as many speech therapies remain ineffective and financially inaccessible for patients with severe deficits. START is effective, remotely delivered, and may likely serve as an affordable adjuvant to traditional therapy for those that have poor access to quality care.
ContributorsSwann, Zoe Elisabeth (Author) / Honeycutt, Claire F (Thesis advisor) / Daliri, Ayoub (Committee member) / Rogalsky, Corianne (Committee member) / Liss, Julie (Committee member) / Schaefer, Sydney (Committee member) / Arizona State University (Publisher)
Created2022
190765-Thumbnail Image.png
Description
Speech analysis for clinical applications has emerged as a burgeoning field, providing valuable insights into an individual's physical and physiological state. Researchers have explored speech features for clinical applications, such as diagnosing, predicting, and monitoring various pathologies. Before presenting the new deep learning frameworks, this thesis introduces a study on

Speech analysis for clinical applications has emerged as a burgeoning field, providing valuable insights into an individual's physical and physiological state. Researchers have explored speech features for clinical applications, such as diagnosing, predicting, and monitoring various pathologies. Before presenting the new deep learning frameworks, this thesis introduces a study on conventional acoustic feature changes in subjects with post-traumatic headache (PTH) attributed to mild traumatic brain injury (mTBI). This work demonstrates the effectiveness of using speech signals to assess the pathological status of individuals. At the same time, it highlights some of the limitations of conventional acoustic and linguistic features, such as low repeatability and generalizability. Two critical characteristics of speech features are (1) good robustness, as speech features need to generalize across different corpora, and (2) high repeatability, as speech features need to be invariant to all confounding factors except the pathological state of targets. This thesis presents two research thrusts in the context of speech signals in clinical applications that focus on improving the robustness and repeatability of speech features, respectively. The first thrust introduces a deep learning framework to generate acoustic feature embeddings sensitive to vocal quality and robust across different corpora. A contrastive loss combined with a classification loss is used to train the model jointly, and data-warping techniques are employed to improve the robustness of embeddings. Empirical results demonstrate that the proposed method achieves high in-corpus and cross-corpus classification accuracy and generates good embeddings sensitive to voice quality and robust across different corpora. The second thrust introduces using the intra-class correlation coefficient (ICC) to evaluate the repeatability of embeddings. A novel regularizer, the ICC regularizer, is proposed to regularize deep neural networks to produce embeddings with higher repeatability. This ICC regularizer is implemented and applied to three speech applications: a clinical application, speaker verification, and voice style conversion. The experimental results reveal that the ICC regularizer improves the repeatability of learned embeddings compared to the contrastive loss, leading to enhanced performance in downstream tasks.
ContributorsZhang, Jianwei (Author) / Jayasuriya, Suren (Thesis advisor) / Berisha, Visar (Thesis advisor) / Liss, Julie (Committee member) / Spanias, Andreas (Committee member) / Arizona State University (Publisher)
Created2023
157359-Thumbnail Image.png
Description
Speech intelligibility measures how much a speaker can be understood by a listener. Traditional measures of intelligibility, such as word accuracy, are not sufficient to reveal the reasons of intelligibility degradation. This dissertation investigates the underlying sources of intelligibility degradations from both perspectives of the speaker and the listener. Segmental

Speech intelligibility measures how much a speaker can be understood by a listener. Traditional measures of intelligibility, such as word accuracy, are not sufficient to reveal the reasons of intelligibility degradation. This dissertation investigates the underlying sources of intelligibility degradations from both perspectives of the speaker and the listener. Segmental phoneme errors and suprasegmental lexical boundary errors are developed to reveal the perceptual strategies of the listener. A comprehensive set of automated acoustic measures are developed to quantify variations in the acoustic signal from three perceptual aspects, including articulation, prosody, and vocal quality. The developed measures have been validated on a dysarthric speech dataset with various severity degrees. Multiple regression analysis is employed to show the developed measures could predict perceptual ratings reliably. The relationship between the acoustic measures and the listening errors is investigated to show the interaction between speech production and perception. The hypothesize is that the segmental phoneme errors are mainly caused by the imprecise articulation, while the sprasegmental lexical boundary errors are due to the unreliable phonemic information as well as the abnormal rhythm and prosody patterns. To test the hypothesis, within-speaker variations are simulated in different speaking modes. Significant changes have been detected in both the acoustic signals and the listening errors. Results of the regression analysis support the hypothesis by showing that changes in the articulation-related acoustic features are important in predicting changes in listening phoneme errors, while changes in both of the articulation- and prosody-related features are important in predicting changes in lexical boundary errors. Moreover, significant correlation has been achieved in the cross-validation experiment, which indicates that it is possible to predict intelligibility variations from acoustic signal.
ContributorsJiao, Yishan (Author) / Berisha, Visar (Thesis advisor) / Liss, Julie (Thesis advisor) / Zhou, Yi (Committee member) / Arizona State University (Publisher)
Created2019
157084-Thumbnail Image.png
Description
Cognitive deficits often accompany language impairments post-stroke. Past research has focused on working memory in aphasia, but attention is largely underexplored. Therefore, this dissertation will first quantify attention deficits post-stroke before investigating whether preserved cognitive abilities, including attention, can improve auditory sentence comprehension post-stroke. In Experiment 1a, three components of

Cognitive deficits often accompany language impairments post-stroke. Past research has focused on working memory in aphasia, but attention is largely underexplored. Therefore, this dissertation will first quantify attention deficits post-stroke before investigating whether preserved cognitive abilities, including attention, can improve auditory sentence comprehension post-stroke. In Experiment 1a, three components of attention (alerting, orienting, executive control) were measured in persons with aphasia and matched-controls using visual and auditory versions of the well-studied Attention Network Test. Experiment 1b then explored the neural resources supporting each component of attention in the visual and auditory modalities in chronic stroke participants. The results from Experiment 1a indicate that alerting, orienting, and executive control are uniquely affected by presentation modality. The lesion-symptom mapping results from Experiment 1b associated the left angular gyrus with visual executive control, the left supramarginal gyrus with auditory alerting, and Broca’s area (pars opercularis) with auditory orienting attention post-stroke. Overall, these findings indicate that perceptual modality may impact the lateralization of some aspects of attention, thus auditory attention may be more susceptible to impairment after a left hemisphere stroke.

Prosody, rhythm and pitch changes associated with spoken language may improve spoken language comprehension in persons with aphasia by recruiting intact cognitive abilities (e.g., attention and working memory) and their associated non-lesioned brain regions post-stroke. Therefore, Experiment 2 explored the relationship between cognition, two unique prosody manipulations, lesion location, and auditory sentence comprehension in persons with chronic stroke and matched-controls. The combined results from Experiment 2a and 2b indicate that stroke participants with better auditory orienting attention and a specific left fronto-parietal network intact had greater comprehension of sentences spoken with sentence prosody. For list prosody, participants with deficits in auditory executive control and/or short-term memory and the left angular gyrus and globus pallidus relatively intact, demonstrated better comprehension of sentences spoken with list prosody. Overall, the results from Experiment 2 indicate that following a left hemisphere stroke, individuals need good auditory attention and an intact left fronto-parietal network to benefit from typical sentence prosody, yet when cognitive deficits are present and this fronto-parietal network is damaged, list prosody may be more beneficial.
ContributorsLaCroix, Arianna (Author) / Rogalsky, Corianne (Thesis advisor) / Azuma, Tamiko (Committee member) / Braden, B. Blair (Committee member) / Liss, Julie (Committee member) / Arizona State University (Publisher)
Created2019
153453-Thumbnail Image.png
Description
The present study describes audiovisual sentence recognition in normal hearing listeners, bimodal cochlear implant (CI) listeners and bilateral CI listeners. This study explores a new set of sentences (the AzAV sentences) that were created to have equal auditory intelligibility and equal gain from visual information.

The aims of Experiment I

The present study describes audiovisual sentence recognition in normal hearing listeners, bimodal cochlear implant (CI) listeners and bilateral CI listeners. This study explores a new set of sentences (the AzAV sentences) that were created to have equal auditory intelligibility and equal gain from visual information.

The aims of Experiment I were to (i) compare the lip reading difficulty of the AzAV sentences to that of other sentence materials, (ii) compare the speech-reading ability of CI listeners to that of normal-hearing listeners and (iii) assess the gain in speech understanding when listeners have both auditory and visual information from easy-to-lip-read and difficult-to-lip read sentences. In addition, the sentence lists were subjected to a multi-level text analysis to determine the factors that make sentences easy or difficult to speech read.

The results of Experiment I showed that (i) the AzAV sentences were relatively difficult to lip read, (ii) that CI listeners and normal-hearing listeners did not differ in lip reading ability and (iii) that sentences with low lip-reading intelligibility (10-15 % correct) provide about a 30 percentage point improvement in speech understanding when added to the acoustic stimulus, while sentences with high lip-reading intelligibility (30-60 % correct) provide about a 50 percentage point improvement in the same comparison. The multi-level text analyses showed that the familiarity of phrases in the sentences was the primary driving factor that affects the lip reading difficulty.

The aim of Experiment II was to investigate the value, when visual information is present, of bimodal hearing and bilateral cochlear implants. The results of Experiment II showed that when visual information is present, low-frequency acoustic hearing can be of value to speech understanding for patients fit with a single CI. However, when visual information was available no gain was seen from the provision of a second CI, i.e., bilateral CIs. As was the case in Experiment I, visual information provided about a 30 percentage point improvement in speech understanding.
ContributorsWang, Shuai (Author) / Dorman, Michael (Thesis advisor) / Berisha, Visar (Committee member) / Liss, Julie (Committee member) / Arizona State University (Publisher)
Created2015
156177-Thumbnail Image.png
Description
The activation of the primary motor cortex (M1) is common in speech perception tasks that involve difficult listening conditions. Although the challenge of recognizing and discriminating non-native speech sounds appears to be an instantiation of listening under difficult circumstances, it is still unknown if M1 recruitment is facilitatory of second

The activation of the primary motor cortex (M1) is common in speech perception tasks that involve difficult listening conditions. Although the challenge of recognizing and discriminating non-native speech sounds appears to be an instantiation of listening under difficult circumstances, it is still unknown if M1 recruitment is facilitatory of second language speech perception. The purpose of this study was to investigate the role of M1 associated with speech motor centers in processing acoustic inputs in the native (L1) and second language (L2), using repetitive Transcranial Magnetic Stimulation (rTMS) to selectively alter neural activity in M1. Thirty-six healthy English/Spanish bilingual subjects participated in the experiment. The performance on a listening word-to-picture matching task was measured before and after real- and sham-rTMS to the orbicularis oris (lip muscle) associated M1. Vowel Space Area (VSA) obtained from recordings of participants reading a passage in L2 before and after real-rTMS, was calculated to determine its utility as an rTMS aftereffect measure. There was high variability in the aftereffect of the rTMS protocol to the lip muscle among the participants. Approximately 50% of participants showed an inhibitory effect of rTMS, evidenced by smaller motor evoked potentials (MEPs) area, whereas the other 50% had a facilitatory effect, with larger MEPs. This suggests that rTMS has a complex influence on M1 excitability, and relying on grand-average results can obscure important individual differences in rTMS physiological and functional outcomes. Evidence of motor support to word recognition in the L2 was found. Participants showing an inhibitory aftereffect of rTMS on M1 produced slower and less accurate responses in the L2 task, whereas those showing a facilitatory aftereffect of rTMS on M1 produced more accurate responses in L2. In contrast, no effect of rTMS was found on the L1, where accuracy and speed were very similar after sham- and real-rTMS. The L2 VSA measure was indicative of the aftereffect of rTMS to M1 associated with speech production, supporting its utility as an rTMS aftereffect measure. This result revealed an interesting and novel relation between cerebral motor cortex activation and speech measures.
ContributorsBarragan, Beatriz (Author) / Liss, Julie (Thesis advisor) / Berisha, Visar (Committee member) / Rogalsky, Corianne (Committee member) / Restrepo, Adelaida (Committee member) / Arizona State University (Publisher)
Created2018
156069-Thumbnail Image.png
Description
Military veterans have a significantly higher incidence of mild traumatic brain injury (mTBI), depression, and Post-traumatic stress disorder (PTSD) compared to civilians. Military veterans also represent a rapidly growing subgroup of college students, due in part to the robust and financially incentivizing educational benefits under the Post-9/11 GI Bill. The

Military veterans have a significantly higher incidence of mild traumatic brain injury (mTBI), depression, and Post-traumatic stress disorder (PTSD) compared to civilians. Military veterans also represent a rapidly growing subgroup of college students, due in part to the robust and financially incentivizing educational benefits under the Post-9/11 GI Bill. The overlapping cognitively impacting symptoms of service-related conditions combined with the underreporting of mTBI and psychiatric-related conditions, make accurate assessment of cognitive performance in military veterans challenging. Recent research findings provide conflicting information on cognitive performance patterns in military veterans. The purpose of this study was to determine whether service-related conditions and self-assessments predict performance on complex working memory and executive function tasks for military veteran college students. Sixty-one military veteran college students attending classes at Arizona State University campuses completed clinical neuropsychological tasks and experimental working memory and executive function tasks. The results revealed that a history of mTBI significantly predicted poorer performance in the areas of verbal working memory and decision-making. Depression significantly predicted poorer performance in executive function related to serial updating. In contrast, the commonly used clinical neuropsychological tasks were not sensitive service-related conditions including mTBI, PTSD, and depression. The differing performance patterns observed between the clinical tasks and the more complex experimental tasks support that researchers and clinicians should use tests that sufficiently tax verbal working memory and executive function when evaluating the subtle, higher-order cognitive deficits associated with mTBI and depression.
ContributorsGallagher, Karen Louise (Author) / Azuma, Tamiko (Thesis advisor) / Liss, Julie (Committee member) / Lavoie, Michael (Committee member) / Arizona State University (Publisher)
Created2017
158471-Thumbnail Image.png
Description
Children with cleft palate with or without cleft lip (CP+/-L) often demonstrate disordered speech. Clinicians and researchers have a goal for children with CP+/-L to demonstrate typical speech when entering kindergarten; however, this benchmark is not routinely met. There is a large body of previous research examining speech articulation skills

Children with cleft palate with or without cleft lip (CP+/-L) often demonstrate disordered speech. Clinicians and researchers have a goal for children with CP+/-L to demonstrate typical speech when entering kindergarten; however, this benchmark is not routinely met. There is a large body of previous research examining speech articulation skills in this clinical population; however, there are continued questions regarding the severity of articulation deficits in children with CP+/-L, especially for the age range of children entering school. This dissertation aimed to provide additional information on speech accuracy and speech error usage in children with CP+/-L between the ages of four and seven years. Additionally, it explored individual and treatment characteristics that may influence articulation skills. Finally, it examined the relationship between speech accuracy during a sentence repetition task versus during a single-word naming task.

Children with CP+/-L presented with speech accuracy that differed according to manner of production. Speech accuracy for fricative phonemes was influenced by severity of hypernasality, although age and status of secondary surgery did not influence speech accuracy for fricatives. For place of articulation, children with CP+/-L demonstrated strongest accuracy of production for bilabial and velar phonemes, while alveolar and palatal phonemes were produced with lower accuracy. Children with clefting that involved the lip and alveolus demonstrated reduced speech accuracy for alveolar phonemes compared to children with clefts involving the hard and soft palate only.

Participants used a variety of speech error types, with developmental/phonological errors, anterior oral cleft speech characteristics, and compensatory errors occurring most frequently across the sample. Several factors impacted the type of speech errors used, including cleft type, severity of hypernasality, and age.

The results from this dissertation project support previous research findings and provide additional information regarding the severity of speech articulation deficits according to manner and place of consonant production and according to different speech error categories. This study adds information on individual and treatment characteristics that influenced speech accuracy and speech error usage.
ContributorsLien, Kari (Author) / Scherer, Nancy J. (Thesis advisor) / Nett Cordero, Kelly (Committee member) / Liss, Julie (Committee member) / Sitzman, Thomas (Committee member) / Arizona State University (Publisher)
Created2020
154197-Thumbnail Image.png
Description
Studies in Second Language Acquisition and Neurolinguistics have argued that adult learners when dealing with certain phonological features of L2, such as segmental and suprasegmental ones, face problems of articulatory placement (Esling, 2006; Abercrombie, 1967) and somatosensory stimulation (Guenther, Ghosh, & Tourville, 2006; Waldron, 2010). These studies have argued that

Studies in Second Language Acquisition and Neurolinguistics have argued that adult learners when dealing with certain phonological features of L2, such as segmental and suprasegmental ones, face problems of articulatory placement (Esling, 2006; Abercrombie, 1967) and somatosensory stimulation (Guenther, Ghosh, & Tourville, 2006; Waldron, 2010). These studies have argued that adult phonological acquisition is a complex matter that needs to be informed by a specialized sensorimotor theory of speech acquisition. They further suggested that traditional pronunciation pedagogy needs to be enhanced by an approach to learning offering learners fundamental and practical sensorimotor tools to advance the quality of L2 speech acquisition.



This foundational study designs a sensorimotor approach to pronunciation pedagogy and tests its effect on the L2 speech of five adult (late) learners of American English. Throughout an eight week classroom experiment, participants from different first language backgrounds received instruction on Articulatory Settings (Honickman, 1964) and the sensorimotor mechanism of speech acquisition (Waldron 2010; Guenther et al., 2006). In addition, they attended five adapted lessons of the Feldenkrais technique (Feldenkrais, 1972) designed to develop sensorimotor awareness of the vocal apparatus and improve the quality of L2 speech movement. I hypothesize that such sensorimotor learning triggers overall positive changes in the way L2 learners deal with speech articulators for L2 and that over time they develop better pronunciation.

After approximately eight hours of intervention, analysis of results shows participants’ improvement in speech rate, degree of accentedness, and speaking confidence, but mixed changes in word intelligibility and vowel space area. Albeit not statistically significant (p >.05), these results suggest that such a sensorimotor approach to L2 phonological acquisition warrants further consideration and investigation for use in the L2 classroom.
ContributorsLima, J. Alberto S., Jr (Author) / Pruitt, Kathryn (Thesis advisor) / Gelderen, Elly van (Thesis advisor) / Liss, Julie (Committee member) / James, Mark (Committee member) / Arizona State University (Publisher)
Created2015