Matching Items (5)

137447-Thumbnail Image.png

Vowel Normalization in Dysarthria

Description

In this study, the Bark transform and Lobanov method were used to normalize vowel formants in speech produced by persons with dysarthria. The computer classification accuracy of these normalized data

In this study, the Bark transform and Lobanov method were used to normalize vowel formants in speech produced by persons with dysarthria. The computer classification accuracy of these normalized data were then compared to the results of human perceptual classification accuracy of the actual vowels. These results were then analyzed to determine if these techniques correlated with the human data.

Contributors

Agent

Created

Date Created
  • 2013-05

152801-Thumbnail Image.png

Audiovisual perception of dysarthric speech in older adults compared to younger adults

Description

Everyday speech communication typically takes place face-to-face. Accordingly, the task of perceiving speech is a multisensory phenomenon involving both auditory and visual information. The current investigation examines how visual information

Everyday speech communication typically takes place face-to-face. Accordingly, the task of perceiving speech is a multisensory phenomenon involving both auditory and visual information. The current investigation examines how visual information influences recognition of dysarthric speech. It also explores where the influence of visual information is dependent upon age. Forty adults participated in the study that measured intelligibility (percent words correct) of dysarthric speech in auditory versus audiovisual conditions. Participants were then separated into two groups: older adults (age range 47 to 68) and young adults (age range 19 to 36) to examine the influence of age. Findings revealed that all participants, regardless of age, improved their ability to recognize dysarthric speech when visual speech was added to the auditory signal. The magnitude of this benefit, however, was greater for older adults when compared with younger adults. These results inform our understanding of how visual speech information influences understanding of dysarthric speech.

Contributors

Agent

Created

Date Created
  • 2014

158318-Thumbnail Image.png

Characterizing Dysarthric Speech with Transfer Learning

Description

Speech is known to serve as an early indicator of neurological decline, particularly in motor diseases. There is significant interest in developing automated, objective signal analytics that detect clinically-relevant changes

Speech is known to serve as an early indicator of neurological decline, particularly in motor diseases. There is significant interest in developing automated, objective signal analytics that detect clinically-relevant changes and in evaluating these algorithms against the existing gold-standard: perceptual evaluation by trained speech and language pathologists. Hypernasality, the result of poor control of the velopharyngeal flap---the soft palate regulating airflow between the oral and nasal cavities---is one such speech symptom of interest, as precise velopharyngeal control is difficult to achieve under neuromuscular disorders. However, a host of co-modulating variables give hypernasal speech a complex and highly variable acoustic signature, making it difficult for skilled clinicians to assess and for automated systems to evaluate. Previous work in rating hypernasality from speech relies on either engineered features based on statistical signal processing or machine learning models trained end-to-end on clinical ratings of disordered speech examples. Engineered features often fail to capture the complex acoustic patterns associated with hypernasality, while end-to-end methods tend to overfit to the small datasets on which they are trained. In this thesis, I present a set of acoustic features, models, and strategies for characterizing hypernasality in dysarthric speech that split the difference between these two approaches, with the aim of capturing the complex perceptual character of hypernasality without overfitting to the small datasets available. The features are based on acoustic models trained on a large corpus of healthy speech, integrating expert knowledge to capture known perceptual characteristics of hypernasal speech. They are then used in relatively simple linear models to predict clinician hypernasality scores. These simple models are robust, generalizing across diseases and outperforming comprehensive set of baselines in accuracy and correlation. This novel approach represents a new state-of-the-art in objective hypernasality assessment.

Contributors

Agent

Created

Date Created
  • 2020

150496-Thumbnail Image.png

Degraded vowel acoustics and the perceptual consequences in dysarthria

Description

Distorted vowel production is a hallmark characteristic of dysarthric speech, irrespective of the underlying neurological condition or dysarthria diagnosis. A variety of acoustic metrics have been used to study the

Distorted vowel production is a hallmark characteristic of dysarthric speech, irrespective of the underlying neurological condition or dysarthria diagnosis. A variety of acoustic metrics have been used to study the nature of vowel production deficits in dysarthria; however, not all demonstrate sensitivity to the exhibited deficits. Less attention has been paid to quantifying the vowel production deficits associated with the specific dysarthrias. Attempts to characterize the relationship between naturally degraded vowel production in dysarthria with overall intelligibility have met with mixed results, leading some to question the nature of this relationship. It has been suggested that aberrant vowel acoustics may be an index of overall severity of the impairment and not an "integral component" of the intelligibility deficit. A limitation of previous work detailing perceptual consequences of disordered vowel acoustics is that overall intelligibility, not vowel identification accuracy, has been the perceptual measure of interest. A series of three experiments were conducted to address the problems outlined herein. The goals of the first experiment were to identify subsets of vowel metrics that reliably distinguish speakers with dysarthria from non-disordered speakers and differentiate the dysarthria subtypes. Vowel metrics that capture vowel centralization and reduced spectral distinctiveness among vowels differentiated dysarthric from non-disordered speakers. Vowel metrics generally failed to differentiate speakers according to their dysarthria diagnosis. The second and third experiments were conducted to evaluate the relationship between degraded vowel acoustics and the resulting percept. In the second experiment, correlation and regression analyses revealed vowel metrics that capture vowel centralization and distinctiveness and movement of the second formant frequency were most predictive of vowel identification accuracy and overall intelligibility. The third experiment was conducted to evaluate the extent to which the nature of the acoustic degradation predicts the resulting percept. Results suggest distinctive vowel tokens are better identified and, likewise, better-identified tokens are more distinctive. Further, an above-chance level agreement between nature of vowel misclassification and misidentification errors was demonstrated for all vowels, suggesting degraded vowel acoustics are not merely an index of severity in dysarthria, but rather are an integral component of the resultant intelligibility disorder.

Contributors

Agent

Created

Date Created
  • 2012

150607-Thumbnail Image.png

Free classification of dysarthric speech

Description

Often termed the "gold standard" in the differential diagnosis of dysarthria, the etiology-based Mayo Clinic classification approach has been used nearly exclusively by clinicians since the early 1970s. However, the

Often termed the "gold standard" in the differential diagnosis of dysarthria, the etiology-based Mayo Clinic classification approach has been used nearly exclusively by clinicians since the early 1970s. However, the current descriptive method results in a distinct overlap of perceptual features across various etiologies, thus limiting the clinical utility of such a system for differential diagnosis. Acoustic analysis may provide a more objective measure for improvement in overall reliability (Guerra & Lovely, 2003) of classification. The following paper investigates the potential use of a taxonomical approach to dysarthria. The purpose of this study was to identify a set of acoustic correlates of perceptual dimensions used to group similarly sounding speakers with dysarthria, irrespective of disease etiology. The present study utilized a free classification auditory perceptual task in order to identify a set of salient speech characteristics displayed by speakers with varying dysarthria types and perceived by listeners, which was then analyzed using multidimensional scaling (MDS), correlation analysis, and cluster analysis. In addition, discriminant function analysis (DFA) was conducted to establish the feasibility of using the dimensions underlying perceptual similarity in dysarthria to classify speakers into both listener-derived clusters and etiology-based categories. The following hypothesis was identified: Because of the presumed predictive link between the acoustic correlates and listener-derived clusters, the DFA classification results should resemble the perceptual clusters more closely than the etiology-based (Mayo System) classifications. Results of the present investigation's MDS revealed three dimensions, which were significantly correlated with 1) metrics capturing rate and rhythm, 2) intelligibility, and 3) all of the long-term average spectrum metrics in the 8000 Hz band, which has been linked to degree of phonemic distinctiveness (Utianski et al., February 2012). A qualitative examination of listener notes supported the MDS and correlation results, with listeners overwhelmingly making reference to speaking rate/rhythm, intelligibility, and articulatory precision while participating in the free classification task. Additionally, acoustic correlates revealed by the MDS and subjected to DFA indeed predicted listener group classification. These results beget acoustic measurement as representative of listener perception, and represent the first phase in supporting the use of a perceptually relevant taxonomy of dysarthria.

Contributors

Agent

Created

Date Created
  • 2012