Search Content

Matching Items (2)

Filtering by

Creators: Soslowsky, Samara Miranda
Creators: Spanias, Andreas

Incorporating auditory models in speech/audio applications

Description

Following the success in incorporating perceptual models in audio coding algorithms, their application in other speech/audio processing systems is expanding. In general, all perceptual speech/audio processing algorithms involve minimization of an objective function that directly/indirectly incorporates properties of human perception. This dissertation primarily investigates the problems associated with directly embedding an auditory model in the objective function formulation and proposes possible solutions to overcome high complexity issues for use in real-time speech/audio algorithms. Specific problems addressed in this dissertation include: 1) the development of approximate but computationally efficient auditory model implementations that are consistent with the principles of psychoacoustics, 2) the development of a mapping scheme that allows synthesizing a time/frequency domain representation from its equivalent auditory model output. The first problem is aimed at addressing the high computational complexity involved in solving perceptual objective functions that require repeated application of auditory model for evaluation of different candidate solutions. In this dissertation, a frequency pruning and a detector pruning algorithm is developed that efficiently implements the various auditory model stages. The performance of the pruned model is compared to that of the original auditory model for different types of test signals in the SQAM database. Experimental results indicate only a 4-7% relative error in loudness while attaining up to 80-90 % reduction in computational complexity. Similarly, a hybrid algorithm is developed specifically for use with sinusoidal signals and employs the proposed auditory pattern combining technique together with a look-up table to store representative auditory patterns. The second problem obtains an estimate of the auditory representation that minimizes a perceptual objective function and transforms the auditory pattern back to its equivalent time/frequency representation. This avoids the repeated application of auditory model stages to test different candidate time/frequency vectors in minimizing perceptual objective functions. In this dissertation, a constrained mapping scheme is developed by linearizing certain auditory model stages that ensures obtaining a time/frequency mapping corresponding to the estimated auditory representation. This paradigm was successfully incorporated in a perceptual speech enhancement algorithm and a sinusoidal component selection task.

ContributorsKrishnamoorthi, Harish (Author) / Spanias, Andreas (Thesis advisor) / Papandreou-Suppappola, Antonia (Committee member) / Tepedelenlioğlu, Cihan (Committee member) / Tsakalis, Konstantinos (Committee member) / Arizona State University (Publisher)

Created2011

Interactions between Pitch and Timbre Perception in Normal-hearing Listeners and Cochlear Implant Users

Description

Pitch and timbre perception are two important dimensions of auditory perception. These aspects of sound aid the understanding of our environment, and contribute to normal everyday functioning. It is therefore important to determine the nature of perceptual interaction between these two dimensions of sound. This study tested the interactions between pitch perception associated with the fundamental frequency (F0) and sharpness perception associated with the spectral slope of harmonic complex tones in normal hearing (NH) listeners and cochlear implant (CI) users. Pitch and sharpness ranking was measured without changes in the non-target dimension (Experiment 1), with different amounts of unrelated changes in the non-target dimension (Experiment 2), and with congruent/incongruent changes of similar perceptual salience in the non-target dimension (Experiment 3). The results showed that CI users had significantly worse pitch and sharpness ranking thresholds than NH listeners. Pitch and sharpness perception had symmetric interactions in NH listeners. However, for CI users, spectral slope changes significantly affected pitch ranking, while F0 changes had no significant effect on sharpness ranking. CI users' pitch ranking sensitivity was significantly better with congruent than with incongruent spectral slope changes. These results have important implications for CI processing strategies to better transmit pitch and timbre cues to CI users.

ContributorsSoslowsky, Samara Miranda (Author) / Luo, Xin (Thesis director) / Yost, William (Committee member) / Dorman, Michael (Committee member) / Department of Speech and Hearing Science (Contributor) / Barrett, The Honors College (Contributor)

Created2016-12