This collection includes both ASU Theses and Dissertations, submitted by graduate students, and the Barrett, Honors College theses submitted by undergraduate students. 

Displaying 1 - 6 of 6
Filtering by

Clear all filters

151722-Thumbnail Image.png
Description
Digital sound synthesis allows the creation of a great variety of sounds. Focusing on interesting or ecologically valid sounds for music, simulation, aesthetics, or other purposes limits the otherwise vast digital audio palette. Tools for creating such sounds vary from arbitrary methods of altering recordings to precise simulations of vibrating

Digital sound synthesis allows the creation of a great variety of sounds. Focusing on interesting or ecologically valid sounds for music, simulation, aesthetics, or other purposes limits the otherwise vast digital audio palette. Tools for creating such sounds vary from arbitrary methods of altering recordings to precise simulations of vibrating objects. In this work, methods of sound synthesis by re-sonification are considered. Re-sonification, herein, refers to the general process of analyzing, possibly transforming, and resynthesizing or reusing recorded sounds in meaningful ways, to convey information. Applied to soundscapes, re-sonification is presented as a means of conveying activity within an environment. Applied to the sounds of objects, this work examines modeling the perception of objects as well as their physical properties and the ability to simulate interactive events with such objects. To create soundscapes to re-sonify geographic environments, a method of automated soundscape design is presented. Using recorded sounds that are classified based on acoustic, social, semantic, and geographic information, this method produces stochastically generated soundscapes to re-sonify selected geographic areas. Drawing on prior knowledge, local sounds and those deemed similar comprise a locale's soundscape. In the context of re-sonifying events, this work examines processes for modeling and estimating the excitations of sounding objects. These include plucking, striking, rubbing, and any interaction that imparts energy into a system, affecting the resultant sound. A method of estimating a linear system's input, constrained to a signal-subspace, is presented and applied toward improving the estimation of percussive excitations for re-sonification. To work toward robust recording-based modeling and re-sonification of objects, new implementations of banded waveguide (BWG) models are proposed for object modeling and sound synthesis. Previous implementations of BWGs use arbitrary model parameters and may produce a range of simulations that do not match digital waveguide or modal models of the same design. Subject to linear excitations, some models proposed here behave identically to other equivalently designed physical models. Under nonlinear interactions, such as bowing, many of the proposed implementations exhibit improvements in the attack characteristics of synthesized sounds.
ContributorsFink, Alex M (Author) / Spanias, Andreas S (Thesis advisor) / Cook, Perry R. (Committee member) / Turaga, Pavan (Committee member) / Tsakalis, Konstantinos (Committee member) / Arizona State University (Publisher)
Created2013
152941-Thumbnail Image.png
Description
Head movement is known to have the benefit of improving the accuracy of sound localization for humans and animals. Marmoset is a small bodied New World monkey species and it has become an emerging model for studying the auditory functions. This thesis aims to detect the horizontal and vertical

Head movement is known to have the benefit of improving the accuracy of sound localization for humans and animals. Marmoset is a small bodied New World monkey species and it has become an emerging model for studying the auditory functions. This thesis aims to detect the horizontal and vertical rotation of head movement in marmoset monkeys.

Experiments were conducted in a sound-attenuated acoustic chamber. Head movement of marmoset monkey was studied under various auditory and visual stimulation conditions. With increasing complexity, these conditions are (1) idle, (2) sound-alone, (3) sound and visual signals, and (4) alert signal by opening and closing of the chamber door. All of these conditions were tested with either house light on or off. Infra-red camera with a frame rate of 90 Hz was used to capture of the head movement of monkeys. To assist the signal detection, two circular markers were attached to the top of monkey head. The data analysis used an image-based marker detection scheme. Images were processed using the Computation Vision Toolbox in Matlab. The markers and their positions were detected using blob detection techniques. Based on the frame-by-frame information of marker positions, the angular position, velocity and acceleration were extracted in horizontal and vertical planes. Adaptive Otsu Thresholding, Kalman filtering and bound setting for marker properties were used to overcome a number of challenges encountered during this analysis, such as finding image segmentation threshold, continuously tracking markers during large head movement, and false alarm detection.

The results show that the blob detection method together with Kalman filtering yielded better performances than other image based techniques like optical flow and SURF features .The median of the maximal head turn in the horizontal plane was in the range of 20 to 70 degrees and the median of the maximal velocity in horizontal plane was in the range of a few hundreds of degrees per second. In comparison, the natural alert signal - door opening and closing - evoked the faster head turns than other stimulus conditions. These results suggest that behaviorally relevant stimulus such as alert signals evoke faster head-turn responses in marmoset monkeys.
ContributorsSimhadri, Sravanthi (Author) / Zhou, Yi (Thesis advisor) / Turaga, Pavan (Thesis advisor) / Berisha, Visar (Committee member) / Arizona State University (Publisher)
Created2014
153418-Thumbnail Image.png
Description
This study consisted of several related projects on dynamic spatial hearing by both human and robot listeners. The first experiment investigated the maximum number of sound sources that human listeners could localize at the same time. Speech stimuli were presented simultaneously from different loudspeakers at multiple time intervals. The maximum

This study consisted of several related projects on dynamic spatial hearing by both human and robot listeners. The first experiment investigated the maximum number of sound sources that human listeners could localize at the same time. Speech stimuli were presented simultaneously from different loudspeakers at multiple time intervals. The maximum of perceived sound sources was close to four. The second experiment asked whether the amplitude modulation of multiple static sound sources could lead to the perception of auditory motion. On the horizontal and vertical planes, four independent noise sound sources with 60° spacing were amplitude modulated with consecutively larger phase delay. At lower modulation rates, motion could be perceived by human listeners in both cases. The third experiment asked whether several sources at static positions could serve as "acoustic landmarks" to improve the localization of other sources. Four continuous speech sound sources were placed on the horizontal plane with 90° spacing and served as the landmarks. The task was to localize a noise that was played for only three seconds when the listener was passively rotated in a chair in the middle of the loudspeaker array. The human listeners were better able to localize the sound sources with landmarks than without. The other experiments were with the aid of an acoustic manikin in an attempt to fuse binaural recording and motion data to localize sounds sources. A dummy head with recording devices was mounted on top of a rotating chair and motion data was collected. The fourth experiment showed that an Extended Kalman Filter could be used to localize sound sources in a recursive manner. The fifth experiment demonstrated the use of a fitting method for separating multiple sounds sources.
ContributorsZhong, Xuan (Author) / Yost, William (Thesis advisor) / Zhou, Yi (Committee member) / Dorman, Michael (Committee member) / Helms Tillery, Stephen (Committee member) / Arizona State University (Publisher)
Created2015
153939-Thumbnail Image.png
Description
Sound localization can be difficult in a reverberant environment. Fortunately listeners can utilize various perceptual compensatory mechanisms to increase the reliability of sound localization when provided with ambiguous physical evidence. For example, the directional information of echoes can be perceptually suppressed by the direct sound to achieve a single, fused

Sound localization can be difficult in a reverberant environment. Fortunately listeners can utilize various perceptual compensatory mechanisms to increase the reliability of sound localization when provided with ambiguous physical evidence. For example, the directional information of echoes can be perceptually suppressed by the direct sound to achieve a single, fused auditory event in a process called the precedence effect (Litovsky et al., 1999). Visual cues also influence sound localization through a phenomenon known as the ventriloquist effect. It is classically demonstrated by a puppeteer who speaks without visible lip movements while moving the mouth of a puppet synchronously with his/her speech (Gelder and Bertelson, 2003). If the ventriloquist is successful, sound will be “captured” by vision and be perceived to be originating at the location of the puppet. This thesis investigates the influence of vision on the spatial localization of audio-visual stimuli. Participants seated in a sound-attenuated room indicated their perceived locations of either ISI or level-difference stimuli in free field conditions. Two types of stereophonic phantom sound sources, created by modulating the inter-stimulus time interval (ISI) or level difference between two loudspeakers, were used as auditory stimuli. The results showed that the light cues influenced auditory spatial perception to a greater extent for the ISI stimuli than the level difference stimuli. A binaural signal analysis further revealed that the greater visual bias for the ISI phantom sound sources was correlated with the increasingly ambiguous binaural cues of the ISI signals. This finding suggests that when sound localization cues are unreliable, perceptual decisions become increasingly biased towards vision for finding a sound source. These results support the cue saliency theory underlying cross-modal bias and extend this theory to include stereophonic phantom sound sources.
ContributorsMontagne, Christopher (Author) / Zhou, Yi (Thesis advisor) / Buneo, Christopher A (Thesis advisor) / Yost, William A. (Committee member) / Arizona State University (Publisher)
Created2015
154721-Thumbnail Image.png
Description
Several music players have evolved in multi-dimensional and surround sound systems. The audio players are implemented as software applications for different audio hardware systems. Digital formats and wireless networks allow for audio content to be readily accessible on smart networked devices. Therefore, different audio output platforms ranging from multispeaker high-end

Several music players have evolved in multi-dimensional and surround sound systems. The audio players are implemented as software applications for different audio hardware systems. Digital formats and wireless networks allow for audio content to be readily accessible on smart networked devices. Therefore, different audio output platforms ranging from multispeaker high-end surround systems to single unit Bluetooth speakers have been developed. A large body of research has been carried out in audio processing, beamforming, sound fields etc. and new formats are developed to create realistic audio experiences.

An emerging trend is seen towards high definition AV systems, virtual reality gears as well as gaming applications with multidimensional audio. Next generation media technology is concentrating around Virtual reality experience and devices. It has applications not only in gaming but all other fields including medical, entertainment, engineering, and education. All such systems also require realistic audio corresponding with the visuals.

In the project presented in this thesis, a new portable audio hardware system is designed and developed along with a dedicated mobile android application to render immersive surround sound experiences with real-time audio effects. The tablet and mobile phone allow the user to control or “play” with sound directionality and implement various audio effects including sound rotation, spatialization, and other immersive experiences. The thesis describes the hardware and software design, provides the theory of the sound effects, and presents demonstrations of the sound application that was created.
ContributorsDharmadhikari, Chinmay (Author) / Spanias, Andreas (Thesis advisor) / Turaga, Pavan (Committee member) / Ingalls, Todd (Committee member) / Arizona State University (Publisher)
Created2016
168345-Thumbnail Image.png
Description
Spatial awareness (i.e., the sense of the space that we are in) involves the integration of auditory, visual, vestibular, and proprioceptive sensory information of environmental events. Hearing impairment has negative effects on spatial awareness and can result in deficits in communication and the overall aesthetic experience of life, especially in

Spatial awareness (i.e., the sense of the space that we are in) involves the integration of auditory, visual, vestibular, and proprioceptive sensory information of environmental events. Hearing impairment has negative effects on spatial awareness and can result in deficits in communication and the overall aesthetic experience of life, especially in noisy or reverberant environments. This deficit occurs as hearing impairment reduces the signal strength needed for auditory spatial processing and changes how auditory information is combined with other sensory inputs (e.g., vision). The influence of multisensory processing on spatial awareness in listeners with normal, and impaired hearing is not assessed in clinical evaluations, and patients’ everyday sensory experiences are currently not directly measurable. This dissertation investigated the role of vision in auditory localization in listeners with normal, and impaired hearing in a naturalistic stimulus setting, using natural gaze orienting responses. Experiments examined two behavioral outcomes—response accuracy and response time—based on eye movement in response to simultaneously presented auditory and visual stimuli. The first set of experiments examined the effects of stimulus spatial saliency on response accuracy and response time and the extent of visual dominance in both metrics in auditory localization. The results indicate that vision can significantly influence both the speed and accuracy of auditory localization, especially when auditory stimuli are more ambiguous. The influence of vision is shown for both normal hearing- and hearing-impaired listeners. The second set of experiments examined the effect of frontal visual stimulation on localizing an auditory target presented from in front of or behind a listener. The results show domain-specific effects of visual capture on both response time and response accuracy. These results support previous findings that auditory-visual interactions are not limited by the spatial rule of proximity. These results further suggest the strong influence of vision on both the processing and the decision-making stages of sound source localization for both listeners with normal, and impaired hearing.
ContributorsClayton, Colton (Author) / Zhou, Yi (Thesis advisor) / Azuma, Tamiko (Committee member) / Daliri, Ayoub (Committee member) / Arizona State University (Publisher)
Created2021