Search Content

Re-sonification of objects, events, and environments

Description

Digital sound synthesis allows the creation of a great variety of sounds. Focusing on interesting or ecologically valid sounds for music, simulation, aesthetics, or other purposes limits the otherwise vast digital audio palette. Tools for creating such sounds vary from arbitrary methods of altering recordings to precise simulations of vibrating…

Digital sound synthesis allows the creation of a great variety of sounds. Focusing on interesting or ecologically valid sounds for music, simulation, aesthetics, or other purposes limits the otherwise vast digital audio palette. Tools for creating such sounds vary from arbitrary methods of altering recordings to precise simulations of vibrating objects. In this work, methods of sound synthesis by re-sonification are considered. Re-sonification, herein, refers to the general process of analyzing, possibly transforming, and resynthesizing or reusing recorded sounds in meaningful ways, to convey information. Applied to soundscapes, re-sonification is presented as a means of conveying activity within an environment. Applied to the sounds of objects, this work examines modeling the perception of objects as well as their physical properties and the ability to simulate interactive events with such objects. To create soundscapes to re-sonify geographic environments, a method of automated soundscape design is presented. Using recorded sounds that are classified based on acoustic, social, semantic, and geographic information, this method produces stochastically generated soundscapes to re-sonify selected geographic areas. Drawing on prior knowledge, local sounds and those deemed similar comprise a locale's soundscape. In the context of re-sonifying events, this work examines processes for modeling and estimating the excitations of sounding objects. These include plucking, striking, rubbing, and any interaction that imparts energy into a system, affecting the resultant sound. A method of estimating a linear system's input, constrained to a signal-subspace, is presented and applied toward improving the estimation of percussive excitations for re-sonification. To work toward robust recording-based modeling and re-sonification of objects, new implementations of banded waveguide (BWG) models are proposed for object modeling and sound synthesis. Previous implementations of BWGs use arbitrary model parameters and may produce a range of simulations that do not match digital waveguide or modal models of the same design. Subject to linear excitations, some models proposed here behave identically to other equivalently designed physical models. Under nonlinear interactions, such as bowing, many of the proposed implementations exhibit improvements in the attack characteristics of synthesized sounds.

ContributorsFink, Alex M (Author) / Spanias, Andreas S (Thesis advisor) / Cook, Perry R. (Committee member) / Turaga, Pavan (Committee member) / Tsakalis, Konstantinos (Committee member) / Arizona State University (Publisher)

Created2013

Head rotation detection in marmoset monkeys

Description

Head movement is known to have the benefit of improving the accuracy of sound localization for humans and animals. Marmoset is a small bodied New World monkey species and it has become an emerging model for studying the auditory functions. This thesis aims to detect the horizontal and vertical…

Head movement is known to have the benefit of improving the accuracy of sound localization for humans and animals. Marmoset is a small bodied New World monkey species and it has become an emerging model for studying the auditory functions. This thesis aims to detect the horizontal and vertical rotation of head movement in marmoset monkeys.

Experiments were conducted in a sound-attenuated acoustic chamber. Head movement of marmoset monkey was studied under various auditory and visual stimulation conditions. With increasing complexity, these conditions are (1) idle, (2) sound-alone, (3) sound and visual signals, and (4) alert signal by opening and closing of the chamber door. All of these conditions were tested with either house light on or off. Infra-red camera with a frame rate of 90 Hz was used to capture of the head movement of monkeys. To assist the signal detection, two circular markers were attached to the top of monkey head. The data analysis used an image-based marker detection scheme. Images were processed using the Computation Vision Toolbox in Matlab. The markers and their positions were detected using blob detection techniques. Based on the frame-by-frame information of marker positions, the angular position, velocity and acceleration were extracted in horizontal and vertical planes. Adaptive Otsu Thresholding, Kalman filtering and bound setting for marker properties were used to overcome a number of challenges encountered during this analysis, such as finding image segmentation threshold, continuously tracking markers during large head movement, and false alarm detection.

The results show that the blob detection method together with Kalman filtering yielded better performances than other image based techniques like optical flow and SURF features .The median of the maximal head turn in the horizontal plane was in the range of 20 to 70 degrees and the median of the maximal velocity in horizontal plane was in the range of a few hundreds of degrees per second. In comparison, the natural alert signal - door opening and closing - evoked the faster head turns than other stimulus conditions. These results suggest that behaviorally relevant stimulus such as alert signals evoke faster head-turn responses in marmoset monkeys.

ContributorsSimhadri, Sravanthi (Author) / Zhou, Yi (Thesis advisor) / Turaga, Pavan (Thesis advisor) / Berisha, Visar (Committee member) / Arizona State University (Publisher)

Created2014

Development of hardware and software for a game-like wireless spatial sound distribution system

Description

Several music players have evolved in multi-dimensional and surround sound systems. The audio players are implemented as software applications for different audio hardware systems. Digital formats and wireless networks allow for audio content to be readily accessible on smart networked devices. Therefore, different audio output platforms ranging from multispeaker high-end…

Several music players have evolved in multi-dimensional and surround sound systems. The audio players are implemented as software applications for different audio hardware systems. Digital formats and wireless networks allow for audio content to be readily accessible on smart networked devices. Therefore, different audio output platforms ranging from multispeaker high-end surround systems to single unit Bluetooth speakers have been developed. A large body of research has been carried out in audio processing, beamforming, sound fields etc. and new formats are developed to create realistic audio experiences.

An emerging trend is seen towards high definition AV systems, virtual reality gears as well as gaming applications with multidimensional audio. Next generation media technology is concentrating around Virtual reality experience and devices. It has applications not only in gaming but all other fields including medical, entertainment, engineering, and education. All such systems also require realistic audio corresponding with the visuals.

In the project presented in this thesis, a new portable audio hardware system is designed and developed along with a dedicated mobile android application to render immersive surround sound experiences with real-time audio effects. The tablet and mobile phone allow the user to control or “play” with sound directionality and implement various audio effects including sound rotation, spatialization, and other immersive experiences. The thesis describes the hardware and software design, provides the theory of the sound effects, and presents demonstrations of the sound application that was created.

ContributorsDharmadhikari, Chinmay (Author) / Spanias, Andreas (Thesis advisor) / Turaga, Pavan (Committee member) / Ingalls, Todd (Committee member) / Arizona State University (Publisher)

Created2016

Language in Trauma: A Pilot Study of Pause Frequency as a Predictor of Cognitive Change Due to Post Traumatic Stress Disorder

Description

With the rise of Posttraumatic Stress Disorder (PTSD) among adults in the United States, understanding the processes of trauma, trauma related disorders, and the long-term impact of living with them is an area of continued focus for researchers. This is especially a concern in the case of current and former…

With the rise of Posttraumatic Stress Disorder (PTSD) among adults in the United States, understanding the processes of trauma, trauma related disorders, and the long-term impact of living with them is an area of continued focus for researchers. This is especially a concern in the case of current and former military service members (veterans), whose work activities and deployment cycles place them at an increased risk of exposure to trauma-inducing experiences but who have a low rate of self-referral to healthcare professionals. There is thus an urgent need for developing procedures for early diagnosis and treatment. The present study examines how the tools and findings of the field of linguistics may contribute to the field of trauma research. Previous research has shown that cognition and language production are closely linked. This study focuses on the role of prosody in PTSD and pilots a procedure for the data collection and analysis. Data consist of monologic talk from a sample of student-veterans and analyzed with speech software (Praat) for pauses greater than 250 milliseconds per 100 words. The pause frequency was compared to a PCL-5 score, an assessment used to check for PTSD symptoms and evaluate need for further assessment and possible diagnosis of PTSD. This pilot study found the methods successfully elicited data that could be used to measure and test the research questions. Although the findings of the study were inconclusive due to limitations of the participant pool, it found that the research model proved effect as a model for future linguistic research on trauma.

ContributorsSouthee, Richard Aaron (Author) / Prior, Matthew T. (Thesis advisor) / Pruitt, Kathryn (Committee member) / Pereira, Jennifer (Committee member) / Arizona State University (Publisher)

Created2020

Pause for Thought: A Pilot Comparative Study of Pause Placement Amongst Native, Heritage, and Non-Native Speakers

Description

Temporal features and frequency of pauses have been studied extensively in the literature, but the interest in the syntactic location of pauses is a more recent development. While previous research has studied the pause patterns of L1 and L2 speakers as well as the effects of pause location on perceptions…

Temporal features and frequency of pauses have been studied extensively in the literature, but the interest in the syntactic location of pauses is a more recent development. While previous research has studied the pause patterns of L1 and L2 speakers as well as the effects of pause location on perceptions of fluency, these studies have all utilized a binary approach the categorization of pauses as occurring either between or within clauses or major constituent boundaries. This research attempts to take a look at pause placement with a finer distinction of pause location, including junctures that occur between and within phrases. To accomplish this, two experiments were conducted. The first experiment gathered read-aloud speech samples from native, non-native, and heritage speakers of Mandarin Chinese, which were then manipulated in Praat to contain only a single pause that occurred either between or within phrases. The samples were presented to native Chinese speakers to assess for perceptions of fluency as affected by the pause location condition. Findings of this preliminary pilot study did not find a significant correlation between pause location and perceptions of fluency at the phrasal level. The second experiment gathered spontaneous speech samples from the same speaker population as Experiment 1. The pauses that occurred in the samples were coded according to a system developed by the author to account for eight different syntactic junctions, and the percentage of pause at each location was calculated. Analysis showed a significant correlation with pause location and percentage of pauses (p < 0.01), as well as a statistically significant interaction between the effects of speaker status and pause location on percentage of pause (p = 0.011). The findings of this study are limited due to the small population size, but research in this fine-grained analysis of pause location within a clause has implications in the fields of L2 acquisition, psycholinguistics, and natural language processing.

ContributorsKennedy, Mary Kathryn (Author) / Van Gelderen, Elly (Thesis advisor) / Pruitt, Kathryn (Committee member) / Prior, Matthew T (Committee member) / Arizona State University (Publisher)

Created2021

Theses and Dissertations

Filtering by

Re-sonification of objects, events, and environments

Head rotation detection in marmoset monkeys

Development of hardware and software for a game-like wireless spatial sound distribution system

Language in Trauma: A Pilot Study of Pause Frequency as a Predictor of Cognitive Change Due to Post Traumatic Stress Disorder

Pause for Thought: A Pilot Comparative Study of Pause Placement Amongst Native, Heritage, and Non-Native Speakers