Theses and Dissertations
Filtering by
- All Subjects: engineering
- Creators: Berisha, Visar
Spatial audio can be especially useful for directing human attention. However, delivering spatial audio through speakers, rather than headphones that deliver audio directly to the ears, produces the issue of crosstalk, where sounds from each of the two speakers reach the opposite ear, inhibiting the spatialized effect. A research team at Meteor Studio has developed an algorithm called Xblock that solves this issue using a crosstalk cancellation technique. This thesis project expands upon the existing Xblock IoT system by providing a way to test the accuracy of the directionality of sounds generated with spatial audio. More specifically, the objective is to determine whether the usage of Xblock with smart speakers can provide generalized audio localization, which refers to the ability to detect a general direction of where a sound might be coming from. This project also expands upon the existing Xblock technique to integrate voice commands, where users can verbalize the name of a lost item using the phrase, “Find [item]”, and the IoT system will use spatial audio to guide them to it.
Spatial audio can be especially useful for directing human attention. However, delivering spatial audio through speakers, rather than headphones that deliver audio directly to the ears, produces the issue of crosstalk, where sounds from each of the two speakers reach the opposite ear, inhibiting the spatialized effect. A research team at Meteor Studio has developed an algorithm called Xblock that solves this issue using a crosstalk cancellation technique. This thesis project expands upon the existing Xblock IoT system by providing a way to test the accuracy of the directionality of sounds generated with spatial audio. More specifically, the objective is to determine whether the usage of Xblock with smart speakers can provide generalized audio localization, which refers to the ability to detect a general direction of where a sound might be coming from. This project also expands upon the existing Xblock technique to integrate voice commands, where users can verbalize the name of a lost item using the phrase, “Find [item]”, and the IoT system will use spatial audio to guide them to it.
This dissertation focuses on how dialogue systems can build rapport utilizing acoustic-prosodic entrainment. Acoustic-prosodic entrainment occurs when individuals adapt their acoustic-prosodic features of speech, such as tone of voice or loudness, to one another over the course of a conversation. Correlated with liking and task success, a dialogue system which entrains may enhance rapport. Entrainment, however, is very challenging to model. People entrain on different features in many ways and how to design entrainment to build rapport is unclear. The first goal of this dissertation is to explore how acoustic-prosodic entrainment can be modeled to build rapport.
Towards this goal, this work presents a series of studies comparing, evaluating, and iterating on the design of entrainment, motivated and informed by human-human dialogue. These models of entrainment are implemented in the dialogue system of a robotic learning companion. Learning companions are educational agents that engage students socially to increase motivation and facilitate learning. As a learning companion’s ability to be socially responsive increases, so do vital learning outcomes. A second goal of this dissertation is to explore the effects of entrainment on concrete outcomes such as learning in interactions with robotic learning companions.
This dissertation results in contributions both technical and theoretical. Technical contributions include a robust and modular dialogue system capable of producing prosodic entrainment and other socially-responsive behavior. One of the first systems of its kind, the results demonstrate that an entraining, social learning companion can positively build rapport and increase learning. This dissertation provides support for exploring phenomena like entrainment to enhance factors such as rapport and learning and provides a platform with which to explore these phenomena in future work.