The main goal of this thesis is to explore how spoken dialogue varies in face-to-face and remote collaborative learning settings. Speech data is collected from student participants solving mathematical problems collaboratively on a tablet. Spoken dialogue is analyzed based on conversational and acoustic features in both the settings. Looking for collaborative differences of transactivity and dialogue initiative, both settings are compared in detail using machine learning classification techniques based on acoustic and prosodic features of speech. Transactivity is defined as a joint construction of knowledge by peers. The main contributions of this thesis are: a speech corpus to analyze spoken dialogue in face-to-face and remote settings and an empirical analysis of conversation, collaboration, and speech prosody in both the settings. The results from the experiments show that amount of overlap is lower in remote dialogue than in the face-to-face setting. There is a significant difference in transactivity among strangers. My research benefits the computer-supported collaborative learning community by providing an analysis that can be used to build more efficient tools for supporting remote collaborative learning.
Consequently, the Winograd Schema Challenge has been suggested as an alternative to the Turing test. As opposed to deciding the intelligent behavior with the help of chat servers, like it was done in the Turing test, the Winograd Schema Challenge is a question answering test. It consists of sentence and question pairs such that the answer to the question depends on the resolution of a definite pronoun or adjective in the sentence. The answers are fairly intuitive for humans but they are difficult for machines because it requires some sort of background or commonsense knowledge about the sentence.
In this thesis, I propose a novel technique to solve the Winograd Schema Challenge. The technique has three basic modules at its disposal, namely, a Semantic Parser that parses the English text (both sentences and questions) into a formal representation, an Automatic Background Knowledge Extractor that extracts the Background Knowledge pertaining to the given Winograd sentence, and an Answer Set Programming Reasoning Engine that reasons on the given Winograd sentence and the corresponding Background Knowledge. The applicability of the technique is illustrated by solving a subset of Winograd Schema Challenge pertaining to a certain type of Background Knowledge. The technique is evaluated on the subset and a notable accuracy is achieved.
This dissertation focuses on how dialogue systems can build rapport utilizing acoustic-prosodic entrainment. Acoustic-prosodic entrainment occurs when individuals adapt their acoustic-prosodic features of speech, such as tone of voice or loudness, to one another over the course of a conversation. Correlated with liking and task success, a dialogue system which entrains may enhance rapport. Entrainment, however, is very challenging to model. People entrain on different features in many ways and how to design entrainment to build rapport is unclear. The first goal of this dissertation is to explore how acoustic-prosodic entrainment can be modeled to build rapport.
Towards this goal, this work presents a series of studies comparing, evaluating, and iterating on the design of entrainment, motivated and informed by human-human dialogue. These models of entrainment are implemented in the dialogue system of a robotic learning companion. Learning companions are educational agents that engage students socially to increase motivation and facilitate learning. As a learning companion’s ability to be socially responsive increases, so do vital learning outcomes. A second goal of this dissertation is to explore the effects of entrainment on concrete outcomes such as learning in interactions with robotic learning companions.
This dissertation results in contributions both technical and theoretical. Technical contributions include a robust and modular dialogue system capable of producing prosodic entrainment and other socially-responsive behavior. One of the first systems of its kind, the results demonstrate that an entraining, social learning companion can positively build rapport and increase learning. This dissertation provides support for exploring phenomena like entrainment to enhance factors such as rapport and learning and provides a platform with which to explore these phenomena in future work.
Conversational entrainment, a pervasive communication phenomenon in which dialogue partners adapt their behaviors to align more closely with one another, is considered essential for successful spoken interaction. While well-established in other disciplines, this phenomenon has received limited attention in the field of speech pathology and the study of communication breakdowns in clinical populations. The current study examined acoustic-prosodic entrainment, as well as a measure of communicative success, in three distinctly different dialogue groups: (i) healthy native vs. healthy native speakers (Control), (ii) healthy native vs. foreign-accented speakers (Accented), and (iii) healthy native vs. dysarthric speakers (Disordered). Dialogue group comparisons revealed significant differences in how the groups entrain on particular acoustic–prosodic features, including pitch, intensity, and jitter. Most notably, the Disordered dialogues were characterized by significantly less acoustic-prosodic entrainment than the Control dialogues. Further, a positive relationship between entrainment indices and communicative success was identified. These results suggest that the study of conversational entrainment in speech pathology will have essential implications for both scientific theory and clinical application in this domain.