Files
Abstract
Spoken Dialogue Systems (SDSs) often lack natural coordination mechanisms used in human conversation, leading to inefficient interactions. This research investigates verbal and non-verbal cues that support dialogue coordination and explores their application to human-computer interaction. Studies examined differences in speech directed to computers versus humans, finding louder speech for computer-directed utterances, and identified gaze as a key—but error-prone—cue for addressee perception. Analysis of disfluencies revealed that “um” is listener-oriented, while “uh” is speaker-oriented. Further, inter-turn gap length influenced response timing and disfluency likelihood. Finally, reinforcement learning simulations demonstrated that incorporating coordination cues can improve SDS dialogue policies.