Intonation provides a means to convey information in speech that is independent of the words and their sounds. Finding a way to automatically describe this non-verbal information is impor-tant for developing sophisticated speech technology applications. One leading approach to model intonation is using a superpositional approach that assume intonation has a hierarchical structure, and models the intonation by decomposing it’s physical representative (F0 contours) into compo-nent curves with simpler intonation patterns in multi-level manner. However, it is not clear what the set of component curves should be, and how they can be defined with few free parameters, that will allow them to be used in analysis and synthesis of English for a wide range of tasks.

  • 2020
