Using conditional adversarial networks for intelligibility improvement for dysarthric speech and laryngectomees

Dinh, Tuan; Kain, Alexander

doi:10.6083/4b29b666b

Dinh, Tuan; Kain, Alexander

2020

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket

Files

Abstract

We explored voice conversion systems to improve speech intelligibility of 1) dysarthric speech and 2) laryngectomees. In the first case, we explore the potential of conditional generative adversarial networks (cGANs) to learn the mapping from habitual speech to clear speech. We evaluated the performance of cGANs in three tasks: 1) speaker-dependent one-to-one mappings, 2) speaker-independent many-to-one mappings, and 3) speaker-independent many-to-many mappings. In the first task, cGANs outperformed a traditional deep learning (DNN) mapping in term of average keyword recall accuracy and the number of speakers with improved intelligibility. In the second task, we showed that without clear speech, we can significantly improve intelligibility of the habitual speech of one of three speakers. In the third task which is the most challenging one, we improved the keyword recall accuracy for two of three speakers. In the second case, we aim to improve speech of laryngectomees in term of intelligibility and naturalness. We predict the voicing and voicing degree for laryngectomees from speech spectra using a deep neural network. We use a logarithmically falling synthetic F0 for statement phrases. Spectra are converted to synthetic target spectra using a cGAN.

Details

Title

Using conditional adversarial networks for intelligibility improvement for dysarthric speech and laryngectomees

Creator

Dinh, Tuan : Oregon Health and Science University
Kain, Alexander : Oregon Health and Science University

Meeting Name

Research Week, Oregon Health and Science University, 2020

Publisher

Oregon Health and Science University

Date

2020

Subjects

Speech Intelligibility
Speech Acoustics
Deep Learning
Laryngectomy

Keywords

laryngectomees; conditional adversarial nets; dysarthric speech; voice conversion

DOI

https://doi.org/10.6083/4b29b666b

Language

English

Content Type

Abstract

Department

Department of Computer Science and Electrical Engineering

School

School of Medicine

Copyright Status

In copyright - joint owners

Usage Statement

CC BY

Record ID

8295

Record Created

2023-06-29

Record Appears in

Organizations > Departments and Divisions > School of Medicine Former Departments > Department of Computer Science and Electrical Engineering
Organizations > Schools > School of Medicine
OHSU Works > Events > Research Week
All Records

PDF

Statistics

Download Full History