TY - GEN AB - We explored voice conversion systems to improve speech intelligibility of 1) dysarthric speech and 2) laryngectomees. In the first case, we explore the potential of conditional generative adversarial networks (cGANs) to learn the mapping from habitual speech to clear speech. We evaluated the performance of cGANs in three tasks: 1) speaker-dependent one-to-one mappings, 2) speaker-independent many-to-one mappings, and 3) speaker-independent many-to-many mappings. In the first task, cGANs outperformed a traditional deep learning (DNN) mapping in term of average keyword recall accuracy and the number of speakers with improved intelligibility. In the second task, we showed that without clear speech, we can significantly improve intelligibility of the habitual speech of one of three speakers. In the third task which is the most challenging one, we improved the keyword recall accuracy for two of three speakers. In the second case, we aim to improve speech of laryngectomees in term of intelligibility and naturalness. We predict the voicing and voicing degree for laryngectomees from speech spectra using a deep neural network. We use a logarithmically falling synthetic F0 for statement phrases. Spectra are converted to synthetic target spectra using a cGAN. AD - Oregon Health and Science University AD - Oregon Health and Science University AU - Dinh, Tuan AU - Kain, Alexander DA - 2020 DO - 10.6083/4b29b666b DO - DOI ID - 8295 KW - Speech Intelligibility KW - Speech Acoustics KW - Deep Learning KW - Laryngectomy KW - laryngectomees KW - conditional adversarial nets KW - dysarthric speech KW - voice conversion L1 - https://digitalcollections.ohsu.edu/record/8295/files/Tuan-Dinh.pdf L1 - https://digitalcollections.ohsu.edu/record/8295/files/Dinh_Presentation.pdf L2 - https://digitalcollections.ohsu.edu/record/8295/files/Tuan-Dinh.pdf L2 - https://digitalcollections.ohsu.edu/record/8295/files/Dinh_Presentation.pdf L4 - https://digitalcollections.ohsu.edu/record/8295/files/Tuan-Dinh.pdf L4 - https://digitalcollections.ohsu.edu/record/8295/files/Dinh_Presentation.pdf LA - eng LK - https://digitalcollections.ohsu.edu/record/8295/files/Tuan-Dinh.pdf LK - https://digitalcollections.ohsu.edu/record/8295/files/Dinh_Presentation.pdf N2 - We explored voice conversion systems to improve speech intelligibility of 1) dysarthric speech and 2) laryngectomees. In the first case, we explore the potential of conditional generative adversarial networks (cGANs) to learn the mapping from habitual speech to clear speech. We evaluated the performance of cGANs in three tasks: 1) speaker-dependent one-to-one mappings, 2) speaker-independent many-to-one mappings, and 3) speaker-independent many-to-many mappings. In the first task, cGANs outperformed a traditional deep learning (DNN) mapping in term of average keyword recall accuracy and the number of speakers with improved intelligibility. In the second task, we showed that without clear speech, we can significantly improve intelligibility of the habitual speech of one of three speakers. In the third task which is the most challenging one, we improved the keyword recall accuracy for two of three speakers. In the second case, we aim to improve speech of laryngectomees in term of intelligibility and naturalness. We predict the voicing and voicing degree for laryngectomees from speech spectra using a deep neural network. We use a logarithmically falling synthetic F0 for statement phrases. Spectra are converted to synthetic target spectra using a cGAN. PB - Oregon Health and Science University PY - 2020 T1 - Using conditional adversarial networks for intelligibility improvement for dysarthric speech and laryngectomees TI - Using conditional adversarial networks for intelligibility improvement for dysarthric speech and laryngectomees UR - https://digitalcollections.ohsu.edu/record/8295/files/Tuan-Dinh.pdf UR - https://digitalcollections.ohsu.edu/record/8295/files/Dinh_Presentation.pdf Y1 - 2020 ER -