Files

Abstract

Automatic speech recognition (ASR) is an essential component for building automatic cognitive assessment systems designed to monitor older adults' cognitive status. While, in the ASR field, remarkable achievements have been reported on publicly available academic datasets, two under-explored problems are important to building automatic cognitive assessment systems: ASRs' performance on aging voice and accuracy in transcribing keywords. Both problems are important to deliver high-quality transcriptions for assessment purposes. In this dissertation, we focus on developing transfer learning techniques/methods to build ASRs that perform well on older adults with possible cognitive impairment. Firstly, we present a transfer learning technique to improve an open-source ASR's performance on older adults (80+ years old) with limited data (i.e., about 10 hours of audio recordings). We demonstrate that the aging voice dramatically impacts an ASR's performance and that adapting the ASR with older adults' recording data through fine-tuning can improve the performance. We propose a transfer learning technique that utilizes intermediate outputs to increase the fine-tuning efficiency with limited training data. This technique achieves better performance than the standard fine-tuning.

Details

PDF

Statistics

from
to
Export
Download Full History