Article: Weighted finite-state transducer-based dysarthric speech recognition error correction using context-dependent pronunciation variation modelling Journal: International Journal of Engineering Systems Modelling and Simulation (IJESMS) 2014 Vol.6 No.1/2 pp.4 - 11 Abstract: In this paper, an error correction method is proposed to improve the performance of dysarthric automatic speech recognition (ASR). To this end, context-dependent pronunciation variations are modelled by using a weighted Kullback-Leibler (KL) distance between acoustic models of ASR. Then, the context-dependent pronunciation variation model is converted into a weighted finite-state transducer (WFST) and combined with a lexicon and a language model. It is shown from ASR experiments that average word error rate (WER) of a WFST-based ASR system with the proposed error correction method is relatively reduced by 19.73%, compared to an ASR system without error correction. Moreover, it is shown that the error correction method using a weighted KL distance relatively reduces average WER by 3.81%, compared to that using a KL distance. Inderscience Publishers - linking academia, business and industry through research

Title: Weighted finite-state transducer-based dysarthric speech recognition error correction using context-dependent pronunciation variation modelling

Authors: Woo Kyeong Seong; Ji Hun Park

Addresses: School of Information and Communications, Gwangju Institute of Science and Technology (GIST), Gwangju 500-712, Korea ' School of Information and Communications, Gwangju Institute of Science and Technology (GIST), Gwangju 500-712, Korea

Abstract: In this paper, an error correction method is proposed to improve the performance of dysarthric automatic speech recognition (ASR). To this end, context-dependent pronunciation variations are modelled by using a weighted Kullback-Leibler (KL) distance between acoustic models of ASR. Then, the context-dependent pronunciation variation model is converted into a weighted finite-state transducer (WFST) and combined with a lexicon and a language model. It is shown from ASR experiments that average word error rate (WER) of a WFST-based ASR system with the proposed error correction method is relatively reduced by 19.73%, compared to an ASR system without error correction. Moreover, it is shown that the error correction method using a weighted KL distance relatively reduces average WER by 3.81%, compared to that using a KL distance.

Keywords: dysarthric ASR system; weighted finite-state transducers; WFST; Kullback-Leibler distance; context-dependent confusion matrix; pronunciation variation model; error correction; dysarthric speech recognition; modelling.

DOI: 10.1504/IJESMS.2014.058418

International Journal of Engineering Systems Modelling and Simulation, 2014 Vol.6 No.1/2, pp.4 - 11

Received: 10 Jun 2013
Accepted: 25 Aug 2013
Published online: 17 Jun 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article

Title: Weighted finite-state transducer-based dysarthric speech recognition error correction using context-dependent pronunciation variation modelling

Keep up-to-date