Title: Continuous pronunciation error recognition of English vocabulary based on dual modal fusion features
Authors: Lin Wu
Addresses: School of Foreign Language, Changchun University of Technology, Changchun, 130000, China
Abstract: In order to reduce the word error rate and character error rate in English pronunciation recognition, a continuous pronunciation error recognition method for English vocabulary based on dual modal fusion features is proposed. Firstly, continuous speech data of English vocabulary, including screening visual information mouth ROI, colour normalisation, and horizontal flipping is pre-processed. Secondly, use short-time Fourier transform to extract audio features and normalise them to ensure data temporal consistency. Then, for the pre-processed continuous speech data of English vocabulary, the fusion of dual modal features of speech visual information and auditory information is completed based on kernel principal component analysis. Finally, using the fused features as input, the construction of an English vocabulary continuous speech pronunciation error recognition model is completed. Experimental results show that the proposed method has a word error rate of less than 7.9%, a character error rate of less than 6.4% for continuous pronunciation errors in English vocabulary.
Keywords: dual modal fusion features; English vocabulary; continuous speech; pronunciation error; intelligent recognition.
International Journal of Biometrics, 2026 Vol.18 No.1/2/3, pp.182 - 197
Received: 13 Feb 2025
Accepted: 24 Apr 2025
Published online: 13 Jan 2026 *