Title: Personalised English listening teaching design based on natural language processing and speech synthesis
Authors: Yanling Han
Addresses: Teaching Department of Public Courses, Hunan Communication Polytechnic, Changsha 410132, China
Abstract: Traditional English listening instruction tends to use a one-size-fits-all model, making it difficult to meet individualised learning needs. For this reason, this paper firstly analyses the English listening teaching text based on natural language processing (NLP), and designs encoders and decoders based on multi-head long- and short-term self-attention to convert the text feature sequences into Mel Spectrograms. The Mel spectrogram is then converted into a speech waveform using an improved generating adversarial network (GAN) generative model, and the grouped convolution-based discriminant model is responsible in distinguishing between real and synthesised speech, prompting the generative model to synthesise more realistic speech waveforms. Finally, a personalised application model of the proposed text to speech (TTS) method in English listening teaching is constructed. The experimental outcome indicate that the proposed method not only improves students' performance, but also the synthesised speech has a high degree of naturalness.
Keywords: English listening instruction; natural language processing; NLP; text to speech; TTS; attention mechanism; generative adversarial network.
DOI: 10.1504/IJICT.2025.146104
International Journal of Information and Communication Technology, 2025 Vol.26 No.11, pp.21 - 37
Received: 10 Mar 2025
Accepted: 22 Mar 2025
Published online: 06 May 2025 *