Article: A corpus-based study on the characteristics of the use of spoken English chunks Journal: International Journal of Computational Systems Engineering (IJCSYSE) 2024 Vol.8 No.1/2 pp.40 - 47 Abstract: This study constructs the English-speaking SELL corpus, proposes a CNN-LSTM-SA algorithm model-based English speaking recognition technique for the use of English-speaking blocks, and analyses the results of the SELL corpus and the speaking recognition model. The results show that the model's loss rate shows a trend of slow increase after a sharp decrease. When the number of iterations of the model is 300, the inflection points of the loss value and accuracy rate occur. At this point, the accuracy tends to converge, and its training accuracy is close to 88%, which is significantly higher than other algorithms. The CNN-LSTM network performs the best under the ReLu and tanh functions selected for the study, and the MAE and RMSE indexes 26.54 and 36.11, respectively. The model performance is higher than other algorithms under all six complexities, and its difference is about 4% at the lowest, with a very stable performance advantage. Inderscience Publishers - linking academia, business and industry through research

Title: A corpus-based study on the characteristics of the use of spoken English chunks

Authors: Rong Hu

Addresses: College of Humanities, Ningbo University of Finance & Economics, Ningbo, 315175, China

Abstract: This study constructs the English-speaking SELL corpus, proposes a CNN-LSTM-SA algorithm model-based English speaking recognition technique for the use of English-speaking blocks, and analyses the results of the SELL corpus and the speaking recognition model. The results show that the model's loss rate shows a trend of slow increase after a sharp decrease. When the number of iterations of the model is 300, the inflection points of the loss value and accuracy rate occur. At this point, the accuracy tends to converge, and its training accuracy is close to 88%, which is significantly higher than other algorithms. The CNN-LSTM network performs the best under the ReLu and tanh functions selected for the study, and the MAE and RMSE indexes 26.54 and 36.11, respectively. The model performance is higher than other algorithms under all six complexities, and its difference is about 4% at the lowest, with a very stable performance advantage.

Keywords: corpus; spoken English; chunk features; self-attentive mechanism; CNN-LSTM.

DOI: 10.1504/IJCSYSE.2024.137444

International Journal of Computational Systems Engineering, 2024 Vol.8 No.1/2, pp.40 - 47

Received: 08 Aug 2022
Accepted: 30 Jan 2023
Published online: 19 Mar 2024 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article

Title: A corpus-based study on the characteristics of the use of spoken English chunks

Keep up-to-date