Article: LSB-DSN: sensor-assisted deep learning for robust English speech recognition Journal: International Journal of Sensor Networks (IJSNET) 2025 Vol.49 No.1 pp.50 - 68 Abstract: In modern communication, the English speech recognition system is essential for improving the personalised user experience and global communication. The recognition systems use sensor devices and deep learning techniques to ensure the system's robustness in all diverse environments. The traditional system efficiency is reduced due to accents, varying pronunciations, and limited contextual considerations. The paper introduces the lion-swarm boosted deep sesame networks, a new speech recognition framework that fuses sensor technologies and deep learning to improve accuracy and robustness. This model combines acoustic signals with sensor-based inputs from accelerometers, gyroscopes, and electromyography devices to capture the delicate speech-related modulations for better recognition across diversified environments. The hierarchical attention mechanism and lion swarm optimisation enable optimal feature selection, reducing the recognition error and computational overhead. The experiments show that it achieves a 9.5% word error rate in clean conditions, a low cross-entropy loss of 0.65, and 100 ms of processing latency - far superior to baseline models for noisy environments. The proposed framework can adapt to different accents and pronunciations, making it a strong solution for real-world applications in speech recognition. Inderscience Publishers - linking academia, business and industry through research

Title: LSB-DSN: sensor-assisted deep learning for robust English speech recognition

Authors: Aili Tang; Dezhi Zeng

Addresses: School of Liberal Education, Liuzhou Railway Vocational Technical College, Liuzhou Guangxi, 545616, China ' Guangxi Taiji Kensijie Information System Consulting Co., Ltd., Liuzhou Guangxi, 545026, China

Abstract: In modern communication, the English speech recognition system is essential for improving the personalised user experience and global communication. The recognition systems use sensor devices and deep learning techniques to ensure the system's robustness in all diverse environments. The traditional system efficiency is reduced due to accents, varying pronunciations, and limited contextual considerations. The paper introduces the lion-swarm boosted deep sesame networks, a new speech recognition framework that fuses sensor technologies and deep learning to improve accuracy and robustness. This model combines acoustic signals with sensor-based inputs from accelerometers, gyroscopes, and electromyography devices to capture the delicate speech-related modulations for better recognition across diversified environments. The hierarchical attention mechanism and lion swarm optimisation enable optimal feature selection, reducing the recognition error and computational overhead. The experiments show that it achieves a 9.5% word error rate in clean conditions, a low cross-entropy loss of 0.65, and 100 ms of processing latency - far superior to baseline models for noisy environments. The proposed framework can adapt to different accents and pronunciations, making it a strong solution for real-world applications in speech recognition.

Keywords: English speech recognition; signal modulation; sensor devices; subnetworks; lion swarm optimisation; LSO; similarity measures; deep sesame networks.

DOI: 10.1504/IJSNET.2025.148452

International Journal of Sensor Networks, 2025 Vol.49 No.1, pp.50 - 68

Received: 27 Feb 2025
Accepted: 11 Mar 2025
Published online: 05 Sep 2025 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article

Title: LSB-DSN: sensor-assisted deep learning for robust English speech recognition

Keep up-to-date