You can view the full text of this article for free using the link below.

Title: The speech emotion recognition in multi-languages using an ensemble deep learning-based technique

Authors: Putta Aruna Kumari; Balusu Nandini

Addresses: Department of Computer Science, Telangana Social Welfare Residential Institution for Women, Nizamabad, Telangana State, 503001, India ' Department of Computer Science, Telangana Social Welfare Residential Institution for Women, Nizamabad, Telangana State, 503001, India

Abstract: Accurate emotion detection from speech signals is essential for enhancing human-computer interaction (HCI) systems. However, existing SER methods often suffer from poor feature representation and limited dataset diversity, resulting in suboptimal performance. To address these challenges, this paper proposes an advanced deep bottleneck residual convolutional neural network (DBR-CNN) integrated with the SEResNeXt-101 feature extraction framework and optimised using the coati optimisation algorithm (COA). The model is trained and evaluated on four diverse benchmark datasets: URDU, EMO-DB, EMOVO, and SAVEE. In the pre-processing phase, speech signals undergo noise reduction and normalisation to enhance data quality. The SEResNeXt-101 extractor then captures high-level features with reduced complexity, which are subsequently processed by the DBR-CNN to classify emotions with greater accuracy. The COA fine-tunes the model to improve classification efficiency. Experimental results demonstrate that the proposed model significantly outperforms state-of-the-art (SOTA) methods. The optimised DBR-CNN framework provides robust, scalable, and highly accurate emotion recognition.

Keywords: speech emotion recognition; SER; feature extraction; deep bottleneck residual convolutional neural network; DBR-CNN; noise reduction; coati optimisation algorithm; COA; speech emotion datasets; SEResNeXt-101; deep learning.

DOI: 10.1504/IJAPR.2024.146817

International Journal of Applied Pattern Recognition, 2024 Vol.7 No.3/4, pp.263 - 295

Received: 09 Dec 2024
Accepted: 30 Apr 2025

Published online: 19 Jun 2025 *

Full-text access for editors Full-text access for subscribers Free access Comment on this article