Title: Robust time domain scalogram filter bank feature learning model for speech depression detection with metaheuristic spatio temporal residual BIGRU model

Authors: Uma Jaishankar; Jagannath H. Nirmal; Girish Gidaye

Addresses: Department of Electronics, KJ Somaiya College of Engineering, Vidyanagar, Vidya Vihar East, Vidyavihar, Mumbai, Maharashtra 400077, India ' KJ Somaiya College of Engineering, Vidyanagar, Vidya Vihar East, Vidyavihar, Mumbai, Maharashtra 400077, India ' Vidyalankar Institute of Technology, Sangam Nagar, Mumbai, Maharashtra-400037, India

Abstract: Speech patterns have become a viable biometric for detecting depressive disorders, but existing methods have trouble with temporal dependencies and obtaining reliable features from speech data. To overcome these challenges, the study developed a time-domain scalogram filter bank feature-learning model. This model incorporates nonlinear transformation, increased scalogram downsampling, and time-domain filtering to improve the feature extraction process. By integrating spatial and temporal attention mechanisms and residual learning, the convolutional spatial and temporal attention-based residual Gazelle Bidirectional gated recurrent unit (BIGRU) (CSTAResGBIGRU) model is proposed. The dataset used in this study are distress analysis interview corpus/wizard-of-oz set (DAIC-WOZ) and the emotional audio-textual Corpus (EATD-Corpus). Furthermore, multiple learning curve analyses and ablation studies can be carried out to demonstrate the efficacy of the proposed model. As per the experimental outcomes, the proposed model can outperform the state-of-the-art techniques, and it can attain 99.31% and 99.5% accuracy in DAIC-WOZ and EATD-Corpus correspondingly.

Keywords: speech depression; SD; classification; benchmark dataset; scalogram filter; performance metrics; pre-processing.

DOI: 10.1504/IJBET.2025.145219

International Journal of Biomedical Engineering and Technology, 2025 Vol.47 No.4, pp.348 - 382

Received: 12 Aug 2024
Accepted: 31 Oct 2024

Published online: 26 Mar 2025 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article