Title: An approach for speaker diarisation using whale-anti coronavirus optimisation integrated deep fuzzy clustering
Authors: K. Vijay Kumar; Ramisetty Rajeswara Rao
Addresses: Department of CSE, Srinivasa Institute of Engineering and Technology, Cheyyeru, Andhra Pradesh, India ' Department of Computer Science and Engineering, JNTU-GV College of Engineering, Vizianagaram, India
Abstract: In this paper, Anticorona whale optimisation (ACWOA) method is developed for speaker diarisation, which is then used to train the deep fuzzy clustering (DFC) algorithm for final clustering. To extract relevant characteristics, such as Mel frequency cepstral coefficients (MFCCs), line spectral frequencies, and line prediction cepstral coefficients (LPCCs), the input audios are fed into a feature extraction procedure (LSF). Music and silence removal are used in the speech activity detection (SAD). After identifying speech activities, the speakers are segmented using a Bayesian inference criterion (BIC) score. The ACWOA-based DFC outperformed other methods with best testing accuracy of 0.891, lowest diarisation error, false discovery rate (FDR), false negative rate (FNR) and false positive rate (FPR) of 0.618, 0.289, 0.148, and 0.130. The proposed approach outperforms the existing approaches active learning, DE+K-means, LSTM, MCGAN, and ANN-ABC-LA in terms of testing accuracy for test case 1 by 9.31%, 7.40%, 6.73%, 5.49%, and 3.59%.
Keywords: speaker diarisation; deep fuzzy clustering; DFC; Bayesian inference criterion; BIC; speech activity detection; SAD; speaker segmentation; Mel frequency cepstral coefficients; MFCCs; line prediction cepstral coefficients; LPCCs.
DOI: 10.1504/IJCVR.2025.144776
International Journal of Computational Vision and Robotics, 2025 Vol.15 No.2, pp.177 - 197
Received: 25 Nov 2021
Accepted: 15 Mar 2023
Published online: 03 Mar 2025 *