Title: Identifying optimised speaker identification model using hybrid GRU-CNN feature extraction technique

Authors: Md. Iftekharul Alam Efat; Md. Shazzad Hossain; Shuvra Aditya; Jahanggir Hossain Setu; K.M. Imtiaz-Ud-Din

Addresses: Institute of Information Technology (IIT), Noakhali Science and Technology University, Noakhali, Bangladesh ' Banglalink Digital Communications Ltd., Bangladesh ' Institute of Information Technology (IIT), Noakhali Science and Technology University, Noakhali, Bangladesh ' Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh ' Department of Computer Science, American International University-Bangladesh (AIUB), Dhaka, Bangladesh

Abstract: Extracting vigorous and discriminative features and selecting an appropriate classifier model to identify speakers from voice clips are challenging tasks. Thus, we considered signal processing techniques and deep neural networks for feature extraction along with state-of-art machine-learning models as classifiers. Also, we introduced a hybrid gated recurrent unit (GRU) and convolutional neural network (CNN) as a novel feature extractor for optimising the subspace loss to extract the best feature vector. Additionally, space-time is contemplated as a computational parameter for finding the optimal speaker identification pipeline. Later, we have inspected the pipeline in a large-scale VoxCeleb dataset comprising 6,000 real world speakers with multiple voices achieving GRU-CNN + R-CNN for the highest accuracy and F1-score as well as GRU-CNN + CNN for maximum precision and LPC + KNN for the highest recall. Also, LPCC + R-CNN and MFCC + R-CNN are accomplished as optimal in terms of memory usage and time respectively.

Keywords: computational complexity; deep learning; feature extraction; speaker identification; VoxCeleb dataset.

DOI: 10.1504/IJCVR.2022.126508

International Journal of Computational Vision and Robotics, 2022 Vol.12 No.6, pp.662 - 685

Received: 11 Apr 2021
Accepted: 10 Oct 2021

Published online: 27 Oct 2022 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article