Title: Improved harmonic spectral envelope extraction for singer classification with hybridised model
Authors: Balachandra Kumaraswamy
Addresses: B.M.S. College of Engineering, Bangalore, India
Abstract: The singing voice has an effect on humans with the addition of expressions, lyrics, and instruments. It is easier for human beings to distinguish the singing tone of voice from a specified auditory clip owing to an individual's perceptual tools and audible physiology. On the other, without human intervention, it is not simple to identify non-vocal portions, vocal portions, feelings, and singers from the related signal owing to intrinsic complications. This proposed a new singer classification mechanism with four stages: 'pre-processing, vocal segmentation, feature extraction, and classification'. Initially, first stage, an 'improved convolutional neural network (CNN)' is deployed for the segmentation of the vocal part. Further, features like 'zero crossing rate (ZCR), Mel-frequency cepstral coefficients (MFCCs), vibration estimation and improved harmonic spectral envelope' are derived to 'bidirectional gated recurrent unit (BI-GRU) and long short-term memory (LSTM)'. The results from LSTM and BI-GRU are median and the final result is attained.
Keywords: singer classification; zero crossing rate; ZCR; convolutional neural network; improved CNN; bidirectional gated recurrent unit; BI-GRU; long short-term memory; LSTM; Mel-frequency cepstral coefficients; MFCCs.
DOI: 10.1504/IJBIC.2024.141676
International Journal of Bio-Inspired Computation, 2024 Vol.24 No.3, pp.150 - 163
Received: 03 Jan 2023
Accepted: 15 Jul 2023
Published online: 30 Sep 2024 *