Title: Optimisation of training samples in recognition of overlapping speech and identification of speaker in a two speakers situation

Authors: S. Shanthi Therese; C. Lingam

Addresses: Ramrao Adik Institute of Technology, Nerul, Navi Mumbai 400706, India; Thadomal Shahani Engineering College, 32nd Road, TPS III, Bandra West, Mumbai, 400050, India; Affiliated to: University of Mumbai, India ' Pillai HOC College of Engineering and Technology, Khalapur, HOC Colony Rd, Taluka, Rasayani, Maharashtra 410207, India; Affiliated to: University of Mumbai, India

Abstract: Recognition of overlapping speech is still a challenging problem in the area of automatic speech recognition (ASR). In this paper, we have proposed a technique for overlapping speech recognition integrated with firefly optimization technique. Overlapped segments are thoroughly analysed for different dominant frequencies involved in the mixture. We have created an audio splitting function. Split audio signals are converted into mel cepstral coefficients and the intensity variations of signals are indicated by their cepstrum. Phoneme density updated cepstrum (PDUC) features are extracted from both spectrum feature analysis and mel frequency cepstral coefficients (MFCC) coefficients. Further, firefly optimization technique is used for clustering and selecting best relevant features. Datasets of speech separation challenge (SSC) Scopus are used to evaluate the results. From the results, we could conclude that minimum samples of 20% to 30% are sufficient to achieve recognition accuracy of above 90%.

Keywords: overlapping speech recognition; audio split; spectrum feature analysis; clustering using firefly algorithm.

DOI: 10.1504/IJAIP.2020.108773

International Journal of Advanced Intelligence Paradigms, 2020 Vol.17 No.1/2, pp.159 - 177

Received: 09 Dec 2017
Accepted: 25 Apr 2018

Published online: 03 Aug 2020 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article