Title: Phoneme dependent inter-session variability reduction for speaker verification

Authors: Haoze Lu; Wenbin Zhang; Yasuo Horiuchi; Shingo Kuroiwa

Addresses: Graduate School of Advanced Integration Science, Chiba University, Chiba 2638522, Japan ' Graduate School of Advanced Integration Science, Chiba University, Chiba 2638522, Japan ' Graduate School of Advanced Integration Science, Chiba University, Chiba 2638522, Japan ' Graduate School of Advanced Integration Science, Chiba University, Chiba 2638522, Japan

Abstract: GMM-UBM super-vectors will potentially lead to worse modelling for speaker verification due to the inter-session variability, especially when a small amount of training utterances were available. In this study, we propose a phoneme dependent method to suppress the inter-session variability. A speaker's model can be represented by several various phoneme Gaussian mixture models. Each of them covers an individual phoneme whose inter-session variability can be constrained in an inter-session independent subspace constructed by principal component analysis (PCA), and it uses corpus uttered by a single speaker that has been recorded over a long period. SVM-based experiments performed using a large corpus, constructed by the National Research Institute of Police Science (NRIPS) to evaluate Japanese speaker recognition, and demonstrate the improvements gained from the proposed method.

Keywords: inter-session variability; phoneme; speaker verification; principal component analysis; PCA; Gaussian mixture models; modelling; SVM; support vector machines; Japanese speakers; speaker recognition.

DOI: 10.1504/IJBM.2015.070922

International Journal of Biometrics, 2015 Vol.7 No.2, pp.83 - 96

Received: 03 Oct 2014
Accepted: 13 Jan 2015

Published online: 31 Jul 2015 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article