Article: Lips tracking biometrics for speaker recognition Journal: International Journal of Biometrics (IJBM) 2009 Vol.1 No.3 pp.288 - 306 Abstract: A novel approach to extract successful biometrics from mouth visual images is presented in this paper. Visual features are extracted from a sequence of images of speakers| lips while speaking. These features consist of the shape and intensity of pixels around the edge of the lips as well as their dynamics. The features are extracted by using particle filters technique to track the movements of the lips. The lips tracker shows adequate accuracy and ability to maintain lock in different speaking scenarios. Speaker models based on these features are built using Gaussian Mixture Models (GMM) trained through the Expectation-Maximisation (EM) algorithm. Satisfactory results are obtained for text-independent speaker recognition carried out on a video database of 35 individuals. A recognition rate of 82.8% for speaker identification and equal error rate of 18% for speaker verification are achieved using this technique. Inderscience Publishers - linking academia, business and industry through research

Title: Lips tracking biometrics for speaker recognition

Authors: Waleed H. Abdulla, Paul W.T. Yu, Paul Calverly

Addresses: Department of Electrical and Computer Engineering, The University of Auckland, Private Bag 92019, Auckland, New Zealand. ' Department of Electrical and Computer Engineering, The University of Auckland, Private Bag 92019, Auckland, New Zealand. ' Department of Electrical and Computer Engineering, The University of Auckland, Private Bag 92019, Auckland, New Zealand

Abstract: A novel approach to extract successful biometrics from mouth visual images is presented in this paper. Visual features are extracted from a sequence of images of speakers| lips while speaking. These features consist of the shape and intensity of pixels around the edge of the lips as well as their dynamics. The features are extracted by using particle filters technique to track the movements of the lips. The lips tracker shows adequate accuracy and ability to maintain lock in different speaking scenarios. Speaker models based on these features are built using Gaussian Mixture Models (GMM) trained through the Expectation-Maximisation (EM) algorithm. Satisfactory results are obtained for text-independent speaker recognition carried out on a video database of 35 individuals. A recognition rate of 82.8% for speaker identification and equal error rate of 18% for speaker verification are achieved using this technique.

Keywords: biometrics; lips tracking; particle filters; speaker modelling; speaker recognition; image analysis; mouth visual images; speaking; feature extraction.

DOI: 10.1504/IJBM.2009.024275

International Journal of Biometrics, 2009 Vol.1 No.3, pp.288 - 306

Published online: 30 Mar 2009 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article

Title: Lips tracking biometrics for speaker recognition

Keep up-to-date