Article: Comparison of VQ and GMM approach for identifying Indian languages Journal: International Journal of Applied Pattern Recognition (IJAPR) 2013 Vol.1 No.1 pp.99 - 107 Abstract: Language identification (LID) is the process of converting an acoustic signal captured by microphone or telephone into a set of words of a particular language in real time thus controlling the computer by the use of spoken commands. The software for LID generally requires an initial training using appropriate classification algorithms in order to teach the software to recognise the language spoken by the user. The LID system we have designed here comprises of four Indian spoken languages - Assamese, Bengali, English and Hindi. Experiments were carried out based on our own recorded standard database consisting of 50 speakers where speech features were extracted using MFCCs. After that VQ and GMM statistical techniques were applied for classification of our designed system followed by testing. Results show that for the vector quantisation approach, accuracy increases almost uniformly with increase in codebook size. Accuracy is maximum at higher codebook sizes for the VQ approach. Comparison of VQ and GMM approach at higher mixture order shows that VQ approach is comparatively better than GMM approach for all the four languages. Inderscience Publishers - linking academia, business and industry through research

Title: Comparison of VQ and GMM approach for identifying Indian languages

Authors: Pinki Roy; Pradip K. Das

Addresses: Department of Computer Science, National Institute of Technology, Silchar, Assam-788010, P.O-REC Silchar, India ' Department of Computer Science and Engineering, Indian Institute of Technology, Guwahati, Assam-781039, India

Abstract: Language identification (LID) is the process of converting an acoustic signal captured by microphone or telephone into a set of words of a particular language in real time thus controlling the computer by the use of spoken commands. The software for LID generally requires an initial training using appropriate classification algorithms in order to teach the software to recognise the language spoken by the user. The LID system we have designed here comprises of four Indian spoken languages - Assamese, Bengali, English and Hindi. Experiments were carried out based on our own recorded standard database consisting of 50 speakers where speech features were extracted using MFCCs. After that VQ and GMM statistical techniques were applied for classification of our designed system followed by testing. Results show that for the vector quantisation approach, accuracy increases almost uniformly with increase in codebook size. Accuracy is maximum at higher codebook sizes for the VQ approach. Comparison of VQ and GMM approach at higher mixture order shows that VQ approach is comparatively better than GMM approach for all the four languages.

Keywords: language identification; LID; Mel frequency cepstral coefficients; MFCCs; vector quantisation; VQ; Gaussian mixture model; GMM; accuracy; Indian languages; Assamese; Bengali; English; Hindi; speech features; feature extraction; classification; codebook size.

DOI: 10.1504/IJAPR.2013.052337

International Journal of Applied Pattern Recognition, 2013 Vol.1 No.1, pp.99 - 107

Received: 02 May 2012
Accepted: 13 Jul 2012
Published online: 31 Jul 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article

Title: Comparison of VQ and GMM approach for identifying Indian languages

Keep up-to-date