Title: A comparison of three spectral features for phone recognition in sub-optimal environments

Authors: Sushanta Kabir Dutta; L. Joyprakash Singh

Addresses: Department of Electronics and Communication Engineering, North Eastern Hill University, Shillong, Meghalay, India ' Department of Electronics and Communication Engineering, North Eastern Hill University, Shillong, Meghalay, India

Abstract: This paper presents a comparison of three spectral features for automatic phone recognition in sub-optimal environments. An exclusive study is carried out with a phone recognition system called phonetic engine (PE) developed in the Manipuri language. The Manipuri language is a scheduled Indian language being used as the official language in the State of Manipur. However, there is no standard database of the language so far. Therefore, a PE has been built for this language. Here phonetic transcriptions are done and then modeling of each phonetic unit is carried out using hidden Markov model (HMM). Speech feature extraction is a very important stage in the development of such a PE. An analysis of phone recognition accuracies of the PE due the three dominant spectral features: MFCC, PLP and LPCC have been studied here. It is found that PLP and MFCC outperform LPCC features under all circumstances.

Keywords: mel-frequency cepstrum coefficients; MFCC; perceptual linear prediction; PLP; linear prediction cepstral coefficients; LPCC; speech features; phonetic engine; hidden Markov model; HMM; HTK toolkit.

DOI: 10.1504/IJAPR.2018.092522

International Journal of Applied Pattern Recognition, 2018 Vol.5 No.2, pp.137 - 148

Received: 14 Sep 2017
Accepted: 07 Mar 2018

Published online: 23 Jun 2018 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article