Authors: Mitar Milacic; A.P. James; Sima Dimitrijev
Addresses: Griffith School of Engineering, Griffith University, Nathan, Qld. 4111, Australia ' School of CS and IT, IIITM-Kerala, IIITM-K Building, Technopark Campus, Trivandrum, Kerala 695581, India ' Griffith School of Engineering, Griffith University, Nathan, Qld. 4111, Australia
Abstract: Formants are regarded as the basic building blocks of vowels; however, they are very rarely used as features for difficult automatic speech recognition tasks. Formant-based research is generally focused on formant extraction, because of the assumption that a better formant extraction method is the only manner to increase the effectiveness of formants. We challenge this assumption by investigating a different use of formants following their extraction. By using the same principles of combining formants as observed in speech perception studies, we create features that show good recognition performance under noisy testing conditions. Improved recognition performance with the proposed formant features is demonstrated by comparing to Mel-frequency cepstrum coefficients and perceptual linear predictive coding features on a hidden Markov model-based automatic speech recognition system.
Keywords: robust speech recognition; formants; signal processing; biologically inspired features; phoneme recognition; automatic speech recognition; speech perception; feature recognition; formant features; bio-inspired features.
International Journal of Machine Intelligence and Sensory Signal Processing, 2013 Vol.1 No.1, pp.46 - 54
Received: 14 Feb 2012
Accepted: 14 Mar 2012
Published online: 19 Mar 2013 *