Authors: Astik Biswas; P.K. Sahu; Mahesh Chandra
Addresses: Department of Electrical Engineering, National Institute of Technology, Rourkela-769008, India ' Department of Electrical Engineering, National Institute of Technology, Rourkela-769008, India ' Department of Electronics and Communication Engineering, Birla Institute of Technology, Mesra, Ranchi – 835215, India
Abstract: Automatic speech recognition (ASR) system performs well under restricted conditions but the performance degrades under noisy environment. Audio-visual features play an important role in ASR systems in presence of noise. In this paper, Hindi isolated digits recognition system is designed using audio visual features. The visual features of the lip region integrated with audio features to get better recognition performance under noisy environments. Colour intensity and pseudo hue methods have been used for lip localisation approach with hidden Markov model (HMM) as a classifier. Recognition performance using HMM is better than LDA recogniser. For image compression, principal component analysis technique has been utilised.
Keywords: audio-visual speech recognition; AVSR; Bark frequency cepstral coefficient; BFCC; discrete wavelets transform; DWT; discrete cosine transform; DCT; hidden Markov model; HMM; linear discriminant analysis; LDA; Hindi digit recognition; Hindi isolated digits; colour intensity; pseudo hue; lip localisation; image compression; principal component analysis; PCA; India.
International Journal of Computational Vision and Robotics, 2015 Vol.5 No.3, pp.320 - 334
Accepted: 20 Sep 2014
Published online: 20 Aug 2015 *