Title: Optimum integration weight for decision fusion audio-visual speech recognition

Authors: R. Rajavel; P.S. Sathidevi

Addresses: Department of ECE, SSN College of Engineering, Kalavakkam-603110, Tamilnadu, India ' Department of ECE, National Institute of Technology Calicut, Calicut 673601, India

Abstract: Automatic speech recognition (ASR) technologies have been successfully applied to several real world applications. But, still there exist several problems that need to be solved for wider application of the technologies. One problem is noise-robustness of recognition performance. Recently, audio-visual speech recognition (AVSR) has received attention as a solution to this problem. In this, visual speech information is used together with acoustic signal for speech recognition in noisy environments. This paper presents a new decision fusion AVSR system, in which the classifier's decision is optimised using genetic algorithm (GA) optimisation technique. Hence, the optimally fused decision fusion AVSR system produces robust recognition accuracy at all SNR conditions. For evaluating the performance of the proposed scheme, the recognition results are compared with those of an equal weight bimodal AVSR system and with another state-of-the-art method, namely, compression and Mel sub-band spectral subtraction (CMSBS)-based noise compensation method for speech recognition in noise. Further, to show the effectiveness of the proposed optimisation method, the recognition results are compared with those of a similar method called directed grid search method, which also optimises the integration weight against the recognition accuracy.

Keywords: audio-visual speech recognition; AVSR; side face visual features; feature extraction; decision fusion; modality weight optimisation; late integration; acoustic signals; genetic algorithms; automatic speech recognition; ASR.

DOI: 10.1504/IJCSE.2015.067044

International Journal of Computational Science and Engineering, 2015 Vol.10 No.1/2, pp.145 - 154

Received: 04 Feb 2012
Accepted: 09 Apr 2012

Published online: 25 Jan 2015 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article