Authors: Amina Makhlouf; Lilia Lazli; Bachir Bensaker
Addresses: LRI (Laboratory of Computer Research), Department of Computer Science, University of Badji Mokhtar, BP. 12, Annaba, Algeria ' LRI (Laboratory of Computer Research), Department of Computer Science, University of Badji Mokhtar, BP. 12, Annaba, Algeria ' Department of Electronics, University of Badji Mokhtar, BP. 12, Annaba, Algeria
Abstract: In this paper, we present an Audio-Visual Automatic Speech Recognition System that combines the acoustic and the visual data. The proposed algorithm here, for modelling the multimodal data, is a Hidden Markov Model (HMM) hybridised with the Genetic Algorithm (GA) to determine its optimal structure. This algorithm is combined with the Baum-Welch algorithm, which allows an effective re-estimation of the probabilities of the HMM. Our experiments show the improvement in the performance of the most promising audio-visual system, based on the combination of GA/HMM model compared to the traditional HMM.
Keywords: automatic speech recognition; computer vision; HMM; hidden Markov models; GAs; genetic algorithms; hybrid models; signal processing; audio-visual fusion; AV fusion; Arabic; multimodal data modelling.
International Journal of Signal and Imaging Systems Engineering, 2016 Vol.9 No.1, pp.55 - 66
Available online: 08 Feb 2016Full-text access for editors Access for subscribers Purchase this article Comment on this article