Title: Multimodal information fusion for selected multimedia applications

Authors: Ling Guan, Yongjin Wang, Rui Zhang, Yun Tie, Adrian Bulzacki, Muhammad Talal Ibrahim

Addresses: Ryerson Multimedia Research Laboratory, Ryerson University, 350 Victoria Street, Toronto, Ontario, M5B 2K3, Canada. ' Ryerson Multimedia Research Laboratory, Ryerson University, 350 Victoria Street, Toronto, Ontario, M5B 2K3 Canada; Department of Electrical and Computer Engineering, University of Toronto, 10 King's College Road, Toronto, Ontario, M5S 3G4, Canada. ' Ryerson Multimedia Research Laboratory, Ryerson University, 350 Victoria Street, Toronto, Ontario, M5B 2K3, Canada. ' Ryerson Multimedia Research Laboratory, Ryerson University, 350 Victoria Street, Toronto, Ontario, M5B 2K3, Canada. ' Ryerson Multimedia Research Laboratory, Ryerson University, 350 Victoria Street, Toronto, Ontario, M5B 2K3, Canada. ' Ryerson Multimedia Research Laboratory, Ryerson University, 350 Victoria Street, Toronto, Ontario, M5B 2K3, Canada

Abstract: The effective interpretation and integration of multiple information content are important for the efficacious utilisation of multimedia in a wide variety of application context. The major challenge in multimodal information fusion lies in the difficulty of identifying the complementary and discriminatory representations from individual channels, and the efficient fusion of the resulting information for the targeted application problem. This paper outlines several multimedia systems that utilise a multimodal approach, and provides a comprehensive review of the state-of-the-art in related areas, including emotion recognition, image annotation and retrieval, and biometrics. Data collected from diverse sources or sensors are employed to improve the recognition or classification accuracy. It is shown that the combination of multimodality information is capable of providing a more complete and effective description of the intrinsic characteristics of the specific pattern, and producing improved system performance than single modality only. In addition, we present a facial fiducial point detection and a gesture recognition system, which can be incorporated into a multimodal framework. The issues and challenges in the research and development of multimodal systems are discussed, and a cutting-edge application of multimodal information fusion for intelligent robotic system is presented.

Keywords: biometrics; emotion recognition; facial fiducial points; gesture recognition; image retrieval; image annotation; multimodal information fusion; multimedia; multimodality; pattern recognition; intelligent robots; robotics intelligence.

DOI: 10.1504/IJMIS.2010.035969

International Journal of Multimedia Intelligence and Security, 2010 Vol.1 No.1, pp.5 - 32

Published online: 11 Oct 2010 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article