Title: A multimodal framework using audio, visible and infrared imagery for surveillance and security applications

Authors: Praveen Kumar, Ankush Mittal, Padam Kumar

Addresses: Department of Electronics and Computer Engineering, Indian Institute of Technology Roorkee, Roorkee 247667, India. ' Department of Electronics and Computer Engineering, Indian Institute of Technology Roorkee, Roorkee 247667, India. ' Department of Electronics and Computer Engineering, Indian Institute of Technology Roorkee, Roorkee 247667, India

Abstract: This paper presents a low-cost framework for combining multimodal information (visible, IR and audio signal) for small area surveillance and security applications. The system uses audio and video information to capture different aspects of the environment and infrared imagery is used for low lighting conditions. The visual processing module of the system uses a motion-based approach for detecting objects, and Kalman filter for tracking. Environmental sound is recognised by extracting Mel-Frequency Cepstral Coefficients (MLCCs) audio features and then classified by Dynamic Time Warping (DTW) technique. Experimental results on some typical sequences show promising results.

Keywords: video surveillance; visual object detection; audio classification; tracking; event analysis; multimodal information; infrared imagery; video security; Kalman filter.

DOI: 10.1504/IJSISE.2008.026797

International Journal of Signal and Imaging Systems Engineering, 2008 Vol.1 No.3/4, pp.255 - 263

Published online: 26 Jun 2009 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article