Article: Multimodal emotion recognition based on combined deep learning networks Journal: International Journal of Biometrics (IJBM) 2026 Vol.18 No.1/2/3 pp.108 - 127 Abstract: Aiming to address the issues of low accuracy, low F1 score, and long task completion time in traditional multimodal emotion recognition methods, a multimodal emotion recognition method based on a combined deep learning network is proposed. Firstly, EEG signals, eye movement data, and facial expression images are collected, and the features of the collected data are extracted. Then, using the modal attention module, weighted operation, and decision module, a multimodal feature fusion model is built, and the extracted features are used as model inputs to obtain the multimodal feature fusion results. Finally, the results of multimodal feature fusion with a combined deep learning network are combined to achieve multimodal emotion recognition. The experimental results show that the proposed method has a maximum accuracy of 99.1% for multimodal emotion recognition, a minimum F1 value of 0.947, and a minimum completion time of 56.8 ms for multimodal emotion recognition tasks, demonstrating high precision and efficiency. Inderscience Publishers - linking academia, business and industry through research

Title: Multimodal emotion recognition based on combined deep learning networks

Authors: Zhenzhen Wang; Yu Ji; Rui Sun; Qi Liu

Addresses: Department of Information Engineering, Hebei Institute of Mechanical and Electrical Technology, Xingtai, 054000, China ' Organization Department, Hebei Institute of Mechanical and Electrical Technology, Xingtai, 054000, China ' Department of Information Engineering, Hebei Institute of Mechanical and Electrical Technology, Xingtai, 054000, China ' Department of Information Engineering, Hebei Institute of Mechanical and Electrical Technology, Xingtai, 054000, China

Abstract: Aiming to address the issues of low accuracy, low F1 score, and long task completion time in traditional multimodal emotion recognition methods, a multimodal emotion recognition method based on a combined deep learning network is proposed. Firstly, EEG signals, eye movement data, and facial expression images are collected, and the features of the collected data are extracted. Then, using the modal attention module, weighted operation, and decision module, a multimodal feature fusion model is built, and the extracted features are used as model inputs to obtain the multimodal feature fusion results. Finally, the results of multimodal feature fusion with a combined deep learning network are combined to achieve multimodal emotion recognition. The experimental results show that the proposed method has a maximum accuracy of 99.1% for multimodal emotion recognition, a minimum F1 value of 0.947, and a minimum completion time of 56.8 ms for multimodal emotion recognition tasks, demonstrating high precision and efficiency.

Keywords: combined deep learning network; multimodal; emotion recognition; EEG signals; eye movement data; facial expression images.

DOI: 10.1504/IJBM.2026.151089

International Journal of Biometrics, 2026 Vol.18 No.1/2/3, pp.108 - 127

Received: 14 Jan 2025
Accepted: 23 Mar 2025
Published online: 13 Jan 2026 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article

Title: Multimodal emotion recognition based on combined deep learning networks

Keep up-to-date