Title: An image semantic understanding model based on double-layer LSTM with information gain
Authors: Chen Li
Addresses: School of Computer Science, Luoyang Institute of Science and Technology, Luoyang, 471023, China
Abstract: In the era of big data, efficient semantic parsing of multi-modal data is crucial for intelligent service systems. However, existing image semantic understanding methods face issues such as cross-modal semantic gaps and insufficient modelling of long-range dependencies. To address these challenges, this paper proposes a novel hybrid network architecture that combines convolutional neural networks, recursive auto-encoders, and a dual-layer long short-term memory (LSTM) network guided by information gain. The proposed model achieves a highest semantic description score of 0.168 and improves both type agnostic accuracy and type aware accuracy to 0.932 and 0.901, respectively - outperforming three baseline methods. Compared to the original model, it increases accuracy by 0.016 and 0.010. This architecture effectively bridges cross-modal gaps and enhances feature selection and long-term dependency modelling. The model demonstrates strong potential for deployment in cloud services, semantic web platforms, and virtualised infrastructures to support fault detection, resource optimisation, and intelligent quality management.
Keywords: information gain; long short-term memory; LSTM; convolutional neural networks; CNN; cross-modal fusion; semantic understanding; recursive auto-encoder; RAE.
International Journal of Cloud Computing, 2025 Vol.14 No.4, pp.390 - 408
Received: 17 Apr 2025
Accepted: 31 Jul 2025
Published online: 14 Jan 2026 *