Forthcoming Articles

International Journal of Computer Applications in Technology

International Journal of Computer Applications in Technology (IJCAT)

Forthcoming articles have been peer-reviewed and accepted for publication but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the Inderscience standard. Additionally, titles, authors, abstracts and keywords may change before publication. Articles will not be published until the final proofs are validated by their authors.

Forthcoming articles must be purchased for the purposes of research, teaching and private study only. These articles can be cited using the expression "in press". For example: Smith, J. (in press). Article Title. Journal Title.

Articles marked with this shopping trolley icon are available for purchase - click on the icon to send an email request to purchase.

Online First articles are also listed here. Online First articles are fully citeable, complete with a DOI. They can be cited, read, and downloaded. Online First articles are published as Open Access (OA) articles to make the latest research available as early as possible.

Open AccessArticles marked with this Open Access icon are Online First articles. They are freely available and openly accessible to all without any restriction except the ones stated in their respective CC licenses.

Register for our alerting service, which notifies you by email when new issues are published online.

International Journal of Computer Applications in Technology (33 papers in press)

Regular Issues

  •   Free full-text access Open AccessDigital media video image data processing based on computer vision
    ( Free Full-text Access ) CC-BY-NC-ND
    by Xiawei Lu 
    Abstract: This paper describes the application direction of DMVI processing technology and the acquisition and post-processing of ultra-high-definition quality data, explores the application of DMVI processing technology in image analysis, proposes a method for obtaining ultra-high definition quality video data, and discusses the reconstruction of ultra-high-definition quality video. According to the research results, satisfaction with the introduction of the five-dimensional light field function algorithm and CV technology reached over 21%; at 4K resolution, the processing time of the five-dimensional light field was 1.05
    Keywords: digital media video image; image data processing; computer vision; ultra clear picture quality image.
    DOI: 10.1504/IJCAT.2025.10073493
     
  •   Free full-text access Open AccessResearch on image enhancement of smart home product layout scene based on virtual reality
    ( Free Full-text Access ) CC-BY-NC-ND
    by Zhongmei Liu 
    Abstract: To elevate the quality of image enhancement for smart home product layout scenes and expedite processing times, a study focused on virtual reality-based enhancement of these scenes has been undertaken. Initially, a virtual reality framework is employed to create an indoor environment for smart homes, with the ISGSA algorithm model utilized to generate this environment. Subsequently, the attributes of each constituent element are amalgamated and fed into a generator to produce a novel indoor scene. Ultimately, a conditional generative adversarial network is devised to formulate a composite loss function, integrating channel color loss, structural feature loss, and smoothness loss. This loss function is instrumental in achieving image enhancement. Experimental findings reveal that the proposed method attains an average information entropy of 8.846, with an image enhancement processing duration of merely 3.9 s.
    Keywords: virtual reality; smart home products; layout scene; image enhancement; ISGSA algorithm; attention learning module; channel colour loss.
    DOI: 10.1504/IJCAT.2025.10073932
     
  •   Free full-text access Open AccessMean-shift-based moving target tracking algorithm in complex industrial environments
    ( Free Full-text Access ) CC-BY-NC-ND
    by Zhongming Liao, Zhaosheng Xu, Xiuhong Xu, Azlan Ismail 
    Abstract: This paper proposes an improved moving target tracking algorithm (TTA) based on the mean-shift (MS) method, which is suitable for complex industrial environments. The improved algorithm introduces the YOLO (You Only Look Once) model for moving target detection and uses its results as tracking input. In addition, the algorithm also introduces a twin network (SN) to extract the deep features of the target for re-identification after occlusion. In order to further improve the tracking stability, a Kalman filter is introduced to predict the next motion state of the target. Stability analysis shows that the algorithm achieves the best multi-target tracking accuracy (MOTA) index in various complex environments, outperforming other tracking methods and showing good multi-target tracking stability. In summary, the algorithm successfully overcomes the limitations of the traditional MS method and provides a novel solution for moving target tracking in industrial environments. The algorithm has important practical value and provides a valuable reference for future research on moving target tracking in dynamic and complex environments.
    Keywords: moving target tracking; mean-shift algorithm; YOLO model; Siamese network; Kalman Filter.
    DOI: 10.1504/IJCAT.2025.10074104
     
  •   Free full-text access Open AccessResearch on multi-modal teaching resource association resource mining under MOOC ideological and political learning
    ( Free Full-text Access ) CC-BY-NC-ND
    by Hui Wang 
    Abstract: To overcome the limitations of current mining algorithms and improve the effectiveness of resource mining, this paper proposes a multimodal teaching resource association resource mining algorithm for MOOC ideological and political learning. Firstly, the features of text, image, and audio modalities are extracted using the bag of words model, VGG16 network, and Mel frequency cepstral coefficient method. Secondly, the feature vectors of each modality are concatenated and fused. Due to the high dimensionality after fusion, principal component analysis is used for dimensionality reduction. Finally, feature fusion, dimensionality reduction, and association rule mining are used to optimize the association of multimodal teaching resources, and dynamic association rules are introduced to adapt to the dynamic needs of students' learning process, thereby improving the effectiveness of MOOC ideological and political learning. The experimental results show that the mining results of the proposed algorithm have diversity and strong correlation with the target topic.
    Keywords: MOOC; ideological and political education; multimodal; teaching resources; resource mining; principal component analysis; association rules.
    DOI: 10.1504/IJCAT.2025.10074404
     
  •   Free full-text access Open AccessStudy on Road to Waterway model for medium to long-distance cargo transportation considering transportation efficiency
    ( Free Full-text Access ) CC-BY-NC-ND
    by Zengli Fang, Yali Liang, Gaoling Li, Xiang Tang 
    Abstract: This paper studies a "Road to Waterway" model for medium and long-distance cargo transportation with consideration of transport efficiency. First, addressing the time-sensitive requirements of high-value-added cargo transportation faced by multimodal operators, a "Road to Waterway" model for medium and long-distance transportation is developed. Second, through cost analysis that quantifies various expenses while establishing objective functions and constraints, the model ensures reasonable transportation mode selection, transit connections, and flow balance. Finally, employing genetic algorithms to generate initial solutions and maintain population diversity, combined with ant colony algorithm's positive feedback mechanism for optimal solution search, the model demonstrates significantly improved solving efficiency and time performance. Experimental results indicate a stable on-time arrival rate exceeding 97.7% and cost savings reaching 9.3%.
    Keywords: transportation efficiency; medium to long distance; freight transportation; ‘Road to Waterway’ model.
    DOI: 10.1504/IJCAT.2025.10074405
     
  •   Free full-text access Open AccessPose estimation technology of electronic components based on point cloud segmentation algorithm
    ( Free Full-text Access ) CC-BY-NC-ND
    by Wei Shen 
    Abstract: In actual manufacturing environments, electronic components often face occlusion problems, which makes it difficult for traditional point cloud segmentation methods to estimate the pose of objects accurately. To address this challenge, this paper introduces the multi-scale feature learning capability provided by PointNet++ to extract deep collective feature information in local areas of different scales and understand the overall morphology of components in a global context. According to experimental analysis, under the same occlusion level, PointNet++ outperforms the PointNet model, the RANSAC (Random Sample Consensus) algorithm, and the voxelisation method Point-Voxel CNN in terms of segmentation accuracy. The pose estimation method of electronic components studied in this paper is highly applicable in actual mechanical manufacturing environments, can process large-scale data, and meets real-time requirements. It provides the theoretical basis and technical support for solving the positioning and assembly problems of components in actual industrial production.
    Keywords: point cloud segmentation; pose estimation; PointNet++ Model; occlusion problems; mechanical manufacturing; random sample consensus.
    DOI: 10.1504/IJCAT.2025.10074466
     
  •   Free full-text access Open AccessCloud computing based construction and empirical evaluation of the security risk early warning evaluation system of digital economy
    ( Free Full-text Access ) CC-BY-NC-ND
    by Yanan Wu 
    Abstract: Traditional static risk assessment methods struggle to meet real-time processing demands for large-scale, multi-source heterogeneous data, showing sluggish responsiveness to emergencies and abnormal transactions. These approaches often suffer from poor early-warning accuracy and frequent false or missed alerts. To address these challenges, this study proposes a cloud-based security risk warning evaluation system for the digital economy. The system first establishes a multi-level risk indicator framework, utilizing fuzzy hierarchical analysis and information entropy to calculate weighted metrics that integrate qualitative and quantitative indicators. It then employs grey prediction algorithms for short-term risk trend forecasting. Through a cloud computing distributed architecture, the system achieves real-time collection, processing, and risk assessment of multi-source heterogeneous data, ensuring instant precision in warnings. Experimental results demonstrate that this method consistently outperforms existing approaches in both warning accuracy and Recall metrics, with significantly reduced average response time while maintaining reasonable control over false alarm rates and resource consumption. This research provides a practical technical solution for digital economy security risk management, offering both theoretical value and practical significance.
    Keywords: risk early warning; evaluation system construction; digital economy; economic security.
    DOI: 10.1504/IJCAT.2025.10075047
     
  • FPGA implementation and Multisim simulation of a new four-dimensional two-scroll hyperchaotic system with coexisting attractors   Order a copy of this article
    by Sundarapandian Vaidyanathan, Esteban Tlelo-Cuautle, Khaled Benkouider, Aceng Sambas, Ciro Fabian Bermudez-Marquez, Samy Abdelwahab Safaan 
    Abstract: Field-programmable gate array (FPGA) design of a new four-dimensional two-scroll hyperchaotic system is investigated in this work. A detailed system modelling of the new system with a hyperchaotic attractor begins this work with phase plots, which is followed by a bifurcation study of the new system. Special dynamic properties such as multistability and symmetry are also investigated for the new system. Using Multisim software, a circuit model is designed and simulated for the new hyperchaotic system. FPGA design and Multisim simulation of the new system enable practical applications in science and engineering. The implementation of the FPGA design in this work is carried out by applying two numerical schemes, viz. Forward Euler and Trapezoidal methods. Experimental attractors observed in the oscilloscope show good match with the Matlab signal plots.The FPGA hardware resources are detailed for both numerical methods.
    Keywords: hyperchaos; bifurcation; symmetry; phase plots; hyperchaotic system;rnparameters; stability; multistability; circuit model; FPGA implementation.

  • Improving hybrid-layer convolutional neural network system for lung cancer nodule classification using enhanced weight optimisation algorithm   Order a copy of this article
    by Vikul Pawar, P. Premchand 
    Abstract: In recent times, lung cancer is evolving as a highly life-threatening disease for human beings. According to the WHO, lung cancer disease is the second largest cause of deaths as compared to all other types of cancer. The prevailing available technology is striving to get more exposure in the field of medical science using Computer Assisted Diagnosis (CAD), where image processing is playing a crucial role for detecting the cancerous nodules in computer tomographic images. Augmenting the machine learning techniques with image processing algorithms is becoming a more comprehensive examination of cancer disease in proposed CAD systems. This paper is describes a heuristic approach for lung cancer nodule detection, and the proposed model predominantly consists of the following tasks, which are image enhancement, segmenting ROI (Region of Interest), features extraction, and nodule classification. In pre-processing, primarily the Adaptive Median Filter (AMF) filtering method is applied to eliminate the speckle noise from input CT images of Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): in the LIDC-IDRI dataset, the quality of input image is improved by applying Histogram Equalization (HE) technique with Contrast-Limited Adaptive (CLA) approach. Secondly, in the successive stage the Improved Level-Set (ILS) algorithm is used to segment the ROI. Furthermore, the third step of the projected work is applied to extract the definite learnable texture features and statistical features from the segmented ROI. The extracted features in the subsequent stage of classification are applied to Hybrid-Layer Convolutional Neural Network (HL-CNN) architecture to classify the lung cancer nodule as either benign or malignant. Principally this research is carried out by contributing to each stage of it, where the novel concept of the improved Hybrid-Layer Convolutional Neural Network (HL-CNN) is employed by optimising and selecting the optimal weight using the Enhanced Cat Swarm Optimisation (ECSO) algorithm. The experimental result of the proposed HL-CNN using the weight optimisation algorithm ECSO is achieved an accuracy of 93%, which is comparatively efficient with respect to existing models such as DBN, SVM, CNN, WOA, MFO, and CSO. Moreover, the proposed model conclusively gives a decision on the detected nodule as either benign or malignant.
    Keywords: Computer Assisted Diagnosis (CAD); Computer Vision; Cancer Diagnosis; Image Classification; Image Enhancement; Image Segmentation; Feature Extraction.

  • Prediction model for total amount of coke oven gas generation based on FCM-RBF   Order a copy of this article
    by Lili Feng, Jun Peng, Zhaojun Huang 
    Abstract: The rational use of Coke Oven Gas (COG) is of great significance to improve the economic efficiency of enterprises. In this paper, a COG generation prediction model based on fuzzy C-mean clustering (FCM) and radial basis function (RBF) neural network is proposed to address the problems such as the difficulty of accurate modelling of COG generation process and the difficulty of real-time flow prediction. Firstly, the coke oven production process is analysed and correlation analysis is used to select the influencing factors. Secondly, the FCM is used to classify the working conditions of the coke oven, and the appropriate number of working conditions is selected through experiments. Finally, the prediction models under different working conditions are established separately by using RBF. The experiments were carried out using actual industrial production data, and the experimental results showed that the model could provide guidance reference for the dispatchers.
    Keywords: coking oven process; fuzzy C-means clustering; prediction model; radial basis function neural network.

  • Hie-Graph-YOLOv9: a hierarchical YOLOv9 model with graph-based SE attention mechanism for vehicle detection in complex background
    by T. Selvamuthukumar, K. Vijayalakshmi, P. Dhanalakshmi, R. Abinaya 
    Abstract: Advanced vehicle detection algorithms are key to Intelligent Transportation Systems (ITS), enabling real-time traffic analysis, congestion and security management. Existing models like YOLOv9 face challenges in feature selection and learning, especially in dynamic or cluttered environments. To address these limitations, this research proposes Hie-Graph-YOLOv9 which is an extended version of YOLOv9 based on improving the feature selecting, feature learning and loss function by incorporating Hiera Transformers, Graph-based GAN-SE attention mechanism and Geometric-based Weighted Smooth L1 loss function. Hiera Transformers, integrated into the backbone network across four stages, refine multi-scale feature learning, ensuring robust representation of fine-grained and global patterns. The Graph-based GAN-SE, embedded in the bottleneck module, emphasises critical regions of feature maps, enhancing detection accuracy. Additionally, a Geometric-based Weighted Smooth L1 loss function is employed for bounding box regression, improving convergence speed and training stability. Experimental evaluations demonstrate the superiority of Hie-Graph-YOLOv9, achieving an AP (0.5) of 79.5%, improvement of faster convergence by 120 Epochs and an increased inference speed of 41.95 FPS, outperforming state-of-the-art models. This work offers a significant step forward in vehicle detection under complex real-world conditions.
    Keywords: object detection; YOLO; vehicle; Hiera; graph; squeeze and excitation.

  • Hie-Graph-YOLOv9: A Hierarchical YOLOv9 model with Graph-based SE attention mechanism for vehicle detection in complex background   Order a copy of this article
    by T. Selvamuthukumar, K. Vijayalakshmi, P. Dhanalakshmi, R. Abinaya 
    Abstract: Advanced vehicle detection algorithms are key to intelligent transportation systems (ITS), enabling real-time traffic analysis, congestion and security management. The proposed Hie-Graph-YOLOv9 method is an extended version of YOLOv9 based on improving the feature selecting, feature learning and loss function. In this YOLO architecture, we induced Hiera Transformers in the backbone network in four stages for improving the feature learning. We also introduced the Graph based GAN-SE attention mechanism in the bottleneck module for giving attention to essential feature map regions and utilized Geometric based Weighted Smooth L1 loss function for bounding box prediction for faster convergence, training stability and improved accuracy.
    Keywords: object detection; YOLO; vehicle;Hiera; graph; Squeeze and Excitation.
    DOI: 10.1504/IJCAT.2025.10072853
     
  • Building a tourism decision support system based on big data   Order a copy of this article
    by Li Fu, Yi Yao 
    Abstract: This paper studies the construction of a tourism decision support system based on Big Data (BD) technology and deep learning models. Apache Kafka is a pipeline for real-time data streams to stream data from different sources to the processing system. Apache Flink is a stream processing engine to processes and analyses the real-time incoming data streams and identifies emergencies. The Long Short-Term Memory (LSTM) network model receives data streams from Flink and performs time series prediction based on the users historical data and real-time information. The output prediction results are used for travel recommendations through a collaborative filtering algorithm. The research results show that compared with the rules-based and collaborative filtering systems, the retention rate of the system implemented in this paper is higher than the other two systems. This study enhances tourism decision support systems personalisation and real-time response capabilities.
    Keywords: tourism decision support system; big data technology; deep learning models; real-time response; personalised recommendations.
    DOI: 10.1504/IJCAT.2025.10073933
     
  • Virtual reality data visualisation design based on model predictive control in metaverse   Order a copy of this article
    by Tiankuo Yu, Lei Ding, Xiaocheng Zhou, Gaofeng Han 
    Abstract: In response to the problem of slow data updates caused by a large amount of static data display and neglect of real-time dynamic interaction in visual design, this study developed a framework based on MPC (Model Predictive Control) to address the limitations of static display and promote real-time interaction. In the article, a data acquisition and processing module is constructed, combined with linear regression and LSTM (Long Short Term Memory) models, optimized and integrated into a VR (Virtual Reality) system. Multiple interaction methods are designed, and reinforcement learning is introduced to improve prediction performance, data display effectiveness, and multi-user synchronization accuracy. The results showed that the average accuracy of the method reached 93.17%, with response delay, frame rate, and update frequency of 6.97 milliseconds, 101 frames per second, and 67 hertz, respectively. These results demonstrate the effectiveness of the framework in VR applications.
    Keywords: data visualisation design; model predictive control; virtual reality; art design; system architecture design.
    DOI: 10.1504/IJCAT.2025.10073937
     
  • Design and research of IIoT intelligent automatic production line security monitoring system based on digital twin   Order a copy of this article
    by Mengjia Lian, Lanqing Li, Shiyu Wang, Chunxiao Wang, Mingshi Li 
    Abstract: The paper proposes a security monitoring method of intelligent automatic production line to address the issues such as the inability to proactively predict instrument failures and inconvenient daily maintenance, and establishes a security monitoring architecture of intelligent automatic production line. The architecture specifically includes four parts: the physical model of the production line, the virtual model of the production line, the twin data of the production line and the digital twin service platform. Furthermore, the twin data of the production line are effectively analysed based on the fault hybrid prediction method, which can predict the possible faults and existing security risks that the production line is running. The intelligent automatic production line security monitoring method based on digital twins has the ability to predict and maintain the possible faults in the production line while ensuring normal production and processing, which can improve the stability of the production line.
    Keywords: industrial internet of things; intelligent automatic production line; security monitoring; failure prediction.
    DOI: 10.1504/IJCAT.2025.10073941
     
  • Application of neural network technology in English speech recognition and its impact on English speaking teaching   Order a copy of this article
    by Fengxiang Zhang, Feifei Wang 
    Abstract: In order to improve the accuracy of English speech recognition and promote the improvement of pronunciation accuracy in English oral teaching, this paper studies the application of neural network technology in English speech recognition and its impact on oral teaching. Using Mel frequency cepstral coefficients to extract audio features of English speech signals, taking the extracted audio features as input, and based on the English speech recognition results, a BP neural network is used to construct an English speech recognition model, which outputs the English speech recognition results with the minimum cumulative residual. Analyse the impact of this technology on English oral teaching from four aspects: improving pronunciation accuracy, achieving personalised learning, enhancing interactivity and expanding learning resources. The experimental results show that the accuracy of the English speech recognition method proposed in this paper always remains above 92%, which can improve the accuracy of English oral pronunciation.
    Keywords: neural network technology; English speech recognition; English speaking teaching; Mel frequency cepstral coefficient.
    DOI: 10.1504/IJCAT.2025.10074406
     
  • Evaluation of sound perception using a wireless sensor network for individuals with normal hearing   Order a copy of this article
    by Xinfei Shen, Wei Wei 
    Abstract: Tone perception depends on reliable frequency cues, yet wireless-sensing implants often convey imprecise pitch because of electrode length, channel limits, and speech-coding strategies. To enhance robustness in wireless sensor network (WSN) sound-source localisation, we introduce a linear-programming sequential localisation algorithm (LPSBL). The method models sequential signal arrival-time constraints across nodes as a linear program and embeds relaxation to compensate for measurement errors, preventing localisation failure under noise. We also examined pitch outcomes in children using hearing technologies. Average tone-perception scores for normal-hearing children with unilateral WSN hearing aids remained at chance, whereas children fitted bimodally (implant + acoustic aid) showed modest pitch recognition that was nevertheless low overall. These findings indicate that, while LPSBL strengthens WSN localisation robustness, bimodal assistance yields only limited improvements in pitch perception, underscoring the need for refined acoustic-electric processing and targeted training.
    Keywords: teaching effect; normal hearing people; music perception; wireless sound sensor network.
    DOI: 10.1504/IJCAT.2025.10074445
     
  • A monitoring and early warning of respiratory infectious disease symptoms based on multi-source information data fusion   Order a copy of this article
    by Shengcong Tao, Yirong Guo 
    Abstract: An oversight and alert methodology grounded in multi-source information data amalgamation is proposed to address the issues of elevated root mean square error and suboptimal alert efficacy in respiratory infectious disease symptom monitoring. First, manifestation data characteristics are delineated through time series analysis, and Support Vector Machines (SVM) are employed for feature extraction. Wavelet transformation technology is utilised to eliminate noise and rectify missing data. Subsequently, data level, feature level and decision level are progressively integrated to consolidate multi-source data characteristics, while Markov chain models are amalgamated to determine alert zones. The experimental results demonstrate that the proposed method achieves optimal performance in the root mean square error test of multi-source respiratory infectious disease symptom data fusion, with a minimum error of 0.11%. In the absolute accuracy value test for symptom monitoring and warning, the highest accuracy is observed to approach 100%.
    Keywords: data fusion; time series definition; SVM; decision level fusion; Markov chain.
    DOI: 10.1504/IJCAT.2025.10074468
     
  • A theoretical framework for integrating federated learning and transfer learning: advancing optimisation in decentralised systems   Order a copy of this article
    by Mohammed Abdul Wajeed, Annavarapu Chandra Sekhara Rao 
    Abstract: Federated Learning (FL) has transformed decentralised model training by enabling collaborative learning while protecting data privacy. Key challenges include non-iid data distributions, slow convergence and limited understanding of combining FL with other paradigms. This research introduces a theoretical framework establishing foundations for incorporating Transfer Learning (TL) into FL to address these issues. The Federated Transfer Optimisation (FTO) framework expands FL optimisation theories by introducing transfer-invariant initialisation metrics for efficient use of pre-trained models. We introduce a Transfer Learning Augmented Loss (TLAL) function combining global objectives and local transfer dynamics to control knowledge retention during fine-tuning. The framework presents adaptive task-alignment kernels to balance global and client-specific objectives in heterogeneous scenarios. Experimental evaluations on text classification data sets show FTO achieves better accuracy, reduced communication overhead and faster convergence compared to existing FL methods. This study provides a principled basis for integrating TL, enabling efficient learning systems for privacy-sensitive applications.
    Keywords: federated learning; transfer learning; federated transfer optimisation; distributed optimisation; adaptive task-alignment kernels; transfer learning augmented loss; TLAL; integrate federated transfer learning; text classification.
    DOI: 10.1504/IJCAT.2025.10074663
     
  • Research on intelligent data collection and quality evaluation of computer science and art education systems based on systemic multi-information fusion approach   Order a copy of this article
    by Ziming Wang 
    Abstract: This paper presents an intelligent system framework tailored to systematically evaluate student learning outcomes and teaching quality in computer science education using a multi-methodological approach. By leveraging multi-information fusion technology in conjunction with advanced data collection techniques, the framework integrates tools such as the Internet of Things (IoT), big data analytics, and machine learning to enhance the accuracy and precision of data acquisition. Addressing the specific demands of computer science education, this research proposes a comprehensive, multi-dimensional evaluation model designed to holistically assess both student learning effectiveness and instructional performance. The study seeks to provide structured, objective methodologies to evaluate and improve the quality of computer science education through in-depth systemic analysis.
    Keywords: multi-information fusion technology; systemic data collection; big data analytics; systemic teaching evaluation; computer science education systems.
    DOI: 10.1504/IJCAT.2025.10074664
     
  • Industrial phased array ultrasonic imaging data processing and defect recognition technology based on deep learning   Order a copy of this article
    by Dawen Yao, Peiwen Meng, Jinggang Xu, Shuqi Li 
    Abstract: This paper innovatively applies deep learning technology and uses deep convolutional neural network (CNN) to automatically extract key features from ultrasonic imaging data and perform defect recognition. The ultrasonic imaging data is denoised, normalized, and data augmented, and a deep CNN model is constructed. Image features are automatically extracted through multi-layer convolution and pooling layers. The model is trained and optimized using the back propagation algorithm and cross-entropy loss function. The trained model is used to realize real-time defect detection and precise positioning of new ultrasonic images. In the defect classification and positioning model comparison experiment, it is compared with different CNN architectures such as ResNet (Residual Network), CBAM-CNN (Convolutional Block Attention Module CNN), and Hybrid CNN. The accuracy of the proposed method reaches 93.10%, and the detection speed is 500 images per second, which is significantly better than the detection precision and efficiency of other models.
    Keywords: deep learning; industrial phased array ultrasonic imaging; defect recognition; deep convolutional neural network; data denoising and augmentation.
    DOI: 10.1504/IJCAT.2025.10074854
     
  • Design of online error monitoring system for capacitive voltage transformer based on LightGBM and PSO   Order a copy of this article
    by Pengcheng Li, Zhiyi Qu, Longxiang Wei, Xiujiang Yang 
    Abstract: This article introduces a new online error monitoring system for Capacitive Voltage Transformers (CVTs), which uses Light Gradient Boosting Machines (LightGBM) and Particle Swarm Optimisation (PSO). The LightGBM model adopts mutual exclusive feature grouping methods and gradient-based instance selection strategies that can reflect complicated non-linear relationships among CVTs' measurement errors and their influencing factors. The PSO algorithm adds a self-adaptive inertia weight strategy to optimise the weights of each model adequately to get the best possible error estimates. The proposed method achieved higher accuracy of prediction than any other methods under the same condition besides consuming less time and being more robust against noise disturbances while having better extensibility. The proposed online error monitoring system offers a reliable and efficient solution for ensuring the accuracy and stability of electrical energy measurement in power systems, enabling proactive maintenance strategies and enhancing the overall reliability of the power grid.
    Keywords: capacitive voltage transformer; online error monitoring; LightGBM; particle swarm optimisation; combined prediction model.
    DOI: 10.1504/IJCAT.2025.10074675
     
  • Sports injury prediction based on sensor information fusion and neural network   Order a copy of this article
    by Ying Song 
    Abstract: A sensor information fusion method for sports injury prediction is proposed in this paper. The hole effect is eliminated by employing the accumulation of multi-frame differences. On this basis, accurate motion regions are determined by fusion sensors to monitor motion in different scenes. Non-stationary signals of monitoring results are analysed by wavelet analysis method to obtain motion injury characteristics. Machine learning algorithms can be trained on this sensor data to develop predictive models for sports injuries. Sensor information fusion and wavelet radial basis function neural network are combined to obtain the wavelet eigenvector of all sensors. A radial basis function neural network will output a value when the data sent to it matches a certain risk level to achieve sports injury prediction. The results reveal that the proposed model performs well in prediction accuracy and running time, which can provide real-time feedback to athletes and coaches.
    Keywords: sports injury; sensor information fusion; RBF; wavelet; neural network.
    DOI: 10.1504/IJCAT.2025.10073307
     
  • Adaptive constraint multi-objective evolutionary computation industrial economic optimisation in smart city   Order a copy of this article
    by Yao Lv, Zimeng Guo 
    Abstract: This paper introduces an Adaptive Constraint Multi-objective Evolutionary Algorithm for Smart City Industrial Economics (ACMEA-SCIE). ACMEA-SCIE employs a dual reproduction strategy, evolving two complementary populations: a main population for exploring diverse industrial configurations and an archive population for preserving high-quality solutions. Additionally, a dynamic fitness allocation function adaptively balances objective optimisation and constraint handling, while an innovative archive update mechanism maintains solution diversity. The algorithm's performance was evaluated on three benchmark sets: smart city resource allocation, industrial ecosystem optimisation and dynamic urban industrial planning. Experimental results demonstrate ACMEA-SCIE's superior performance compared to state-of-the-art algorithms, achieving significant improvements in both inverted generational distance and hypervolume metrics. Additional analyses, including convergence performance and solution distribution, further validate ACMEA-SCIE's effectiveness. The proposed algorithm shows remarkable adaptability across various problem types, enhanced constraint handling and improved multi-objective balancing.
    Keywords: small city; industrial economic; evolutionary computation; multi-objective optimisation.
    DOI: 10.1504/IJCAT.2025.10073306
     
  • Evaluation of ultra-large-scale English translation mechanism based on Bi-LSTM   Order a copy of this article
    by Yafei Bi 
    Abstract: How to effectively extract and utilise syntactic features in the model is an issue worthy of further study in the current translation quality estimation task. This paper introduces a Bi-directional Long-Short-Term Memory (Bi-LSTM)-based English translation mechanism evaluation model aimed at providing fast and accurate feedback to enhance machine translation systems. The proposed model incorporates the following strategies. Firstly, we utilise the Skip-gram model and the Continuous Bag of Words (CBOW) model of the Word2Vec to preprocess text data before feature extraction. Second, we utilise three types of translation feature to promote the performance of translation evaluation, including word prediction feature, word-embedding feature and syntactic structure feature. Third, we design an English translation mechanism evaluation model based on the Bi-LSTM model by fusing the three types of extracted features. The results of the experiment demonstrate that the approach suggested in this paper exhibits favourable evaluation performance.
    Keywords: machine learning; English translation; evaluation model; neural network; feature extraction.
    DOI: 10.1504/IJCAT.2025.10073308
     
  • A large-scale high-definition music performance strategy based on the combination of reality and Metaverse   Order a copy of this article
    by Minglong Wang, Shimanqi Kong, Daohua Pan 
    Abstract: In this paper, we explore the intersection of the Metaverse, music generation, deep learning and performance strategy. Deep learning techniques have shown promise in generating music, and can be applied to create personalised soundscapes for users in the Metaverse. However, creating music with deep learning is a complex process that requires careful consideration of performance strategy. Factors such as data quality, model selection and training methodology can significantly influence the quality of generated music. In this paper, we propose a method for large-scale high-definition music generation and dance performance by combining Metaverse and deep learning techniques. First, we use the Transformer model to generate polyphonic music. Then, we use the Variational Autoencoder model (VAE) to encode dance movements. Finally, we use a joint attention mechanism to map music to dance performances. Experimental results and comparative analysis show the effectiveness of the proposed method.
    Keywords: reality and metaverse; deep learning; music generation; large-scale creation.
    DOI: 10.1504/IJCAT.2025.10073494
     
  • Preschool education video image optimisation mechanism based on deep evolutionary learning in smart city   Order a copy of this article
    by Junqing Fan 
    Abstract: As an important stage of basic education, the richness and quality of teaching resources in preschool education directly affect the growth and development of children. In order to better optimise the processing of preschool education video images and improve their clarity, this paper proposes a deep evolutionary learning method based on the Improved Whale Optimisation Algorithm and Bi-directional Long-Short-Term Memory (IWOA-BiLSTM). BiLSTM utilises the temporal information between adjacent frames of preschool education video images to preserve the time series output in the feature map of the images. This can fully learn the information between adjacent frames of the images, making the optimised image contain richer information. IWOA is used to optimise the key parameters of BiLSTM and improve its optimisation performance. Finally, experiments show that IWOA-BiLSTM can effectively optimise preschool education video images in smart city.
    Keywords: deep learning; image optimisation; evolutionary algorithms; preschool education video; smart city.
    DOI: 10.1504/IJCAT.2025.10073495
     
  • Graphic design optimisation mechanism based on deep learning in smart cities   Order a copy of this article
    by Yinan Chen 
    Abstract: This article focuses on the background of smart cities, analyses and optimises urban graphic design based on deep learning, and proposes the improved UNet model based on the Coordinate Attention (CA-IUN). First, we improve the model based on UNet. The Improved UNet model (IUN) replaces some traditional convolutions in the encoding and decoding stages with dilated convolutions. Then, transposed convolution is used for upsampling, replacing traditional linear interpolation. We also design multi-scale fusion using phantom convolution and SENet. CA-IUN adds coordinate attention module to the encoder and decoder of IUN to focus on the specific positions of features. In addition, this article combines perceptual loss and smooth L1 loss function to train the network. Finally, experiments show that CA-IUN outperforms other models in optimising graphic design, indicating that CA-IUN can effectively achieve more refined and efficient graphic design optimisation in smart cities.
    Keywords: deep learning; graphic design optimisation; smart cities.
    DOI: 10.1504/IJCAT.2025.10073305
     
  • Digital media art design mechanism based on reinforcement learning in smart city   Order a copy of this article
    by Xin He 
    Abstract: Digital media art has emerged as a pivotal domain that intersects technology, culture and urban life, transforming public spaces and offering novel forms of interaction and expression. In this paper, we propose a novel framework that leverages Generative Adversarial Networks (GANs) and Reinforcement Learning (RL) for 3D face reconstruction in digital media art design. We train and evaluate our model with rigorous experiments based on public data set, comparing its performance against several state-of-the-art methods. Our proposed model demonstrates superior performance in two metrics. Additionally, we conduct convergence analysis and robustness to input noise experiments to further validate our approach. The results highlight the effectiveness of our method in producing high-quality, realistic and robust 3D face reconstructions, underscoring its potential for enhancing digital media art installations in smart cities.
    Keywords: digital media; smart city; reinforcement learning.
    DOI: 10.1504/IJCAT.2025.10073496
     
  • Adaptability analysis of artificial intelligence and evolutionary computation in modelling and prediction of complex economic systems   Order a copy of this article
    by Na Tao 
    Abstract: The rapid advancements in Artificial Intelligence (AI) and Evolutionary Computation (EC) have paved the way for innovative solutions to complex economic modelling and prediction challenges. In this paper, we present a novel approach that integrates Deep Belief Networks (DBNs) with Particle Swarm Optimisation (PSO) to enhance the accuracy and robustness of exchange rate predictions in the Forex market. The proposed hybrid DBN-PSO model leverages the deep learning capabilities of DBNs to capture intricate data patterns, while PSO optimises the hyperparameters to achieve optimal performance. Extensive experiments on historical Forex data demonstrate that the DBN-PSO model significantly outperforms compared models in terms of four metrics. Visual analyses further illustrate the close alignment between predicted and actual exchange rates, underscoring the model's predictive accuracy and reliability. This research contributes to the advancement of economic forecasting by providing a robust and efficient tool for modelling and predicting complex economic systems.
    Keywords: artificial intelligence; evolutionary computation; complex economic systems; adaptability analysis.
    DOI: 10.1504/IJCAT.2025.10073633
     
  • Deep learning-powered automatic assessment mechanism in enhancing spoken English fluency   Order a copy of this article
    by Chunyan Xu 
    Abstract: Spoken English is essential for individuals who wish to work or study in an English-speaking environment. It is the primary means of communication for many professions, including business, education and healthcare. To improve the efficiency of spoken English learning, an end-to-end automatic English assessment method based on deep learning is designed. At the input level, the words are represented as a sequence tensor, where each position corresponds to the pre-trained word vector and the high-level information is obtained using a bi-directional Long Short-Term Memory (LSTM) network. The attention mechanism is integrated into the network in the acoustic model layer to improve the method's efficiency. In the output layer, the expression of words is connected with the spoken English expression, and the Softmax function is used to predict the grades. Simulation results show that the proposed method performs better than traditional LSTM and gate recurrent unit.
    Keywords: spoken English; automatic assessment; deep learning; LSTM.
    DOI: 10.1504/IJCAT.2025.10073634
     
  • Dual-phase temporal attention framework for energy-aware music recommendation   Order a copy of this article
    by Long Tang 
    Abstract: Personalised music recommendation systems build preference models based on users' listening history to suggest music aligned with their interests. As music streaming data volumes increase exponentially, energy consumption has become a critical concern in processing these recommendations. This paper introduces a novel energy-conscious approach to music recommendation. First, we propose a sequential preference framework that captures both enduring and recent user preferences using temporal attention networks. Second, we develop a cascaded decomposition technique to address data sparsity and imbalance challenges in large-scale music interaction data sets. Finally, we implement an energy-aware computation strategy that optimises resource utilisation during recommendation processing. Our experimental results demonstrate that the proposed framework outperforms baseline methods across multiple evaluation metrics while reducing energy consumption by up to 25%. Ablation studies confirm each component's effectiveness in enhancing recommendation quality and energy efficiency.
    Keywords: music recommendation; temporal attention; energy-aware computation; sequential preference.
    DOI: 10.1504/IJCAT.2025.10073545
     
  • Deep learning driven dynamic image processing in film and television animation   Order a copy of this article
    by Ran Zhang 
    Abstract: The growing development of film and television animation makes the data of dynamic image become more-huge, and the dynamic image processing becomes more complex. This paper proposes an HDR-Net method for dynamic image processing based on multi-disciplinary fields such as image processing, film and television animation and deep learning. HDR-Net is improved by U-Net, which has less computation. At the same time, global feature modules are added to process large areas of weak texture areas. In addition, this paper has conducted a great many simulation experimentations on the HDR-Net, the traditional SegNet and U-SegNet methods. It also conducts functional tests and performance tests on the data set for the above methods. The results show that the HDR-Net has higher prediction accuracy and better sensory effects when it is used to process dynamic image in film and television animation.
    Keywords: image processing; deep learning; HDR-Net method; film and television animation.
    DOI: 10.1504/IJCAT.2025.10074851