Title: Integrating convolution and transformer for enhanced diabetic retinopathy detection
Authors: Xinrong Cao; Jie Lin; Xiaozhi Gao; Zuoyong Li
Addresses: College of Computer and Data Science/College of Software, Fuzhou University, China; Fujian Provincial Key Laboratory of Information Processing and Intelligent Control, College of Computer and Control Engineering, Minjiang University, Fuzhou, Fujian, China ' College of Computer and Data Science/College of Software, Fuzhou University, China; Fujian Provincial Key Laboratory of Information Processing and Intelligent Control, College of Computer and Control Engineering, Minjiang University, Fuzhou, Fujian, China ' School of Computer, University of Eastern Finland, Finland ' Fujian Provincial Key Laboratory of Information Processing and Intelligent Control, College of Computer and Control Engineering, Minjiang University, Fuzhou, Fujian, China
Abstract: Diabetic retinopathy (DR) is a common diabetes complication that can cause irreversible blindness. Deep learning models have been developed to automatically classify the severity of retinopathy. However, these methods face challenges like a lack of long-range connections, weak interactions between images, and mismatches between lesion details and receptive fields, leading to accuracy issues. In our research, we propose a deep learning model with three main aspects. Firstly, a transformer structure is incorporated into a convolutional neural network to effectively utilise both local and long-range information. Secondly, the disease details are aggregated from multiple images before applying self-attention to improve inter-image interactions and reduce overfitting. Lastly, an attention-based approach is proposed to filter information from different stages of feature maps and adaptively capture lesion-related details. Our experiments achieved a 5-class accuracy of 85.96% on the APTOS dataset and a 2-class accuracy of 95.33% on the Messidor dataset, surpassing recent methods.
Keywords: diabetic retinopathy; DR; convolutional neural network; transformer; cross attention; deep feature aggregation.
DOI: 10.1504/IJBIC.2024.139257
International Journal of Bio-Inspired Computation, 2024 Vol.23 No.4, pp.225 - 235
Received: 12 Nov 2023
Accepted: 18 Dec 2023
Published online: 28 Jun 2024 *