Title: Multimodal comparative learning chip defect detection algorithm based on GLIP guidance

Authors: Ziyi He; Bingqi Wang; Li Ma; Jingjing Fang

Addresses: School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, 200093, China ' Shanghai Radio Equipment Research Institute, 1555 Zhongchun Road, Minhang District, Shanghai, 201109, China ' Naval Medical Centre, The Second Military Medical University, Shanghai, 200433, China ' Naval Medical Centre, The Second Military Medical University, Shanghai, 200433, China

Abstract: In the production of semiconductor chips, the existing process technology and the working environment have an impact on the quality of the chip, so defect detection on the chip surface is crucial. However, in real-world environments, it is challenging to collect a sufficiently large and highly representative sample of defects. In this paper, we propose a multimodal comparative learning approach with GLIP for location guidance and Multi-scale fusion modules for different multiscale fusions to localise defect locations of different shapes and sizes. In the testing phase, samples from different chip types in the training set were used to demonstrate the good generalisation ability and accuracy of our model. Data was tested on the MVTEC dataset to demonstrate the superiority of our method, where the image-level and pixel-level accuracies on our privately owned chip dataset can reach 91.3 and 92.6, and the pixel-level accuracy on the MVTEC is 92.3.

Keywords: defect detection; visual language model; zero-sample inference; comparative learning; transfer learning.

DOI: 10.1504/IJSCC.2025.147351

International Journal of Systems, Control and Communications, 2025 Vol.16 No.3, pp.194 - 209

Received: 29 Jan 2025
Accepted: 02 Mar 2025

Published online: 14 Jul 2025 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article