Title: FCOS3Dformer: enhancing monocular 3D object detection through transformer-assisted fusion of depth information
Authors: Bingsen Hao; Zhaoxue Deng; Mingze Liu; Can Liu
Addresses: School of Mechatronics and Vehicle Engineering, Chongqing Jiaotong University, Chongqing, 400074, China ' School of Mechatronics and Vehicle Engineering, Chongqing Jiaotong University, Chongqing, 400074, China ' School of Mechatronics and Vehicle Engineering, Chongqing Jiaotong University, Chongqing, 400074, China ' School of Mechatronics and Vehicle Engineering, Chongqing Jiaotong University, Chongqing, 400074, China
Abstract: Existing monocular 3D object detection schemes for autonomous driving predominantly rely on local features, rendering them incapable of comprehensively grasping global depth context information. Consequently, precise recognition of 3D data is hindered. To address this challenge, this paper proposes an innovative approach termed FCOS3Dformer, which leverages a Transformer-assisted depth information fusion scheme. First, a Transformer-based depth encoder is employed to establish a global depth-guided region on features processed through the adaptive channel-space coordinate attention module, encapsulating both distant and close spatial depth information within the image. Subsequently, a depth decoder facilitates interactions between inter-queries and queries with global depth features, enabling each object query to estimate global depth distance from the depth-guided region. Additionally, we introduce a multi-object bounding box module using pseudo-labels to eliminate strict limitations of original hard labels, improving monocular depth estimation. Experimental evaluations on the KITTI dataset demonstrate the effectiveness of our proposed FCOS3Dformer method.
Keywords: autonomous driving; 3D object detection; local features; Transformer; depth information fusion; adaptive; pseudo-label; depth estimation.
DOI: 10.1504/IJVSMT.2024.142156
International Journal of Vehicle Systems Modelling and Testing, 2024 Vol.18 No.3, pp.228 - 244
Received: 17 Dec 2023
Accepted: 28 May 2024
Published online: 10 Oct 2024 *