• 中文核心期刊要目总览
  • 中国科技核心期刊
  • 中国科学引文数据库(CSCD)
  • 中国科技论文与引文数据库(CSTPCD)
  • 中国学术期刊文摘数据库(CSAD)
  • 中国学术期刊(网络版)(CNKI)
  • 中文科技期刊数据库
  • 万方数据知识服务平台
  • 中国超星期刊域出版平台
  • 国家科技学术期刊开放平台
  • 荷兰文摘与引文数据库(SCOPUS)
  • 日本科学技术振兴机构数据库(JST)

BEV-radar:毫米波雷达-相机双向融合的三维目标检测

BEV-radar: bidirectional radar-camera fusion for 3D object detection

  • 摘要: 在自动驾驶场景下的3D目标检测任务中,探索毫米波雷达数据作为RGB图像输入的补充正成为多模态融合的新兴趋势。然而,现有的毫米波雷达-相机融合方法高度依赖于相机的一阶段检测结果,导致整体性能不够理想。本文提供了一种不依赖于相机检测结果的鸟瞰图下双向融合方法(BEV-radar)。对于来自不同域的两个模态的特征,BEV-radar设计了一个双向的基于注意力的融合策略。具体地,以基于BEV的3D目标检测方法为基础,我们的方法使用双向转换器嵌入来自两种模态的信息,并根据后续的卷积块强制执行局部空间关系。嵌入特征后,BEV特征在3D对象预测头中解码。我们在nuScenes数据集上评估了我们的方法,实现了48.2 mAP和57.6 NDS。结果显示,与仅使用相机的基础模型相比,不仅在精度上有所提升,特别地,速度预测误差项有了相当大的改进。代码开源于https://github.com/Etah0409/BEV-Radar。

     

    Abstract: Exploring millimeter wave radar data as complementary to RGB images for ameliorating 3D object detection has become an emerging trend for autonomous driving systems. However, existing radar-camera fusion methods are highly dependent on the prior camera detection results, rendering the overall performance unsatisfactory. In this paper, we propose a bidirectional fusion scheme in the bird-eye view (BEV-radar), which is independent of prior camera detection results. Leveraging features from both modalities, our method designs a bidirectional attention-based fusion strategy. Specifically, following BEV-based 3D detection methods, our method engages a bidirectional transformer to embed information from both modalities and enforces the local spatial relationship according to subsequent convolution blocks. After embedding the features, the BEV features are decoded in the 3D object prediction head. We evaluate our method on the nuScenes dataset, achieving 48.2 mAP and 57.6 NDS. The result shows considerable improvements compared to the camera-only baseline, especially in terms of velocity prediction. The code is available at https://github.com/Etah0409/BEV-Radar.

     

/

返回文章
返回