YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection (EN)
This is my paper accepcted by ICRA 2021. The open-sourced code is in https://github.com/Owen-Liuyuxuan/visualDet3D .
The basic idea is to train Stereo 3D detection model "like" a Monocular one, to obtain fast inference speed and reasonable performance. Multiple modules are introduced and merged.
The re-production of the stereo/monocular results of this paper should be rather stable provided with the open-source repo.
Core Operations and Code Placement
- Precomputing statistics for anchors: script github page
- Using the statistics for anchors: head github page
- Matching Module: lib github page
- Ghost Module: lib github page
- Multi-Scale Fusion: core github page
- Disparity Loss: loss github page
- To obtain the monocular results in this paper, just trained the with the monocular 3D settings like the GAC paper and get rid of additiona modules (DeformConv/GAC/...).
Result for the published model:
Benchmark | Easy | Moderate | Hard |
---|---|---|---|
Car Detection | 94.75 % | 84.50 % | 62.13 % |
Car Orientation | 93.65 % | 82.88 % | 60.92 % |
Car 3D Detection | 65.77 % | 40.71 % | 29.99 % |
Car Bird's Eye View | 74.00 % | 49.54 % | 36.30 % |
Pedestrian Detection | 58.34 % | 49.54 % | 36.30 % |
Pedestrian Orientation | 50.41 % | 36.81 % | 31.51 % |
Pedestrian 3D Detection | 31.03 % | 20.67 % | 18.34 % |
Pedestrian Bird's Eye View | 32.52 % | 22.74 % | 19.16 % |