YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection (EN)

This is my paper accepcted by ICRA 2021. The open-sourced code is in https://github.com/Owen-Liuyuxuan/visualDet3D .

The basic idea is to train Stereo 3D detection model "like" a Monocular one, to obtain fast inference speed and reasonable performance. Multiple modules are introduced and merged.

The re-production of the stereo/monocular results of this paper should be rather stable provided with the open-source repo.

Core Operations and Code Placement

Precomputing statistics for anchors: script github page
Using the statistics for anchors: head github page
Matching Module: lib github page
Ghost Module: lib github page
Multi-Scale Fusion: core github page
Disparity Loss: loss github page
To obtain the monocular results in this paper, just trained the with the monocular 3D settings like the GAC paper and get rid of additiona modules (DeformConv/GAC/...).

Result for the published model:

Release Page

Benchmark	Easy	Moderate	Hard
Car Detection	94.75 %	84.50 %	62.13 %
Car Orientation	93.65 %	82.88 %	60.92 %
Car 3D Detection	65.77 %	40.71 %	29.99 %
Car Bird's Eye View	74.00 %	49.54 %	36.30 %
Pedestrian Detection	58.34 %	49.54 %	36.30 %
Pedestrian Orientation	50.41 %	36.81 %	31.51 %
Pedestrian 3D Detection	31.03 %	20.67 %	18.34 %
Pedestrian Bird's Eye View	32.52 %	22.74 %	19.16 %

Keys	Action
`?`	Open this help
`n`	Next page
`p`	Previous page
`s`	Search