Abstract
3D multi-object tracking (MOT) has witnessed numerous novel benchmarks and approaches in recent years, especially those under the “tracking-by-detection” paradigm. Despite their progress and usefulness, an in-depth analysis of their strengths and weaknesses is not yet available. In this paper, we summarize current 3D MOT methods into a unified framework by decomposing them into four constituent parts: pre-processing of detection, association, motion model, and life cycle management. We then ascribe the failure cases of existing algorithms to each component and investigate them in detail. Based on the analyses, we propose corresponding improvements which lead to a strong yet simple baseline: SimpleTrack. Comprehensive experimental results on Waymo Open Dataset and nuScenes demonstrate that our final method could achieve new state-of-the-art results with minor modifications. Furthermore, we take additional steps and rethink whether current benchmarks authentically reflect the ability of algorithms for real-world challenges. We delve into the details of existing benchmarks and find some intriguing facts. Finally, we analyze the distribution and causes of remaining failures in SimpleTrack and propose future directions for 3D MOT. Our code is at https://github.com/tusen-ai/SimpleTrack.
Z. Pang—This work is complete during the first author’s internship at TuSimple.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Please check Sect. 5.1 for how we 10 Hz settings on nuScenes.
- 2.
We use py-motmetrics [11] for the analysis.
- 3.
Validation split comparisons are in the supplementary.
- 4.
Because of the submission time limits to nuScenes test set, we are only able to report the “10 Hz-One” variant in Table 5. It will be updated to “10 Hz-Two” once we had the chance.
- 5.
The ID-Switch increases because we output more bounding boxes and IDs. The 0.003 false positives in pedestrians are caused by boxes matched with the same GT box in crowded scenes.
References
Bar-Shalom, Y., Fortmann, T.E., Cable, P.G.: Tracking and data association (1990)
Baser, E., Balasubramanian, V., Bhattacharyya, P., Czarnecki, K.: FANTrack: 3D multi-object tracking with feature association network. In: IV (2019)
Benbarka, N., Schröder, J., Zell, A.: Score refinement for confidence-based 3D multi-object tracking. In: IROS (2021)
Berclaz, J., Fleuret, F., Turetken, E., Fua, P.: Multiple object tracking using k-shortest paths optimization. IEEE Trans. Pattern Anal. Mach. Intell. 33(9), 1806–1819 (2011)
Bergmann, P., Meinhardt, T., Leal-Taixe, L.: Tracking without bells and whistles. In: ICCV (2019)
Bewley, A., Ge, Z., Ott, L., Ramos, F., Upcroft, B.: Simple online and realtime tracking. In: ICIP (2016)
Braso, G., Leal-Taixe, L.: Learning a neural solver for multiple object tracking. In: CVPR (2020)
Caesar, H., et al.: nuScenes: a multimodal dataset for autonomous driving. In: CVPR (2020)
Chiu, H., Li, J., Ambrus, R., Bohg, J.: Probabilistic 3D multi-modal, multi-object tracking for autonomous driving. In: ICRA (2021)
Chiu, H.k., Prioletti, A., Li, J., Bohg, J.: Probabilistic 3D multi-object tracking for autonomous driving. arXiv:2001.05673 (2020)
py-motmetrics Contributors: py-motmetrics. https://github.com/cheind/py-motmetrics
Fleuret, F., Berclaz, J., Lengagne, R., Fua, P.: Multicamera people tracking with a probabilistic occupancy map. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 267–282 (2007)
Gautam, S., Meyer, G.P., Vallespi-Gonzalez, C., Becker, B.C.: Sdvtracker: real-time multi-sensor association and tracking for self-driving vehicles. arXiv preprint arXiv:2003.04447 (2020)
Genovese, A.F.: The interacting multiple model algorithm for accurate state estimation of maneuvering targets. J. Hopkins APL Tech. Dig. 22(4), 614–623 (2001)
He, J., Huang, Z., Wang, N., Zhang, Z.: Learnable graph matching: incorporating graph partitioning with deep feature learning for multiple object tracking. In: CVPR (2021)
Hornakova, A., Henschel, R., Rosenhahn, B., Swoboda, P.: Lifted disjoint paths with application in multiple object tracking. In: ICML (2020)
Jiang, X., Li, P., Li, Y., Zhen, X.: Graph neural ased end-to-end data association framework for online multiple-object tracking. arXiv preprint arXiv:1907.05315 (2019)
Kim, A., Osep, A., Leal-Taixé, L.: EagerMOT: 3D multi-object tracking via sensor fusion. arxiv:2104.14682 (2021)
Kim, C., Li, F., Ciptadi, A., Rehg, J.M.: Multiple hypothesis tracking revisited. In: ICCV (2015)
Lan, L., Tao, D., Gong, C., Guan, N., Luo, Z.: Online multi-object tracking by quadratic pseudo-boolean optimization. In: IJCAI (2016)
Leal-Taixé, L., Canton-Ferrer, C., Schindler, K.: Learning by tracking: siamese CNN for robust target association. In: CVPR Workshops (2016)
Li, J., Gao, X., Jiang, T.: Graph networks for multiple object tracking. In: WACV (2020)
Liang, T., Lan, L., Luo, Z.: Enhancing the association in multi-object tracking via neighbor graph. arXiv preprint arXiv:2007.00265 (2020)
Liu, Q., Chu, Q., Liu, B., Yu, N.: GSM: graph similarity model for multi-object trackin. In: IJCAI (2020)
Lu, Z., Rathod, V., Votel, R., Huang, J.: RetinaTrack: online single stage joint detection and tracking. In: CVPR (2020)
Pang, J., et al.: Quasi-dense similarity learning for multiple object tracking. In: CVPR (2021)
Pang, Z., Li, Z., Wang, N.: Model-free vehicle tracking and state estimation in point cloud sequences. In: IROS (2021)
Peng, J., et al.: TPM: multiple object tracking with tracklet-plane matching. Pattern Recogn. 107, 107480 (2020)
Pirsiavash, H., Ramanan, D., Fowlkes, C.C.: Globally-optimal greedy algorithms for tracking a variable number of objects. In: CVPR (2011)
Pöschmann, J., Pfeifer, T., Protzel, P.: Factor graph based 3D multi-object tracking in point clouds. In: IROS (2020)
Qi, C.R., et al.: Offboard 3D object detection from point cloud sequences. In: CVPR (2021)
Rangesh, A., Trivedi, M.M.: No blind spots: full-surround multi-object tracking for autonomous vehicles using cameras and lidars. In: IV (2019)
Reid, D.: An algorithm for tracking multiple targets. IEEE Trans. Autom. Control 24(6), 843–854 (1979)
Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., Savarese, S.: Generalized intersection over union. In: CVPR (2019)
Rezatofighi, S.H., Milan, A., Zhang, Z., Shi, Q., Dick, A., Reid, I.: Joint probabilistic data association revisited. In: ICCV (2015)
Sadeghian, A., Alahi, A., Savarese, S.: Tracking the untrackable: learning to track multiple cues with long-term dependencies. In: ICCV (2017)
Sun, P., et al.: Scalability in perception for autonomous driving: Waymo Open Dataset. arxiv:1912.04838 (2019)
Tang, S., Andres, B., Andriluka, M., Schiele, B.: Subgraph decomposition for multi-target tracking. In: CVPR (2015)
Tang, S., Andres, B., Andriluka, M., Schiele, B.: Multi-person tracking by multicut and deep matching. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 100–111. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_8
Weng, X., Wang, J., Held, D., Kitani, K.: 3D multi-object tracking: a baseline and new evaluation metrics. In: IROS (2020)
Weng, X., Wang, Y., Man, Y., Kitani, K.: GNN3DMOT: graph neural network for 3D multi-object tracking with 2D-3D multi-feature learning. In: CVPR (2020)
Wojke, N., Bewley, A., Paulus, D.: Simple online and realtime tracking with a deep association metric. In: ICIP (2017)
Xu, Y., et al.: How to train your deep multi-object tracker. In: CVPR (2020)
Yang, B., Bai, M., Liang, M., Zeng, W., Urtasun, R.: Auto4D: learning to label 4D objects from sequential point clouds. arxiv:2101.06586 (2021)
Yang, B., Huang, C., Nevatia, R.: Learning affinities and dependencies for multi-target tracking using a CRF model. In: CVPR (2011)
Yin, T., Zhou, X., Krähenbühl, P.: Center-based 3D object detection and tracking. In: CVPR (2021)
Zaech, J., Dai, D., Liniger, A., Danelljan, M., Gool, L.V.: Learnable online graph representations for 3D multi-object tracking. arXiv:2104.11747 (2021)
Roshan Zamir, A., Dehghan, A., Shah, M.: GMCP-tracker: global multi-object tracking using generalized minimum clique graphs. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7573, pp. 343–356. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33709-3_25
Zhang, L., Li, Y., Nevatia, R.: Global data association for multi-object tracking using network flows. In: CVPR (2008)
Zhang, Y., et al.: ByteTrack: multi-object tracking by associating every detection box. arXiv preprint arXiv:2110.06864 (2021)
Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: FairMOT: on the fairness of detection and re-identification in multiple object tracking. Int. J. Comput. Vis. 129, 1–19 (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Pang, Z., Li, Z., Wang, N. (2023). SimpleTrack: Understanding and Rethinking 3D Multi-object Tracking. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13801. Springer, Cham. https://doi.org/10.1007/978-3-031-25056-9_43
Download citation
DOI: https://doi.org/10.1007/978-3-031-25056-9_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25055-2
Online ISBN: 978-3-031-25056-9
eBook Packages: Computer ScienceComputer Science (R0)