[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1007/978-3-031-54605-1_29guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

COOLer: Class-Incremental Learning for Appearance-Based Multiple Object Tracking

Published: 19 September 2023 Publication History

Abstract

Continual learning allows a model to learn multiple tasks sequentially while retaining the old knowledge without the training data of the preceding tasks. This paper extends the scope of continual learning research to class-incremental learning for multiple object tracking (MOT), which is desirable to accommodate the continuously evolving needs of autonomous systems. Previous solutions for continual learning of object detectors do not address the data association stage of appearance-based trackers, leading to catastrophic forgetting of previous classes’ re-identification features. We introduce COOLer, a COntrastive- and cOntinual-Learning-based tracker, which incrementally learns to track new categories while preserving past knowledge by training on a combination of currently available ground truth labels and pseudo-labels generated by the past tracker. To further exacerbate the disentanglement of instance representations, we introduce a novel contrastive class-incremental instance representation learning technique. Finally, we propose a practical evaluation protocol for continual learning for MOT and conduct experiments on the BDD100K and SHIFT datasets. Experimental results demonstrate that COOLer continually learns while effectively addressing catastrophic forgetting of both tracking and detection. The project page is available at https://www.vis.xyz/pub/cooler.

References

[1]
Aharon, N., Orfaig, R., Bobrovsky, B.Z.: Bot-sort: robust associations multi-pedestrian tracking. arXiv preprint arXiv:2206.14651 (2022)
[2]
Bernardin K and Stiefelhagen R Evaluating multiple object tracking performance: the clear mot metrics EURASIP J. Image Video Process. 2008 2008 1-10
[3]
Cha, H., Lee, J., Shin, J.: Co2l: contrastive continual learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9516–9525 (2021)
[4]
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607. PMLR (2020)
[5]
Chen Z and Liu B Lifelong machine learning Synth. Lect. Artif. Intell. Mach. Learn. 2018 12 3 1-207
[6]
Dave A, Khurana T, Tokmakov P, Schmid C, and Ramanan D Vedaldi A, Bischof H, Brox T, and Frahm J-M TAO: a large-scale benchmark for tracking any object Computer Vision – ECCV 2020 2020 Cham Springer 436-454
[7]
Dendorfer, P., et al.: Mot20: a benchmark for multi object tracking in crowded scenes. arXiv preprint arXiv:2003.09003 (2020)
[8]
Du Y et al. Strongsort: make deepsort great again IEEE Trans. Multimedia 2023 25 8725-8737
[9]
Fan, H., Zheng, L., Yan, C., Yang, Y.: Unsupervised person re-identification: clustering and fine-tuning. ACM Trans. Multimedia Comput. Commun. Appl. (TOMM) 14(4), 1–18 (2018)
[10]
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9729–9738 (2020)
[11]
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
[12]
Joseph, K., Khan, S., Khan, F.S., Balasubramanian, V.N.: Towards open world object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5830–5840 (2021)
[13]
Karthik, S., Prabhu, A., Gandhi, V.: Simple unsupervised multi-object tracking. arXiv preprint arXiv:2006.02609 (2020)
[14]
Kirkpatrick J et al. Overcoming catastrophic forgetting in neural networks Proc. Natl. Acad. Sci. 2017 114 13 3521-3526
[15]
Li M, Zhu X, and Gong S Unsupervised tracklet person re-identification IEEE Trans. Pattern Anal. Mach. Intell. 2019 42 7 1770-1782
[16]
Li Z and Hoiem D Learning without forgetting IEEE Trans. Pattern Anal. Mach. Intell. 2017 40 12 2935-2947
[17]
Liu, Y., Schiele, B., Vedaldi, A., Rupprecht, C.: Continual detection transformer for incremental object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 23799–23808 (2023)
[18]
Mai, Z., Li, R., Kim, H., Sanner, S.: Supervised contrastive replay: Revisiting the nearest class mean classifier in online class-incremental continual learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3589–3599 (2021)
[19]
Mallya, A., Lazebnik, S.: Packnet: adding multiple tasks to a single network by iterative pruning. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 7765–7773 (2018)
[20]
McCloskey, M., Cohen, N.J.: Catastrophic interference in connectionist networks: the sequential learning problem. In: Psychology of Learning and Motivation, vol. 24, pp. 109–165. Elsevier (1989)
[21]
Pang, J., et al.: Quasi-dense similarity learning for multiple object tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 164–173 (2021)
[22]
Peng C, Zhao K, and Lovell BC Faster ilod: incremental learning for object detectors based on faster RCNN Pattern Recogn. Lett. 2020 140 109-115
[23]
Ramanan, D., Forsyth, D.A.: Finding and tracking people from the bottom up. In: 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Proceedings, vol. 2, p. II. IEEE (2003)
[24]
Rebuffi, S.A., Kolesnikov, A., Sperl, G., Lampert, C.H.: ICARL: incremental classifier and representation learning. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 2001–2010 (2017)
[25]
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 28 (2015)
[26]
Rusu, A.A., et al.: Progressive neural networks. arXiv preprint arXiv:1606.04671 (2016)
[27]
Segu, M., Schiele, B., Yu, F.: Darth: holistic test-time adaptation for multiple object tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2023)
[28]
Shmelkov, K., Schmid, C., Alahari, K.: Incremental learning of object detectors without catastrophic forgetting. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3400–3409 (2017)
[29]
Sun, T., et al.: SHIFT: a synthetic driving dataset for continuous multi-task domain adaptation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 21371–21382 (2022)
[30]
Wang, Y.H.: Smiletrack: similarity learning for multiple object tracking. arXiv preprint arXiv:2211.08824 (2022)
[31]
Wojke, N., Bewley, A., Paulus, D.: Simple online and realtime tracking with a deep association metric. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 3645–3649. IEEE (2017).
[32]
Wu, G., Zhu, X., Gong, S.: Tracklet self-supervised learning for unsupervised person re-identification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 12362–12369 (2020)
[33]
Xie, J., Yan, S., He, X.: General incremental learning with domain-aware categorical representations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14351–14360 (2022)
[34]
Yu, F., et al.: Bdd100k: a diverse driving dataset for heterogeneous multitask learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2636–2645 (2020)
[35]
Zhang, Y., et al.: Bytetrack: multi-object tracking by associating every detection box. In: European Conference on Computer Vision, pp. 1–21. Springer, Heidelberg (2022).
[36]
Zheng, K., Chen, C.: Contrast r-cnn for continual learning in object detection. arXiv preprint arXiv:2108.04224 (2021)
[37]
Zhou, W., Chang, S., Sosa, N., Hamann, H., Cox, D.: Lifelong object detection. arXiv preprint arXiv:2009.01129 (2020)
[38]
Zhu, X., Su, W., Lu, L., Li, B., Wang, X., Dai, J.: Deformable DETR: deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159 (2020)

Cited By

View all
  • (2024)SLAck: Semantic, Location, and Appearance Aware Open-Vocabulary TrackingComputer Vision – ECCV 202410.1007/978-3-031-73383-3_1(1-18)Online publication date: 29-Sep-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
Pattern Recognition: 45th DAGM German Conference, DAGM GCPR 2023, Heidelberg, Germany, September 19–22, 2023, Proceedings
Sep 2023
647 pages
ISBN:978-3-031-54604-4
DOI:10.1007/978-3-031-54605-1
  • Editors:
  • Ullrich Köthe,
  • Carsten Rother

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 19 September 2023

Author Tags

  1. Continual learning
  2. Multiple object tracking
  3. Re-Identification

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 27 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)SLAck: Semantic, Location, and Appearance Aware Open-Vocabulary TrackingComputer Vision – ECCV 202410.1007/978-3-031-73383-3_1(1-18)Online publication date: 29-Sep-2024

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media