[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Talisman: Targeted Active Learning for Object Detection with Rare Classes and Slices Using Submodular Mutual Information

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Abstract

Deep neural networks based object detectors have shown great success in a variety of domains like autonomous vehicles, biomedical imaging, etc., however their success depends on the availability of a large amount of data from the domain of interest. While deep models perform well in terms of overall accuracy, they often struggle in performance on rare yet critical data slices. For e.g., detecting objects in rare data slices like “motorcycles at night” or “bicycles at night” for self-driving applications. Active learning (AL) is a paradigm to incrementally and adaptively build training datasets with a human in the loop. However, current AL based acquisition functions are not well-equipped to mine rare slices of data from large real-world datasets, since they are based on uncertainty scores or global descriptors of the image. We propose Talisman, a novel framework for Targeted Active Learning for object detectIon with rare slices using Submodular MutuAl iNformation. Our method uses the submodular mutual information functions instantiated using features of the region of interest (RoI) to efficiently target and acquire images with rare slices. We evaluate our framework on the standard PASCAL VOC07+12 [8] and BDD100K [31], a real-world large-scale driving dataset. We observe that Talisman consistently outperforms a wide range of AL methods by \(\approx \) 5%- - 14% in terms of average precision on rare slices, and \(\approx \) 2%–4% in terms of mAP. The code for Talisman is available here: https://github.com/surajkothawade/talisman.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 79.50
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 99.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    An unfortunate example of this is the self-driving car crash with Uber: https://www.theverge.com/2019/11/6/20951385/uber-self-driving-crash-death-reason-ntsb-dcouments where the self-driving car did not detect a pedestrian on a highway at night, resulting in a fatal accident.

  2. 2.

    See torch.tensordot.

References

  1. Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: SODA 2007: Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, pp. 1027–1035. Society for Industrial and Applied Mathematics, Philadelphia (2007)

    Google Scholar 

  2. Ash, J.T., Zhang, C., Krishnamurthy, A., Langford, J., Agarwal, A.: Deep batch active learning by diverse, uncertain gradient lower bounds. arXiv preprint arXiv:1906.03671 (2019)

  3. Bach, F.: Learning with submodular functions: a convex optimization perspective. arXiv preprint arXiv:1111.6453 (2011)

  4. Bach, F.: Submodular functions: from discrete to continuous domains. Math. Program. 175(1), 419–459 (2019)

    Article  MathSciNet  Google Scholar 

  5. Beck, N., Sivasubramanian, D., Dani, A., Ramakrishnan, G., Iyer, R.: Effective evaluation of deep active learning on image classification tasks. arXiv preprint arXiv:2106.15324 (2021)

  6. Brust, C.A., Käding, C., Denzler, J.: Active learning for deep object detection. arXiv preprint arXiv:1809.09875 (2018)

  7. Desai, S.V., Chandra, A.L., Guo, W., Ninomiya, S., Balasubramanian, V.N.: An adaptive supervision framework for active learning in object detection. arXiv preprint arXiv:1908.02454 (2019)

  8. Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vision 88(2), 303–338 (2010)

    Article  Google Scholar 

  9. Fujishige, S.: Submodular functions and optimization. Elsevier (2005)

    Google Scholar 

  10. Haussmann, E., et al.: Scalable active learning for object detection. In: 2020 IEEE Intelligent Vehicles Symposium (IV), pp. 1430–1435. IEEE (2020)

    Google Scholar 

  11. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  12. Iyer, R., Bilmes, J.: A memoization framework for scaling submodular optimization to large scale problems. In: The 22nd International Conference on Artificial Intelligence and Statistics, pp. 2340–2349. PMLR (2019)

    Google Scholar 

  13. Iyer, R., Khargoankar, N., Bilmes, J., Asanani, H.: Submodular combinatorial information measures with applications in machine learning. In: Algorithmic Learning Theory, pp. 722–754. PMLR (2021)

    Google Scholar 

  14. Iyer, R.K.: Submodular optimization and machine learning: Theoretical results, unifying and scalable algorithms, and applications. Ph.D. thesis (2015)

    Google Scholar 

  15. Kao, C.-C., Lee, T.-Y., Sen, P., Liu, M.-Y.: Localization-aware active learning for object detection. In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018. LNCS, vol. 11366, pp. 506–522. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20876-9_32

    Chapter  Google Scholar 

  16. Kaushal, V., Iyer, R., Kothawade, S., Mahadev, R., Doctor, K., Ramakrishnan, G.: Learning from less data: a unified data subset selection and active learning framework for computer vision. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1289–1299. IEEE (2019)

    Google Scholar 

  17. Kaushal, V., Kothawade, S., Ramakrishnan, G., Bilmes, J., Iyer, R.: Prism: a unified framework of parameterized submodular information measures for targeted data subset selection and summarization. arXiv preprint arXiv:2103.00128 (2021)

  18. Kirsch, A., Van Amersfoort, J., Gal, Y.: Batchbald: efficient and diverse batch acquisition for deep bayesian active learning. Adv. Neural. Inf. Process. Syst. 32, 7026–7037 (2019)

    Google Scholar 

  19. Kothawade, S., Beck, N., Killamsetty, K., Iyer, R.: Similar: submodular information measures based active learning in realistic scenarios. arXiv preprint arXiv:2107.00717 (2021)

  20. Kothyari, M., Mekala, A.R., Iyer, R., Ramakrishnan, G., Jyothi, P.: Personalizing ASR with limited data using targeted subset selection. arXiv preprint arXiv:2110.04908 (2021)

  21. Mirzasoleiman, B., Badanidiyuru, A., Karbasi, A., Vondrák, J., Krause, A.: Lazier than lazy greedy. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 29 (2015)

    Google Scholar 

  22. Nemhauser, G.L., Wolsey, L.A., Fisher, M.L.: An analysis of approximations for maximizing submodular set functions-i. Math. Program. 14(1), 265–294 (1978)

    Article  MathSciNet  Google Scholar 

  23. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. Adv. Neural. Inf. Process. Syst. 28, 91–99 (2015)

    Google Scholar 

  24. Roth, D., Small, K.: Margin-based active learning for structured output spaces. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 413–424. Springer, Heidelberg (2006). https://doi.org/10.1007/11871842_40

    Chapter  Google Scholar 

  25. Roy, S., Unmesh, A., Namboodiri, V.P.: Deep active learning for object detection. In: BMVC, vol. 362, p. 91 (2018)

    Google Scholar 

  26. Sener, O., Savarese, S.: Active learning for convolutional neural networks: a core-set approach. arXiv preprint arXiv:1708.00489 (2017)

  27. Settles, B.: Active learning literature survey (2009)

    Google Scholar 

  28. Tohidi, E., Amiri, R., Coutino, M., Gesbert, D., Leus, G., Karbasi, A.: Submodularity in action: from machine learning to signal processing applications. IEEE Signal Process. Mag. 37(5), 120–133 (2020)

    Article  Google Scholar 

  29. Wang, D., Shang, Y.: A new active labeling method for deep learning. In: 2014 International Joint Conference on Neural Networks (IJCNN), pp. 112–119. IEEE (2014)

    Google Scholar 

  30. Wei, K., Iyer, R., Bilmes, J.: Submodularity in data subset selection and active learning. In: International Conference on Machine Learning, pp. 1954–1963. PMLR (2015)

    Google Scholar 

  31. Yu, F., Xian, W., Chen, Y., Liu, F., Liao, M., Madhavan, V., Darrell, T.: Bdd100k: A diverse driving video database with scalable annotation tooling. arXiv preprint arXiv:1805.04687 2(5), 6 (2018)

Download references

Acknowledgments.

This work is supported by the National Science Foundation under Grant No. IIS-2106937, a startup grant from UT Dallas, and by a Google, Adobe, and Amazon research award, and an Adobe data science award.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Suraj Kothawade .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1575 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kothawade, S., Ghosh, S., Shekhar, S., Xiang, Y., Iyer, R. (2022). Talisman: Targeted Active Learning for Object Detection with Rare Classes and Slices Using Submodular Mutual Information. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13698. Springer, Cham. https://doi.org/10.1007/978-3-031-19839-7_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-19839-7_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-19838-0

  • Online ISBN: 978-3-031-19839-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics