Abstract
Many real-world mobile or robotic vision systems encounter the problem of occlusion or unfavorable viewpoint in performing their tasks. A remedy to this issue is active vision, i.e. physically moving the camera or employ another camera to provide other viewpoints that hopefully provide more information for the task at hand. In the case of object recognition, an active vision system can help by offering classification decisions from another viewpoint when the current recognition confidence is low. A natural question, however, would be which next viewpoint is more effective in improving the object recognition performance. To determine the next best view, previous approaches either need multiple captures of the same object in specified poses, training datasets of 3D objects, or construction of occupancy grids. These methods are consequently computation, data, or observation intensive. In this paper, we propose a next best view method for object recognition that does not need any information about objects in other viewpoints, their 3D shape, or multiple prior observations to function properly. The proposed approach analyzes the object’s appearance and foreshortening in the current view to rapidly decide where to look next. Test results show its efficacy in correctly selecting the viewpoints that improve the object recognition performance more.
Similar content being viewed by others
Data Availability
Yes. Dataset available at https://github.com/pouryahoseini/Next-Best-View-Dataset.
Code Availability
Yes. Code available at https://github.com/pouryahoseini/Next-Best-View.
Notes
Dataset available at https://github.com/pouryahoseini/Next-Best-View-Dataset.
References
Bajcsy R, Aloimonos Y, Tsotsos JK. Revisiting active perception. Autonom Robots. 2018;42(2):177–96.
Hoseini P, Blankenburg J, Nicolescu M, Nicolescu M, Feil-Seifer D. An active robotic vision system with a pair of moving and stationary cameras. In: International symposium on visual computing. Springer; 2019. p. 184–195.
Hoseini P, Blankenburg J, Nicolescu M, Nicolescu M, Feil-Seifer D. Active eye-in-hand data management to improve the robotic object detection performance. Computers. 2019;8(4):71.
Hoseini P, Paul S.K, Nicolescu M, Nicolescu M.N. A surface and appearance-based next best view system for active object recognition. In: VISIGRAPP (5: VISAPP). 2021. p. 841–851.
Hoseini P, Paul SK, Nicolescu M, Nicolescu M. A one-shot next best view system for active object recognition. Appl Intell. 2021;1–20.
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J. 3d shapenets: a deep representation for volumetric shapes. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2015. p. 1912–1920.
Zeng R, Zhao W, Liu Y-J. Pc-nbv: a point cloud based deep network for efficient next best view planning. In: 2020 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE; 2020. p. 7050–7057
Wei W, Yu H, Zhang H, Xu W, Wu Y. Metaview: few-shot active object recognition. 2021. arXiv preprint arXiv:2103.04242.
Barzilay O, Zelnik-Manor L, Gutfreund Y, Wagner H, Wolf A. From biokinematics to a robotic active vision system. Bioinspir Biomimet. 2017;12(5):056004.
Atanasov N, Sankaran B, Le Ny J, Pappas GJ, Daniilidis K. Nonmyopic view planning for active object classification and pose estimation. IEEE Trans Robot. 2014;30(5):1078–90.
Potthast C, Sukhatme GS. Next best view estimation with eye in hand camera. In: IEEE/RSJ intl. conf. on intelligent robots and systems (IROS). Citeseer; 2011.
Potthast C, Sukhatme GS. A probabilistic framework for next best view estimation in a cluttered environment. J Vis Commun Image Rep. 2014;25(1):148–64.
Doumanoglou A, Kouskouridas R, Malassiotis S, Kim T-K. Recovering 6d object pose and predicting next-best-view in the crowd. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. p. 3583–3592.
Rebull Mestres J. Implementation of an automated eye-in hand scanning system using best-path planning. Master’s thesis, Universitat Politècnica de Catalunya. 2017.
Bircher A, Kamel M, Alexis K, Oleynikova H, Siegwart R. Receding horizon “ next-best-view” planner for 3d exploration. In: 2016 IEEE International conference on robotics and automation (ICRA). IEEE; 2016. p. 1462–1468.
Lehnert C, Tsai D, Eriksson A, McCool C. 3d move to see: multi-perspective visual servoing towards the next best view within unstructured and occluded environments. In: 2019 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE; 2019. p. 3890–3897.
Cui J, Wen JT, Trinkle J. A multi-sensor next-best-view framework for geometric model-based robotics applications. In: 2019 International conference on robotics and automation (ICRA). IEEE; 2019. p. 8769–8775.
Jia Z, Chang Y-J, Chen T. A general boosting-based framework for active object recognition. In: British machine vision conference (BMVC). Citeseer; 2010. p. 1–11.
Edmonds M, Yigit T, Yi J. Auto-calibrated 3d hyperspectral scanning using a heterogeneous set of cameras and lights with spectrally-optimal next-best-view planning. In: 2020 IEEE 16th international conference on automation science and engineering (CASE). IEEE; 2020. p. 863–868.
Xu Y, Hu J, Wattanachote K, Zeng K, Gong Y. Sketch-based shape retrieval via best view selection and a cross-domain similarity measure. IEEE Trans Multimedia. 2020;22(11):2950–62.
Lauri M, Pajarinen J, Peters J, Frintrop S. Multi-sensor next-best-view planning as matroid-constrained submodular maximization. IEEE Robot Autom Lett. 2020;5(4):5323–30.
Morrison D, Corke P, Leitner J. Multi-view picking: next-best-view reaching for improved grasping in clutter. In: 2019 International conference on robotics and automation (ICRA). IEEE; 2019. p. 8762–8768.
Almadhoun R, Abduldayem A, Taha T, Seneviratne L, Zweiri Y. Guided next best view for 3d reconstruction of large complex structures. Remote Sens. 2019;11(20):2440.
Palomeras N, Hurtós N, Vidal E, Carreras M. Autonomous exploration of complex underwater environments using a probabilistic next-best-view planner. IEEE Robot Automat Lett. 2019;4(2):1619–25.
Gonzalez RC, Woods RE. Digital image processing. 4th ed. London: Pearson; 2018.
Ammirato P, Poirson P, Park E, Košecká J, Berg AC. A dataset for developing and benchmarking active vision. In: 2017 IEEE international conference on robotics and automation (ICRA). IEEE; 2017. p. 1378–1385.
Acknowledgements
This work has been supported in part by the Office of Naval Research award N00014-16-1-2312 and US Army Research Laboratory (ARO) award W911NF-20-2-0084.
Funding
This work has been supported in part by the Office of Naval Research award N00014-16-1-2312 and US Army Research Laboratory (ARO) award W911NF-20-2-0084.
Author information
Authors and Affiliations
Contributions
Conceptualization: PH, MN, MN; methodology: PH, MN, SKP; formal analysis and investigation: PH, MN, MN; writing—original draft preparation: PH; writing—review and editing: PH, SKP, MN; funding acquisition: MN, MN; resources: MN, MN, PH, SKP; supervision: MN, MN.
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Ethics approval
Not applicable.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the topical collection “Computer Vision, Imaging and Computer Graphics Theory and Applications” guest edited by Jose Braz, A. Augusto Sousa, Alexis Paljic, Christophe Hurter and Giovanni Maria Farinella.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Hoseini, P., Paul, S.K., Nicolescu, M. et al. Next Best View Planning in a Single Glance: An Approach to Improve Object Recognition. SN COMPUT. SCI. 4, 51 (2023). https://doi.org/10.1007/s42979-022-01454-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-022-01454-w