[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Embodied Navigation at the Art Gallery

  • Conference paper
  • First Online:
Image Analysis and Processing – ICIAP 2022 (ICIAP 2022)

Abstract

Embodied agents, trained to explore and navigate indoor photorealistic environments, have achieved impressive results on standard datasets and benchmarks. So far, experiments and evaluations have involved domestic and working scenes like offices, flats, and houses. In this paper, we build and release a new 3D space with unique characteristics: the one of a complete art museum. We name this environment ArtGallery3D (AG3D). Compared with existing 3D scenes, the collected space is ampler, richer in visual features, and provides very sparse occupancy information. This feature is challenging for occupancy-based agents which are usually trained in crowded domestic environments with plenty of occupancy information. Additionally, we annotate the coordinates of the main points of interest inside the museum, such as paintings, statues, and other items. Thanks to this manual process, we deliver a new benchmark for PointGoal navigation inside this new space. Trajectories in this dataset are far more complex and lengthy than existing ground-truth paths for navigation in Gibson and Matterport3D. We carry on extensive experimental evaluation using our new space for evaluation and prove that existing methods hardly adapt to this scenario. As such, we believe that the availability of this 3D model will foster future research and help improve existing solutions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 79.50
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 99.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    The dataset has been collected at the Galleria Estense museum of Modena and can be found at https://github.com/aimagelab/ag3d.

  2. 2.

    https://matterport.com/it/cameras/pro2-3D-camera.

References

  1. Anderson, P., et al.: On evaluation of embodied navigation agents. arXiv preprint arXiv:1807.06757 (2018)

  2. Anderson, P., et al..: Sim-to-real transfer for vision-and-language navigation. In: CoRL (2021)

    Google Scholar 

  3. Anderson, P., et al.: Vision-and-language navigation: Interpreting visually-grounded navigation instructions in real environments. In: CVPR (2018)

    Google Scholar 

  4. Batra, D., et al.: Objectnav revisited: On evaluation of embodied agents navigating to objects. arXiv preprint arXiv:2006.13171 (2020)

  5. Bigazzi, R., Landi, F., Cascianelli, S., Baraldi, L., Cornia, M., Cucchiara, R.: Focus on impact: indoor exploration with intrinsic motivation. RA-L (2022)

    Google Scholar 

  6. Bigazzi, R., Landi, F., Cornia, M., Cascianelli, S., Baraldi, L., Cucchiara, R.: Explore and explain: self-supervised navigation and recounting. In: ICPR (2020)

    Google Scholar 

  7. Bigazzi, R., Landi, F., Cornia, M., Cascianelli, S., Baraldi, L., Cucchiara, R.: Out of the box: embodied navigation in the real world. In: CAIP (2021)

    Google Scholar 

  8. Cascianelli, S., Costante, G., Ciarfuglia, T.A., Valigi, P., Fravolini, M.L.: Full-GRU natural language video description for service robotics applications. RA-L 3(2), 841–848 (2018)

    Google Scholar 

  9. Chang, A., et al.: Matterport3D: learning from RGB-D data in indoor environments. In: 3DV (2017)

    Google Scholar 

  10. Chaplot, D.S., Gandhi, D., Gupta, S., Gupta, A., Salakhutdinov, R.: Learning to explore using active neural SLAM. In: ICLR (2019)

    Google Scholar 

  11. Chen, T., Gupta, S., Gupta, A.: Learning exploration policies for navigation. In: ICLR (2019)

    Google Scholar 

  12. Cornia, M., Baraldi, L., Cucchiara, R.: Smart: training shallow memory-aware transformers for robotic explainability. In: ICRA (2020)

    Google Scholar 

  13. Das, A., Datta, S., Gkioxari, G., Lee, S., Parikh, D., Batra, D.: Embodied question answering. In: CVPR (2018)

    Google Scholar 

  14. Irshad, M.Z., Ma, C.Y., Kira, Z.: Hierarchical cross-modal agent for robotics vision-and-language navigation. In: ICRA (2021)

    Google Scholar 

  15. Kadian, A., et al.: Sim2real predictivity: does evaluation in simulation predict real-world performance? RA-L 5(4), 6670–6677 (2020)

    Google Scholar 

  16. Krantz, J., Wijmans, E., Majumdar, A., Batra, D., Lee, S.: Beyond the Nav-graph: vision-and-language navigation in continuous environments. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12373, pp. 104–120. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58604-1_7

    Chapter  Google Scholar 

  17. Landi, F., Baraldi, L., Cornia, M., Corsini, M., Cucchiara, R.: Multimodal attention networks for low-level vision-and-language navigation. CVIU 210, 103255 (2021)

    Google Scholar 

  18. Niroui, F., Zhang, K., Kashino, Z., Nejat, G.: Deep reinforcement learning robot for search and rescue applications: exploration in unknown cluttered environments. RA-L 4(2), 610–617 (2019)

    Google Scholar 

  19. Pathak, D., Agrawal, P., Efros, A.A., Darrell, T.: Curiosity-driven exploration by self-supervised prediction. In: ICML (2017)

    Google Scholar 

  20. Ramakrishnan, S.K., Al-Halah, Z., Grauman, K.: Occupancy anticipation for efficient exploration and navigation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12350, pp. 400–418. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58558-7_24

    Chapter  Google Scholar 

  21. Ramakrishnan, S.K., Jayaraman, D., Grauman, K.: An exploration of embodied visual exploration. Int. J. Comput. Vis. 129(5), 1616–1649 (2021). https://doi.org/10.1007/s11263-021-01437-z

    Article  Google Scholar 

  22. Ramakrishnan, S.K., et al.: Habitat-matterport 3d dataset (HM3d): 1000 large-scale 3D environments for embodied AI. In: NeurIPS (2021). https://openreview.net/forum?id=-v4OuqNs5P

  23. Savva, M., et al.: Habitat: a platform for embodied AI research. In: ICCV (2019)

    Google Scholar 

  24. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal Policy Optimization Algorithms. arXiv preprint arXiv:1707.06347 (2017)

  25. Straub, J., et al.: The Replica Dataset: A Digital Replica of Indoor Spaces. arXiv preprint arXiv:1906.05797 (2019)

  26. Xia, F., Zamir, A.R., He, Z., Sax, A., Malik, J., Savarese, S.: Gibson env: real-world perception for embodied agents. In: CVPR (2018)

    Google Scholar 

  27. Ye, J., Batra, D., Das, A., Wijmans, E.: Auxiliary tasks and exploration enable ObjectNav. In: ICCV (2021)

    Google Scholar 

  28. Ye, J., Batra, D., Wijmans, E., Das, A.: Auxiliary tasks speed up learning point goal navigation. In: CoRL (2021)

    Google Scholar 

  29. Zhu, Y., et al.: Target-driven visual navigation in indoor scenes using deep reinforcement learning. In: ICRA (2017)

    Google Scholar 

Download references

Acknowledgement

This work has been supported by “Fondazione di Modena” and the “European Training Network on PErsonalized Robotics as SErvice Oriented applications” (PERSEO) MSCA-ITN-2020 project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Roberto Bigazzi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bigazzi, R., Landi, F., Cascianelli, S., Cornia, M., Baraldi, L., Cucchiara, R. (2022). Embodied Navigation at the Art Gallery. In: Sclaroff, S., Distante, C., Leo, M., Farinella, G.M., Tombari, F. (eds) Image Analysis and Processing – ICIAP 2022. ICIAP 2022. Lecture Notes in Computer Science, vol 13231. Springer, Cham. https://doi.org/10.1007/978-3-031-06427-2_61

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-06427-2_61

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-06426-5

  • Online ISBN: 978-3-031-06427-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics