[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Transport Object Detection in Street View Imagery Using Decomposed Convolutional Neural Networks

  • Conference paper
  • First Online:
Advances in Computational Intelligence Systems (UKCI 2022)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1454))

Included in the following conference series:

  • 128 Accesses

Abstract

Deep learning has achieved great successes in performing many visual recognition tasks including object detection. Nevertheless, existing deep networks are computationally expensive and memory intensive, hindering their deployment in resource-constrained environments, such as mobile or embedded devices that are widely used by city travellers. Recently, estimating city-level travel patterns using street imagery has shown to be a potentially valid way according to a case study with Google Street View (GSV), addressing a critical challenge in transport object detection. This paper presents a compressed deep network using tensor decomposition to detect transport objects in GSV images, which is sustainable and eco-friendly. In particular, a new dataset named Transport Mode Share-Tokyo (TMS-Tokyo) is created to serve the public for transport object detection. This is based on the selection and filtering of 32,555 acquired images that involve 50,827 visible transport objects (including cars, pedestrians, buses, trucks, motors, vans, cyclists and parked bicycles) from the GSV imagery of Tokyo. Then a compressed convolutional neural network (termed SVDet) is proposed for street view object detection via tensor train decomposition on a given baseline detector. Experimental results conducted on the TMS-Tokyo dataset demonstrate that SVDet can achieve promising performance in comparison with conventional deep detection networks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 139.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 179.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Anguelov, D., et al.: Google street view: capturing the world at street level. Computer 43(6), 32–38 (2010). https://doi.org/10.1109/MC.2010.170

    Article  Google Scholar 

  2. Bochkovskiy, A., et al.: YOLOv4: optimal speed and accuracy of object detection. arXiv:2004.10934 [cs, eess]. (2020)

  3. Cai, Z., Vasconcelos, N.: Cascade R-CNN: delving into high quality object detection. arXiv:1712.00726 [cs]. (2017)

  4. Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. arXiv:1604.01685 [cs]. (2016)

  5. Dai, J., et al.: R-FCN: object detection via region-based fully convolutional networks. arXiv (2016). https://doi.org/10.48550/arXiv.1605.06409

  6. Denil, M., et al.: Predicting parameters in deep learning. arXiv:1306.0543 [cs, stat]. (2014)

  7. Denton, E., et al.: Exploiting linear structure within convolutional networks for efficient evaluation. arXiv:1404.0736 [cs] (2014)

  8. Geiger, A., et al.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3354–3361 (2012). https://doi.org/10.1109/CVPR.2012.6248074

  9. Ghiasi, G., et al.: NAS-FPN: learning scalable feature pyramid architecture for object detection. arXiv:1904.07392 [cs]. (2019)

  10. Girshick, R., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. arXiv:1311.2524 [cs]. (2014)

  11. Goel, R., et al.: Estimating city-level travel patterns using street imagery: a case study of using Google street view in Britain. Plos One 13(5), e0196521 (2018). https://doi.org/10.1371/journal.pone.0196521

    Article  Google Scholar 

  12. Grimsrud, M., El-Geneidy, A.: Transit to eternal youth: lifecycle and generational trends in greater Montreal public transport mode share. Transportation 41(1), 1–19 (2014). https://doi.org/10.1007/s11116-013-9454-9

    Article  Google Scholar 

  13. He, K., et al.: Deep residual learning for image recognition. arXiv:1512.03385 [cs]. (2015)

  14. Huang, G., et al.: Densely connected convolutional networks. arXiv (2018). https://doi.org/10.48550/arXiv.1608.06993

  15. Jaderberg, M., et al.: Speeding up convolutional neural networks with low rank expansions. arXiv:1405.3866 [cs]. (2014)

  16. Lebedev, V., et al.: Speeding-up convolutional neural networks using fine-tuned CP-decomposition. arXiv:1412.6553 [cs]. (2015)

  17. Lin, T.-Y., et al.: Feature Pyramid networks for object detection. arXiv:1612.03144 [cs]. (2017)

  18. Lin, T.-Y., et al.: Focal loss for dense object detection. arXiv:1708.02002 [cs]. (2018)

  19. Liu, S., et al.: Path aggregation network for instance segmentation. arXiv (2018). https://doi.org/10.48550/arXiv.1803.01534

  20. Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision – ECCV 2016. Lecture Notes in Computer Science(), vol. 9905, pp. 21–37. Springer, Cham arXiv:1512.02325 [cs]. (2016). https://doi.org/10.1007/978-3-319-46448-0_2

  21. Lu, Y., et al.: Fully-adaptive feature sharing in multi-task networks with applications in person attribute classification. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1131–1140 (2017). https://doi.org/10.1109/CVPR.2017.126

  22. Mueller, N., et al.: Health impacts related to urban and transport planning: a burden of disease assessment. Environ. Int. 107, 243–257 (2017). https://doi.org/10.1016/j.envint.2017.07.020

    Article  Google Scholar 

  23. Neuhold, G., et al.: The Mapillary vistas dataset for semantic understanding of street scenes. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 5000–5009 (2017). https://doi.org/10.1109/ICCV.2017.534

  24. Pang, J., et al.: Libra R-CNN: towards balanced learning for object detection. arXiv:1904.02701 [cs]. (2019)

  25. Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXiv:1804.02767 [cs]. (2018)

  26. Ren, S., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. arXiv:1506.01497 [cs]. (2016)

  27. Rigamonti, R., et al.: Learning separable filters. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2754–2761 (2013). https://doi.org/10.1109/CVPR.2013.355

  28. Sainath, T.N., et al.: Low-rank matrix factorization for deep neural network training with high-dimensional output targets. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6655–6659 (2013). https://doi.org/10.1109/ICASSP.2013.6638949

  29. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv (2015). https://doi.org/10.48550/arXiv.1409.1556

  30. Tai, C., et al.: Convolutional neural networks with low-rank regularization. arXiv:1511.06067 [cs, stat]. (2016)

  31. Tan, M., et al.: EfficientDet: scalable and efficient object detection. arXiv (2020). https://doi.org/10.48550/arXiv.1911.09070

  32. Tian, Z., et al.: FCOS: fully convolutional one-stage object detection (2019)

    Google Scholar 

  33. Li,X., et al.: A new benchmark for vision-based cyclist detection. In: 2016 IEEE Intelligent Vehicles Symposium (IV), pp. 1028–1033 (2016). https://doi.org/10.1109/IVS.2016.7535515

  34. Yu, F., et al.: BDD100K: a diverse driving dataset for heterogeneous multitask learning. arXiv:1805.04687 [cs]. (2020)

Download references

Acknowledgements

This work is supported in part by the Strategic Partner Acceleration Award (80761-AU201), funded under the Ser Cymru II programme, UK. The first author is supported with a full International PhD Scholarship awarded by Aberystwyth University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yunpeng Bai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bai, Y., Shang, C., Li, Y., Shen, L., Zeng, X., Shen, Q. (2024). Transport Object Detection in Street View Imagery Using Decomposed Convolutional Neural Networks. In: Panoutsos, G., Mahfouf, M., Mihaylova, L.S. (eds) Advances in Computational Intelligence Systems. UKCI 2022. Advances in Intelligent Systems and Computing, vol 1454. Springer, Cham. https://doi.org/10.1007/978-3-031-55568-8_34

Download citation

Publish with us

Policies and ethics