Transport Object Detection in Street View Imagery Using Decomposed Convolutional Neural Networks

Yunpeng Bai ORCID: orcid.org/0000-0002-6923-672X¹⁷,
Changjing Shang¹⁷,
Ying Li¹⁸,
Liang Shen¹⁹,
Xianwen Zeng²⁰ &
…
Qiang Shen¹⁷

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1454))

Included in the following conference series:

UK Workshop on Computational Intelligence

128 Accesses

Abstract

Deep learning has achieved great successes in performing many visual recognition tasks including object detection. Nevertheless, existing deep networks are computationally expensive and memory intensive, hindering their deployment in resource-constrained environments, such as mobile or embedded devices that are widely used by city travellers. Recently, estimating city-level travel patterns using street imagery has shown to be a potentially valid way according to a case study with Google Street View (GSV), addressing a critical challenge in transport object detection. This paper presents a compressed deep network using tensor decomposition to detect transport objects in GSV images, which is sustainable and eco-friendly. In particular, a new dataset named Transport Mode Share-Tokyo (TMS-Tokyo) is created to serve the public for transport object detection. This is based on the selection and filtering of 32,555 acquired images that involve 50,827 visible transport objects (including cars, pedestrians, buses, trucks, motors, vans, cyclists and parked bicycles) from the GSV imagery of Tokyo. Then a compressed convolutional neural network (termed SVDet) is proposed for street view object detection via tensor train decomposition on a given baseline detector. Experimental results conducted on the TMS-Tokyo dataset demonstrate that SVDet can achieve promising performance in comparison with conventional deep detection networks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 139.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 179.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

CCT-DOSA: a hybrid architecture for road network extraction from satellite images in the era of IoT

Article 02 July 2024

Automated Road Extraction from Remotely Sensed Imagery using ConnectNet

Article 08 September 2023

End-to-end deep learning pipeline for scalable, deployable object detection engine in the traffic system

Article 25 November 2023

References

Anguelov, D., et al.: Google street view: capturing the world at street level. Computer 43(6), 32–38 (2010). https://doi.org/10.1109/MC.2010.170
Article Google Scholar
Bochkovskiy, A., et al.: YOLOv4: optimal speed and accuracy of object detection. arXiv:2004.10934 [cs, eess]. (2020)
Cai, Z., Vasconcelos, N.: Cascade R-CNN: delving into high quality object detection. arXiv:1712.00726 [cs]. (2017)
Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. arXiv:1604.01685 [cs]. (2016)
Dai, J., et al.: R-FCN: object detection via region-based fully convolutional networks. arXiv (2016). https://doi.org/10.48550/arXiv.1605.06409
Denil, M., et al.: Predicting parameters in deep learning. arXiv:1306.0543 [cs, stat]. (2014)
Denton, E., et al.: Exploiting linear structure within convolutional networks for efficient evaluation. arXiv:1404.0736 [cs] (2014)
Geiger, A., et al.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3354–3361 (2012). https://doi.org/10.1109/CVPR.2012.6248074
Ghiasi, G., et al.: NAS-FPN: learning scalable feature pyramid architecture for object detection. arXiv:1904.07392 [cs]. (2019)
Girshick, R., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. arXiv:1311.2524 [cs]. (2014)
Goel, R., et al.: Estimating city-level travel patterns using street imagery: a case study of using Google street view in Britain. Plos One 13(5), e0196521 (2018). https://doi.org/10.1371/journal.pone.0196521
Article Google Scholar
Grimsrud, M., El-Geneidy, A.: Transit to eternal youth: lifecycle and generational trends in greater Montreal public transport mode share. Transportation 41(1), 1–19 (2014). https://doi.org/10.1007/s11116-013-9454-9
Article Google Scholar
He, K., et al.: Deep residual learning for image recognition. arXiv:1512.03385 [cs]. (2015)
Huang, G., et al.: Densely connected convolutional networks. arXiv (2018). https://doi.org/10.48550/arXiv.1608.06993
Jaderberg, M., et al.: Speeding up convolutional neural networks with low rank expansions. arXiv:1405.3866 [cs]. (2014)
Lebedev, V., et al.: Speeding-up convolutional neural networks using fine-tuned CP-decomposition. arXiv:1412.6553 [cs]. (2015)
Lin, T.-Y., et al.: Feature Pyramid networks for object detection. arXiv:1612.03144 [cs]. (2017)
Lin, T.-Y., et al.: Focal loss for dense object detection. arXiv:1708.02002 [cs]. (2018)
Liu, S., et al.: Path aggregation network for instance segmentation. arXiv (2018). https://doi.org/10.48550/arXiv.1803.01534
Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision – ECCV 2016. Lecture Notes in Computer Science(), vol. 9905, pp. 21–37. Springer, Cham arXiv:1512.02325 [cs]. (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Lu, Y., et al.: Fully-adaptive feature sharing in multi-task networks with applications in person attribute classification. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1131–1140 (2017). https://doi.org/10.1109/CVPR.2017.126
Mueller, N., et al.: Health impacts related to urban and transport planning: a burden of disease assessment. Environ. Int. 107, 243–257 (2017). https://doi.org/10.1016/j.envint.2017.07.020
Article Google Scholar
Neuhold, G., et al.: The Mapillary vistas dataset for semantic understanding of street scenes. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 5000–5009 (2017). https://doi.org/10.1109/ICCV.2017.534
Pang, J., et al.: Libra R-CNN: towards balanced learning for object detection. arXiv:1904.02701 [cs]. (2019)
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXiv:1804.02767 [cs]. (2018)
Ren, S., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. arXiv:1506.01497 [cs]. (2016)
Rigamonti, R., et al.: Learning separable filters. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2754–2761 (2013). https://doi.org/10.1109/CVPR.2013.355
Sainath, T.N., et al.: Low-rank matrix factorization for deep neural network training with high-dimensional output targets. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6655–6659 (2013). https://doi.org/10.1109/ICASSP.2013.6638949
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv (2015). https://doi.org/10.48550/arXiv.1409.1556
Tai, C., et al.: Convolutional neural networks with low-rank regularization. arXiv:1511.06067 [cs, stat]. (2016)
Tan, M., et al.: EfficientDet: scalable and efficient object detection. arXiv (2020). https://doi.org/10.48550/arXiv.1911.09070
Tian, Z., et al.: FCOS: fully convolutional one-stage object detection (2019)
Google Scholar
Li,X., et al.: A new benchmark for vision-based cyclist detection. In: 2016 IEEE Intelligent Vehicles Symposium (IV), pp. 1028–1033 (2016). https://doi.org/10.1109/IVS.2016.7535515
Yu, F., et al.: BDD100K: a diverse driving dataset for heterogeneous multitask learning. arXiv:1805.04687 [cs]. (2020)

Download references

Acknowledgements

This work is supported in part by the Strategic Partner Acceleration Award (80761-AU201), funded under the Ser Cymru II programme, UK. The first author is supported with a full International PhD Scholarship awarded by Aberystwyth University.

Author information

Authors and Affiliations

Department of Computer Science, Aberystwyth University, Aberystwyth, SY23 3DB, UK
Yunpeng Bai, Changjing Shang & Qiang Shen
School of Computer Science, Northwestern Polytechnical University, Xi’an, 710129, China
Ying Li
School of Information Engineering, Fujian Business University, Fuzhou, 350506, China
Liang Shen
Fujian Yingge Information Technology Co. Ltd., Fuzhou, 350000, China
Xianwen Zeng

Authors

Yunpeng Bai
View author publications
You can also search for this author in PubMed Google Scholar
Changjing Shang
View author publications
You can also search for this author in PubMed Google Scholar
Ying Li
View author publications
You can also search for this author in PubMed Google Scholar
Liang Shen
View author publications
You can also search for this author in PubMed Google Scholar
Xianwen Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Qiang Shen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yunpeng Bai .

Editor information

Editors and Affiliations

Department of Automatic Control and Systems Engineering, The University of Sheffield, Sheffield, UK
George Panoutsos
Department of Automatic Control and Systems Engineering, Head of the Intelligent Systems Research Laboratory, The University of Sheffield, Sheffield, South Yorkshire, UK
Mahdi Mahfouf
Department of Automatic Control and Systems Engineering, The University of Sheffield, Sheffield, South Yorkshire, UK
Lyudmila S Mihaylova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bai, Y., Shang, C., Li, Y., Shen, L., Zeng, X., Shen, Q. (2024). Transport Object Detection in Street View Imagery Using Decomposed Convolutional Neural Networks. In: Panoutsos, G., Mahfouf, M., Mihaylova, L.S. (eds) Advances in Computational Intelligence Systems. UKCI 2022. Advances in Intelligent Systems and Computing, vol 1454. Springer, Cham. https://doi.org/10.1007/978-3-031-55568-8_34

Download citation

DOI: https://doi.org/10.1007/978-3-031-55568-8_34
Published: 19 May 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-55567-1
Online ISBN: 978-3-031-55568-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Transport Object Detection in Street View Imagery Using Decomposed Convolutional Neural Networks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

CCT-DOSA: a hybrid architecture for road network extraction from satellite images in the era of IoT

Automated Road Extraction from Remotely Sensed Imagery using ConnectNet

End-to-end deep learning pipeline for scalable, deployable object detection engine in the traffic system

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Transport Object Detection in Street View Imagery Using Decomposed Convolutional Neural Networks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

CCT-DOSA: a hybrid architecture for road network extraction from satellite images in the era of IoT

Automated Road Extraction from Remotely Sensed Imagery using ConnectNet

End-to-end deep learning pipeline for scalable, deployable object detection engine in the traffic system

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation