LDCM-MVIT: A Lightweight Depth Completion Model Based on MViT

Yarui Chen¹⁰,
Qixiu Wu¹⁰,
Lianlong Sun¹⁰,
Shiwei Wu¹⁰,
Mengchen Zhang¹⁰,
Bingqi Wang¹⁰ &
…
Tingting Zhao¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14876))

Included in the following conference series:

International Conference on Intelligent Computing

348 Accesses

Abstract

In the field of computer vision, many perception methods rely on depth information captured by depth cameras. However, the integrity of depth maps is hindered by the reflection and refraction of light on transparent objects. Existing methods of completing depth map are usually impractical due to depth estimation error or unacceptably slow inference speeds. To address this challenge, we propose a lightweight depth completion model based on the Mobile Vision Transformer (LDCM-MViT), which uses a Mobile Guide Block (MGB). The MGB can efficiently fuse features from RGB and depth maps with limited parameters. Furthermore, we provide two types of fusion strategies to process RGB and depth features to get final depth map. Finally, we demonstrate the performance of LDCM-MViT compared with the DDC-SRGBD model and GuideFormer model on Matterport3D and KITTI datasets. Experimental results show that our model has a higher accuracy in comparison to the traditional methods with limited parameters, especially on edge devices.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 51.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 64.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Towards Domain-agnostic Depth Completion

Article 29 May 2024

LiDUT-Depth: A Lightweight Self-supervised Depth Estimation Model Featuring Dynamic Upsampling and Triplet Loss Optimization

Learning an Efficient Multimodal Depth Completion Model

References

Ma, F., Cavalheiro, G.V., Karaman, S.: Self-supervised sparse-to-dense: self-supervised depth completion from LiDAR and monocular camera. In: International Conference on Robotics and Automation (ICRA), pp. 3288–3295. IEEE (2019)
Google Scholar
Cheng, X., Wang, P., Guan, C., et al.: Learning context and resource aware convolutional spatial propagation networks for depth completion. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 07, pp. 10615–10622 (2020)
Google Scholar
Zhao, Y., Bai, L., Zhang, Z., et al.: A Surface geometry model for lidar depth completion. IEEE Robot. Autom. Lett. 6(3), 4457–4464 (2021)
Article Google Scholar
Wu, X., Peng, L., Yang, H., et al.: Sparse fuse dense: towards high quality 3D detection with depth completion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5418–5427 (2022)
Google Scholar
Song, Z., Lu, J., Yao, Y., et al.: Self-supervised depth completion from direct visual-LiDAR odometry in autonomous driving. IEEE Trans. Intell. Transp. Syst. 23(8), 11654–11665 (2021)
Article Google Scholar
Jaritz, M., De, C. R., Wirbel, E., et al.: Sparse and dense data with CNNs: depth completion and semantic segmentation. In: International Conference on 3D Vision (3DV), pp. 52–60. IEEE (2018)
Google Scholar
Tang, J., Tian, P., Feng, W., et al.: Learning guided convolutional network for depth completion. IEEE Trans. Image Process. 30, 1116–1129 (2020)
Article Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Mehta, S., Rastegari, M.: MobileViT: light-weight, general-purpose, and mobile-friendly vision transformer. arXiv preprint arXiv:2110.02178 (2021)
Terven, J., Cordova, M., Ramirez, A., et al.: Loss functions and metrics in deep learning. a review. arXiv preprint arXiv:2307.02694 (2023)
Zhang, Y., Funkhouser, T.: Deep depth completion of a single RGB-D image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 175–185 (2018)
Google Scholar
Rho, K., Ha, J., Kim, Y.: GuideFormer: transformers for image guided depth completion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6250–6259 (2022)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Artificial Intelligence, Tianjin University of Science and Technology, Tianjin, 300457, China
Yarui Chen, Qixiu Wu, Lianlong Sun, Shiwei Wu, Mengchen Zhang, Bingqi Wang & Tingting Zhao

Authors

Yarui Chen
View author publications
You can also search for this author in PubMed Google Scholar
Qixiu Wu
View author publications
You can also search for this author in PubMed Google Scholar
Lianlong Sun
View author publications
You can also search for this author in PubMed Google Scholar
Shiwei Wu
View author publications
You can also search for this author in PubMed Google Scholar
Mengchen Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Bingqi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Tingting Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tingting Zhao .

Editor information

Editors and Affiliations

Eastern Institute of Technology, Ningbo, China
De-Shuang Huang
Tianjin University of Science and Technology, Tianjin, China
Chuanlei Zhang
Eastern Institute of Technology, Ningbo, China
Yijie Pan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, Y. et al. (2024). LDCM-MVIT: A Lightweight Depth Completion Model Based on MViT. In: Huang, DS., Zhang, C., Pan, Y. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2024. Lecture Notes in Computer Science(), vol 14876. Springer, Singapore. https://doi.org/10.1007/978-981-97-5666-7_41

Download citation

DOI: https://doi.org/10.1007/978-981-97-5666-7_41
Published: 01 August 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-5665-0
Online ISBN: 978-981-97-5666-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

LDCM-MVIT: A Lightweight Depth Completion Model Based on MViT

Abstract

Access this chapter

Subscribe and save

Buy Now