DFT-VSLAM: A Dynamic Optical Flow Tracking VSLAM Method

Dupeng Cai¹,
Shijiang Li¹,
Wenlu Qi¹,
Kunkun Ding¹,
Junlin Lu¹,
Guangfeng Liu² &
…
Zhuhua Hu ORCID: orcid.org/0000-0002-6837-9024¹

495 Accesses
Explore all metrics

Abstract

Visual Simultaneous Localization and Mapping (VSLAM) technology can provide reliable visual localization and mapping capabilities for critical tasks. Existing VSLAM can extract accurate feature points in static environments for matching and pose estimation, and then build environmental map. However, in dynamic environments, the feature points extracted by the VSLAM system will become inaccurate points as the object moves, which not only leads to tracking failure but also seriously affects the accuracy of the environmental map. To alleviate these challenges, we propose a dynamic target-aware optical flow tracking method based on YOLOv8. Firstly, we use YOLOv8 to identify moving targets in the environment, and propose a method to eliminate dynamic points in the dynamic contour region. Secondly, we use the optical flow mask method to identify dynamic feature points outside the target detection object frame. Thirdly, we comprehensively eliminate the dynamic feature points. Finally, we combine the geometric and semantic information of static map points to construct the semantic map of the environment. We used ATE (Absolute Trajectory Error) and RPE (Relative Pose Error) as evaluation metrics and compared the original method with our method on the TUM dataset. The accuracy of our method is significantly improved, especially 96.92% on walking_xyz dataset. The experimental results show that our proposed method can significantly improve the overall performance of VSLAM systems under high dynamic environments.

Article PDF

Detection and Elimination of Dynamic Feature Points Based on YOLO and Geometric Constraints

Article 02 April 2024

Real-time one-dimensional motion estimation and its application in computer vision

Article 31 May 2015

A Lightweight Visual Odometry Based on Object Detection for Dynamic Environment

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Code and Data Availability

The code that support the findings of this study are available from the corresponding author Dr. Zhuhua Hu upon reasonable request. And the datasets used are publicly available in: https://vision.in.tum.de/data/datasets/rgbd-dataset. The DFT-VSLAM demo video can be accessed at the following link: https://www.bilibili.com/video/BV16W421A7t4/?spm_id_from=333.999.0.0.

References

Shen, X., Chen, L., Hu, Z., Fu, Y., Qi, H., Xiang, Y., Wu, J.: A closed-loop detection algorithm for online updating of bag-of-words model. In Proceedings of the 2023 9th International Conference on Computing and Data Engineering, pp. 34–40. (2023)
Chen, Y., Li, N., Zhu, D., Zhou, C.C., Hu, Z., Bai, Y., Yan, J.: Bevsoc: Self-supervised contrastive learning for calibration-free bev 3d object detection. IEEE Internet Things J. (2024)
Ahmed Abdulsaheb, J., Jasim Kadhim, D., et al.: Real-time slam mobile robot and navigation based on cloud-based implementation. J. Robot. 2023 (2023)
Fu, Y., Han, B., Hu, Z., Shen, X., Zhao, Y.: Cbam-slam: A semantic slam based on attention module in dynamic environment. In 2022 6th Asian Conference on Artificial Intelligence Technology (ACAIT).IEEE, pp. 1–6 (2022)
Hu, Z., Qi, W., Ding, K., Liu, G., Zhao, Y.: An adaptive lighting indoor vslam with limited on-device resources. IEEE Internet Things J. (2024)
Li, R., Zhao, Y., Hu, Z., Qi, W., Liu, G.: Tohf: A feature extractor for resource-constrained indoor vslam. J. Syst. Simul. (2024)
Hao Qi, Z.H.J.W.Y.Z., Fu, Y.: A lightweight semantic vslam approach based on adaptive thresholding and speed optimization. J. Beijing Univ. Aeronaut. Astronaut. (2024)
Soares, J.C.V., Gattass, M., Meggiolaro, M.A.: Crowd-slam: visual slam towards crowded environments using object detection. J. Intell. Robot. Syst. 102(2), 50 (2021)
Article Google Scholar
Liu, G., Hu, Z., Zhao, Y., Li, R., Ding, K., Qi, W.: A key frame selection and local ba optimization method for vslam. Int. J. Robot, Autom (2024)
Google Scholar
Qin, Y., Yu, H.: A review of visual slam with dynamic objects. Industrial Robot: the international journal of robotics research and application (2023)
Pu, H., Luo, J., Wang, G., Huang, T., Liu, H.: Visual slam integration with semantic segmentation and deep learning: A review. IEEE Sensors J. (2023)
Zhao, Y., Xiong, Z., Zhou, S., Peng, Z., Campoy, P., Zhang, L.: Ksf-slam: a key segmentation frame based semantic slam in dynamic environments. J. Intell. Robot. Syst. 105(1), 3 (2022)
Article Google Scholar
Yu, C., Liu, Z., Liu, X.-J., Xie, F., Yang, Y., Wei, Q., Fei, Q.: Ds-slam: A semantic visual slam towards dynamic environments. In 2018 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp. 1168–1174. IEEE (2018)
Bescos, B., Fácil, J.M., Civera, J., Neira, J.: Dynaslam: Tracking, mapping, and inpainting in dynamic scenes. IEEE Robot. Autom. Lett. 3(4), 4076–4083 (2018)
Article Google Scholar
Cai, D., Hu, Z., Li, R., Qi, H., Xiang, Y., Zhao, Y.: Agam-slam: An adaptive dynamic scene semantic slam method based on gam. In International Conference on Intelligent Computing, pp. 27–39. Springer (2023)
Zhang, J., Henein, M., Mahony, R., Ila, V.: Vdo-slam: a visual dynamic object-aware slam system. arXiv preprint arXiv:2005.11052 (2020)
Liu, Y., Miura, J.: Rds-slam: Real-time dynamic slam using semantic segmentation methods. Ieee Access 9, 23772–23785 (2021)
Article Google Scholar
Li, M., He, J., Jiang, G., Wang, H.: Ddn-slam: Real-time dense dynamic neural implicit slam with joint semantic encoding. arXiv preprint arXiv:2401.01545 (2024)
Pu, H., Luo, J., Wang, G., Huang, T., Liu, H.: Visual slam integration with semantic segmentation and deep learning: A review. IEEE Sensors J. (2023)
Zhong, F., Wang, S., Zhang, Z., Wang, Y.: Detect-slam: Making object detection and slam mutually beneficial. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1001–1010. IEEE (2018)
Jin, G., Zhong, X., Fang, Deng, S., Li, J.: Keyframe-based dynamic elimination slam system using yolo detection. In Intelligent Robotics and Applications: 12th International Conference, ICIRA 2019, Shenyang, China, August 8–11, 2019, Proceedings, Part IV 12, pp. 697–705. Springer (2019)
Wu, W., Guo, L., Gao, H., You, Z., Liu, Y., Chen, Z.: Yolo-slam: A semantic slam system towards dynamic environment with geometric constraint. Neural Comput. & Applic. 1–16 (2022)
Qi, H., Hu, Z., Xiang, Y., Cai, D., Zhao, Y.: Aty-slam: A visual semantic slam for dynamic indoor environments. In International Conference on Intelligent Computing, pp. 3–14. Springer (2023)
Campos, C., Elvira, R., Rodríguez, J.J.G., Montiel, J.M., Tardós, J.D.: Orb-slam3: An accurate open-source library for visual, visual–inertial, and multimap slam. IEEE Trans. Robot. 37(6), 1874–1890 (2021)
Article Google Scholar
Wang, C.-Y., Bochkovskiy, A., Liao, H.-Y.M.: Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7464–7475. (2023)
Qin, Y., Yu, H.: A review of visual slam with dynamic objects. Ind. Robot. Int. J. Robot. Res. App. 50(6), 1000–1010 (2023)
Article Google Scholar
Zheng, B., Liu, Q., Zhao, F., Zhang, X., Wang, Q.: A visual slam method integrating semantic maps and loop closure detection. J. Chinese Inertial Technol. 28(5), 629–637 (2020)
Google Scholar
Hempel, T., Al-Hamadi, A.: An online semantic mapping system for extending and enhancing visual slam. Eng. Appl. Artif. Intell. 111, 104830 (2022)
Article Google Scholar
Cui, L., Ma, C.: Sdf-slam: Semantic depth filter slam for dynamic environments. IEEE Access 8, 95301–95311 (2020)
Article Google Scholar
Cai, D., Li, R., Hu, Z., Lu, J., Li, S., Zhao, Y.: A comprehensive overview of core modules in visual slam framework. Neurocomputing 127760 (2024)
Kumar, D., Muhammad, N.: Object detection in adverse weather for autonomous driving through data merging and yolov8. Sensors 23(20), 8471 (2023)
Article Google Scholar
Huang, Z., Shi, X., Zhang, C., Wang, Q., Cheung, K.C., Qin, H., Dai, J., Li, H.: Flowformer: A transformer architecture for optical flow. In European Conference on Computer Vision, pp. 668–685. Springer (2022)
Zhang, Z., Zhao, J., Huang, C., Li, L.: Learning visual semantic map-matching for loosely multi-sensor fusion localization of autonomous vehicles. IEEE Trans. Intell. Veh. 8(1), 358–367 (2022)
Article Google Scholar
Sturm, J., Engelhard, N., Endres, F., Burgard, W., Cremers, D.: A benchmark for the evaluation of rgb-d slam systems. In 2012 IEEE/RSJ international conference on intelligent robots and systems, pp. 573–580. IEEE (2012)
Cheng, S., Sun, C., Zhang, S., Zhang, D.: Sg-slam: A real-time rgb-d visual slam toward dynamic scenes with semantic and geometric information. IEEE Trans. Instrum. Meas. 72, 1–12 (2022)
Article Google Scholar
Ji, Q., Zhang, Z., Chen, Y., Zheng, E.: Drv-slam: An adaptive real-time semantic visual slam based on instance segmentation toward dynamic environments. Ieee Access 12, 43827–43837 (2024)
Article Google Scholar
Cheng, J., Wang, Z., Zhou, H., Li, L., Yao, J.: Dm-slam: A feature-based slam system for rigid dynamic scenes. ISPRS Int. J. Geo-Information 9(4), 202 (2020)
Article Google Scholar
Cong, P., Li, J., Liu, J., Xiao, Y., Zhang, X.: Seg-slam: Dynamic indoor rgb-d visual slam integrating geometric and yolov5-based semantic information. Sensors 24(7), 2102 (2024)
Article Google Scholar

Download references

Acknowledgements

This research was supported by the National Natural Science Foundation of China (Grant no. 62161010), the Key Research and Development Project of Hainan Province (Grant no. ZDYF2022GXJS348 and Grant no. ZDYF2022SHFZ039), and the Hainan Province Natural Science Foundation (623RC446). The authors would like to thank the referees for their constructive suggestions.

Author information

Authors and Affiliations

School of Information and Communication Engineering, Hainan University, Haikou, 570228, China
Dupeng Cai, Shijiang Li, Wenlu Qi, Kunkun Ding, Junlin Lu & Zhuhua Hu
School of Cyberspace Security (School of Cryptology), Hainan University, Haikou, 570228, China
Guangfeng Liu

Authors

Dupeng Cai
View author publications
You can also search for this author in PubMed Google Scholar
Shijiang Li
View author publications
You can also search for this author in PubMed Google Scholar
Wenlu Qi
View author publications
You can also search for this author in PubMed Google Scholar
Kunkun Ding
View author publications
You can also search for this author in PubMed Google Scholar
Junlin Lu
View author publications
You can also search for this author in PubMed Google Scholar
Guangfeng Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zhuhua Hu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the conception or design of the work, the analysis and interpretation of the data, and the draft of the manuscript.

Corresponding author

Correspondence to Zhuhua Hu.

Ethics declarations

Conflicts of Interest

No conflict of interest exists in the submission of this manuscript.

Ethics Approval

Not applicable (this article does not contain any studies with human participants or animals performed by any of the authors).

Consent to Participate

All authors have participated in conception and design, or analysis and interpretation of the data; drafting the article or revising it critically for important intellectual content, and approval of the final version.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Cai, D., Li, S., Qi, W. et al. DFT-VSLAM: A Dynamic Optical Flow Tracking VSLAM Method. J Intell Robot Syst 110, 135 (2024). https://doi.org/10.1007/s10846-024-02171-7

Download citation

Received: 31 October 2023
Accepted: 27 August 2024
Published: 14 September 2024
DOI: https://doi.org/10.1007/s10846-024-02171-7

DFT-VSLAM: A Dynamic Optical Flow Tracking VSLAM Method

Abstract

Article PDF

Similar content being viewed by others

Detection and Elimination of Dynamic Feature Points Based on YOLO and Geometric Constraints

Real-time one-dimensional motion estimation and its application in computer vision

A Lightweight Visual Odometry Based on Object Detection for Dynamic Environment

Code and Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of Interest

Ethics Approval

Consent to Participate

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

DFT-VSLAM: A Dynamic Optical Flow Tracking VSLAM Method

Abstract

Article PDF

Similar content being viewed by others

Detection and Elimination of Dynamic Feature Points Based on YOLO and Geometric Constraints

Real-time one-dimensional motion estimation and its application in computer vision

A Lightweight Visual Odometry Based on Object Detection for Dynamic Environment

Explore related subjects

Code and Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of Interest

Ethics Approval

Consent to Participate

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords