Interactive object annotation based on one-click guidance

Yijin Xiong¹^na1,
Xin Gao¹^na1 &
Guoying Zhang¹

197 Accesses
Explore all metrics

Abstract

Due to the large workload of manual annotation of datasets, uneven data quality and high professional thresholds have been a problem. Based on the idea of semi-automatic annotation, this article discusses the method use of interactive methods to obtain accurate annotations of objects. We propose a method of human–machine interactive object annotation based on one-click guidance. Specifically, we click on a point close to the center of the object and use the prior information of this point to give a guide to the model. The advantages of our method are fourfold: (1) the simulated click method is transferable and can be labeled across datasets; (2) clicks help to eliminate irrelevant areas within the bounding box; (3) the operation is more convenient and does not require artificial boxes, we only need to give the relevant location information; (4) our method supports additional click annotations for further correction. To verify the effectiveness of the proposed method, we conducted a lot of experiments on the KITTI and PASCAL VOC2012 datasets, and the results proved that our model has improved average IoU by 18.1% and 14.6% compared with Anno-Mage and CVAT, respectively. Our method focuses on improving the accuracy and efficiency of annotation, and provides a new idea for the field of semi-automatic annotation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Fig. 4

An efficient weakly semi-supervised method for object automated annotation

Article 17 June 2023

Interactive 3D Object Detection with Prompts

Interactive Deep Annotation as DARos: Object Detection Supervision for Efficient Instance Segmentation

References

Real E, Shlens J, Mazzocchi S, Pan X, Vanhoucke V (2017) Youtube-boundingboxes: a large high-precision human-annotated data set for object detection in video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5296–5305
Nandhini P, Kuppuswami S, Malliga S, DeviPriya R (2022) Enhanced rank attack detection algorithm (e-rad) for securing rpl-based iot networks by early detection and isolation of rank attackers. J Supercomput 1–24
Suseendran G, Akila D, Vijaykumar H, Jabeen TN, Nirmala R, Nayyar A (2022) Multi-sensor information fusion for efficient smart transport vehicle tracking and positioning based on deep learning technique. J Supercomput 1–26
Varga V, Lőrincz A (2020) Reducing human efforts in video segmentation annotation with reinforcement learning. Neurocomputing 405:247–258
Article Google Scholar
Kishorekumar R, Deepa P (2020) A framework for semantic image annotation using legion algorithm. J Supercomput 76(6):4169–4183
Article Google Scholar
Pham T-N, Nguyen V-H, Huh J-H (2023) Integration of improved yolov5 for face mask detector and auto-labeling to generate dataset for fighting against covid-19. J Supercomput 1–27
Boukthir K, Qahtani AM, Almutiry O, Dhahri H, Alimi AM (2022) Reduced annotation based on deep active learning for Arabic text detection in natural scene images. Pattern Recogn Lett 157:42–48
Article Google Scholar
Russell BC, Torralba A, Murphy KP, Freeman WT (2008) Labelme: a database and web-based tool for image annotation. Int J Comput Vis 77(1–3):157–173
Article Google Scholar
Su H, Deng J, Fei-Fei L (2012) Crowdsourcing annotations for visual object detection. In: Workshops at the Twenty-Sixth AAAI Conference on Artificial Intelligence
Acuna D, Ling H, Kar A, Fidler S (2018) Efficient interactive annotation of segmentation datasets with polygon-rnn++. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 859–868
Vondrick C, Patterson D, Ramanan D (2013) Efficiently scaling up crowdsourced video annotation. Int J Comput Vis 101(1):184–204
Article Google Scholar
Mottaghi R, Chen X, Liu X, Cho N-G, Lee S-W, Fidler S, Urtasun R, Yuille A (2014) The role of context for object detection and semantic segmentation in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 891–898
Zhang S, Liew JH, Wei Y, Wei S, Zhao Y (2020) Interactive object segmentation with inside–outside guidance. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 12234–12244
Pacha S, Murugan SR, Sethukarasi R (2020) Semantic annotation of summarized sensor data stream for effective query processing. J Supercomput 76(6):4017–4039
Article Google Scholar
Schembera B (2021) Like a rainbow in the dark: metadata annotation for hpc applications in the age of dark data. J Supercomput 77(8):8946–8966
Article Google Scholar
Ling H, Gao J, Kar A, Chen W, Fidler S (2019) Fast interactive object annotation with curve-gcn. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5257–5266
Gao X, Zhang G, Xiong Y (2022) Multi-scale multi-modal fusion for object detection in autonomous driving based on selective kernel. Measurement 194:111001
Article Google Scholar
Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The Pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338
Article Google Scholar
Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The kitti vision benchmark suite. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 3354–3361
Geiger A, Lenz P, Stiller C, Urtasun R (2013) Vision meets robotics: the kitti dataset. Int J Robot Res 32(11):1231–1237
Article Google Scholar
Tzutalin: LabelImg. https://github.com/tzutalin/labelImg (2015)
Dutta A, Zisserman A (2019) The via annotation software for images, audio and video. In: Proceedings of the 27th ACM International Conference on Multimedia, pp 2276–2279
Yu F, Xian W, Chen Y, Liu F, Liao M, Madhavan V, Darrell T (2018) Bdd100k: a diverse driving video database with scalable annotation tooling. arXiv preprint arXiv:1805.04687 2(5), 6
christopher5106: FastAnnotationTool. https://github.com/christopher5106/FastAnnotationTool (2016)
virajmavani: Anno-mage. https://github.com/virajmavani/semi-auto-image-annotation-tool (2018)
OpenVINO: CVAT. https://github.com/openvinotoolkit/cvat (2020)
Wang B, Wu V, Wu B, Keutzer K (2019) Latte: accelerating lidar point cloud annotation via sensor fusion, one-click annotation, and tracking. In: 2019 IEEE Intelligent Transportation Systems Conference (ITSC). IEEE, pp 265–272
Piewak F, Pinggera P, Schafer M, Peter D, Schwarz B, Schneider N, Enzweiler M, Pfeiffer D, Zollner M (2018) Boosting lidar-based semantic labeling by cross-modal training data generation. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops, pp 0–0
Yue X, Wu B, Seshia SA, Keutzer K, Sangiovanni-Vincentelli AL (2018) A lidar point cloud generator: from a virtual world to autonomous driving. In: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, pp 458–464
Dosovitskiy A, Ros G, Codevilla F, Lopez A, Koltun V (2017) Carla: an open urban driving simulator. In: Conference on Robot Learning. PMLR, pp 1–16
Maninis K-K, Caelles S, Pont-Tuset J, Van Gool L (2018) Deep extreme cut: from extreme points to object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 616–625
Papadopoulos DP, Uijlings JR, Keller F, Ferrari V (2017) Extreme clicking for efficient object annotation. In: Proceedings of the IEEE International Conference on Computer Vision, pp 4930–4939
Fails JA, Olsen Jr DR (2003) Interactive machine learning. In: Proceedings of the 8th International Conference on Intelligent User Interfaces, pp 39–45
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst 28:91–99
Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2117–2125
Li X, Wang W, Hu X, Yang J (2019) Selective kernel networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 510–519
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7132–7141

Download references

Author information

Yijin Xiong and Xin Gao have contributed equally to this work.

Authors and Affiliations

Computer Science and Technology, China University of Mining and Technology-Beijing, Xueyuan Road, Beijing, 100083, China
Yijin Xiong, Xin Gao & Guoying Zhang

Authors

Yijin Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Xin Gao
View author publications
You can also search for this author in PubMed Google Scholar
Guoying Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

GZ and XG wrote the main manuscript text, and YX prepared Figs. 1, 2 and 3, and GZ prepared Figs. 4 and 5, and XG prepared Figs. 6 and 7. All authors reviewed the manuscript.

Corresponding author

Correspondence to Guoying Zhang.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Xiong, Y., Gao, X. & Zhang, G. Interactive object annotation based on one-click guidance. J Supercomput 79, 16098–16117 (2023). https://doi.org/10.1007/s11227-023-05279-z

Download citation

Accepted: 10 April 2023
Published: 25 April 2023
Issue Date: September 2023
DOI: https://doi.org/10.1007/s11227-023-05279-z

Interactive object annotation based on one-click guidance

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An efficient weakly semi-supervised method for object automated annotation

Interactive 3D Object Detection with Prompts

Interactive Deep Annotation as DARos: Object Detection Supervision for Efficient Instance Segmentation

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Interactive object annotation based on one-click guidance

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An efficient weakly semi-supervised method for object automated annotation

Interactive 3D Object Detection with Prompts

Interactive Deep Annotation as DARos: Object Detection Supervision for Efficient Instance Segmentation

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation