More Web Proxy on the site http://driver.im/

research-article

Coarse-to-Fine Annotation Enrichment for Semantic Segmentation Learning

Authors:

Cong ZhaoAuthors Info & Claims

CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management

Pages 237 - 246

https://doi.org/10.1145/3269206.3271672

Published: 17 October 2018 Publication History

Abstract

Rich high-quality annotated data is critical for semantic segmentation learning, yet acquiring dense and pixel-wise ground-truth is both labor- and time-consuming. Coarse annotations (e.g., scribbles, coarse polygons) offer an economical alternative, with which training phase could hardly generate satisfactory performance unfortunately. In order to generate high-quality annotated data with a low time cost for accurate segmentation, in this paper, we propose a novel annotation enrichment strategy, which expands existing coarse annotations of training data to a finer scale. Extensive experiments on the Cityscapes and PASCAL VOC 2012 benchmarks have shown that the neural networks trained with the enriched annotations from our framework yield a significant improvement over that trained with the original coarse labels. It is highly competitive to the performance obtained by using human annotated dense annotations. The proposed method also outperforms among other state-of-the-art weakly-supervised segmentation methods.

References

[1]

David Acuna, Huan Ling, Amlan Kar, and Sanja Fidler. 2018. Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN++. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, USA, June 18-22, 2018.

[2]

Amy L. Bearman, Olga Russakovsky, Vittorio Ferrari, and Fei-Fei Li. 2016. What's the Point: Semantic Segmentation with Point Supervision. In Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part VII. 549--565.

[3]

Gedas Bertasius, Lorenzo Torresani, Stella X. Yu, and Jianbo Shi. 2017. Convolutional Random Walk Networks for Semantic Image Segmentation. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017. 6137--6145.

[4]

Lluis Castrejon, Kaustav Kundu, Raquel Urtasun, and Sanja Fidler. 2017. Annotating Object Instances with a Polygon-RNN. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017. 4485--4493.

[5]

Siddhartha Chandra and Iasonas Kokkinos. 2016. Fast, Exact and Multi-scale Inference for Semantic Image Segmentation with Deep Gaussian CRFs. In Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part VII. 402--418.

[6]

Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L. Yuille. 2018. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 40, 4 (2018), 834--848.

[7]

Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, and Bernt Schiele. 2016. The Cityscapes Dataset for Semantic Urban Scene Understanding. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016. 3213--3223.

[8]

Jifeng Dai, Kaiming He, and Jian Sun. 2015. BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation. In 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, December 7-13, 2015. 1635--1643.

Digital Library

[9]

Mark Everingham, Luc J. Van Gool, Christopher K. I. Williams, John M. Winn, and Andrew Zisserman. 2010. The Pascal Visual Object Classes (VOC) Challenge. International Journal of Computer Vision, Vol. 88, 2 (2010), 303--338.

Digital Library

[10]

Yunchao Gong, Svetlana Lazebnik, Albert Gordo, and Florent Perronnin. 2013. Iterative Quantization: A Procrustean Approach to Learning Binary Codes for Large-Scale Image Retrieval. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 35, 12 (2013), 2916--2929.

Digital Library

[11]

Jacob E. Goodman and Joseph O'Rourke (Eds.). 2004. Handbook of Discrete and Computational Geometry, Second Edition. Chapman and Hall/CRC.

[12]

Andrew V. Knyazev and Ilya Lashuk. 2007. Steepest Descent and Conjugate Gradient Methods with Variable Preconditioning. SIAM J. Matrix Analysis Applications, Vol. 29, 4 (2007), 1267--1280.

Digital Library

[13]

Philipp Krähenbü hl and Vladlen Koltun. 2012. Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials. CoRR, Vol. abs/1210.5644 (2012). arxiv: 1210.5644 http://arxiv.org/abs/1210.5644

[14]

Friederike Laus, Mila Nikolova, Johannes Persch, and Gabriele Steidl. 2017. A Nonlocal Denoising Algorithm for Manifold-Valued Images Using Second Order Statistics. SIAM J. Imaging Sciences, Vol. 10, 1 (2017), 416--448.

[15]

Philip Greggory Lee and Ying Wu. 2011. Nonlocal matting. In The 24th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2011, Colorado Springs, CO, USA, 20-25 June 2011. 2193--2200.

Digital Library

[16]

Xiaoxiao Li, Ziwei Liu, Ping Luo, Chen Change Loy, and Xiaoou Tang. 2017. Not All Pixels Are Equal: Difficulty-Aware Semantic Segmentation via Deep Layer Cascade. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017. 6459--6468.

[17]

Yin Li, Jian Sun, Chi-Keung Tang, and Heung-Yeung Shum. 2004. Lazy snapping. ACM Trans. Graph., Vol. 23, 3 (2004), 303--308.

Digital Library

[18]

Di Lin, Jifeng Dai, Jiaya Jia, Kaiming He, and Jian Sun. 2016. ScribbleSup: Scribble-Supervised Convolutional Networks for Semantic Segmentation. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016. 3159--3167.

[19]

Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7-12, 2015. 3431--3440.

[20]

Yadan Luo, Yang Yang, Fumin Shen, Zi Huang, Pan Zhou, and Heng Tao Shen. 2018. Robust discrete code modeling for supervised hashing. Pattern Recognition, Vol. 75 (2018), 128--135.

Digital Library

[21]

Marius Muja and David G. Lowe. 2009. Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration. In VISAPP 2009 - Proceedings of the Fourth International Conference on Computer Vision Theory and Applications, Lisboa, Portugal, February 5-8, 2009 - Volume 1. 331--340.

[22]

Dim P. Papadopoulos, Jasper R. R. Uijlings, Frank Keller, and Vittorio Ferrari. 2017. Extreme Clicking for Efficient Object Annotation. IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017. 4940--4949.

[23]

George Papandreou, Liang-Chieh Chen, Kevin Murphy, and Alan L. Yuille. 2015. Weakly- and Semi-Supervised Learning of a DCNN for Semantic Image Segmentation. CoRR, Vol. abs/1502.02734 (2015). arxiv: 1502.02734 http://arxiv.org/abs/1502.02734

Digital Library

[24]

Deepak Pathak, Evan Shelhamer, Jonathan Long, and Trevor Darrell. 2014. Fully Convolutional Multi-Class Multiple Instance Learning. CoRR, Vol. abs/1412.7144 (2014). arxiv: 1412.7144 http://arxiv.org/abs/1412.7144

[25]

Xiaojuan Qi, Zhengzhe Liu, Jianping Shi, Hengshuang Zhao, and Jiaya Jia. 2016. Augmented Feedback in Semantic Segmentation Under Image Level Supervision. In Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part VIII. 90--105.

[26]

Carsten Rother, Vladimir Kolmogorov, and Andrew Blake. 2004. "GrabCut": interactive foreground extraction using iterated graph cuts. ACM Trans. Graph., Vol. 23, 3 (2004), 309--314.

Digital Library

[27]

Falong Shen, Rui Gan, Shuicheng Yan, and Gang Zeng. 2017. Semantic Segmentation via Structured Patch Prediction, Context CRF and Guidance CRF. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017. 5178--5186.

[28]

Fumin Shen, Xin Gao, Li Liu, Yang Yang, and Heng Tao Shen. 2017. Deep Asymmetric Pairwise Hashing. In Proceedings of the 2017 ACM on Multimedia Conference, MM 2017, Mountain View, CA, USA, October 23-27, 2017. 1522--1530.

Digital Library

[29]

Fumin Shen, Chunhua Shen, Wei Liu, and Heng Tao Shen. 2015. Supervised Discrete Hashing. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7-12, 2015. 37--45.

[30]

Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR, Vol. abs/1409.1556 (2014). arxiv: 1409.1556 http://arxiv.org/abs/1409.1556

[31]

Yunchao Wei, Jiashi Feng, Xiaodan Liang, Ming-Ming Cheng, Yao Zhao, and Shuicheng Yan. 2017. Object Region Mining with Adversarial Erasing: A Simple Classification to Semantic Segmentation Approach. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017. 6488--6496.

[32]

Jimei Yang, Brian L. Price, Scott Cohen, Honglak Lee, and Ming-Hsuan Yang. 2016. Object Contour Detection with a Fully Convolutional Encoder-Decoder Network. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016. 193--202.

[33]

Yang Yang, Yadan Luo, Weilun Chen, Fumin Shen, Jie Shao, and Heng Tao Shen. 2016. Zero-Shot Hashing via Transferring Supervised Knowledge. In Proceedings of the 2016 ACM Conference on Multimedia Conference, MM 2016, Amsterdam, The Netherlands, October 15-19, 2016. 1286--1295.

Digital Library

[34]

Mohsen Zand, Shyamala Doraisamy, Alfian Abdul Halin, and Mas Rina Mustaffa. 2016. Ontology-Based Semantic Image Segmentation Using Mixture Models and Multiple CRFs. IEEE Trans. Image Processing, Vol. 25, 7 (2016), 3233--3248.

Digital Library

[35]

Ke Zhang, Wei Zhang, Yingbin Zheng, and Xiangyang Xue. 2013. Sparse Reconstruction for Weakly Supervised Semantic Segmentation. In IJCAI 2013, Proceedings of the 23rd International Joint Conference on Artificial Intelligence, Beijing, China, August 3-9, 2013. 1889--1895. http://www.aaai.org/ocs/index.php/IJCAI/IJCAI13/paper/view/6990

Digital Library

[36]

Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. 2017. Pyramid Scene Parsing Network. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017. 6230--6239.

[37]

Shuai Zheng, Sadeep Jayasumana, Bernardino Romera-Paredes, Vibhav Vineet, Zhizhong Su, Dalong Du, Chang Huang, and Philip H. S. Torr. 2015. Conditional Random Fields as Recurrent Neural Networks. In 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, December 7-13, 2015. 1529--1537.

Digital Library

Cited By

Zhang YTian X(2025)Consistent prompt learning for vision-language modelsKnowledge-Based Systems10.1016/j.knosys.2025.112974310(112974)Online publication date: Feb-2025
https://doi.org/10.1016/j.knosys.2025.112974
Liu XHe YLi JYan RLi XHuang H(2024)A Comparative Review on Enhancing Visual Simultaneous Localization and Mapping with Deep Semantic SegmentationSensors10.3390/s2411338824:11(3388)Online publication date: 24-May-2024
https://doi.org/10.3390/s24113388
Cui FYang XWu CXiao LTian XCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Advancing Prompt Learning through an External LayerProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680953(807-816)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680953
Show More Cited By

Index Terms

Coarse-to-Fine Annotation Enrichment for Semantic Segmentation Learning
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Image segmentation
      2. Computer vision representations
        Appearance and texture representations

Recommendations

Coarse-to-fine Kidney Segmentation Incorporating Abnormality Detection and Correction
ISICDM 2020: The Fourth International Symposium on Image Computing and Digital Medicine

In this paper, we propose and validate a coarse-to-fine kidney segmentation method from Computed Tomography (CT) images, i.e., predicting a coarse label based on the entire image and a fine label based on the coarse segmentation and cropped image ...
Boosted MIML method for weakly-supervised image semantic segmentation

Weakly-supervised image semantic segmentation aims to segment images into semantically consistent regions with only image-level labels are available, and is of great significance for fine-grained image analysis, retrieval and other possible ...
A Coarse-to-Fine Framework for Head and Neck Tumor Segmentation in CT and PET Images
Head and Neck Tumor Segmentation and Outcome Prediction
Abstract
Radiomics analysis can help patients suffered from head and neck (H&N) cancer customize tailoring treatments. It requires a large number of segmentation of the H&N tumor area in PET and CT images. However, the cost of manual segmentation is ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management

October 2018

2362 pages

ISBN:9781450360142

DOI:10.1145/3269206

General Chair:
Alfredo Cuzzocrea
University of Trieste, Italy
,
Program Chairs:
James Allan
University of Massachusetts, USA
,
Norman Paton
University of Manchester, United Kingdom
,
Divesh Srivastava
AT&T Labs Research, USA
,
Rakesh Agrawal
Data Insights Lab, USA
,
Andrei Broder
Google Research, USA
,
Mohammed Zaki
Rensselaer Polytechnic Institute, USA
,
Selcuk Candan
Arizona State University, USA
,
Alexandros Labrinidis
University of Pittsburgh, USA
,
Assaf Schuster
Technion, Israel
,
Haixun Wang
Google Research, USA

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

CIKM '18

Sponsor:

CIKM '18: The 27th ACM International Conference on Information and Knowledge Management

October 22 - 26, 2018

Torino, Italy

Acceptance Rates

CIKM '18 Paper Acceptance Rate 147 of 826 submissions, 18%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
438
Total Downloads

Downloads (Last 12 months)13
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhang YTian X(2025)Consistent prompt learning for vision-language modelsKnowledge-Based Systems10.1016/j.knosys.2025.112974310(112974)Online publication date: Feb-2025
https://doi.org/10.1016/j.knosys.2025.112974
Liu XHe YLi JYan RLi XHuang H(2024)A Comparative Review on Enhancing Visual Simultaneous Localization and Mapping with Deep Semantic SegmentationSensors10.3390/s2411338824:11(3388)Online publication date: 24-May-2024
https://doi.org/10.3390/s24113388
Cui FYang XWu CXiao LTian XCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Advancing Prompt Learning through an External LayerProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680953(807-816)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680953
Prunella MScardigno RBuongiorno DBrunetti ALongo NCarli RDotoli MBevilacqua V(2023)Deep Learning for Automatic Vision-Based Recognition of Industrial Surface Defects: A SurveyIEEE Access10.1109/ACCESS.2023.327174811(43370-43423)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3271748
Sayez NVleeschouwer C(2022)Accelerating the creation of instance segmentation training sets through bounding box annotation2022 26th International Conference on Pattern Recognition (ICPR)10.1109/ICPR56361.2022.9956321(252-258)Online publication date: 21-Aug-2022
https://doi.org/10.1109/ICPR56361.2022.9956321
Ravi ARepakula SDutta UParmar M(2021)Buy Me That Look: An Approach for Recommending Similar Fashion Products2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)10.1109/MIPR51284.2021.00022(97-103)Online publication date: Sep-2021
https://doi.org/10.1109/MIPR51284.2021.00022
Fang YZhu DZhou NLiu LYao J(2021)PiPo-Net: A Semi-automatic and Polygon-based Annotation Method for Pathological Images2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)10.1109/IROS51168.2021.9636146(2978-2984)Online publication date: 27-Sep-2021
https://doi.org/10.1109/IROS51168.2021.9636146
Huang YShen QFu YYou S(2021)Weakly-supervised Semantic Segmentation in Cityscape via Hyperspectral Image2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)10.1109/ICCVW54120.2021.00131(1117-1126)Online publication date: Oct-2021
https://doi.org/10.1109/ICCVW54120.2021.00131
Li YLuo YHuang Z(2020)Fashion Recommendation with Multi-relational Representation LearningAdvances in Knowledge Discovery and Data Mining10.1007/978-3-030-47426-3_1(3-15)Online publication date: 6-May-2020
https://doi.org/10.1007/978-3-030-47426-3_1
Wang ZHuang ZLuo Y(2020)PAIC: Parallelised Attentive Image CaptioningDatabases Theory and Applications10.1007/978-3-030-39469-1_2(16-28)Online publication date: 21-Jan-2020
https://doi.org/10.1007/978-3-030-39469-1_2
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten