Computer Science > Computer Vision and Pattern Recognition

arXiv:2402.05589 (cs)

[Submitted on 8 Feb 2024 (v1), last revised 11 Feb 2024 (this version, v2)]

Title:RESMatch: Referring Expression Segmentation in a Semi-Supervised Manner

Authors:Ying Zang, Chenglong Fu, Runlong Cao, Didi Zhu, Min Zhang, Wenjun Hu, Lanyun Zhu, Tianrun Chen

Abstract:Referring expression segmentation (RES), a task that involves localizing specific instance-level objects based on free-form linguistic descriptions, has emerged as a crucial frontier in human-AI interaction. It demands an intricate understanding of both visual and textual contexts and often requires extensive training data. This paper introduces RESMatch, the first semi-supervised learning (SSL) approach for RES, aimed at reducing reliance on exhaustive data annotation. Extensive validation on multiple RES datasets demonstrates that RESMatch significantly outperforms baseline approaches, establishing a new state-of-the-art. Although existing SSL techniques are effective in image segmentation, we find that they fall short in RES. Facing the challenges including the comprehension of free-form linguistic descriptions and the variability in object attributes, RESMatch introduces a trifecta of adaptations: revised strong perturbation, text augmentation, and adjustments for pseudo-label quality and strong-weak supervision. This pioneering work lays the groundwork for future research in semi-supervised learning for referring expression segmentation.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2402.05589 [cs.CV]
	(or arXiv:2402.05589v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2402.05589

Submission history

From: Tianrun Chen [view email]
[v1] Thu, 8 Feb 2024 11:40:50 UTC (32,510 KB)
[v2] Sun, 11 Feb 2024 10:27:04 UTC (32,506 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:RESMatch: Referring Expression Segmentation in a Semi-Supervised Manner

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:RESMatch: Referring Expression Segmentation in a Semi-Supervised Manner

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators