Computer Science > Computer Vision and Pattern Recognition

arXiv:2207.06604 (cs)

[Submitted on 14 Jul 2022]

Title:Rethinking Super-Resolution as Text-Guided Details Generation

Authors:Chenxi Ma, Bo Yan, Qing Lin, Weimin Tan, Siming Chen

View PDF

Abstract:Deep neural networks have greatly promoted the performance of single image super-resolution (SISR). Conventional methods still resort to restoring the single high-resolution (HR) solution only based on the input of image modality. However, the image-level information is insufficient to predict adequate details and photo-realistic visual quality facing large upscaling factors (x8, x16). In this paper, we propose a new perspective that regards the SISR as a semantic image detail enhancement problem to generate semantically reasonable HR image that are faithful to the ground truth. To enhance the semantic accuracy and the visual quality of the reconstructed image, we explore the multi-modal fusion learning in SISR by proposing a Text-Guided Super-Resolution (TGSR) framework, which can effectively utilize the information from the text and image modalities. Different from existing methods, the proposed TGSR could generate HR image details that match the text descriptions through a coarse-to-fine process. Extensive experiments and ablation studies demonstrate the effect of the TGSR, which exploits the text reference to recover realistic images.

Comments:	10 pages, 11 figures, ACM MM 2022
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2207.06604 [cs.CV]
	(or arXiv:2207.06604v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2207.06604

Submission history

From: Cx Ma [view email]
[v1] Thu, 14 Jul 2022 01:46:38 UTC (9,406 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Rethinking Super-Resolution as Text-Guided Details Generation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Rethinking Super-Resolution as Text-Guided Details Generation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators