Computer Science > Computation and Language

arXiv:2212.10562 (cs)

[Submitted on 20 Dec 2022 (v1), last revised 3 May 2023 (this version, v2)]

Title:Character-Aware Models Improve Visual Text Rendering

Authors:Rosanne Liu, Dan Garrette, Chitwan Saharia, William Chan, Adam Roberts, Sharan Narang, Irina Blok, RJ Mical, Mohammad Norouzi, Noah Constant

View PDF

Abstract:Current image generation models struggle to reliably produce well-formed visual text. In this paper, we investigate a key contributing factor: popular text-to-image models lack character-level input features, making it much harder to predict a word's visual makeup as a series of glyphs. To quantify this effect, we conduct a series of experiments comparing character-aware vs. character-blind text encoders. In the text-only domain, we find that character-aware models provide large gains on a novel spelling task (WikiSpell). Applying our learnings to the visual domain, we train a suite of image generation models, and show that character-aware variants outperform their character-blind counterparts across a range of novel text rendering tasks (our DrawText benchmark). Our models set a much higher state-of-the-art on visual spelling, with 30+ point accuracy gains over competitors on rare words, despite training on far fewer examples.

Subjects:	Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2212.10562 [cs.CL]
	(or arXiv:2212.10562v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2212.10562

Submission history

From: Noah Constant [view email]
[v1] Tue, 20 Dec 2022 18:59:23 UTC (15,263 KB)
[v2] Wed, 3 May 2023 16:36:38 UTC (6,196 KB)

Computer Science > Computation and Language

Title:Character-Aware Models Improve Visual Text Rendering

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Character-Aware Models Improve Visual Text Rendering

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators