Computer Science > Computer Vision and Pattern Recognition

arXiv:1804.08588 (cs)

[Submitted on 23 Apr 2018 (v1), last revised 19 Nov 2018 (this version, v2)]

Title:Large Scale Scene Text Verification with Guided Attention

Authors:Dafang He, Yeqing Li, Alexander Gorban, Derrall Heath, Julian Ibarz, Qian Yu, Daniel Kifer, C. Lee Giles

View PDF

Abstract:Many tasks are related to determining if a particular text string exists in an image. In this work, we propose a new framework that learns this task in an end-to-end way. The framework takes an image and a text string as input and then outputs the probability of the text string being present in the image. This is the first end-to-end framework that learns such relationships between text and images in scene text area. The framework does not require explicit scene text detection or recognition and thus no bounding box annotations are needed for it. It is also the first work in scene text area that tackles suh a weakly labeled problem. Based on this framework, we developed a model called Guided Attention. Our designed model achieves much better results than several state-of-the-art scene text reading based solutions for a challenging Street View Business Matching task. The task tries to find correct business names for storefront images and the dataset we collected for it is substantially larger, and more challenging than existing scene text dataset. This new real-world task provides a new perspective for studying scene text related problems. We also demonstrate the uniqueness of our task via a comparison between our problem and a typical Visual Question Answering problem.

Comments:	18 pages, ACCV 2019
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1804.08588 [cs.CV]
	(or arXiv:1804.08588v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1804.08588

Submission history

From: Dafang He [view email]
[v1] Mon, 23 Apr 2018 17:30:49 UTC (4,993 KB)
[v2] Mon, 19 Nov 2018 01:01:52 UTC (8,909 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Large Scale Scene Text Verification with Guided Attention

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Large Scale Scene Text Verification with Guided Attention

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators