More Web Proxy on the site http://driver.im/

Article

Scene Text Recognition: An Overview

Authors:

Jun TanAuthors Info & Claims

Pattern Recognition and Artificial Intelligence: Third International Conference, ICPRAI 2022, Paris, France, June 1–3, 2022, Proceedings, Part I

Pages 323 - 334

https://doi.org/10.1007/978-3-031-09037-0_27

Published: 01 June 2022 Publication History

Abstract

Recent years have witnessed increasing interest in recognizing text in natural scenes in both academia and industry due to the rich semantic information carried by text. With the rapid development of deep learning technology, text recognition in natural scene, also known as scene text recognition (STR), has also made breakthrough progress. However, noise interference in natural scene such as extreme illumination and occlusion, as well as other factors, lead huge challenges to it. Recent research has shown promising in terms of accuracy and efficiency. In order to present the entire picture of the field of STR, this paper try to: 1) summarize the fundamental problems of STR and the progress of representative STR algorithms in recent years; 2) analyze and compare the advantages and disadvantages of them; 3) point out directions for future work to inspire future research.

References

[1]

Liao M, Shi B, and Bai X Textboxes++: A single-shot oriented scene text detector IEEE Trans. Image Process. 2018 27 8 3676-3690

[2]

Tian, Z., Huang, W., He, T., He, P., Qiao, Y.: Detecting text in natural image with connectionist text proposal network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 56–72. Springer, Cham (2016).

[3]

Ma, J., et al.: Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans. Multimedia 20(11), 3111–3122 (2018)

[4]

Liao, M., Shi, B., Bai, X., Wang, X., Liu, W.: Textboxes: a fast text detector with a single deep neural network. In: Thirty-first AAAI Conference on Artificial Intelligence (2017)

[5]

Shi, B., Bai, X., Belongie, S.: Detecting oriented text in natural images by linking segments. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2550–2558 (2017)

[6]

Ma C, Sun L, Zhong Z, and Huo Q ReLaText: Exploiting visual relationships for arbitrary-shaped scene text detection with graph convolutional networks Pattern Recogn. 2021 111

[7]

Wang, X., Zheng, S., Zhang, C., Li, R., Gui, L.: R-YOLO: a real-time text detector for natural scenes with arbitrary rotation. Sensors 21(3), 888 (2021)

[8]

Xiao, L., Zhou, P., Xu, K., Zhao, X.: Multi-directional scene text detection based on improved YOLOv3. Sensors 21(14), 4870 (2021)

[9]

Long S, Ruan J, Zhang W, He X, Wu W, and Yao C Ferrari V, Hebert M, Sminchisescu C, and Weiss Y TextSnake: a flexible representation for detecting text of arbitrary shapes Computer Vision – ECCV 2018 2018 Cham Springer 19-35

[10]

Xie, E., et al.: Scene text detection with supervised pyramid context network. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, pp. 9038–9045 (2019)

[11]

He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)

[12]

Wang, W., et al.: Shape robust text detection with progressive scale expansion network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9336–9345 (2019)

[13]

Wang, W., et al.: Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8440–8449 (2019)

[14]

Tian, Z., et al.: Learning shape-aware embedding for scene text detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4234–4243 (2019)

[15]

Xu Y, Wang Y, Zhou W, Wang Y, Yang Z, and Bai X TextField: learning a deep direction field for irregular scene text detection IEEE Trans. Image Process. 2019 28 11 5566-5579

[16]

Zhu Y and Du J Textmountain: accurate scene text detection via instance segmentation Pattern Recogn. 2021 110

[17]

Liao, M., Wan, Z., Yao, C., Chen, K., Bai, X.: Real-time scene text detection with differentiable binarization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 07, pp. 11474–11481 (2020)

[18]

Zhou, X., et al.: East: an efficient and accurate scene text detector. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 5551–5560 (2017)

[19]

Ghosh M, Roy SS, Mukherjee H, Obaidullah SM, Gao XZ, and Roy K Movie title extraction and script separation using shallow convolution neural network IEEE Access 2021 9 125184-125201

[20]

Zhang, C., et al.: Look more than once: an accurate detector for text of arbitrary shapes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10552–10561 (2019)

[21]

He, M., et al.: MOST: a multi-oriented scene text detector with localization refinement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8813–8822 (2021)

[22]

Shi B, Bai X, and Yao C An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition IEEE Trans. Pattern Anal. Mach. Intell. 2016 39 11 2298-2304

[23]

Wang, J., Hu, X.: Gated recurrent convolution neural network for OCR. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 334–343 (2017)

[24]

Liu, W., Chen, C., Wong, K.Y.K., Su, Z., Han, J.: STAR-Net: a spatial attention residue network for scene text recognition. In: BMVC, vol. 2, p. 7 (2016)

[25]

Liu H, Jin S, and Zhang C Connectionist temporal classification with maximum entropy regularization Adv. Neural. Inf. Process. Syst. 2018 31 831-841

[26]

Yin, F., Wu, Y.C., Zhang, X.Y., Liu, C.L.: Scene text recognition with sliding convolutional character models. arXiv preprint arXiv:1709.01727(2017)

[27]

Gao Y, Chen Y, Wang J, Tang M, and Lu H Reading scene text with fully convolutional sequence modeling Neurocomputing 2019 339 161-170

[28]

Shi, B., Wang, X., Lyu, P., Yao, C., Bai, X.: Robust scene text recognition with automatic rectification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4168–4176 (2016)

[29]

Shi B, Yang M, Wang X, Lyu P, Yao C, and Bai X Aster: An attentional scene text recognizer with flexible rectification IEEE Trans. Pattern Anal. Mach. Intell. 2018 41 9 2035-2048

[30]

Luo C, Jin L, and Sun Z MORAN: a multi-object rectified attention network for scene text recognition Pattern Recognt. 2019 90 109-118

[31]

Lin Q, Luo C, Jin L, and Lai S STAN: a sequential transformation attention-based network for scene text recognition Pattern Recognt. 2021 111

[32]

Cheng, Z., et al.: Focusing attention: towards accurate text recognition in natural images. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5076–5084 (2017)

[33]

Lu N et al. MASTER: multi-aspect non-local network for scene text recognition Pattern Recognt. 2021 117

[34]

Wang, T., et al.: Decoupled attention network for text recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 07, pp. 12216–12224 (2020)

[35]

Yan, R., Peng, L., Xiao, S., Yao, G.: Primitive representation learning for scene text recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 284–293 (2021)

[36]

Chen, Y., et al.: Graph-based global reasoning networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 433–442 (2019)

[37]

Fang, S., Xie, H., Wang, Y., Mao, Z., Zhang, Y.: Read like humans: autonomous, bidirectional and iterative language modeling for scene text recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7098–7107 (2021)

[38]

Bhunia, A. K., et al.: Joint visual semantic reasoning: Multi-stage decoder for text recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 14940–14949 (2021)

[39]

Yu, D., et al.: Towards accurate scene text recognition with semantic reasoning networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12113–12122 (2020)

[40]

Litman, R., et al.: SCATTER: selective context attentional scene text recognizer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11962–11972 (2020)

[41]

Hu, W., Cai, X., Hou, J., Yi, S., Lin, Z.: GTC: guided training of CTC towards efficient and accurate scene text recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 07, pp. 11005–11012 (2020)

[42]

Lyu, P., Liao, M., Yao, C., Wu, W., Bai, X.: Mask TextSpotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds) ECCV 2018. LNCS, vol. 11218, pp. 67–83. Springer, Cham (2018).

[43]

Liu, X., et al.: FOTS: fast oriented text spotting with a unified network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5676–5685 (2018)

[44]

Feng, W., He, W., Yin, F., Zhang, X.Y., Liu, C.L.: TextDragon: an end-to-end framework for arbitrary shaped text spotting. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9076–9085 (2019)

[45]

Liao, M., et al.: Mask TextSpotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. IEEE Trans. Pattern Anal. Mach. Intell. (2019)

[46]

Wang, H., et al.: All you need is boundary: toward arbitrary-shaped text spotting. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 07, pp. 12160–12167 (2020)

[47]

Mittal A, Shivakumara P, Pal U, Lu T, and Blumenstein M A new method for detection and prediction of occluded text in natural scene images Signal Process. Image Commun. 2022 100

[48]

Liu, Y., et al.: ABCNet: real-time scene text spotting with adaptive Bezier-curve network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9809–9818 (2020)

[49]

Wang, P., et al.: PGNet: real-time arbitrarily-shaped text spotting with point gathering network. arXiv preprint arXiv:2104.05458(2021)

[50]

Wang, W., et al.: PAN++: towards efficient and accurate end-to-end spotting of arbitrarily-shaped text. IEEE Trans. Pattern Anal. Machi. Intell. (2021)

Index Terms

Scene Text Recognition: An Overview
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
  2. Machine learning

Index terms have been assigned to the content through auto-classification.

Recommendations

A light-weight natural scene text detection and recognition system
Abstract
Scene text recognition is an application of Computer Vision that analyses the scene image and recognizes the text present on it. This task has many applications and will gain more importance if it can be used in handheld devices. The problem with ...
Detection and rectification of arbitrary shaped scene texts by using text keypoints and links
Highlights
- We propose a robust scene text detection and rectification technique that is capable of detecting and rectifying scene texts of arbitrary shapes almost ...
Abstract
Detection and recognition of scene texts of arbitrary shapes remain a grand challenge due to the super-rich text shape variation in text line orientations, lengths, curvatures, etc. This paper presents a mask-guided multi-task network ...
Scene text understanding: recapitulating the past decade
Abstract
Computational perception has indeed been dramatically modified and reformed from handcrafted feature-based techniques to the advent of deep learning. Scene text identification and recognition have inexorably been touched by this bow effort of ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

Pattern Recognition and Artificial Intelligence: Third International Conference, ICPRAI 2022, Paris, France, June 1–3, 2022, Proceedings, Part I

Jun 2022

718 pages

ISBN:978-3-031-09036-3

DOI:10.1007/978-3-031-09037-0

Editors:
Mounîm El Yacoubi
Télécom SudParis, Palaiseau, France
,
Eric Granger
École de Technologie Supérieure, Montreal, QC, Canada
,
Pong Chi Yuen
Hong Kong Baptist University, Kowloon, Kowloon, Hong Kong
,
Umapada Pal
Indian Statistical Institute, Kolkata, India
,
Nicole Vincent
Université Paris Cité, Paris, France

© Springer Nature Switzerland AG 2022.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 June 2022

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

View Table of Contents