Abstract
Text detection in natural and video scene images is still considered to be challenging due to unpredictable nature of scene texts. This paper presents a new method based on Cloud of Line Distribution (COLD) and Random Forest Classifier for text detection in both natural and video images. The proposed method extracts unique shapes of text components by studying the relationship between dominant points such as straight or cursive over contours of text components, which is called COLD in polar domain. We consider edge components as text candidates if the edge components in Canny and Sobel of an input image share the COLD property. For each text candidate, we further study its COLD distribution at component level to extract statistical features and angle oriented features. Next, these features are fed to a random forest classifier to eliminate false text candidates, which results representatives. We then perform grouping using representatives to form text lines based on the distances between edge components in the edge image. The statistical and angle orientated features are finally extracted at word level for eliminating false positives, which results in text detection. The proposed method is tested on standard database, namely, SVT, ICDAR 2015 scene, ICDAR2013 scene and video databases, to show its effectiveness and usefulness compared with the existing methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ye, Q., Doermann, D.: Text detection and recognition in imagery: a survey. IEEE Trans. PAMI 1480–1500 (2015)
Yin, X.C., Zuo, Z.Y., Tian, S., Liu, C.L.: Text detection, tracking and recognition in video: a comprehensive survey. IEEE Trans. Image Process. 2752–2773 (2016)
Feng, Y., Song, Y., Zhang, Y.: Scene text detection based on multi-scale SWT and edge filtering. In: Proceedings of ICPR, pp. 634–639 (2016)
Pei, W.Y., Yang, C., Kau, L.J., Yin, X.C.: Multi-orientation scene text detection with multi-information fusion. In: Proceedings of ICPR, pp. 646–651 (2016)
Wu, H., Zou, B., Zhao, Y.Q., Chen, Z., Zhu, C., Guo, J.: Natural scene text detection by multi-scale adaptive color clustering and non-text filtering. Neurocomputing 1011–1025 (2016)
Zheng, Y., Li, Q., Ju, J., Hu, H., Li, G., Zhang, S.: A cascaded method for text detection in natural scene images. Neurocomputing 1–9 (2017)
Yin, X., Yin, X., Huang, K., Hao, H.: Robust text detection in natural scene images. IEEE Trans. PAMI 36(5), 970–983 (2014)
Neumann, L., Matas, J.: Real-time scene text localization and recognition. In: Proceedings of CVPR, pp. 3538–3545 (2012)
Li, Y., Jia, W., Shen, C., Hengel, A.V.D.: Characterness: an indicator of text in the wild. IEEE Trans. IP 23(4), 1666–1677 (2014)
Mosleh, A., Bouguila, N., Hamza, A.B.: Automatic inpainting scheme for video text detection and removal. IEEE Trans. IP 22(11), 4460–4472 (2013)
Mittal, A., Roy, P.P., Singh, P., Raman, B.: Rotation and script independent text detection from video using sub pixel mapping. Vis. Commun. Image Represent. (2017, to appear)
Wu, Y., Shivakumara, P., Lu, T., Lim Tan, C., Blumenstein, M., Kumar, G.H.: Contour restoration of text components for recognition in video/scene images. IEEE Trans. IP 25(12), 5622–5634 (2016)
Shivakumara, P., Wu, L., Lu, T., Tan, C.L.: Fractals based multi-oriented text detection system for recognition in mobile video images. Pattern Recogn. 158–174 (2017)
Shivakumara, P., Raghavendra, R., Qin, L., Raja, K.B., Lu, T., Pal, U.: A new multi-modal approach to bib number/text detection and recognition in Marathon images. Pattern Recogn. 479–491 (2017)
Huang, W., Qiao, Y., Tang, X.: Robust scene text detection with convolution neural network induced MSER trees. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 497–511. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10593-2_33
Jaderberg, M., Simonyan, K., Vedaldi, A., Zisserman, A.: Reading text in the wild with convolutional neural networks. IJCV 116(1), 1–20 (2016)
He, S., Schomaker, L.: Beyond OCR: multi-faceted understanding of handwritten document characteristics. Pattern Recogn. 321–333 (2017)
Karatzas, D., Shafait, F., Uchida, S., Iwamura, M., Boorda, L.G.I., Mestre, S.R., Mas, J., Mota, D.F., Almazan, J.A., De las Heras, L.P.: ICDAR 2013 robust reading competition. In: Proceedings of ICDAR, pp. 1115–1124 (2013)
Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., Ghosh, S., Bagdanow, A., Iwamura, M., Matas, J., Neumann, L., Chandrsekhar, V.R.: ICDAR 2015 competition on robust reading. In: Proceedings of ICDAR, pp. 1156–1160 (2015)
Acknowledgements
This work was supported by the Natural Science Foundation of China under Grant 61672273, Grant 61272218, and Grant 61321491, the Science Foundation for Distinguished Young Scholars of Jiangsu under Grant BK20160021, the Science Foundation of JiangSu under Grant BK20170892, the Fundamental Research Funds for the Central Universities under Grant 2013/B16020141, and the open Project of the National Key Lab for Novel Software Technology in NJU under Grant KFKT2017B05.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Wang, W., Wu, Y., Shivakumara, P., Lu, T. (2018). Cloud of Line Distribution and Random Forest Based Text Detection from Natural/Video Scene Images. In: Schoeffmann, K., et al. MultiMedia Modeling. MMM 2018. Lecture Notes in Computer Science(), vol 10705. Springer, Cham. https://doi.org/10.1007/978-3-319-73600-6_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-73600-6_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73599-3
Online ISBN: 978-3-319-73600-6
eBook Packages: Computer ScienceComputer Science (R0)