Abstract
Scene text in indoor environments usually preserves and communicates important contextual information which can significantly enhance the independent travel of blind and visually impaired people. In this paper, we present an assistive text spotting navigation system based on an RGB-D mobile device for blind or severely visually impaired people. Specifically, a novel spatial-temporal text localization algorithm is proposed to localize and prune text regions, by integrating stroke-specific features with a subsequent text tracking process. The density of extracted text-specific feature points serves as an efficient text indicator to guide the user closer to text-likely regions for better recognition performance. Next, detected text regions are binarized and recognized by off-the-shelf optical character recognition methods. Significant non-text indicator signage can also be matched to provide additional environment information. Both recognized results are then transferred to speech feedback for user interaction. Our proposed video text localization approach is quantitatively evaluated on the ICDAR 2013 dataset, and the experimental results demonstrate the effectiveness of our proposed method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Xiong, B., Grauman, K.: Text detection in stores using a repetition prior. In: WACV (2016)
Qin, S., Manduchi, R.: A fast and robust text spotter. In: WACV (2016)
Yin, X., Zuo, Z., Tian, S., Liu, C.: Text detection, tracking and recognition in video: a comprehensive survey. IEEE Trans. Image Process. (2016)
Busta, M., Neumann, L., Matas, J.: FASText: efficient unconstrained scene text detector. In: ICCV (2015)
Jaderberg, M., Vedaldi, A., Zisserman, A.: Deep features for text spotting. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 512–528. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10593-2_34
Yin, X., Yin, X., Huang, K., Hao, H.: Robust text detection in natural scene images. IEEE Trans. Pattern Anal. Mach. Intell. (2014)
Rakshit, S., Basu, S.: Recognition of handwritten roman script using tesseract open source ocr engine. arXiv.org (2010)
Munõz, J.P., Li, B., Rong, X., Xiao, J., Tian, Y., Arditi, A.: Demo: assisting visually impaired people navigate indoors. In: International Joint Conference on Artificial Intelligence (IJCAI), pp. 4260–4261 (2016)
Lees, Y., Medioni, G.: RGB-D camera based wearable navigation system for the visually impaired. Comput. Vis. Image Underst. 149, 3–20 (2016)
Li, B., Muñoz, J.P., Rong, X., Xiao, J., Tian, Y., Arditi, A.: ISANA: wearable context-aware indoor assistive navigation with obstacle avoidance for the blind. In: Hua, G., Jégou, H. (eds.) ECCV 2016 Workshop. LNCS, vol. 9914, pp. 448–462. Springer, Heidelberg (2016)
Li, B., Zhang, X., Muñoz, J.P., Xiao, J., Rong, X., Tian, Y.: Assisting blind people to avoid obstacles: an wearable obstacle stereo feedback system based on 3D detection. In: IEEE International Conference on Robotics and Biomimetics (ROBIO) (2015)
Rong, X., Yi, C., Yang, X., Tian, Y.: Scene text recognition in multiple frames based on text tracking. In: IEEE International Conference on Multimedia and Expo (2014)
Rong, X., Yi, C., Tian, Y.: Recognizing text-based traffic guide panels with cascaded localization network. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9913, pp. 109–121. Springer, Heidelberg (2016). doi:10.1007/978-3-319-46604-0_8
Yi, C., Tian, Y.: Text string detection from natural scenes by structure-based partition and grouping. IEEE Trans. Image Process. 20, 2594–2605 (2011)
Yi, C., Tian, Y., Arditi, A.: Portable camera-based assistive text and product label reading from hand-held objects for blind persons. IEEE Trans. Mechatron. 19, 808–817 (2014)
Balntas, V., Tang, L., Mikolajczyk, K.: Bold - binary online learned descriptor for efficient image matching. In: CVPR (2015)
Ozuysal, M., Calonder, M., Lepetit, V., Fua, P.: Fast keypoint recognition using random ferns. IEEE Trans. Pattern Anal. Mach. Intell. 32, 448–461 (2010)
Rosten, E., Drummond, T.: Fusing points and lines for high performance tracking. In: ICCV (2005)
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression a statistical view of boosting. Ann. Stat. 28, 337–407 (2000)
Karatzas, D.: ICDAR 2013 robust reading competition. In: ICDAR (2013)
Goto, H., Tanaka, M.: Text-tracking wearable camera system for the blind. In: ICDAR (2009)
Wu, L., Shivakumara, P., Lu, T.: A new technique for multi-oriented scene text line detection and tracking in video. IEEE Trans. Multimed. 17, 1137–1152 (2015)
Cambra, A., Murillo, A.: Towards robust and efficient text sign reading from a mobile phone (2011)
Li, H., Doermann, D., Kia, O.: Automatic text detection and tracking in digital video. IEEE Trans. Image Process. 9, 147–156 (2000)
Mosleh, A., Bouguila, N., Hamza, A.: Automatic inpainting scheme for video text detection and removal. IEEE Trans. Image Process. 22, 4460–4472 (2013)
Zhao, X., Lin, K., Fu, Y., Hu, Y., Liu, Y.: Text from corners: a novel approach to detect text and caption in videos. IEEE Trans. Image Process. 20, 790–799 (2011)
Acknowledgements
This work was supported in part by U.S. Federal Highway Administration (FHWA) grant DTFH 61-12-H-00002, National Science Foundation (NSF) grants CBET-1160046, EFRI-1137172 and IIP-1343402, National Institutes of Health (NIH) grant EY023483.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Rong, X., Li, B., Muñoz, J.P., Xiao, J., Arditi, A., Tian, Y. (2016). Guided Text Spotting for Assistive Blind Navigation in Unfamiliar Indoor Environments. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2016. Lecture Notes in Computer Science(), vol 10073. Springer, Cham. https://doi.org/10.1007/978-3-319-50832-0_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-50832-0_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50831-3
Online ISBN: 978-3-319-50832-0
eBook Packages: Computer ScienceComputer Science (R0)