Abstract
We present a strategy, called Seq2Seg, to reach both precise and accurate recognition and segmentation for children handwritten words. Reaching such high performance for both tasks is necessary to give personalized feedback to children who are learning how to write. The first contribution is to combine the predictions of an accurate Seq2Seq model with the predictions of a R-CNN object detector. The second one is to refine the bounding box predictions provided by the detector with a segmentation lattice computed from the online signal. An ablation study shows that both contributions are relevant, and their combination is efficient enough for immediate feedback and achieves state of the art results even compared to more informed systems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Sayre, K.M.: Machine recognition of handwritten words: a project report. Pattern Recognit. 5(3), 213–228 (1973)
Anquetil, E., Lorette, G.: Perceptual model of handwriting drawing application to the handwriting segmentation problem. In: 4th International Conference Document Analysis and Recognition (ICDAR 1997), 2-Volume Set, 18–20 August 1997, Ulm, Germany, Proceedings, p. 112. IEEE Computer Society (1997)
Anquetil, E., Lorette, G.: On-line handwriting character recognition system based on hierarchical qualitative fuzzy modelling. In: Progress in Handwriting Recognition, pp. 109–116 (1997)
Simonnet, D., Girard, N., Anquetil, É., Renault, M., Thomas, S.: Evaluation of children cursive handwritten words for e-Education. Pattern Recogn. Lett. 121, 133–139 (2019)
Michael, J., Labahn, R., Grüning, T., Zöllner, J.: Evaluating sequence-to-sequence models for handwritten text recognition. In: 2019 International Conference on Document Analysis and Recognition, ICDAR 2019, Sydney, Australia, 20–25 September 2019, pp. 1286–1293. IEEE (2019)
Coquenet, D., Chatelain, C., Paquet, T.: End-to-end handwritten paragraph text recognition using a vertical attention network. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (2022)
Kang, L., Riba, P., Rusiñol, M., Fornés, A., Villegas, M.: Pay attention to what you read: Non-recurrent handwritten text-line recognition. Pattern Recognit. 129, 108766 (2022)
Barrere, K., Soullard, Y., Lemaitre, A., Coüasnon, B.: Transformers for Historical Handwritten Text Recognition. In: Doctoral Consortium - ICDAR 2021, Lausanne, Switzerland (2021)
Marti, U.-V., Bunke, H.: A full English sentence database for off-line handwriting recognition. In: Fifth International Conference on Document Analysis and Recognition, ICDAR 1999, 20–22 September 1999, Bangalore, India, pp. 705–708. IEEE Computer Society (1999)
Liwicki, M., Bunke, H.: IAM-OnDB - an on-line English sentence database acquired from handwritten text on a whiteboard. In: Eighth International Conference on Document Analysis and Recognition (ICDAR 2005), 29 August - 1 September 2005, Seoul, Korea, pages 956–961. IEEE Computer Society (2005)
Graves, A., Fernández, S., Gomez, F.J., Schmidhuber, J.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: Machine Learning, Proceedings of the Twenty-Third International Conference (ICML 2006), Pittsburgh, Pennsylvania, USA, 25–29 June 2006, vol. 148 of ACM International Conference Proceeding Series, pp. 369–376. ACM (2006)
Zeyer, A., Schlüter, R., Ney, H.: Why does CTC result in peaky behavior? CoRR, abs/2105.14849 (2021)
Liu, H., Jin, S., Zhang, C.: Connectionist temporal classification with maximum entropy regularization. In: Bengio, S., Wallach, H.M., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R., (edn.), Advances in Neural Information Processing Systems vol. 31, pp. 839–849 (2018)
Li, H., Wang, W.: Reinterpreting CTC training as iterative fitting. Pattern Recognit. 105, 107392 (2020)
Krichen, O., Corbillé, S., Anquetil, E., et al.: Combination of explicit segmentation with Seq2Seq recognition for fine analysis of children handwriting. IJDAR 25, pp. 339–350 (2022). https://doi.org/10.1007/s10032-022-00409-4
Krichen, O., Corbillé, S., Anquetil, E., Girard, N., Nerdeux, P.: Online analysis of children handwritten words in dictation context. In: Barney Smith, E.H., Pal, U. (eds.) ICDAR 2021. LNCS, vol. 12916, pp. 125–140. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86198-8_10
He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October 2017, pp. 2980–2988. IEEE Computer Society (2017)
Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R., (edn.) Advances in Neural Information Processing Systems. vol. 28, pp. 91–99 (2015)
Pang, J., Chen, K., Shi, J., Feng, H., Ouyang, W., Lin, D.: Libra R-CNN: towards balanced learning for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 821–830. Computer Vision Foundation/IEEE (2019)
Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016, pp. 779–788. IEEE Computer Society (2016)
Bochkovskiy, A., Wang, C.-Y., Liao, H.-Y.M.: YOLOv4: Optimal speed and accuracy of object detection. CoRR, abs/2004.10934 (2020)
Li, C., et al.: YOLOv6: A single-stage object detection framework for industrial applications. CoRR, abs/2209.02976 (2022)
Wang, C.-Y., Bochkovskiy, A., Liao, H.-Y.M.: YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. CoRR, abs/2207.02696 (2022)
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Mouchère, H., Bayoudh, S., Anquetil, E., Miclet, L.: Synthetic on-line handwriting generation by distortions and analogy. In: 13th Conference of the International Graphonomics Society (IGS2007), pp. 10–13, Melbourne, Australia, November 2007
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, 6–9 May 2019. OpenReview.net (2019)
Wang, X., Song, J.-Y.: ICIoU: improved loss based on complete intersection over union for bounding box regression. IEEE Access 9, 105686–105695 (2021)
Damerau, F.: A technique for computer detection and correction of spelling errors. Commun. ACM 7(3), 171–176 (1964)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Corbillé, S., Anquetil, É., Fromont, É. (2023). Precise Segmentation for Children Handwriting Analysis by Combining Multiple Deep Models with Online Knowledge. In: Fink, G.A., Jain, R., Kise, K., Zanibbi, R. (eds) Document Analysis and Recognition - ICDAR 2023. ICDAR 2023. Lecture Notes in Computer Science, vol 14190. Springer, Cham. https://doi.org/10.1007/978-3-031-41685-9_15
Download citation
DOI: https://doi.org/10.1007/978-3-031-41685-9_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-41684-2
Online ISBN: 978-3-031-41685-9
eBook Packages: Computer ScienceComputer Science (R0)