[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3591106.3592259acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

Symbol Location-Aware Network for Improving Handwritten Mathematical Expression Recognition

Published: 12 June 2023 Publication History

Abstract

Recently most handwritten mathematical expression recognition methods adopt the attention-based encoder-decoder framework, which generates LaTeX sequences from given images. However, the accuracy of the attention mechanism limits the performance of HMER models. Lacking global context information in the decoding process is also a challenge for HMER. Some methods adopt symbol-level counting to localize symbols for improving the model performance, while these methods cannot work well. In this paper, we propose a method named SLAN, shorted for a Symbol Location-Aware Network, to solve the HMER problem. Specifically, we propose an advanced relation-level counting method to detect symbols in the image. We solve the lacking global context problem with a new global context-aware decoder. For improving the accuracy of attention, we design a novel attention alignment loss function by the dynamic programming algorithm, which can learn attention alignment directly without pixel-level labels. We conducted extensive experiments on the CROHME dataset to demonstrate the effectiveness of each part of SLAN and achieved state-of-the-art performance.

References

[1]
Peter Anderson, Xiaodong He, Chris Buehler, Damien Teney, Mark Johnson, Stephen Gould, and Lei Zhang. 2018. Bottom-up and top-down attention for image captioning and visual question answering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6077–6086.
[2]
Xiaohang Bian, Bo Qin, Xiaozhe Xin, Jianwu Li, Xuefeng Su, and Yanfeng Wang. 2022. Handwritten mathematical expression recognition via attention aggregation based bi-directional mutual learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 113–121.
[3]
Dorothea Blostein and Ann Grbavec. 1997. Recognition of mathematical notation. In Handbook of character recognition and document image analysis. World Scientific, 557–582.
[4]
Kam-Fai Chan and Dit-Yan Yeung. 2001. Error detection, error correction and performance evaluation in on-line mathematical expression recognition. Pattern Recognition 34, 8 (2001), 1671–1684.
[5]
Xinpeng Chen, Lin Ma, Wenhao Jiang, Jian Yao, and Wei Liu. 2018. Regularizing rnns for caption generation by reconstructing the past with the present. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7995–8003.
[6]
Yuntian Deng, Anssi Kanervisto, Jeffrey Ling, and Alexander M Rush. 2017. Image-to-markup generation with coarse-to-fine attention. In Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 980–989.
[7]
Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4700–4708.
[8]
Joseph J LaViola and Robert C Zeleznik. 2007. A practical approach for writer-dependent symbol recognition using a writer-independent symbol recognizer. IEEE Transactions on pattern analysis and machine intelligence 29, 11 (2007), 1917–1926.
[9]
Anh Duc Le. 2020. Recognizing handwritten mathematical expressions via paired dual loss attention network and printed mathematical expressions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 566–567.
[10]
Bohan Li, Ye Yuan, Dingkang Liang, Xiao Liu, Zhilong Ji, Jinfeng Bai, Wenyu Liu, and Xiang Bai. 2022. When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXVIII. Springer, 197–214.
[11]
Qiqiang Lin, Chunyi Wang, Ning Bi, Ching Y Suen, and Jun Tan. 2022. An Encoder-Decoder Approach to Offline Handwritten Mathematical Expression Recognition with Residual Attention. In Pattern Recognition and Artificial Intelligence: Third International Conference, ICPRAI 2022, Paris, France, June 1–3, 2022, Proceedings, Part I. Springer, 335–345.
[12]
Qi Liu, Zai Huang, Zhenya Huang, Chuanren Liu, Enhong Chen, Yu Su, and Guoping Hu. 2018. Finding similar exercises in online education systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 1821–1830.
[13]
Minh-Thang Luong, Hieu Pham, and Christopher D Manning. 2015. Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025 (2015).
[14]
Christopher Malon, Seiichi Uchida, and Masakazu Suzuki. 2008. Mathematical symbol recognition with support vector machines. Pattern Recognition Letters 29, 9 (2008), 1326–1332.
[15]
Cuong Tuan Nguyen, Hung Tuan Nguyen, Kei Morizumi, and Masaki Nakagawa. 2021. Temporal classification constraint for improving handwritten mathematical expression recognition. In Document Analysis and Recognition–ICDAR 2021 Workshops: Lausanne, Switzerland, September 5–10, 2021, Proceedings, Part II 16. Springer, 113–125.
[16]
Masayuki Okamoto, Hiroki Imai, and Kazuhiko Takagi. 2001. Performance evaluation of a robust method for mathematical expression recognition. In Proceedings of Sixth International Conference on Document Analysis and Recognition. IEEE, 121–128.
[17]
Aniket Pal and Krishna Pratap Singh. 2022. R-GRU: Regularized gated recurrent unit for handwritten mathematical expression recognition. Multimedia Tools and Applications 81, 22 (2022), 31405–31419.
[18]
Amar Raja, Matthew Rayner, Alan Sexton, and Volker Sorge. 2006. Towards a parser for mathematical formula recognition. In International Conference on Mathematical Knowledge Management. Springer, 139–151.
[19]
Faisal Shafait, Daniel Keysers, and Thomas Breuel. 2008. Performance evaluation and benchmarking of six-page segmentation algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence 30, 6 (2008), 941–954.
[20]
Yu Su, Qingwen Liu, Qi Liu, Zhenya Huang, Yu Yin, Enhong Chen, Chris Ding, Si Wei, and Guoping Hu. 2018. Exercise-enhanced sequential modeling for student performance prediction. In Thirty-Second AAAI Conference on Artificial Intelligence.
[21]
Thanh-Nghia Truong, Huy Quang Ung, Hung Tuan Nguyen, Cuong Tuan Nguyen, and Masaki Nakagawa. 2021. Relation-based representation for handwritten mathematical expression recognition. In Document Analysis and Recognition–ICDAR 2021 Workshops: Lausanne, Switzerland, September 5–10, 2021, Proceedings, Part I 16. Springer, 7–19.
[22]
Lei Wang, Dongxiang Zhang, Lianli Gao, Jingkuan Song, Long Guo, and Heng Tao Shen. 2018. Mathdqn: Solving arithmetic word problems via deep reinforcement learning. In Thirty-Second AAAI Conference on Artificial Intelligence.
[23]
Lei Wang, Dongxiang Zhang, Jipeng Zhang, Xing Xu, Lianli Gao, Bingtian Dai, and Heng Tao Shen. 2019. Template-Based Math Word Problem Solvers with Recursive Neural Networks. (2019).
[24]
Zelun Wang and Jyh-Charn Liu. 2021. Translating math formula images to LaTeX sequences using deep neural networks with sequence-level training. International Journal on Document Analysis and Recognition (IJDAR) 24, 1 (2021), 63–75.
[25]
Changjie Wu, Jun Du, Yunqing Li, Jianshu Zhang, Chen Yang, Bo Ren, and Yiqing Hu. 2022. TDv2: A Novel Tree-Structured Decoder for Offline Mathematical Expression Recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 2694–2702.
[26]
Jin-Wen Wu, Fei Yin, Yan-Ming Zhang, Xu-Yao Zhang, and Cheng-Lin Liu. 2019. Image-to-markup generation via paired adversarial learning. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2018, Dublin, Ireland, September 10–14, 2018, Proceedings, Part I 18. Springer, 18–34.
[27]
Jin-Wen Wu, Fei Yin, Yan-Ming Zhang, Xu-Yao Zhang, and Cheng-Lin Liu. 2020. Handwritten mathematical expression recognition via paired adversarial learning. International Journal of Computer Vision 128 (2020), 2386–2401.
[28]
Zuoyu Yan, Xiaode Zhang, Liangcai Gao, Ke Yuan, and Zhi Tang. 2021. ConvMath: A Convolutional Sequence Network for Mathematical Expression Recognition. In 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 4566–4572.
[29]
Yu Yin, Zhenya Huang, Enhong Chen, Qi Liu, Fuzheng Zhang, Xing Xie, and Guoping Hu. 2018. Transcribing content from structural images with spotlight mechanism. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2643–2652.
[30]
Richard Zanibbi and Dorothea Blostein. 2012. Recognition and retrieval of mathematical expressions. International Journal on Document Analysis and Recognition (IJDAR) 15, 4 (2012), 331–357.
[31]
Richard Zanibbi, Dorothea Blostein, and James R Cordy. 2001. Baseline structure analysis of handwritten mathematics notation. In Proceedings of Sixth International Conference on Document Analysis and Recognition. IEEE, 768–773.
[32]
Richard Zanibbi, Dorothea Blostein, and James R. Cordy. 2002. Recognizing mathematical expressions using tree transformation. IEEE Transactions on pattern analysis and machine intelligence 24, 11 (2002), 1455–1467.
[33]
Jianshu Zhang, Jun Du, and Lirong Dai. 2018. Multi-scale attention with dense encoder for handwritten mathematical expression recognition. In 2018 24th international conference on pattern recognition (ICPR). IEEE, 2245–2250.
[34]
Jianshu Zhang, Jun Du, Yongxin Yang, Yi-Zhe Song, Si Wei, and Lirong Dai. 2020. A tree-structured decoder for image-to-markup generation. In International Conference on Machine Learning. PMLR, 11076–11085.
[35]
Jianshu Zhang, Jun Du, Shiliang Zhang, Dan Liu, Yulong Hu, Jinshui Hu, Si Wei, and Lirong Dai. 2017. Watch, attend and parse: An end-to-end neural network based approach to handwritten mathematical expression recognition. Pattern Recognition 71 (2017), 196–206.
[36]
Wenqi Zhao, Liangcai Gao, Zuoyu Yan, Shuai Peng, Lin Du, and Ziyin Zhang. 2021. Handwritten mathematical expression recognition with bidirectionally trained transformer. In Document Analysis and Recognition–ICDAR 2021: 16th International Conference, Lausanne, Switzerland, September 5–10, 2021, Proceedings, Part II 16. Springer, 570–584.
[37]
Shuhan Zhong, Sizhe Song, Guanyao Li, and S-H Gary Chan. 2022. A Tree-Based Structure-Aware Transformer Decoder for Image-To-Markup Generation. In Proceedings of the 30th ACM International Conference on Multimedia. 5751–5760.

Cited By

View all
  • (2025)Can question-texts improve the recognition of handwritten mathematical expressions in respondents’ solutions?Knowledge-Based Systems10.1016/j.knosys.2024.112731307(112731)Online publication date: Jan-2025
  • (2024)Character Relationship Refinement Network for Handwritten Mathematical Expression Recognition2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10651043(1-8)Online publication date: 30-Jun-2024
  • (2024)Generating Handwritten Mathematical Expressions From Symbol Graphs: An End-to-End Pipeline2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01484(15675-15685)Online publication date: 16-Jun-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMR '23: Proceedings of the 2023 ACM International Conference on Multimedia Retrieval
June 2023
694 pages
ISBN:9798400701788
DOI:10.1145/3591106
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 June 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. dynamic programming
  2. global context
  3. handwritten mathematical expression recognition
  4. symbol counting

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICMR '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)64
  • Downloads (Last 6 weeks)6
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Can question-texts improve the recognition of handwritten mathematical expressions in respondents’ solutions?Knowledge-Based Systems10.1016/j.knosys.2024.112731307(112731)Online publication date: Jan-2025
  • (2024)Character Relationship Refinement Network for Handwritten Mathematical Expression Recognition2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10651043(1-8)Online publication date: 30-Jun-2024
  • (2024)Generating Handwritten Mathematical Expressions From Symbol Graphs: An End-to-End Pipeline2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01484(15675-15685)Online publication date: 16-Jun-2024
  • (2024)A tree-based model with branch parallel decoding for handwritten mathematical expression recognitionPattern Recognition10.1016/j.patcog.2023.110220149:COnline publication date: 25-Jun-2024
  • (2024)DGNet: A Handwritten Mathematical Formula Recognition Network Based on Deformable Convolution and Global Context AttentionMobile Networks and Applications10.1007/s11036-024-02315-xOnline publication date: 10-May-2024
  • (2023)Elevating Handwritten Mathematical Expression Recognition: Unveiling 2D Structural Insights Through Weak Supervision2023 2nd International Conference on Machine Learning, Cloud Computing and Intelligent Mining (MLCCIM)10.1109/MLCCIM60412.2023.00057(352-357)Online publication date: 25-Jul-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media