[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Using Large Language Models to Automate Annotation and Part-of-Math Tagging of Math Equations

  • Conference paper
  • First Online:
Intelligent Computer Mathematics (CICM 2024)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14960))

Included in the following conference series:

  • 261 Accesses

Abstract

This paper explores the potential of leveraging Large Language Models (LLMs) for the tasks of automated annotation and Part-of-Math (POM) tagging of equations. Traditional methods for math term annotation and POM tagging rely heavily on manually crafted rules and limited datasets, which often result in scalability issues and insufficient adaptability to new domains. In contrast, LLMs, with their vast knowledge and advanced natural language understanding capabilities, present a promising alternative. Our methodology involves crafting prompts for LLMs to elicit answers that can be read as key-value pairs where the keys are math terms and the values are the corresponding annotations. We also investigate the effect on the performance of LLMs when we provide in the prompt different levels of context, such as the sentence or paragraph containing the input equation. The performance is evaluated by consistency between the ground truth and the output of LLMs. Consistency is assessed by a separate LLM session and with a different prompt. Our results show that when different levels of context are involved, the consistency rate of binary classification increased from 14.8% to 24.5%, and the favorable outcomes rate of multi-class classification increased from 47.1% to 77.5%. Finally, we conclude by discussing the implications of our findings for the future of mathematical knowledge management. We propose that LLMs could play a key role in automating the annotation and tagging of mathematical content, thereby enhancing the accessibility and utility of mathematical knowledge in digital libraries and beyond.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 79.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 59.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/RuochengShan/math_annotation_dataset_for_llm.

References

  1. OpenAI. Chatgpt: Optimizing language models for dialogue, 2022. https://openai.com/blog/chatgpt/

  2. Wang, X., et al.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022)

  3. Wei, J., et al.: Chain-of-thought prompting elicits reasoning in large language models. Adv. Neural Inf. Process. Syst. 35, 24824–24837 (2022)

    Google Scholar 

  4. Imani, S., Du, L., Shrivastava, H.: Mathprompter: mathematical reasoning using large language models. arXiv preprint arXiv:2303.05398 (2023)

  5. Youssef, A.: Part-of-math tagging and applications. In: Geuvers, H., England, M., Hasan, O., Rabe, F., Teschke, O. (eds.) Intelligent Computer Mathematics. CICM 2017. LNCS, vol. 10383, pp. 356–374. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-62075-6_25

  6. Youssef, A., Miller, B.R.: A contextual and labeled math-dataset derived from NIST’s DLMF. In: Benzmüller, C., Miller, B. (eds.) Intelligent Computer Mathematics. CICM 2020. LNCS, vol. 12236, pp. 324–330. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-53518-6_25

  7. Olver, F.W.J., et al. (eds.): NIST Digital Library of Mathematical Functions. https://dlmf.nist.gov/, Release 1.2.0 of 2024-03-15

  8. Socher, R., et al.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1631–1642 (2013)

    Google Scholar 

  9. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. Adv. Neural Inf. Process. Syst. 27 (2014)

    Google Scholar 

  10. Navigli, R.: Word sense disambiguation: a survey. ACM Comput. Surv. (CSUR) 41(2), 1–69 (2009)

    Article  Google Scholar 

  11. He, X., Yiu, S.M.: Controllable dictionary example generation: generating example sentences for specific targeted audiences. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 610–627 (2022)

    Google Scholar 

  12. Shan, R., Youssef, A.: Towards math terms disambiguation using machine learning. In: Kamareddine, F., Sacerdoti Coen, C. (eds.) Intelligent Computer Mathematics. CICM 2021. LNCS, vol. 12833, pp. 90–106. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-81097-9_7

  13. Achiam, J., et al.: GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023)

  14. Team, G., et al.: Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805 (2023)

  15. Touvron, H., et al.: Llama 2: open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023)

  16. Zhang, Z., Zhang, A., Li, M., Smola, A.: Automatic chain of thought prompting in large language models. arXiv preprint arXiv:2210.03493 (2022)

  17. Hou, Y., et al.: Large language models are zero-shot rankers for recommender systems. In: Goharian, N., et al. (eds.) Advances in Information Retrieval. ECIR 2024. LNCS, vol. 14609, pp. 364–381. Springer, Cham (2024). https://doi.org/10.1007/978-3-031-56060-6_24

  18. Song, F., et al.: Preference ranking optimization for human alignment. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 17, pp. 18990–18998 (2024)

    Google Scholar 

  19. Guha, N., et al.: Legalbench: a collaboratively built benchmark for measuring legal reasoning in large language models. Adv. Neural Inf. Process. Syst. 36 (2024)

    Google Scholar 

  20. Meskó, B.: Prompt engineering as an important emerging skill for medical professionals: tutorial. J. Med. Internet Res. 25, e50638 (2023)

    Article  Google Scholar 

  21. Giray, L.: Prompt engineering with ChatGPT: a guide for academic writers. Ann. Biomed. Eng. 51(12), 2629–2633 (2023)

    Article  Google Scholar 

  22. Denny, P., Kumar, V., Giacaman, N.: Conversing with copilot: exploring prompt engineering for solving CS1 problems using natural language. In: Proceedings of the 54th ACM Technical Symposium on Computer Science Education, vol. 1, pp. 1136–1142 (2023)

    Google Scholar 

  23. Liu, P., Yuan, W., Jinlan, F., Jiang, Z., Hayashi, H., Neubig, G.: Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput. Surv. 55(9), 1–35 (2023)

    Article  Google Scholar 

  24. White, J., et al.: A prompt pattern catalog to enhance prompt engineering with chatgpt. arXiv preprint arXiv:2302.11382 (2023)

  25. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation (PDF). In: ACL-2002: 40th Annual Meeting of the Association for Computational Linguistics, pp. 311–318 (2002)

    Google Scholar 

  26. Lin, C.-Y.: ROUGE: a package for automatic evaluation of summaries. In: Proceedings of the Workshop on Text Summarization Branches Out (WAS 2004), Barcelona, Spain, 25–26 July 2004

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruocheng Shan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Shan, R., Youssef, A. (2024). Using Large Language Models to Automate Annotation and Part-of-Math Tagging of Math Equations. In: Kohlhase, A., Kovács, L. (eds) Intelligent Computer Mathematics. CICM 2024. Lecture Notes in Computer Science(), vol 14960. Springer, Cham. https://doi.org/10.1007/978-3-031-66997-2_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-66997-2_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-66996-5

  • Online ISBN: 978-3-031-66997-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics