[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3639856.3639895acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaimlsystemsConference Proceedingsconference-collections
short-paper

Towards reducing hallucination in extracting information from financial reports using Large Language Models

Published: 17 May 2024 Publication History

Abstract

For a financial analyst, the question and answer (Q&A) segment of the company financial report is a crucial piece of information for various analysis and investment decisions. However, extracting valuable insights from the Q&A section has posed considerable challenges as the conventional methods such as detailed reading and note-taking lack scalability and are susceptible to human errors, and Optical Character Recognition (OCR) and similar techniques encounter difficulties in accurately processing unstructured transcript text, often missing subtle linguistic nuances that drive investor decisions. Here, we demonstrate the utilization of Large Language Models (LLMs) to efficiently and rapidly extract information from earnings report transcripts while ensuring high accuracy—transforming the extraction process as well as reducing hallucination by combining retrieval-augmented generation technique as well as metadata. We evaluate the outcomes of various LLMs with and without using our proposed approach based on various objective metrics for evaluating Q&A systems, and empirically demonstrate superiority of our method.

References

[1]
Rohan Anil, Andrew M Dai, Orhan Firat, Melvin Johnson, Dmitry Lepikhin, Alexandre Passos, Siamak Shakeri, Emanuel Taropa, Paige Bailey, Zhifeng Chen, 2023. Palm 2 technical report. arXiv preprint arXiv:2305.10403 (2023).
[2]
Lasse Bergroth, Harri Hakonen, and Timo Raita. 2000. A survey of longest common subsequence algorithms. In Proceedings Seventh International Symposium on String Processing and Information Retrieval. SPIRE 2000. IEEE, 39–48.
[3]
Stella Biderman, Hailey Schoelkopf, Quentin Gregory Anthony, Herbie Bradley, Kyle O’Brien, Eric Hallahan, Mohammad Aflah Khan, Shivanshu Purohit, USVSN Sai Prashanth, Edward Raff, 2023. Pythia: A suite for analyzing large language models across training and scaling. In International Conference on Machine Learning. PMLR, 2397–2430.
[4]
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.
[5]
Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Eric Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, 2022. Scaling instruction-finetuned language models. arXiv preprint arXiv:2210.11416 (2022).
[6]
Graham A Cutting and Anne-Françoise Cutting-Decelle. 2021. Intelligent Document Processing–Methods and Tools in the real world. arXiv preprint arXiv:2112.14070 (2021).
[7]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[8]
Jade Goldstein and Jaime G Carbonell. 1998. Summarization:(1) using MMR for diversity-based reranking and (2) evaluating summaries. In TIPSTER TEXT PROGRAM PHASE III: Proceedings of a Workshop held at Baltimore, Maryland, October 13-15, 1998. 181–195.
[9]
Hien Thi Ha and Ales Horák. 2022. Information extraction from scanned invoice images using text analysis and layout features. Signal Processing: Image Communication 102 (2022), 116601.
[10]
Matthew A Jaro. 1989. Advances in record-linkage methodology as applied to matching the 1985 census of Tampa, Florida. J. Amer. Statist. Assoc. 84, 406 (1989), 414–420.
[11]
Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, 2020. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems 33 (2020), 9459–9474.
[12]
Jiongnan Liu, Jiajie Jin, Zihan Wang, Jiehan Cheng, Zhicheng Dou, and Ji-Rong Wen. 2023. RETA-LLM: A Retrieval-Augmented Large Language Model Toolkit. arXiv preprint arXiv:2306.05212 (2023).
[13]
Xiaojing Liu, Feiyu Gao, Qiong Zhang, and Huasha Zhao. 2019. Graph convolution for multimodal information extraction from visually rich documents. arXiv preprint arXiv:1903.11279 (2019).
[14]
Niklas Muennighoff, Thomas Wang, Lintang Sutawika, Adam Roberts, Stella Biderman, Teven Le Scao, M Saiful Bari, Sheng Shen, Zheng-Xin Yong, Hailey Schoelkopf, 2022. Crosslingual generalization through multitask finetuning. arXiv preprint arXiv:2211.01786 (2022).
[15]
Jingwei Ni, Julia Bingler, Chiara Colesanti-Senni, Mathias Kraus, Glen Gostlow, Tobias Schimanski, Dominik Stammbach, Saeid Ashraf Vaghefi, Qian Wang, Nicolas Webersinke, 2023. Paradigm Shift in Sustainability Disclosure Analysis: Empowering Stakeholders with CHATREPORT, a Language Model-Based Tool. arXiv preprint arXiv:2306.15518 (2023).
[16]
Shreeshiv Patel and Dvijesh Bhatt. 2020. Abstractive information extraction from scanned invoices (AIESI) using end-to-end sequential approach. arXiv preprint arXiv:2009.05728 (2020).
[17]
Yujie Qian, Enrico Santus, Zhijing Jin, Jiang Guo, and Regina Barzilay. 2018. Graphie: A graph-based framework for information extraction. arXiv preprint arXiv:1810.13083 (2018).
[18]
Mahmudul Sheikh and Sumali Conlon. 2012. A rule-based system to extract financial information. Journal of Computer Information Systems 52, 4 (2012), 10–19.
[19]
Štěpán Šimsa, Milan Šulc, Michal Uřičář, Yash Patel, Ahmed Hamdi, Matěj Kocián, Matyáš Skalickỳ, Jiří Matas, Antoine Doucet, Mickaël Coustaty, 2023. Docile benchmark for document information localization and extraction. arXiv preprint arXiv:2302.05658 (2023).
[20]
Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, 2023. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023).
[21]
William E Winkler. 1990. String comparator metrics and enhanced decision rules in the Fellegi-Sunter model of record linkage. (1990).
[22]
Weizhe Yuan, Graham Neubig, and Pengfei Liu. 2021. BARTScore: Evaluating Generated Text as Text Generation. In Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan (Eds.). Vol. 34. Curran Associates, Inc., 27263–27277. https://proceedings.neurips.cc/paper/2021/file/e4d2b6e6fdeca3e60e0f1a62fee3d9dd-Paper.pdf
[23]
Chongjian Yue, Xinrun Xu, Xiaojun Ma, Lun Du, Hengyu Liu, Zhiming Ding, Yanbing Jiang, Shi Han, and Dongmei Zhang. 2023. Leveraging LLMs for KPIs Retrieval from Hybrid Long-Document: A Comprehensive Framework and Dataset. arXiv preprint arXiv:2305.16344 (2023).
[24]
Tianyi Zhang*, Varsha Kishore*, Felix Wu*, Kilian Q. Weinberger, and Yoav Artzi. 2020. BERTScore: Evaluating Text Generation with BERT. In International Conference on Learning Representations. https://openreview.net/forum?id=SkeHuCVFDr
[25]
Yuchen Zhuang, Yue Yu, Kuan Wang, Haotian Sun, and Chao Zhang. 2023. ToolQA: A Dataset for LLM Question Answering with External Tools. arXiv preprint arXiv:2306.13304 (2023).

Cited By

View all
  • (2024)Evaluating Retrieval-Augmented Generation Models for Financial Report Question and AnsweringApplied Sciences10.3390/app1420931814:20(9318)Online publication date: 12-Oct-2024
  • (2024)HybridRAG: Integrating Knowledge Graphs and Vector Retrieval Augmented Generation for Efficient Information ExtractionProceedings of the 5th ACM International Conference on AI in Finance10.1145/3677052.3698671(608-616)Online publication date: 14-Nov-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
AIMLSystems '23: Proceedings of the Third International Conference on AI-ML Systems
October 2023
381 pages
ISBN:9798400716492
DOI:10.1145/3639856
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 May 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Earning Call Transcripts
  2. Financial Markets
  3. Large Language Models
  4. Natural Language Processing

Qualifiers

  • Short-paper
  • Research
  • Refereed limited

Conference

AIMLSystems 2023

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)133
  • Downloads (Last 6 weeks)17
Reflects downloads up to 04 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Evaluating Retrieval-Augmented Generation Models for Financial Report Question and AnsweringApplied Sciences10.3390/app1420931814:20(9318)Online publication date: 12-Oct-2024
  • (2024)HybridRAG: Integrating Knowledge Graphs and Vector Retrieval Augmented Generation for Efficient Information ExtractionProceedings of the 5th ACM International Conference on AI in Finance10.1145/3677052.3698671(608-616)Online publication date: 14-Nov-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media