[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3604237.3626869acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicaifConference Proceedingsconference-collections
research-article
Open access

Large Language Models in Finance: A Survey

Published: 25 November 2023 Publication History

Abstract

Recent advances in large language models (LLMs) have opened new possibilities for artificial intelligence applications in finance. In this paper, we provide a practical survey focused on two key aspects of utilizing LLMs for financial tasks: existing solutions and guidance for adoption.
First, we review current approaches employing LLMs in finance, including leveraging pretrained models via zero-shot or few-shot learning, fine-tuning on domain-specific data, and training custom LLMs from scratch. We summarize key models and evaluate their performance improvements on financial natural language processing tasks.
Second, we propose a decision framework to guide financial professionals in selecting the appropriate LLM solution based on their use case constraints around data, compute, and performance needs. The framework provides a pathway from lightweight experimentation to heavy investment in customized LLMs.
Lastly, we discuss limitations and challenges around leveraging LLMs in financial applications. Overall, this survey aims to synthesize the state-of-the-art and provide a roadmap for responsibly applying LLMs to advance financial AI.

References

[1]
2023. Auto-GPT: An Autonomous GPT-4 Experiment. https://github.com/Significant-Gravitas/Auto-GPT.
[2]
2023. Chatbots in consumer finance. https://www.consumerfinance.gov/data-research/research-reports/chatbots-in-consumer-finance/chatbots-in-consumer-finance/
[3]
Talal Almutiri and Farrukh Nadeem. 2022. Markov models applications in natural language processing: a survey. Int. J. Inf. Technol. Comput. Sci 2 (2022), 1–16.
[4]
Harrison Chase. 2022. LangChain. https://github.com/hwchase17/langchain
[5]
Mu-Yen Chen. 2011. Bankruptcy prediction in firms with statistical and intelligent techniques and a comparison of evolutionary computation approaches. Computers & Mathematics with Applications 62, 12 (2011), 4514–4524. https://doi.org/10.1016/j.camwa.2011.10.030
[6]
Wei-Lin et al. Chiang. 2023. Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality. https://lmsys.org/blog/2023-03-30-vicuna/
[7]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arxiv:1810.04805 [cs.CL]
[8]
Zhengxiao Du, Yujie Qian, Xiao Liu, Ming Ding, Jiezhong Qiu, Zhilin Yang, and Jie Tang. 2022. GLM: General Language Model Pretraining with Autoregressive Blank Infilling. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 320–335.
[9]
Aakanksha Chowdhery et al.2022. PaLM: Scaling Language Modeling with Pathways. arxiv:2204.02311 [cs.CL]
[10]
Amir Gholami et al.2021. A Survey of Quantization Methods for Efficient Neural Network Inference. arxiv:2103.13630 [cs.CV]
[11]
Ashish Vaswani et al.2017. Attention Is All You Need. arxiv:1706.03762 [cs.CL]
[12]
Aohan Zeng et al.2023. GLM-130B: An Open Bilingual Pre-trained Model. In The Eleventh International Conference on Learning Representations (ICLR). https://openreview.net/forum?id=-Aw0rrrPUF
[13]
Bengio et al.2000. A neural probabilistic language model. Advances in neural information processing systems 13 (2000).
[14]
BigScience Workshop et al.2023. BLOOM: A 176B-Parameter Open-Access Multilingual Language Model. arxiv:2211.05100 [cs.CL]
[15]
Colin Raffel et al.2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. arxiv:1910.10683 [cs.LG]
[16]
Dhiraj Kalamkaret et al.2019. A Study of BFLOAT16 for Deep Learning Training. arxiv:1905.12322 [cs.LG]
[17]
Dakuan Lu et al.2023. BBT-Fin: Comprehensive Construction of Chinese Financial Domain Pre-trained Language Model, Corpus and Benchmark. arxiv:2302.09432 [cs.CL]
[18]
Edward J. Hu et al.2021. LoRA: Low-Rank Adaptation of Large Language Models. arxiv:2106.09685 [cs.CL]
[19]
Hyung Won Chung et al.2022. Scaling Instruction-Finetuned Language Models. arxiv:2210.11416 [cs.LG]
[20]
Jason Wei et al.2022. Chain of Thought Prompting Elicits Reasoning in Large Language Models. CoRR abs/2201.11903 (2022). arXiv:2201.11903https://arxiv.org/abs/2201.11903
[21]
Jason Wei et al.2022. Emergent Abilities of Large Language Models. arxiv:2206.07682 [cs.CL]
[22]
Jingfeng Yang et al.2023. Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond. arxiv:2304.13712 [cs.CL]
[23]
Kyunghyun Cho et al.2014. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. arxiv:1406.1078 [cs.CL]
[24]
Long Ouyang et al.2022. Training language models to follow instructions with human feedback. arxiv:2203.02155 [cs.CL]
[25]
Pagliaro et al.2022. Investor Behavior Modeling by Analyzing Financial Advisor Notes: A Machine Learning Perspective. In Proceedings of the Second ACM International Conference on AI in Finance (Virtual Event) (ICAIF ’21). Association for Computing Machinery, New York, NY, USA, Article 23, 8 pages. https://doi.org/10.1145/3490354.3494388
[26]
Patrick Lewis et al.2021. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. arxiv:2005.11401 [cs.CL]
[27]
Percy Liang et al.2022. Holistic Evaluation of Language Models. arxiv:2211.09110 [cs.CL]
[28]
Qingsong Wen et al.2023. Transformers in Time Series: A Survey. arxiv:2202.07125 [cs.LG]
[29]
Qianqian Xie et al.2023. PIXIU: A Large Language Model, Instruction Data and Evaluation Benchmark for Finance. arxiv:2306.05443 [cs.CL]
[30]
Shijie Wu et al.2023. BloombergGPT: A Large Language Model for Finance. arxiv:2303.17564 [cs.LG]
[31]
Shunyu Yao et al.2023. Tree of Thoughts: Deliberate Problem Solving with Large Language Models. arxiv:2305.10601 [cs.CL]
[32]
Susan Zhang et al.2022. OPT: Open Pre-trained Transformer Language Models. arxiv:2205.01068 [cs.CL]
[33]
Tom B. Brown et al.2020. Language Models are Few-Shot Learners. arxiv:2005.14165 [cs.CL]
[34]
Tom B. Brown et al.2020. Language Models are Few-Shot Learners. CoRR abs/2005.14165 (2020). arXiv:2005.14165https://arxiv.org/abs/2005.14165
[35]
Wenxuan Zhang et al.2023. Sentiment Analysis in the Era of Large Language Models: A Reality Check. arxiv:2305.15005 [cs.CL]
[36]
Yaqing Wang et al.2020. Generalizing from a Few Examples: A Survey on Few-Shot Learning. arxiv:1904.05046 [cs.LG]
[37]
Bledar Fazlija and Pedro Harder. 2022. Using Financial News Sentiment for Stock Price Direction Prediction. Mathematics 10, 13 (2022). https://doi.org/10.3390/math10132156
[38]
Peter Foy. 2023. GPT-4 for Financial Statements: Building an AI Analyst. MLQ AI. https://www.mlq.ai/gpt-4-financial-statements-ai-analyst/
[39]
Xinyang Geng and Hao Liu. 2023. OpenLLaMA: An Open Reproduction of LLaMA. https://github.com/openlm-research/open_llama
[40]
John Goodell, Satish Kumar, Weng Marc Lim, and Debidutta Pattnaik. 2021. Artificial intelligence and machine learning in finance: Identifying foundations, themes, and research clusters from bibliometric analysis. Journal of Behavioral and Experimental Finance 32 (08 2021). https://doi.org/10.1016/j.jbef.2021.100577
[41]
Alex Graves. 2014. Generating Sequences With Recurrent Neural Networks. arxiv:1308.0850 [cs.NE]
[42]
Aaryan Gupta, Vinya Dengre, Hamza Kheruwala, and Manan Shah. 2020. Comprehensive review of text-mining applications in finance. Journal of Financial Innovation 6 (11 2020). https://doi.org/10.1186/s40854-020-00205-1
[43]
Kyoung jae Kim. 2003. Financial time series forecasting using support vector machines. Neurocomputing 55, 1 (2003), 307–319. https://doi.org/10.1016/S0925-2312(03)00372-2 Support Vector Machines.
[44]
Yinheng Li. 2023. A Practical Survey on Zero-shot Prompt Design for In-context Learning. International Conference Recent Advances in Natural Language Processing.
[45]
Cuicui Luo, Desheng Wu, and Dexiang Wu. 2017. A deep learning approach for credit scoring using credit default swaps. Engineering Applications of Artificial Intelligence 65 (2017), 465–470. https://doi.org/10.1016/j.engappai.2016.12.002
[46]
Akib Mashrur, Wei Luo, Nayyar A. Zaidi, and Antonio Robles-Kelly. 2020. Machine Learning for Financial Risk Management: A Survey. IEEE Access 8 (2020), 203203–203223. https://doi.org/10.1109/ACCESS.2020.3036322
[47]
Microsoft. 2023. Semantic Kernel. https://github.com/microsoft/semantic-kernel.
[48]
Chiara Valentina Misischia, Flora Poecze, and Christine Strauss. 2022. Chatbots in customer service: Their relevance and impact on service quality. Procedia Computer Science 201 (2022), 421–428. https://doi.org/10.1016/j.procs.2022.03.055 The 13th International Conference on Ambient Systems, Networks and Technologies (ANT) / The 5th International Conference on Emerging Data and Industry 4.0 (EDI40).
[49]
OpenAI. 2023. GPT-4 Technical Report. arxiv:2303.08774 [cs.CL]
[50]
Ahmet Murat Ozbayoglu, Mehmet Ugur Gudelek, and Omer Berat Sezer. 2020. Deep Learning for Financial Applications : A Survey. arxiv:2002.05786 [q-fin.ST]
[51]
Igor Radovanovic. 2023. Auto-GPT for finance - an exploratory guide - algotrading101 blog. https://algotrading101.com/learn/auto-gpt-finance-guide/
[52]
Abhimanyu Roy, Jingyi Sun, Robert Mahoney, Loreto Alonzi, Stephen Adams, and Peter Beling. 2018. Deep learning detecting fraud in credit card transactions. In 2018 Systems and Information Engineering Design Symposium (SIEDS). 129–134. https://doi.org/10.1109/SIEDS.2018.8374722
[53]
Omer Berat Sezer, Murat Ozbayoglu, and Erdogan Dogdu. 2017. A Deep Neural-Network Based Stock Trading System Based on Evolutionary Optimized Technical Analysis Parameters. Procedia Computer Science 114 (2017), 473–480. https://doi.org/10.1016/j.procs.2017.09.031 Complex Adaptive Systems Conference with Theme: Engineering Cyber Physical Systems, CAS October 30 – November 1, 2017, Chicago, Illinois, USA.
[54]
Ashish et al. Shah. 2020. FinAID, A Financial Advisor Application using AI., 2282–2286 pages. https://doi.org/10.35940/ijrte.a2951.059120
[55]
Hugh Son. 2023. JPMorgan is developing a CHATGPT-like A.I. service that gives investment advice. https://www.cnbc.com/2023/05/25/jpmorgan-develops-ai-investment-advisor.html
[56]
Alex Tamkin, Miles Brundage, Jack Clark, and Deep Ganguli. 2021. Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models. arxiv:2102.02503 [cs.CL]
[57]
Rohan Taori, Ishaan Gulrajani, Tianyi Zhang, Yann Dubois, Xuechen Li, Carlos Guestrin, Percy Liang, and Tatsunori B. Hashimoto. 2023. Stanford Alpaca: An Instruction-following LLaMA model. https://github.com/tatsu-lab/stanford_alpaca.
[58]
Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample. 2023. LLaMA: Open and Efficient Foundation Language Models. arxiv:2302.13971 [cs.CL]
[59]
Junhao Wang, Yinheng Li, and Yijie Cao. 2019. Dynamic Portfolio Management with Reinforcement Learning. arxiv:1911.11880 [q-fin.PM]
[60]
David West. 2000. Neural network credit scoring models. Computers & Operations Research 27, 11 (2000), 1131–1152. https://doi.org/10.1016/S0305-0548(99)00149-5
[61]
Pedram Babaei William Todt, Ramtin Babaei. 2023. Fin-LLAMA: Efficient Finetuning of Quantized LLMs for Finance. https://github.com/Bavest/fin-llama.
[62]
Frank Xing, Erik Cambria, and Roy Welsch. 2018. Natural language based financial forecasting: a survey. Artificial Intelligence Review 50 (06 2018). https://doi.org/10.1007/s10462-017-9588-9
[63]
Hongyang Yang, Xiao-Yang Liu, and Christina Dan Wang. 2023. FinGPT: Open-Source Financial Large Language Models. arxiv:2306.06031 [q-fin.ST]
[64]
YangMu Yu. 2023. Cornucopia-LLaMA-Fin-Chinese. https://github.com/jerry1993-tech/Cornucopia-LLaMA-Fin-Chinese.
[65]
Boyu Zhang, Hongyang Yang, and Xiao-Yang Liu. 2023. Instruct-FinGPT: Financial Sentiment Analysis by Instruction Tuning of General-Purpose Large Language Models. arxiv:2306.12659 [cs.CL]
[66]
Xuanyu Zhang, Qing Yang, and Dongliang Xu. 2023. XuanYuan 2.0: A Large Chinese Financial Chat Model with Hundreds of Billions Parameters. arxiv:2305.12002 [cs.CL]
[67]
Zihao Zhang, Stefan Zohren, and Stephen Roberts. 2020. Deep Learning for Portfolio Optimization. The Journal of Financial Data Science 2, 4 (aug 2020), 8–20. https://doi.org/10.3905/jfds.2020.1.042
[68]
Ekaterina Zolotareva. 2021. Aiding Long-Term Investment Decisions with XGBoost Machine Learning Model. arxiv:2104.09341 [q-fin.CP]

Cited By

View all
  • (2024)Disruptive Factors in Product Portfolio Management: An Exploratory Study in B2B Manufacturing for Sustainable TransitionSustainability10.3390/su1611440216:11(4402)Online publication date: 23-May-2024
  • (2024)Adaptive Control of Retrieval-Augmented Generation for Large Language Models Through Reflective TagsElectronics10.3390/electronics1323464313:23(4643)Online publication date: 25-Nov-2024
  • (2024)HELIOS Approach: Utilizing AI and LLM for Enhanced Homogeneity Identification in Real Estate Market AnalysisApplied Sciences10.3390/app1414613514:14(6135)Online publication date: 15-Jul-2024
  • Show More Cited By

Index Terms

  1. Large Language Models in Finance: A Survey
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Please enable JavaScript to view thecomments powered by Disqus.

            Information & Contributors

            Information

            Published In

            cover image ACM Other conferences
            ICAIF '23: Proceedings of the Fourth ACM International Conference on AI in Finance
            November 2023
            697 pages
            ISBN:9798400702402
            DOI:10.1145/3604237
            This work is licensed under a Creative Commons Attribution International 4.0 License.

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            Published: 25 November 2023

            Check for updates

            Author Tags

            1. Finance
            2. Generative AI
            3. Large Language Models
            4. Natural Language Processing

            Qualifiers

            • Research-article
            • Research
            • Refereed limited

            Conference

            ICAIF '23

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)9,737
            • Downloads (Last 6 weeks)1,085
            Reflects downloads up to 13 Dec 2024

            Other Metrics

            Citations

            Cited By

            View all
            • (2024)Disruptive Factors in Product Portfolio Management: An Exploratory Study in B2B Manufacturing for Sustainable TransitionSustainability10.3390/su1611440216:11(4402)Online publication date: 23-May-2024
            • (2024)Adaptive Control of Retrieval-Augmented Generation for Large Language Models Through Reflective TagsElectronics10.3390/electronics1323464313:23(4643)Online publication date: 25-Nov-2024
            • (2024)HELIOS Approach: Utilizing AI and LLM for Enhanced Homogeneity Identification in Real Estate Market AnalysisApplied Sciences10.3390/app1414613514:14(6135)Online publication date: 15-Jul-2024
            • (2024)ECC Analyzer: Extracting Trading Signal from Earnings Conference Calls using Large Language Model for Stock Volatility PredictionProceedings of the 5th ACM International Conference on AI in Finance10.1145/3677052.3698689(257-265)Online publication date: 14-Nov-2024
            • (2024)TAT-LLM: A Specialized Language Model for Discrete Reasoning over Financial Tabular and Textual DataProceedings of the 5th ACM International Conference on AI in Finance10.1145/3677052.3698685(310-318)Online publication date: 14-Nov-2024
            • (2024)Transformers and attention-based networks in quantitative trading: a comprehensive surveyProceedings of the 5th ACM International Conference on AI in Finance10.1145/3677052.3698684(822-830)Online publication date: 14-Nov-2024
            • (2024)Analyzing Cascading Outbreak of GameStop Event: A Practical Approach Using Network Analysis and Large Language ModelsProceedings of the 5th ACM International Conference on AI in Finance10.1145/3677052.3698636(428-436)Online publication date: 14-Nov-2024
            • (2024)Mechanistic interpretability of large language models with applications to the financial services industryProceedings of the 5th ACM International Conference on AI in Finance10.1145/3677052.3698612(660-668)Online publication date: 14-Nov-2024
            • (2024)Unifying Corroborative and Contributive Attributions in Large Language Models2024 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML)10.1109/SaTML59370.2024.00039(665-683)Online publication date: 9-Apr-2024
            • (2024)LLM-Based Edge Intelligence: A Comprehensive Survey on Architectures, Applications, Security and TrustworthinessIEEE Open Journal of the Communications Society10.1109/OJCOMS.2024.34565495(5799-5856)Online publication date: 2024
            • Show More Cited By

            View Options

            View options

            PDF

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            HTML Format

            View this article in HTML Format.

            HTML Format

            Login options

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media