[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3627673.3679233acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

GongBu: Easily Fine-tuning LLMs for Domain-specific Adaptation

Published: 21 October 2024 Publication History

Abstract

Parameter-Efficient Fine-Tuning (PEFT) adapts large language models (LLMs) to specific domains by updating only a small portion of the parameters. To easily and efficiently adapt LLMs to custom domains, we present a no-code fine-tuning platform, GongBu, supporting 9 PEFT methods and open-source LLMs. GongBu allows LLM fine-tuning through a user-friendly GUI, eliminating the need to write any code. Its features include data selection, accelerated training speed, decoupled deployment, performance monitoring, and error log analysis. The demonstration video is available at https://www.youtube.com/watch?v=QuDR_WNoB9o.

References

[1]
XTuner Contributors. 2023. Xtuner: a toolkit for efficiently fine-tuning llm. https://github.com/InternLM/xtuner. (2023).
[2]
Tri Dao. 2023. Flashattention-2: faster attention with better parallelism and work partitioning. In The Twelfth International Conference on Learning Representations.
[3]
Ning Ding et al. 2023. Parameter-efficient fine-tuning of large-scale pre-trained language models. Nature Machine Intelligence, 5, 3, 220--235.
[4]
Neel Guha et al. 2024. Legalbench: a collaboratively built benchmark for measuring legal reasoning in large language models. Advances in Neural Information Processing Systems, 36.
[5]
Nam Hyeon-Woo, Moon Ye-Bin, and Tae-Hyun Oh. 2021. Fedpara: low-rank hadamard product for communication-efficient federated learning. In International Conference on Learning Representations.
[6]
Albert Q Jiang et al. 2023. Mistral 7b. arXiv preprint arXiv:2310.06825.
[7]
Di Jin, Eileen Pan, Nassim Oufattole, Wei-Hung Weng, Hanyi Fang, and Peter Szolovits. 2021. What disease does this patient have? a large-scale open domain question answering dataset from medical exams. Applied Sciences, 11, 14, 6421.
[8]
Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, Joseph Gonzalez, Hao Zhang, and Ion Stoica. 2023. Efficient memory management for large language model serving with pagedattention. In Proceedings of the 29th Symposium on Operating Systems Principles, 611--626.
[9]
Brian Lester, Rami Al-Rfou, and Noah Constant. 2021. The power of scale for parameter-efficient prompt tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 3045--3059.
[10]
Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: optimizing continuous prompts for generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 4582--4597.
[11]
Percy Liang et al. 2023. Holistic evaluation of language models. Transactions on Machine Learning Research.
[12]
Haokun Liu, Derek Tam, Mohammed Muqeeth, Jay Mohta, Tenghao Huang, Mohit Bansal, and Colin A Raffel. 2022. Few-shot parameter-efficient finetuning is better and cheaper than in-context learning. Advances in Neural Information Processing Systems, 35, 1950--1965.
[13]
Xiao Liu, Yanan Zheng, Zhengxiao Du, Ming Ding, Yujie Qian, Zhilin Yang, and Jie Tang. 2023. Gpt understands, too. AI Open.
[14]
Andrew Maas, Raymond E Daly, Peter T Pham, Dan Huang, Andrew Y Ng, and Christopher Potts. 2011. Learning word vectors for sentiment analysis. In Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies, 142--150.
[15]
Sourab Mangrulkar, Sylvain Gugger, Lysandre Debut, Younes Belkada, Sayak Paul, and Benjamin Bossan. 2022. Peft: state-of-the-art parameter-efficient fine-tuning methods. https://github.com/huggingface/peft. (2022).
[16]
Niklas Muennighoff et al. 2023. Crosslingual generalization through multitask finetuning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 15991--16111.
[17]
Samyam Rajbhandari, Jeff Rasley, Olatunji Ruwase, and Yuxiong He. 2020. Zero: memory optimizations toward training trillion parameter models. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 1--16.
[18]
Ozan Sener and Silvio Savarese. 2018. Active learning for convolutional neural networks: a core-set approach. In International Conference on Learning Representations.
[19]
Rohan Taori, Ishaan Gulrajani, Tianyi Zhang, Yann Dubois, Xuechen Li, Carlos Guestrin, Percy Liang, and Tatsunori B Hashimoto. 2023. Stanford alpaca: an instruction-following llama model. (2023).
[20]
Gemma Team et al. 2024. Gemma: open models based on gemini research and technology. arXiv preprint arXiv:2403.08295.
[21]
Hugo Touvron et al. 2023. Llama 2: open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
[22]
Jiahao Wang, Bolin Zhang, Qianlong Du, Jiajun Zhang, and Dianhui Chu. 2024. A survey on data selection for llm instruction tuning. arXiv preprint arXiv:2402.05123.
[23]
Neng Wang, Hongyang Yang, and Christina Wang. 2023. Fingpt: instruction tuning benchmark for open-source large language models in financial datasets. In NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following.
[24]
[SW] He sicheng Wang Yuxin Sun Qingxuan, M3E: Moka Massive Mixed Embedding Model 2023.
[25]
SHIH-YING YEH, Yu-Guan Hsieh, Zhidong Gao, Bernard BW Yang, Giyeong Oh, and Yanmin Gong. 2023. Navigating text-to-image customization: from lycoris fine-tuning to model evaluation. In The Twelfth International Conference on Learning Representations.
[26]
Yu Yu et al. 2023. Low-rank adaptation of large language model rescoring for parameter-efficient speech recognition. In 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 1--8.
[27]
Qingru Zhang, Minshuo Chen, Alexander Bukharin, Pengcheng He, Yu Cheng, Weizhu Chen, and Tuo Zhao. 2023. Adaptive budget allocation for parameterefficient fine-tuning. In The Eleventh International Conference on Learning Representations.
[28]
Renrui Zhang et al. 2023. Llama-adapter: efficient fine-tuning of language models with zero-init attention. arXiv preprint arXiv:2303.16199.
[29]
Xujiang Zhao et al. 2023. Domain specialization as the key to make large language models disruptive: a comprehensive survey. arXiv preprint arXiv:2305.18703.
[30]
Yaowei Zheng, Richong Zhang, Junhao Zhang, Yanhan Ye, Zheyan Luo, and Yongqiang Ma. 2024. Llamafactory: unified efficient fine-tuning of 100 language models. arXiv preprint arXiv:2403.13372. http://arxiv.org/abs/2403.13372.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management
October 2024
5705 pages
ISBN:9798400704369
DOI:10.1145/3627673
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. LLM
  2. no-code platform
  3. parameter-efficient fine-tuning

Qualifiers

  • Short-paper

Funding Sources

  • the National Key R\&D Program of China
  • the Special Funding Program of Shandong Taishan Scholars Project
  • the China Scholarship Council
  • Harbin Institute of Technology Graduate Teaching Reform Project

Conference

CIKM '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 96
    Total Downloads
  • Downloads (Last 12 months)96
  • Downloads (Last 6 weeks)56
Reflects downloads up to 12 Dec 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media