More Web Proxy on the site http://driver.im/

short-paper

GongBu: Easily Fine-tuning LLMs for Domain-specific Adaptation

Authors:

Zhiqi ShenAuthors Info & Claims

CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management

Pages 5309 - 5313

https://doi.org/10.1145/3627673.3679233

Published: 21 October 2024 Publication History

Abstract

Parameter-Efficient Fine-Tuning (PEFT) adapts large language models (LLMs) to specific domains by updating only a small portion of the parameters. To easily and efficiently adapt LLMs to custom domains, we present a no-code fine-tuning platform, GongBu, supporting 9 PEFT methods and open-source LLMs. GongBu allows LLM fine-tuning through a user-friendly GUI, eliminating the need to write any code. Its features include data selection, accelerated training speed, decoupled deployment, performance monitoring, and error log analysis. The demonstration video is available at https://www.youtube.com/watch?v=QuDR_WNoB9o.

References

[1]

XTuner Contributors. 2023. Xtuner: a toolkit for efficiently fine-tuning llm. https://github.com/InternLM/xtuner. (2023).

[2]

Tri Dao. 2023. Flashattention-2: faster attention with better parallelism and work partitioning. In The Twelfth International Conference on Learning Representations.

[3]

Ning Ding et al. 2023. Parameter-efficient fine-tuning of large-scale pre-trained language models. Nature Machine Intelligence, 5, 3, 220--235.

[4]

Neel Guha et al. 2024. Legalbench: a collaboratively built benchmark for measuring legal reasoning in large language models. Advances in Neural Information Processing Systems, 36.

[5]

Nam Hyeon-Woo, Moon Ye-Bin, and Tae-Hyun Oh. 2021. Fedpara: low-rank hadamard product for communication-efficient federated learning. In International Conference on Learning Representations.

[6]

Albert Q Jiang et al. 2023. Mistral 7b. arXiv preprint arXiv:2310.06825.

[7]

Di Jin, Eileen Pan, Nassim Oufattole, Wei-Hung Weng, Hanyi Fang, and Peter Szolovits. 2021. What disease does this patient have? a large-scale open domain question answering dataset from medical exams. Applied Sciences, 11, 14, 6421.

[8]

Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, Joseph Gonzalez, Hao Zhang, and Ion Stoica. 2023. Efficient memory management for large language model serving with pagedattention. In Proceedings of the 29th Symposium on Operating Systems Principles, 611--626.

Digital Library

[9]

Brian Lester, Rami Al-Rfou, and Noah Constant. 2021. The power of scale for parameter-efficient prompt tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 3045--3059.

[10]

Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: optimizing continuous prompts for generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 4582--4597.

[11]

Percy Liang et al. 2023. Holistic evaluation of language models. Transactions on Machine Learning Research.

[12]

Haokun Liu, Derek Tam, Mohammed Muqeeth, Jay Mohta, Tenghao Huang, Mohit Bansal, and Colin A Raffel. 2022. Few-shot parameter-efficient finetuning is better and cheaper than in-context learning. Advances in Neural Information Processing Systems, 35, 1950--1965.

[13]

Xiao Liu, Yanan Zheng, Zhengxiao Du, Ming Ding, Yujie Qian, Zhilin Yang, and Jie Tang. 2023. Gpt understands, too. AI Open.

[14]

Andrew Maas, Raymond E Daly, Peter T Pham, Dan Huang, Andrew Y Ng, and Christopher Potts. 2011. Learning word vectors for sentiment analysis. In Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies, 142--150.

Digital Library

[15]

Sourab Mangrulkar, Sylvain Gugger, Lysandre Debut, Younes Belkada, Sayak Paul, and Benjamin Bossan. 2022. Peft: state-of-the-art parameter-efficient fine-tuning methods. https://github.com/huggingface/peft. (2022).

[16]

Niklas Muennighoff et al. 2023. Crosslingual generalization through multitask finetuning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 15991--16111.

[17]

Samyam Rajbhandari, Jeff Rasley, Olatunji Ruwase, and Yuxiong He. 2020. Zero: memory optimizations toward training trillion parameter models. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 1--16.

[18]

Ozan Sener and Silvio Savarese. 2018. Active learning for convolutional neural networks: a core-set approach. In International Conference on Learning Representations.

[19]

Rohan Taori, Ishaan Gulrajani, Tianyi Zhang, Yann Dubois, Xuechen Li, Carlos Guestrin, Percy Liang, and Tatsunori B Hashimoto. 2023. Stanford alpaca: an instruction-following llama model. (2023).

[20]

Gemma Team et al. 2024. Gemma: open models based on gemini research and technology. arXiv preprint arXiv:2403.08295.

[21]

Hugo Touvron et al. 2023. Llama 2: open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.

[22]

Jiahao Wang, Bolin Zhang, Qianlong Du, Jiajun Zhang, and Dianhui Chu. 2024. A survey on data selection for llm instruction tuning. arXiv preprint arXiv:2402.05123.

[23]

Neng Wang, Hongyang Yang, and Christina Wang. 2023. Fingpt: instruction tuning benchmark for open-source large language models in financial datasets. In NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following.

[24]

[SW] He sicheng Wang Yuxin Sun Qingxuan, M3E: Moka Massive Mixed Embedding Model 2023.

[25]

SHIH-YING YEH, Yu-Guan Hsieh, Zhidong Gao, Bernard BW Yang, Giyeong Oh, and Yanmin Gong. 2023. Navigating text-to-image customization: from lycoris fine-tuning to model evaluation. In The Twelfth International Conference on Learning Representations.

[26]

Yu Yu et al. 2023. Low-rank adaptation of large language model rescoring for parameter-efficient speech recognition. In 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 1--8.

[27]

Qingru Zhang, Minshuo Chen, Alexander Bukharin, Pengcheng He, Yu Cheng, Weizhu Chen, and Tuo Zhao. 2023. Adaptive budget allocation for parameterefficient fine-tuning. In The Eleventh International Conference on Learning Representations.

[28]

Renrui Zhang et al. 2023. Llama-adapter: efficient fine-tuning of language models with zero-init attention. arXiv preprint arXiv:2303.16199.

[29]

Xujiang Zhao et al. 2023. Domain specialization as the key to make large language models disruptive: a comprehensive survey. arXiv preprint arXiv:2305.18703.

[30]

Yaowei Zheng, Richong Zhang, Junhao Zhang, Yanhan Ye, Zheyan Luo, and Yongqiang Ma. 2024. Llamafactory: unified efficient fine-tuning of 100 language models. arXiv preprint arXiv:2403.13372. http://arxiv.org/abs/2403.13372.

Index Terms

GongBu: Easily Fine-tuning LLMs for Domain-specific Adaptation
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Natural language generation
2. Information systems
  1. Information systems applications
    1. Computing platforms

Recommendations

Domain-Driven LLM Development: Insights into RAG and Fine-Tuning Practices
KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

To improve Large Language Model (LLM) performance on domain specific applications, ML developers often leverage Retrieval Augmented Generation (RAG) and LLM Fine-Tuning. RAG extends the capabilities of LLMs to specific domains or an organization's ...
A General Domain Specific Feature Transfer Framework for Hybrid Domain Adaptation

Heterogeneous domain adaptation needs supplementary information to link up different domains. However, such supplementary information may not always be available in real cases. In this paper, a new problem setting called hybrid domain adaptation is ...
CS1-LLM: Integrating LLMs into CS1 Instruction
ITiCSE 2024: Proceedings of the 2024 on Innovation and Technology in Computer Science Education V. 1

The recent, widespread availability of Large Language Models (LLMs) like ChatGPT and GitHub Copilot may impact introductory programming courses (CS1) both in terms of what should be taught and how to teach it. Indeed, recent research has shown that LLMs ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management

October 2024

5705 pages

ISBN:9798400704369

DOI:10.1145/3627673

General Chairs:
Edoardo Serra
Boise State University, USA
,
Francesca Spezzano
Boise State University, USA

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Funding Sources

the National Key R\&D Program of China
the Special Funding Program of Shandong Taishan Scholars Project
the China Scholarship Council
Harbin Institute of Technology Graduate Teaching Reform Project

Conference

CIKM '24

Sponsor:

SIGIR

CIKM '24: The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

ID, Boise, USA

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
96
Total Downloads

Downloads (Last 12 months)96
Downloads (Last 6 weeks)56

Reflects downloads up to 12 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents