Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild

Code for the paper "Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild".

Authors: Xinyu Zhao*, Guoheng Sun*, Ruisi Cai*, Yukun Zhou*, Pingzhi Li*, Peihao Wang*, Bowen Tan, Yexiao He, Li Chen, Yi Liang, Beidi Chen, Binhang Yuan, Hongyi Wang^, Ang Li^, Zhangyang Wang^, Tianlong Chen^
* Equal contribution, ^ Equal supervision

Overview

As Large Language Models (LLMs) excel across tasks and specialized domains, scaling LLMs based on existing models has garnered significant attention, which faces the challenge of decreasing performance when combining disparate models. Various techniques have been proposed for the aggregation of pre-trained LLMs, including model merging, Mixture-of-Experts, and stacking. Despite their merits, a comprehensive comparison and synergistic application of them to a diverse model zoo is yet to be adequately addressed. In light of this research gap, this paper introduces Model-GLUE, a holistic LLM scaling guideline. First, our work starts with a benchmarking of existing LLM scaling techniques, especially selective merging, and variants of mixture. Utilizing the insights from the benchmark results, we formulate an optimal strategy for the selection and aggregation of a heterogeneous model zoo characterizing different architectures and initialization. Our methodology involves the clustering of mergeable models and optimal merging strategy selection, and the integration of clusters through a model mixture. Finally, evidenced by our experiments on a diverse Llama-2-based model zoo, \texttt{Model-GLUE} shows an average performance enhancement of 5.61%, achieved without additional training.

Setup

Setup environment

conda create -n modelglue python=3.10
conda activate modelglue

pip install -r requirements.txt

Install mergekit

git clone https://github.com/arcee-ai/mergekit.git -b mixtral
cd mergekit

pip install -e .

Install lm-eval

git clone -b offset_by_id https://github.com/s1ghhh/lm-evaluation-harness.git
cd lm-evaluation-harness
pip install --editable ./

Install bigcode-eval

git clone https://github.com/bigcode-project/bigcode-evaluation-harness.git
cd bigcode-evaluation-harness && git checkout 00967d1
pip install --editable ./

Model Merging

. scripts/run_heuristic_merge.sh

Model Mixture

. scripts/run_mixture.sh

Model-GLUE: mixture of selectively merged model clusters

Model cluster example can be found at here

Evalution

Please refer to eval_tools.py

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
bigcode_eval		bigcode_eval
files		files
lm_eval		lm_eval
merging		merging
mixture		mixture
scripts		scripts
LICENSE		LICENSE
README.md		README.md
eval_tools.py		eval_tools.py
modelglue.jpg		modelglue.jpg
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild

Overview

Setup

Model Merging

Model Mixture

Model-GLUE: mixture of selectively merged model clusters

Evalution

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

Model-GLUE/Model-GLUE

Folders and files

Latest commit

History

Repository files navigation

Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild

Overview

Setup

Model Merging

Model Mixture

Model-GLUE: mixture of selectively merged model clusters

Evalution

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages