Parameter Competition Balancing for Model Merging (PCB-Merging)(NeurIPS 2024)
This is the source code to reproduce the experiments for "Parameter Competition Balancing for Model Merging" by Guodong Du, Junlin Lee, Jing Li, Runhua Jiang, Shuyang Yu, Yifei Guo, Hanting Liu, Sim Kuan Goh, Ho-Kin Tang.
our key contributions include:
- We re-examine existing model merging methods, highlighting the critical role of parameter competition awareness;
- We introduce a novel approach called PCB-Merging, which effectively adjusts parameter coefficients through balancing parameter competition;
- Our proposed method stabilizes and enhances model merging performance across various application scenarios without additional training.
We release the source code for cross-task merging experiments, including nlp_src
and vision_src
. More code will be released as soon as possible.
The details of our method are shown in the file pcb-merging.py
. Additionally, you can check the application in different scenarios with evolutionary strategies in pcb_ES.py
(found in the vision_source_code
directory) or merging.py
(in the nlp_source_code
directory). You can obtain the model population by executing run_finetuning.sh
and try different merging methods using run_merging.sh
.
We release an implementation of the key steps in our proposed method, including intra-balancing, inter-balancing, drop and rescale, as shown in pcb-merging.py
. In addition, we include the implementation of evolutionary strategies.
- Create a virtual environment and activate it.
python3 -m venv env
source env/bin/activate
- Install dependencies
python -m pip install -r requirements.txt -f https://download.pytorch.org/whl/cu113/torch_stable.html
-
Download Story Cloze Dataset and update its path in data/dataset_readers.py StoryClozeReader class.
-
Set the path to where finetuned models are stored in utils/merge_utils.py
We download the released the IA3 checkpoints from TIES_Merging
Please cite our paper if you use our models in your works:
@inproceedings{guodong24neurips,
title={Parameter Competition Balancing for Model Merging},
author = {Guodong Du and
Junlin Lee and Jing Li and Runhua Jiang and Yifei Guo and Shuyang Yu and Hanting Liu and Sim Kuan Goh and Ho-Kin Tang and Daojing He and Min Zhang},
booktitle = {The Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS)},
year={2024}
}