update to support agent chat sft. #572

lilongxian · 2024-09-26T12:43:28Z

最近在基于GLM-4的智能体多轮对话任务训练过程中发现finetune.py脚本中的两个bug：

finetune_demo/finetune.py中：
process_batch函数中，以下代码部分存在漏洞：
for message in conv:
message = process_message(message)
loss_mask_val = False if message['role'] in ('system', 'user', 'observation') else True
new_input_ids = tokenizer.apply_chat_template([message], tokenize=True, return_dict=False)[2:]
以及和process_batch_eval函数中以下代码部分存在漏洞：
for message in conv:
if len(input_ids) >= max_input_length:
break
else:
message = process_message(message)
new_input_ids = tokenizer.apply_chat_template([message], tokenize=True, return_dict=False)[2:]
追溯apply_chat_template函数中结尾处代码：
if tokenize:
output = self.batch_encode_plus(
[result] if isinstance(result[0], int) else result,
padding=padding,
truncation=truncation,
max_length=max_length,
return_tensors=return_tensors,
is_split_into_words=True,
add_special_tokens=False
)
if return_dict:
return output
else:
return output["input_ids"]
可知：
tokenizer.apply_chat_template([message], tokenize=True, return_dict=False) 返回为具有一个向量的矩阵，此时直接的[2:]切片操作导致训练数据完全丢失，导致无法有效训练。而我们期望提取的数据是当前message编码后的向量数据（去掉[gMASK]），因此解决办法是修改代码：
new_input_ids = tokenizer.apply_chat_template([message], tokenize=True, return_dict=False)[2:]
为：
new_input_ids = tokenizer.apply_chat_template([message], tokenize=True, return_dict=False)[0][2:]

修改后，多轮对话模型可正常训练。

当前finetune_demo/finetune.py脚本的实际代码不支持agent chat 训练数据，存在以下漏洞：
从finetune_demo/finetune.py脚本中函数：
def process_message(message):
if 'tools' in message and message['role'] == 'system':
for tool in message['tools']:
parameters = tool['function']['parameters']['properties']
tool['function']['parameters']['properties'] =
{k: v for k, v in parameters.items() if
v is not None}
elif 'tools' in message:
del message['tools']

以及 process_batch函数中：
else:
for message in conv:
message = process_message(message)
loss_mask_val = False if message['role'] in ('system', 'user', 'observation') else True
new_input_ids = tokenizer.apply_chat_template([message], tokenize=True, return_dict=False)[0][2:]
input_ids += new_input_ids
loss_masks += [loss_mask_val] * len(new_input_ids)

可推出结论：
    现有程序没有对agent chat 训练中的assistant生成的 tool 信息的数据进行建模和转换，部分实际训练数据如下：
      {
  "role": "user",
  "content": "Hi, I am looking for some book recommendations. I am interested in history and science fiction."
},
{
  "role": "assistant",
  "content": "{\"name\": \"get_recommended_books\", \"arguments\": {\"interests\": [\"history\", \"science fiction\"]}}"
},
{
  "role": "observation",
  "content": "{\"books\": [\"Sapiens: A Brief History of Humankind by Yuval Noah Harari\", \"A Brief History of Time by Stephen Hawking\", \"Dune by Frank Herbert\", \"The Martian by Andy Weir\"]}"
},
{
  "role": "assistant",
  "content": "Based on your interests in history and science fiction, I would recommend the following books: \"Sapiens: A Brief History of Humankind\" by Yuval Noah Harari, \"A Brief History of Time\" by Stephen Hawking, \"Dune\" by Frank Herbert, and \"The Martian\" by Andy Weir."
}

其中，assistant生成的 tool 信息为：
{
"role": "assistant",
"content": "{"name": "get_recommended_books", "arguments": {"interests": ["history", "science fiction"]}}"
},

根据GLM-4 agent chat 推理测试发现：GLM-4 生成的API任务规划调度数据格式为：functionName\n{"param":"p"}

继续追踪数据建模算法到函数apply_chat_template中：
input = self.build_single_message(
item["role"],
item.get("metadata", ""),
item["content"],
tokenize=tokenize
)
和函数：
def build_single_message(self, role, metadata, message, tokenize=True):
""" tokens of "" <|{role}|>{metadata}\n{message} """
assert role in ["system", "user", "assistant", "observation"], role
if tokenize:
role_tokens = [self.convert_tokens_to_ids(f"<|{role}|>")] + self.tokenizer.encode(f"{metadata}\n",
disallowed_special=())
message_tokens = self.tokenizer.encode(message, disallowed_special=())
tokens = role_tokens + message_tokens
return tokens
else:
return str(f"<|{role}|>{metadata}\n{message}")

可推出：
GLM-4的agent chat 训练数据建模模板为：
[gMASK]<|system|>\n tools messages<|user|>\nQ<|assistant|>\nPrefix response<|assistant|>functionName\n{"param":"p"}<|observation|>\n tool return<|assistant|>Answer

以上充分证明了：
agent chat 训练时，必须把assistant的tool信息
{
"role": "assistant",
"content": "{"name": "get_recommended_books", "arguments": {"interests": ["history", "science fiction"]}}"
}
转换为格式：
{
"role": role,
"metadata": function name,
"content": arguments {} 参数
}

解决办法：

 为了使代码整洁，在 finetune.py的process_message函数中追加代码：
# convert tarin data of agent chat.
if message['role'] == 'assistant':
    content = message['content']
    if isinstance(content, str) and content.startswith("{") and content.endswith("}"):
        try:
            content_ = eval(content)
            if isinstance(content_, dict) and "name" in content_ and "arguments" in content_:
                message['content'] = json.dumps(content_["arguments"], ensure_ascii=False)
                message['metadata'] = content_["name"]
        except:
            pass

完整函数如下：

def process_message(message):
if 'tools' in message and message['role'] == 'system':
for tool in message['tools']:
parameters = tool['function']['parameters']['properties']
tool['function']['parameters']['properties'] =
{k: v for k, v in parameters.items() if
v is not None}
elif 'tools' in message:
del message['tools']

# convert tarin data of agent chat.
if message['role'] == 'assistant':
    content = message['content']
    if isinstance(content, str) and content.startswith("{") and content.endswith("}"):
        try:
            content_ = eval(content)
            if isinstance(content_, dict) and "name" in content_ and "arguments" in content_:
                message['content'] = json.dumps(content_["arguments"], ensure_ascii=False)
                message['metadata'] = content_["name"]
        except:
            pass
return message

效果：成功完成了agent chat 多轮对话工具调用的训练。

zRzRzRzRzRzRzR · 2024-09-27T05:54:41Z

[0] 的问题是你没有更新最新文件

zRzRzRzRzRzRzR · 2024-09-27T05:55:01Z

finetune_demo/finetune.py

按照评论修改

lilongxian · 2024-09-27T09:58:58Z

收到，谢谢！

lilongxian · 2024-09-27T10:02:10Z

已修改，请老师查阅，谢谢！

update to support agent chat sft.

80e1b4c

zRzRzRzRzRzRzR self-requested a review September 27, 2024 05:53

zRzRzRzRzRzRzR requested changes Sep 27, 2024

View reviewed changes

finetune_demo/finetune.py

Copy link

Member

zRzRzRzRzRzRzR Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

按照评论修改

update to support agent chat sft.

c6f0629

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update to support agent chat sft. #572

update to support agent chat sft. #572

update to support agent chat sft. #572

Are you sure you want to change the base?

update to support agent chat sft. #572

Conversation

Choose a reason for hiding this comment