Question about Hammer-2.1-7b chat template #7

jc-ryan · 2024-12-13T13:39:28Z

Congratulations on your new progress with Hammer-2.1! I have a question: I noticed that when using vllm serve, the tool call parser being used is hermes, but your training data uses a custom tool output format. How does the hermes XML template implement the parsing?

Additionally, I noticed that you modified the chat template of Qwen-2.5-coder-7B. I'm curious why you didn't directly use Qwen-2.5's own chat template for training (which already supports tool calls and appropriate Hermes output format)? Are there any additional considerations behind this decision?

The text was updated successfully, but these errors were encountered:

jc-ryan · 2024-12-13T13:59:45Z

Additionally, I'd like to ask: I noticed that the Model Card mentions "better multi-turn" capabilities, but the current training data, custom chat template, and the LlamaFactory training format used ({ "instruction": content, "input": "", "output": label}) don't seem to support multi-turn function calling? (Indeed, using Qwen2.5's own chat template would have made it easier to train for multi-turn function calling)

jc-ryan · 2024-12-13T14:04:56Z

Congratulations on your new progress with Hammer-2.1! I have a question: I noticed that when using vllm serve, the tool call parser being used is hermes, but your training data uses a custom tool output format. How does the hermes XML template implement the parsing?

Additionally, I noticed that you modified the chat template of Qwen-2.5-coder-7B. I'm curious why you didn't directly use Qwen-2.5's own chat template for training (which already supports tool calls and appropriate Hermes output format)? Are there any additional considerations behind this decision?

Indeed, it doesn't work.

linqq9 · 2024-12-13T14:08:14Z

Congratulations on your new progress with Hammer-2.1! I have a question: I noticed that when using vllm serve, the tool call parser being used is hermes, but your training data uses a custom tool output format. How does the hermes XML template implement the parsing?

Additionally, I noticed that you modified the chat template of Qwen-2.5-coder-7B. I'm curious why you didn't directly use Qwen-2.5's own chat template for training (which already supports tool calls and appropriate Hermes output format)? Are there any additional considerations behind this decision?
hi, we have found that if we want to use Hammer tool to output formats, we may need to submit a Pull Request (PR) to VLLM to create our own parsing. We tried it and found that Hermes can directly output the tool output format we defined without any processing, so we chose it.

Regarding why we didn't use Qwen - 2.5's own chat template for fine - tuning, when we were training Hammer 1.0, we found that attempting to switch to Qwen's own tool calls prompt didn't bring any benefits. Therefore, we didn't continue with that approach.

linqq9 · 2024-12-13T14:11:43Z

Additionally, I'd like to ask: I noticed that the Model Card mentions "better multi-turn" capabilities, but the current training data, custom chat template, and the LlamaFactory training format used ({ "instruction": content, "input": "", "output": label}) don't seem to support multi-turn function calling? (Indeed, using Qwen2.5's own chat template would have made it easier to train for multi-turn function calling)

For Hammer 2.1, we have used some multi turn data mixing training, and we will also open source this data in the future

linqq9 · 2024-12-13T14:15:26Z

Congratulations on your new progress with Hammer-2.1! I have a question: I noticed that when using vllm serve, the tool call parser being used is hermes, but your training data uses a custom tool output format. How does the hermes XML template implement the parsing?
Additionally, I noticed that you modified the chat template of Qwen-2.5-coder-7B. I'm curious why you didn't directly use Qwen-2.5's own chat template for training (which already supports tool calls and appropriate Hermes output format)? Are there any additional considerations behind this decision?

Indeed, it doesn't work. !

sorry, I misunderstood. Currently, it does not support calling the built-in parse directly, but you can use json.loads() to complete the parsing directly

linqq9 · 2024-12-13T14:19:24Z

And the current custom chat template supports multi-turn function calling.

jc-ryan · 2024-12-13T14:42:30Z

And the current custom chat template supports multi-turn function calling.

Oh I see, so the data_processing file should also be updated accordingly to use tokenizer.apply_chat_template(), right?

Also, was multi-turn function calling also fine-tuned using llama_factory? If so, it probably no longer uses the Alpaca data format as Hammer 2.0 did, right?

linqq9 · 2024-12-13T14:59:12Z

Currently, we are only leveraging data processing files to augment single-turn data. Looking ahead, we'll also take into account the application of these files for multi-turn scenarios.

Our multi-turn function calling is fine-tuned with the Llama_factory. Initially, we had planned to conduct training based on the ShareGPT format. However, during the process, we discovered that Llama_factory appears to restrict even-numbered roles to be assistant. Given that the potential data might encompass diverse roles, we ultimately opted to train using the Alpaca data format, with the sole modification being made to the template.

jc-ryan · 2024-12-13T15:08:48Z

Currently, we are only leveraging data processing files to augment single-turn data. Looking ahead, we'll also take into account the application of these files for multi-turn scenarios.

Our multi-turn function calling is fine-tuned with the Llama_factory. Initially, we had planned to conduct training based on the ShareGPT format. However, during the process, we discovered that Llama_factory appears to restrict even-numbered roles to be assistant. Given that the potential data might encompass diverse roles, we ultimately opted to train using the Alpaca data format, with the sole modification being made to the template.

Sorry, what I meant to say was that the template in data_processing.py is not exactly the same as the one in chat_template - for example, the template in data_processing.py doesn't include roles. Shouldn't we construct the fine-tuning data based on chat_template + tokenizer.apply_chat_template()?

linqq9 · 2024-12-13T15:46:51Z

You're right. Constructing fine-tuning data based on chat_template + tokenizer.apply_chat_template() is indeed a great approach.

jc-ryan · 2024-12-16T11:37:44Z

Ok, thanks, looking forward to the updated data_processing logic and vllm tool parser support~
(p.s. If using Qwen's chat template won't cause significant performance impact, you can save the effort of submitting a separate pull request for a hammer tool parser to the vllm official repo)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about Hammer-2.1-7b chat template #7

Question about Hammer-2.1-7b chat template #7

Question about Hammer-2.1-7b chat template #7

Question about Hammer-2.1-7b chat template #7

Comments