function call with llama cpp python #13

HuntZhaozq · 2025-04-08T03:35:00Z

Does this model support the function call using llama cpp python?

linqq9 · 2025-04-08T06:44:20Z

I haven't tried this, so I'm not sure if llama-cpp-python supports directly calling the chat_template of the tokenizer in the same way as the Hugging Face Transformers library. If it doesn't support direct chat_template calls, the option might be to concatenate the prompt before passing it to llama-cpp-python.

HuntZhaozq · 2025-04-08T07:36:20Z

The llama-cpp-python supports the function call. You can see in this link https://llama-cpp-python.readthedocs.io/en/latest/server/#function-calling. I tried the hammer2.1-3b-Q4.gguf in this way. But the results is not good. Can you have a try?

nickvdw · 2025-04-08T08:27:43Z

I had success by specifying the chatml-function-calling chat format in the model instantiation. Specifically,

If I don't specify a chat format, the model will not run inference as it will leverages the default chat format which leads to a TypeError: can only concatenate str (not "list") to str.
If I specify the chatml chat format, the model will run inference but the generated response will be empty (e.g., \n[]\n).
If I select the chatml-function-calling chat format, the model will run inference and the generated response will be a function call from my list of specified tools.

My code ends up looking like something below:

from llama_cpp import Llama

# Instantiate the model.
llm = Llama.from_pretrained(
    repo_id="mradermacher/Hammer2.1-3b-i1-GGUF",
    filename="**i1-Q4_K_M**",
    chat_format="chatml-function-calling",
)

# Define the user prompt.
user_prompt = ...

# Specify the messages and list of tools.
messages = [
    {
        "role": "system",
        "content": [
            {
                "type": "text",
                "text": SYSTEM_PROMPT,
            },
        ],
    },
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": user_prompt,
            },
        ],
    },
]

tools = [
    {
        "type": "function",
        "function": {
            "name": "dummy_tool",
            "description": "A dummy tool for this MVP.",
            "parameters": {
                "type": "object",
                "properties": {
                    "input_text": {"type": "string", "description": "Text as input for the function."}
                },
                "required": ["input_text"],
            },
        },
    },
    ...
]

# Run inference.
response = llm.create_chat_completion(
    messages=messages,
    tools=tools,
)
>>> ```\nAction: {"name": "dummy_tool", "arguments": {"input_text": "Example of input text."}}\n```

For the system prompt, I'm leveraging smolagent's ToolCallingAgent system prompt (reference).

I haven't tried this on a larger scale yet, so I'm not fully sure whether it'll work robustly but I hope that this helps!

HuntZhaozq · 2025-04-08T08:52:34Z

@nickvdw Thank you ! How about asking it the name of it ? Like these general query (not about the tool), I found it responses so bad. It seems it always call the tool.

linqq9 · 2025-04-08T09:23:43Z

@HuntZhaozq hi, Maybe you can refer to this document llama-cpp-python chat-completion to set tokenizer.chat_template

nickvdw · 2025-04-08T11:26:59Z

Thank you ! How about asking it the name of it ? Like these general query (not about the tool), I found it responses so bad. It seems it always call the tool.

Without specifying any tools nor any system prompt for mradermacher/Hammer2.1-3b-i1-GGUF (i1-Q4_K_M), the model generates the following text:

// Input prompt:
What is your name?

// With the `chatml` chat format:
>>> Hello! How can I assist you today?

// With the `chatml-function-calling` chat format:
>>> I am an AI assistant.

// Input prompt:
Describe the meaning of life.

// With the `chatml` chat format:
>>> Hello! How can I assist you today?

// With the `chatml-function-calling` chat format:
>>> The meaning of life is a philosophical question that has been debated for centuries. It is a fundamental question that seeks to understand the purpose and significance of existence. Different people have different beliefs and perspectives on the meaning of life, but some common themes include the search for happiness, fulfillment, and a connection to something greater than oneself. Some argue that the meaning of life is to seek knowledge and understanding, while others believe it is to contribute to society and make a positive impact. Ultimately, the meaning of life is a personal and subjective question that can vary greatly from person to person.

HuntZhaozq · 2025-04-09T01:36:07Z

@nickvdw But when you specify the tools in the request and set the tool_choice="auto", the output is abnormal for these general queries. I try this and it outputs the null content and calling no tools.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

function call with llama cpp python #13

function call with llama cpp python #13

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

function call with llama cpp python #13

function call with llama cpp python #13

Comments

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!