-
Notifications
You must be signed in to change notification settings - Fork 7
function call with llama cpp python #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your accoun 8000 t
Comments
I haven't tried this, so I'm not sure if llama-cpp-python supports directly calling the chat_template of the tokenizer in the same way as the Hugging Face Transformers library. If it doesn't support direct chat_template calls, the option might be to concatenate the prompt before passing it to llama-cpp-python. |
The llama-cpp-python supports the function call. You can see in this link https://llama-cpp-python.readthedocs.io/en/latest/server/#function-calling. I tried the hammer2.1-3b-Q4.gguf in this way. But the results is not good. Can you have a try? |
I had success by specifying the
My code ends up looking like something below: from llama_cpp import Llama
# Instantiate the model.
llm = Llama.from_pretrained(
repo_id="mradermacher/Hammer2.1-3b-i1-GGUF",
filename="**i1-Q4_K_M**",
chat_format="chatml-function-calling",
)
# Define the user prompt.
user_prompt = ...
# Specify the messages and list of tools.
messages = [
{
"role": "system",
"content": [
{
"type": "text",
"text": SYSTEM_PROMPT,
},
],
},
{
"role": "user",
"content": [
{
"type": "text",
"text": user_prompt,
},
],
},
]
tools = [
{
"type": "function",
"function": {
"name": "dummy_tool",
"description": "A dummy tool for this MVP.",
"parameters": {
"type": "object",
"properties": {
"input_text": {"type": "string", "description": "Text as input for the function."}
},
"required": ["input_text"],
},
},
},
...
]
# Run inference.
response = llm.create_chat_completion(
messages=messages,
tools=tools,
)
>>> ```\nAction: {"name": "dummy_tool", "arguments": {"input_text": "Example of input text."}}\n``` For the system prompt, I'm leveraging smolagent's I haven't tried this on a larger scale yet, so I'm not fully sure whether it'll work robustly but I hope that this helps! |
@nickvdw Thank you ! How about asking it the name of it ? Like these general query (not about the tool), I found it responses so bad. It seems it always call the tool. |
@HuntZhaozq hi, Maybe you can refer to this document llama-cpp-python chat-completion to set tokenizer.chat_template |
Without specifying any tools nor any system prompt for // Input prompt:
What is your name?
// With the `chatml` chat format:
>>> Hello! How can I assist you today?
// With the `chatml-function-calling` chat format:
>>> I am an AI assistant. // Input prompt:
Describe the meaning of life.
// With the `chatml` chat format:
>>> Hello! How can I assist you today?
// With the `chatml-function-calling` chat format:
>>> The meaning of life is a philosophical question that has been debated for centuries. It is a fundamental question that seeks to understand the purpose and significance of existence. Different people have different beliefs and perspectives on the meaning of life, but some common themes include the search for happiness, fulfillment, and a connection to something greater than oneself. Some argue that the meaning of life is to seek knowledge and understanding, while others believe it is to contribute to society and make a positive impact. Ultimately, the meaning of life is a personal and subjective question that can vary greatly from person to person. |
@nickvdw But when you specify the tools in the request and set the tool_choice="auto", the output is abnormal for these general queries. I try this and it outputs the null content and calling no tools. |
Does this model support the function call using llama cpp python?
The text was updated successfully, but these errors were encountered: