-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Context window always shows zero token when using local LLM #1138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Tested using LM Studio and got same result (no context window updating). |
Seeing the same issue, Qwen 2.5 32B on LM Studio (configured in Roo as LM Studio provider)...Roo doesn't show tokens up or down. |
FYI, seeing the same when setting up an OpenAI-compatible provider - I was using xAI for Grok 3 mini, with a base URL of https://api.x.ai/v1. Everything seems to work fine, but no tokens are shown going up or down - both are zero. I set pricing also for the model config, but I don't see that either. |
im having this same problem lm studio qwen 2.5 |
In case the context window size is not fetchable from the api, we should be able to set the window size explicitly, like in most recent cline, no? |
Here is the simplest way to extract the current value from the model: curl -s -X POST http://localhost:11434/api/show -d '{"name": "<SELECTED_MODEL_NAME>"}' | jq -r '.parameters' | grep '^num_ctx' | awk '{print $2}' Since the Using default or global settings usually indicates that the user does not fully understand what they are doing, which can lead to unsatisfactory results. This should be emphasized in the documentation. After the model is configured, the method described above can be used to set the context window size. Token usage calculations can then be performed using an approximate formula (e.g., 1 token ≈ 3/4 of an word, or a more advanced algorithm). Combined with future improvements regarding the context window, this will ensure Ollama stays competitive. |
OpenAI compatible api shows token usage: curl -s http://localhost:11434/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "qwen3:30b-a3b",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello"
}
]
}' | jq -r '.usage'
{
"prompt_tokens": 20,
"completion_tokens": 83,
"total_tokens": 103
} |
Which version of the app are you using?
latest
Which API Provider are you using?
Ollama
Which Model are you using?
qwen2.5:7b
What happened?
I'm running the qwen2.5:7b on my local, and point it to roo code, it works fine in acceptable speed. however, context windows are always showing 0 when running.
Steps to reproduce
Relevant API REQUEST output
Additional context
No response
The text was updated successfully, but these errors were encountered: