8000 Context window always shows zero token when using local LLM · Issue #1138 · RooVetGit/Roo-Code · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Context window always shows zero token when using local LLM #1138

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
phken91 opened this issue Feb 24, 2025 · 7 comments
Open

Context window always shows zero token when using local LLM #1138

phken91 opened this issue Feb 24, 2025 · 7 comments
Labels
actionable Confirmed and ready to work on bug Something isn't working

Comments

@phken91
Copy link
phken91 commented Feb 24, 2025

Which version of the app are you using?

latest

Which API Provider are you using?

Ollama

Which Model are you using?

qwen2.5:7b

What happened?

I'm running the qwen2.5:7b on my local, and point it to roo code, it works fine in acceptable speed. however, context windows are always showing 0 when running.

Steps to reproduce

Relevant API REQUEST output

Additional context

No response

@phken91 phken91 added the bug Something isn't working label Feb 24, 2025
@cte cte changed the title context window always show zero token when using local llm Context window always shows zero token when using local LLM Feb 26, 2025
@hannesrudolph
Copy link
Collaborator

Tested using LM Studio and got same result (no context window updating).

@hannesrudolph hannesrudolph added the actionable Confirmed and ready to work on label Mar 3, 2025
@hannesrudolph hannesrudolph moved this to To triage in Roo Code Roadmap Mar 4, 2025
@hannesrudolph hannesrudolph moved this from To triage to Backlog in Roo Code Roadmap Mar 4, 2025
@hannesrudolph hannesrudolph added needs scoping Needs 8000 up-front scoping to be actionable and removed actionable Confirmed and ready to work on labels Mar 5, 2025
@hannesrudolph hannesrudolph moved this from Backlog to Needs Scoping in Roo Code Roadmap Mar 5, 2025
@greinacker
Copy link
greinacker commented Apr 13, 2025

Seeing the same issue, Qwen 2.5 32B on LM Studio (configured in Roo as LM Studio provider)...Roo doesn't show tokens up or down.

@greinacker
Copy link

FYI, seeing the same when setting up an OpenAI-compatible provider - I was using xAI for Grok 3 mini, with a base URL of https://api.x.ai/v1. Everything seems to work fine, but no tokens are shown going up or down - both are zero. I set pricing also for the model config, but I don't see that either.

Image

@adgower
Copy link
adgower commented Apr 23, 2025

im having this same problem lm studio qwen 2.5

@marksverdhei
8000 Copy link

In case the context window size is not fetchable from the api, we should be able to set the window size explicitly, like in most recent cline, no?

@hannesrudolph hannesrudolph moved this from Issue [Needs Scoping] to Issue [Unassigned] in Roo Code Roadmap May 7, 2025
@hannesrudolph hannesrudolph added actionable Confirmed and ready to work on and removed needs scoping Needs up-front scoping to be actionable labels May 7, 2025
@R-omk
Copy link
R-omk commented May 7, 2025

Here is the simplest way to extract the current value from the model:

curl -s -X POST http://localhost:11434/api/show -d '{"name": "<SELECTED_MODEL_NAME>"}' | jq -r '.parameters' | grep '^num_ctx' | awk '{print $2}'

Since the num_ctx size is not explicitly passed through the API during the request, the best approach is to use a pre-configured model.

Using default or global settings usually indicates that the user does not fully understand what they are doing, which can lead to unsatisfactory results. This should be emphasized in the documentation.

After the model is configured, the method described above can be used to set the context window size. Token usage calculations can then be performed using an approximate formula (e.g., 1 token ≈ 3/4 of an word, or a more advanced algorithm).

Combined with future improvements regarding the context window, this will ensure Ollama stays competitive.

@R-omk
Copy link
R-omk commented May 7, 2025

OpenAI compatible api shows token usage:

curl -s   http://localhost:11434/v1/chat/completions     -H "Content-Type: application/json"     -d '{
        "model": "qwen3:30b-a3b",
        "messages": [
            {
                "role": "system",
                "content": "You are a helpful assistant."
            },
            {
                "role": "user",
                "content": "Hello"
            }
        ]
    }' | jq -r '.usage'

{
  "prompt_tokens": 20,
  "completion_tokens": 83,
  "total_tokens": 103
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
actionable Confirmed and ready to work on bug Something isn't working
Projects
Status: Issue [Unassigned]
Development

No branches or pull requests

6 participants
0