8000 Add tokens consumed · Issue #3 · LiveBench/liveswebench · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Add tokens consumed #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
LoggeL opened this issue Apr 2, 2025 · 4 comments
Open

Add tokens consumed #3

LoggeL opened this issue Apr 2, 2025 · 4 comments

Comments

@LoggeL
Copy link
LoggeL commented Apr 2, 2025

It would be nice to compare the amount of tokens consumed or money spent per tool to better compare their cost-effectivness.
I know its not possible for all tools but would be nice to have.

@fiery-prometheus
Copy link
fiery-prometheus commented Apr 2, 2025

Came here to add this as well, and some thoughts from the community

https://www.reddit.com/r/LocalLLaMA/comments/1jplg2o/livebench_team_just_dropped_a_leaderboard_for/

  • Tokens generated is a good idea because it lets users measure reasoning and calculate cost at a point in time. The reason a straight monetary cost column wouldn't be good, is that cost can wary wildly from a point in time to another, since it changes with the provider and isn't a stable measurement.

  • At the same time, having a model/agentic ai come to the right conclusion faster is a valid measurement as well, since the interaction with the tool and the time it takes to solve problems is correlated with total tokens generated.

  • A faster "time to solve" will make the tool more practical to use and will provide a measure stick for tools to use less tokens and reasoning to achieve good results. Which will benefit users' time and wallet.

@gnguralnick
Copy link
Contributor

Tokens used / number of steps is something we're working on displaying, it's definitely useful information. Major hurdle there is that there isn't really a good way to get that information out of the IDE tools (open to suggestions there), but we can at least do it for the command-line agents. Raw time to solve is problematic since it would depend on internet connection.

@LoggeL
Copy link
Author
LoggeL commented Apr 4, 2025

The easiest solution for would be using an own API key and tracking the usage on the key.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
0