Add Cached Tokens to UsageResponse #642
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Openai Api has added prompt caching : https://openai.com/index/api-prompt-caching/
This has added new property to the usage response returned this contain the number of tokens that where retrieved from the cache.
Here is the example from the docs
usage: {
total_tokens: 2306,
prompt_tokens: 2006,
completion_tokens: 300,
prompt_tokens_details: {
cached_tokens: 1920,
audio_tokens: 0,
},
completion_tokens_details: {
reasoning_tokens: 0,
audio_tokens: 0,
}
}
I have added a new class for prompt_tokens_details with property cached_tokens . The prompt_tokens_details has been added as a property of usage.