Implement a retry mechanism for Google GenAI calls #15783
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What it does
See #15718:
Google imposes rate limits for its LLMs. Especially with lower tiers (including the free tier - see https://ai.google.dev/gemini-api/docs/rate-limits?hl=de ), it can quickly happen that an agent with tool calls (such as the Coder agent) hits the Requests Per Minute (RPM) rate limit and the agent execution terminates with an error.
Also, in longer conversations, it can happen that the LLM does not return proper JSON which leads to an error in the GenAI API client, or reports 500 Internal Server Errors occasionally.
To make the Google LanguageModel implementation in Theia more robust against these errors, a retry mechanism is now implemented that can resend the last request after a configurable delay in case of an error.
The retry mechanism can be configured using preferences:
maxRetriesOnErrors
configures the maximum number of retries per request after which to give up. Defaults to3
. If smaller than1
, then the retry logic is disabled.retryDelayOnRateLimitError
configures the delay in seconds to wait in case of a rate limit error. Defaults to60
(1 minute). If negative, then no retry is attempted and the error is propagated.retryDelayOnOtherErrors
configures the delay in seconds to wait in case of any other error. Defaults to-1
(disabled). If negative, then no retry is attempted and the error is propagated.How to test
Without the change, you should see an message 429 Rate Limit Exceeded after 10 tool calls.
With the change, the conversation will stall for some time in the middle, but eventually continue after 60s of waiting.
Follow-ups
Breaking changes
Attribution
Review checklist
Reminder for reviewers