fix: 429 runaway retry. retry_after
is in seconds and so is sleep
#114
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
The
retry_after
response returned with a 429 is in seconds according to the documentation and so is thesleep
time. We shouldn't be converting it to milliseconds. This PR mimics what thehandle_preemptive_rl
method does forX-RateLimit-Reset-After
which is also in seconds.I don't know how to reproduce this yet, but it happened twice today and generated hundreds of thousands of retry requests milliseconds apart because the sleep was essentially 0. We had to reboot the server to stop it. Here is what that looked like in our Honeycomb.io monitoring dashboard:
We've added some more instrumentation to see the RestClient response headers so if it happens again, maybe we can get a better idea of what is going on, but regardless this seems like a very-low-risk change that would prevent other's from getting into this out of control retry loop.
Fixed
wait_seconds
when handlingRestClient::TooManyRequests