Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: 429 runaway retry. retry_after is in seconds and so is sleep #114

Merged
merged 1 commit into from Jan 6, 2022

Conversation

sadleb
Copy link
Contributor

@sadleb sadleb commented Jan 6, 2022

Summary

The retry_after response returned with a 429 is in seconds according to the documentation and so is the sleep time. We shouldn't be converting it to milliseconds. This PR mimics what the handle_preemptive_rl method does for X-RateLimit-Reset-After which is also in seconds.

I don't know how to reproduce this yet, but it happened twice today and generated hundreds of thousands of retry requests milliseconds apart because the sleep was essentially 0. We had to reboot the server to stop it. Here is what that looked like in our Honeycomb.io monitoring dashboard:
image

We've added some more instrumentation to see the RestClient response headers so if it happens again, maybe we can get a better idea of what is going on, but regardless this seems like a very-low-risk change that would prevent other's from getting into this out of control retry loop.

Fixed

  • Retry wait_seconds when handling RestClient::TooManyRequests

Copy link
Member

@Daniel-Worrall Daniel-Worrall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, seems this came in discord/discord-api-docs#2097 and it was missed this whole time. I'm quite worried we hadn't had reports of this before now though

@sadleb sadleb changed the title fix: 429 retry_after is in seconds, not milliseconds fix: 429 runaway retry. retry_after is in seconds and so is sleep Jan 6, 2022
@sadleb
Copy link
Contributor Author

sadleb commented Jan 6, 2022

Yeah, seems this came in discord/discord-api-docs#2097 and it was missed this whole time. I'm quite worried we hadn't had reports of this before now though

I'll let you know if we find out more about why we're getting a 429 in the first place.

@sadleb
Copy link
Contributor Author

sadleb commented Jan 6, 2022

Yeah, seems this came in discord/discord-api-docs#2097 and it was missed this whole time. I'm quite worried we hadn't had reports of this before now though

I'll let you know if we find out more about why we're getting a 429 in the first place.

Can I expect this to be merged anytime soon? Trying to decide if I need to monkey patch or something in the meantime. Thanks for the quick response by the way!

@swarley
Copy link
Member

swarley commented Jan 6, 2022

I should be able to merge this tonight, I just need to check that the information returned in the headers isn't more precise (because there should also be a Retry-After header and I remember there being a discrepancy between the two values)

@sadleb
Copy link
Contributor Author

sadleb commented Jan 6, 2022

I should be able to merge this tonight, I just need to check that the information returned in the headers isn't more precise (because there should also be a Retry-After header and I remember there being a discrepancy between the two values)

Gotcha. Thanks so much!

@swarley swarley merged commit f9a94c9 into shardlab:main Jan 6, 2022
@swarley
Copy link
Member

swarley commented Jan 6, 2022

It appears that the header value is the least precise. Thanks for the PR! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants