Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

503: Request slowdown using Python wrapper #222

Open
duckduckgrayduck opened this issue May 13, 2024 · 2 comments
Open

503: Request slowdown using Python wrapper #222

duckduckgrayduck opened this issue May 13, 2024 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@duckduckgrayduck
Copy link
Contributor

i'm receiving the following:
documentcloud.exceptions.APIError: 503 -
SlowDownPlease reduce your request rate.48HHKKE4ZVM2HX5E3cDjq5OKjt3YZM91t21VRKRbDTr89/lUkzrJMSphkBjge369inHIAVDNiNzuGnEiDtsQGUy+RQw=
https://github.com/MuckRock/documentcloud-regex-addon/actions/runs/9066429901/job/24909314442

when using page_text = document.get_page_text(page_number) in the Regex Extractor Add-On

@duckduckgrayduck duckduckgrayduck added the bug Something isn't working label May 13, 2024
@mitchelljkotler
Copy link
Member

That is coming from S3 directly - I believe the rate limits for S3 are not concrete, and they rate limit you as they see fit. We could put some exponential backoff into the python library.

@eyeseast
Copy link
Contributor

It's doing this one page at a time instead of getting all the text at once: https://github.com/MuckRock/documentcloud-regex-addon/blob/main/main.py#L34

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants