Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve convert retry handling #433

Open
2 tasks
stchris opened this issue Feb 10, 2023 · 1 comment
Open
2 tasks

Improve convert retry handling #433

stchris opened this issue Feb 10, 2023 · 1 comment

Comments

@stchris
Copy link
Contributor

stchris commented Feb 10, 2023

Our current retry logic for converting documents (shelling out to LibreOffice) is based on two constants: the number of retry attempts and the timeout

TIMEOUT = 3600 # seconds
CONVERT_RETRIES = 5

What would be more desirable is a faster first fail which could be increased to a maximum.

For instance: right now we retry up to 5 times and timeout after 3600s (1 hour). We could potentially get much better throughput by having a first timeout after 600s (10 minutes) which gets progressively larger (with a potential max cap). To illustrate:

TIMEOUT_START=600
TIMEOUT_INCREASE=900
TIMEOUT_MAX=3600
CONVERT_RETRIES=5

This would result in up to 5 retries with timeouts of 10, 25, 40, 55 and 60 minutes. Ideally "stuck" convert tasks would time out much sooner and get queued up for a retry faster.

TODO

@ozhyrenkov
Copy link

Hey, have a thought regarding START/INCREASE/MAX variables. I do like the way it works in retry and requests libraries - it has a backoff parameter as a float number which indicates speed of growth of interval between attempts.

I am not really deeply into the way it works in Aleph, however Retry lib has this implemented in a nice manner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants