Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the circuit breaker pattern #8735

Open
akkie opened this issue Feb 3, 2022 · 3 comments
Open

Implement the circuit breaker pattern #8735

akkie opened this issue Feb 3, 2022 · 3 comments
Labels
blocker/info Marks an issue as blocked, awaiting more information from the author component/clients component/zeebe Related to the Zeebe component/team kind/feature Categorizes an issue or PR as a feature, i.e. new behavior

Comments

@akkie
Copy link

akkie commented Feb 3, 2022

Is your feature request related to a problem? Please describe.
Handling massive parallel process in todays cloud architecture can lead to infrastructure autoscaling. This autoscaling needs time. The current retry process cannot handle such scenarios, where errors occur because of resource limits. Instead incidents will be created.

Describe the solution you'd like
With a circuit breaker implementation such burst scenarios could be handled without the creation of incidents. Instead of creating an incident after x retries, the circuit breaker will wait some time before it starts a new retry. If the calls to a worker fails permanent (configured with a threshold), an incident can be created.

Additional context
https://www.martinfowler.com/bliki/CircuitBreaker.html
https://doc.akka.io/docs/akka/current/common/circuitbreaker.html

@akkie akkie added the kind/feature Categorizes an issue or PR as a feature, i.e. new behavior label Feb 3, 2022
@saig0
Copy link
Member

saig0 commented Feb 4, 2022

There is another open request #5629 that should allow configuring a backoff when failing a job. This feature is already implemented in the broker but not on the clients yet.

Do you need anything else? Or, would it fulfill your requirements?
If not then please describe your expectations in detail.

@saig0 saig0 added the blocker/info Marks an issue as blocked, awaiting more information from the author label Feb 8, 2022
@akkie
Copy link
Author

akkie commented Feb 8, 2022

I think the backoff time is a great addition. But I think a circuit breaker could be more flexible in its configuration.
The backoff time is more fixed. As example 5 retries with a backoff time needs exactly five minutes. With the circuit breaker you could define something like this. After 5 retries without a backoff time, wait for one minute, after that fails, wait for another minute plus an exponential backoff factor and so one. This will be repeated until it reaches a max timeout. This solution will be fast with its retries at the beginning and then will stretch the time between the retries at the end.

@saig0
Copy link
Member

saig0 commented Feb 17, 2022

@akkie the backoff time can be defined by the client that is handling the job. It's up to the job worker/handler to calculate the backoff time. It can be fixed (e.g. retry after 1 minute) but it can also be dynamic (e.g. exponential backoff).

So, the backoff time would enable the dynamic backoff that you're requesting.

I assume that you want to have a more declarative way of defining the backoff. For example, using a configuration setting or a new property of the job worker annotation. This should avoid implementing the pattern for every job worker again.
Is this correct?

It sounds like a good idea for the spring-zeebe project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocker/info Marks an issue as blocked, awaiting more information from the author component/clients component/zeebe Related to the Zeebe component/team kind/feature Categorizes an issue or PR as a feature, i.e. new behavior
Projects
None yet
Development

No branches or pull requests

6 participants