Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support accrual failure detection #346

Open
jhalterman opened this issue Sep 17, 2022 · 4 comments
Open

Support accrual failure detection #346

jhalterman opened this issue Sep 17, 2022 · 4 comments

Comments

@jhalterman
Copy link
Member

jhalterman commented Sep 17, 2022

As Failsafe already supports policies that are useful for networked operations, it would make sense to support phi accrural (or other accural algorithms) failure detection for situations where fixed timeouts don't adequately account for changing load conditions.

This could be implemented as a new policy which measures execution times over a number of executions, to determine if some threshold is crossed which represents a failure. Phi accrual could be one strategy supported by the policy, but there could be others. When the threshold is crossed, a fallback-like function could be called, for example, to fail over a system from one node that has failed to another. In that sense, the policy would be like a time-based fallback (rather than result based), except unlike a fallback it would be stateful.

Alternatively, this could be implemented as a Timeout option, where the timeout is stateful and adapts to execution time distributions.

One open question for this policy is, similar to a circuit breaker or rate limiter, at what point should it "reset" after triggering a failure, or should it even reset?

Any ideas for how this should work or what the policy should be named are welcome!

@Tembrel
Copy link
Contributor

Tembrel commented Sep 17, 2022

accural -> accrual

@jhalterman jhalterman changed the title Support phi accural failure detection Support phi accrural failure detection Sep 17, 2022
@jhalterman
Copy link
Member Author

For some reason my fingers always struggle with that one :)

@Tembrel
Copy link
Contributor

Tembrel commented Sep 17, 2022 via email

@jhalterman jhalterman changed the title Support phi accrural failure detection Support phi accrual failure detection Sep 17, 2022
@jhalterman
Copy link
Member Author

jhalterman commented Sep 17, 2022

This is definitely a sign that the new policy should not be named PhiAccrual :) I like the idea of thinking about a new policy more generally, as something that measures a series of execution times, where phi accrual is maybe just one strategy for determining if those times represent a failure.

@jhalterman jhalterman changed the title Support phi accrual failure detection Support accrual failure detection Sep 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants