Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide an async do on retry callback #2063

Closed
remk opened this issue Jan 24, 2020 · 5 comments · Fixed by #1979
Closed

Provide an async do on retry callback #2063

remk opened this issue Jan 24, 2020 · 5 comments · Fixed by #1979
Labels
type/enhancement A general enhancement
Milestone

Comments

@remk
Copy link

remk commented Jan 24, 2020

Motivation

The Retry.doOnRetry allow to register a synchronous callback. At the time being if we want to use an async operation in this callback which finish before the next retry is attempted we have to block on it. This can cause some locks and was not usable. This was reported in the gitter channel reactor/reactor by @mbaechler.

Hi there, I have a problem with using retryWhen on my Mono. Basically, what I want to do is:

    put a REST resource somewhere
    If failing with a "parent resource doesn't exist", retry using Retry.onlyIf and doOnRetry creating the parent resource

However, doOnRetry is called synchronously, so the REST request I perform here requires a call to block, triggering in some cases a starvation of my threads

Do you know any way to implement that in a different way?

Desired solution

So it would be good to be able to register an async callback with an equivalent to Retry.doOnRetry for example :

 public Retry<T> onRetryWithMono(Function<? super RetryContext<T>, Mono<?>> onRetryMono) 
remk referenced this issue in linagora/reactor-addons Jan 24, 2020
Add Retry.onRetryWithMono which is the async equivalent to the
Retry.doOnRetry operation.
This allow to do some async operation on retry without having to block,
which could cause some locks.
@rouazana
Copy link

Here is a similar issue: https://stackoverflow.com/questions/55924849/on-error-do-another-call-and-retry-in-webflux/55937290

This would avoid the ugly workaround shown is answers.

@simonbasle
Copy link
Member

At first this felt like bending the intent of Retry builder, which was initially thought around the combination of counter + delays. doOnRetry could also be seen as a side-effect, that are not supposed to introduce blocking behavior / latency.

That said, why not having this way of further customizing the retry attempt. Two caveats:

  1. we need to come up with a better name for it
  2. the documentation must be very explicit that this changes the delay between two attempts. eg. if you use that + linearBackoff(Duration.ofSeconds(1)), the backoff is not guaranteed to be 1s anymore (depending on how long the custom Mono takes to terminate).

In balancing whether or not to add it, let's also consider that:

  1. the "ugly" code in the stackoverflow question stems from a more complex retry case than the gitter one, and can probably be made a bit more palatable by putting the Function in a constant, or as a class method reference
  2. Simpler cases where only an action + single retry is needed, the onErrorResume approach below can be used:
operation.retry(e -> isTransient(e), 10)
    .onErrorResume(PreconditionException.class, e -> prepareOperation().then(operation));

@simonbasle
Copy link
Member

simonbasle commented Jan 24, 2020

The two alternatives with pure core operators:

public final Function<Flux<Throwable>, Publisher<String>> retryOrFixPrecondition(int max, Predicate<Throwable> canRetry,
		Predicate<Throwable> isPreconditionError, Mono<String> fixPrecondition) {
	return ft -> ft
			.index()
			.flatMap(idxError -> {
				if (isPreconditionError.test(idxError.getT2())) {
					return fixPrecondition;
				}
				if (!canRetry.test(idxError.getT2()) || idxError.getT1() > max) {
					throw Exceptions.propagate(idxError.getT2());
				}
				return Mono.just("retry");
			});
}

@Test
public void retrySpecialErrorWithPrecondition() {
	//simulate a source with transient errors + a precondition error that needs
	//specific handling then retry
	AtomicInteger count = new AtomicInteger(0);
	AtomicBoolean prerequisite = new AtomicBoolean();
	Mono<String> operation = Mono.defer(() -> {
		int c = count.getAndIncrement();
		if (c < 2) {
			throw new IllegalStateException("TRANSIENT FAILURE " + c);
		}
		if (prerequisite.get()) {
			return Mono.just("SUCCESS");
		}
		throw new IllegalArgumentException("PREREQUISITE");
	});

	//simulate a long running asynchronous operation that updates state to match
	//the precondition
	Mono<String> prepare = Mono.delay(Duration.ofSeconds(3))
	                           .doOnNext(ignored -> prerequisite.set(true))
	                           .thenReturn("retry");

	//predicate for errors that can be retried
	Predicate<Throwable> shouldRetry = p -> p instanceof IllegalStateException
			&& p.getMessage() != null
			&& p.getMessage().startsWith("TRANSIENT");

	//predicate for the error that reports precondition is not met
	Predicate<Throwable> isMissingPrerequisite = p -> p instanceof IllegalArgumentException
			&& "PREREQUISITE".equals(p.getMessage());

	//APPROACH 1, capable of retrying further in case of transient failures after
	//the prerequisite state is updated
	Mono<String> retryWhen = operation
			.retryWhen(retryOrFixPrecondition(10, shouldRetry, isMissingPrerequisite, prepare));

	//APPROACH 2, if you want to retry other exceptions but update prerequisite state
	// and re-attempt the operation only once
	Mono<String> retryResume = operation
			.retry(10, shouldRetry)
			.onErrorResume(isMissingPrerequisite, e -> prepare.then(operation));


	//testing all this
	retryWhen.as(StepVerifier::create)
	         .expectNext("SUCCESS")
	         .verifyComplete();
	count.set(0); prerequisite.set(false);
	retryResume.as(StepVerifier::create)
	           .expectNext("SUCCESS")
	           .verifyComplete();
}

@remk
Copy link
Author

remk commented Jan 27, 2020

Thank you for your suggestions, but it means we should re-implement the backoff and jitter we rely on with Retry.

@simonbasle
Copy link
Member

@remk I see. We'll discuss this. Also note that we're thinking about bringing the builder pattern into the core library in parallel, so maybe that new feature should be part of this effort core-side rather than addons-side.

@simonbasle simonbasle transferred this issue from reactor/reactor-addons Mar 2, 2020
@simonbasle simonbasle added the type/enhancement A general enhancement label Mar 2, 2020
simonbasle added a commit that referenced this issue Mar 2, 2020
Also use concatMap instead of flatMap to be 100% sure and set a sane
precedent.
simonbasle added a commit that referenced this issue Mar 2, 2020
simonbasle added a commit that referenced this issue Mar 9, 2020
simonbasle added a commit that referenced this issue Mar 10, 2020
simonbasle added a commit that referenced this issue Mar 16, 2020
simonbasle added a commit that referenced this issue Mar 18, 2020
This big commit is a large refactor of the `retryWhen` operator in order
to add several features.

Fixes #1978
Fixes #1905
Fixes #2063
Fixes #2052
Fixes #2064

 * Expose more state to `retryWhen` companion (#1978)

This introduces a retryWhen variant based on a `Retry` functional
interface. This "function" deals not with a Flux of `Throwable` but of
`RetrySignal`. This allows retry function to check if there was some
success (onNext) since last retry attempt, in which case the current
attempt can be interpreted as if this was the first ever error.

This is especially useful for cases where exponential backoff delays
should be reset, for long lived sequences that only see intermittent
bursts of errors (transient errors).

We take that opportunity to offer a builder for such a function that
could take transient errors into account.

 * the `Retry` builders

Inspired by the `Retry` builder in addons, we introduce two classes:
`RetrySpec` and `RetryBackoffSpec`. We name them Spec and not Builder
because they don't require to call a `build()` method. Rather, each
configuration step produces A) a new instance (copy on write) that B)
is by itself already a `Retry`.

The `Retry` + `xxxSpec` approach allows us to offer 2 standard
strategies that both support transient error handling, while letting
users write their own strategy (either as a standalone `Retry` concrete
implementation, or as a builder/spec that builds one).

Both specs allow to handle `transientErrors(boolean)`, which when true
relies on the extra state exposed by the `RetrySignal`. For the simple
case, this means that the remaining number of retries is reset in case
of onNext. For the exponential case, this means retry delay is reset to
minimum after an onNext (#1978).

Additionally, the introduction of the specs allows us to add more
features and support some features on more combinations, see below.

 * `filter` exceptions (#1905)

 Previously we could only filter exceptions to be retried on the simple
 long-based `retry` methods. With the specs we can `filter` in both
 immediate and exponential backoff retry strategies.

 * Add pre/post attempt hooks (#2063)

The specs let the user configure two types of pre/post hooks.
Note that if the retry attempt is denied (eg. we've reached the maximum
number of attempts), these hooks are NOT executed.

Synchronous hooks (`doBeforeRetry` and `doAfterRetry`) are side effects
that should not block for too long and are executed right before and
right after the retry trigger is sent by the companion publisher.

Asynchronous hooks (`doBeforeRetryAsync` and `doAfterRetryAsync`) are
composed into the companion publisher which generates the triggers, and
they both delay the emission of said trigger in non-blocking and
asynchronous fashion. Having pre and post hooks allows a user to better
manage the order in which these asynchronous side effect should be
performed.

 * Retry exhausted meaningful exception (#2052)

The `Retry` function implemented by both spec throw a `RuntimeException`
with a meaningful message when the configured maximum amount of attempts
is reached. That exception can be pinpointed by calling the utility
`Exceptions.isRetryExhausted` method.

For further customization, users can replace that default with their
own custom exception via `onRetryExhaustedThrow`. The BiFunction lets
user access the Spec, which has public final fields that can be
used to produce a meaningful message.

 * Ensure retry hooks completion is taken into account (#2064)

The old `retryBackoff` would internally use a `flatMap`, which can
cause issues. The Spec functions use `concatMap`.

 /!\ CAVEAT

This commit deprecates all of the retryBackoff methods as well as the
original `retryWhen` (based on Throwable companion publisher) in order
to introduce the new `RetrySignal` based signature.

The use of `Retry` explicit type lifts any ambiguity when using the Spec
but using a lambda instead will raise some ambiguity at call sites of
`retryWhen`.

We deem that acceptable given that the migration is quite easy
(turn `e -> whatever(e)` to `(Retry) rs -> whatever(rs.failure())`).
Furthermore, `retryWhen` is an advanced operator, and we expect most
uses to be combined with the retry builder in reactor-extra, which lifts
the ambiguity itself.
@simonbasle simonbasle added this to the 3.3.4.RELEASE milestone Mar 18, 2020
simonbasle added a commit that referenced this issue Mar 18, 2020
This big commit is a large refactor of the `retryWhen` operator in order
to add several features.

Fixes #1978
Fixes #1905
Fixes #2063
Fixes #2052
Fixes #2064

 * Expose more state to `retryWhen` companion (#1978)

This introduces a retryWhen variant based on a `Retry` functional
interface. This "function" deals not with a Flux of `Throwable` but of
`RetrySignal`. This allows retry function to check if there was some
success (onNext) since last retry attempt, in which case the current
attempt can be interpreted as if this was the first ever error.

This is especially useful for cases where exponential backoff delays
should be reset, for long lived sequences that only see intermittent
bursts of errors (transient errors).

We take that opportunity to offer a builder for such a function that
could take transient errors into account.

 * the `Retry` builders

Inspired by the `Retry` builder in addons, we introduce two classes:
`RetrySpec` and `RetryBackoffSpec`. We name them Spec and not Builder
because they don't require to call a `build()` method. Rather, each
configuration step produces A) a new instance (copy on write) that B)
is by itself already a `Retry`.

The `Retry` + `xxxSpec` approach allows us to offer 2 standard
strategies that both support transient error handling, while letting
users write their own strategy (either as a standalone `Retry` concrete
implementation, or as a builder/spec that builds one).

Both specs allow to handle `transientErrors(boolean)`, which when true
relies on the extra state exposed by the `RetrySignal`. For the simple
case, this means that the remaining number of retries is reset in case
of onNext. For the exponential case, this means retry delay is reset to
minimum after an onNext (#1978).

Additionally, the introduction of the specs allows us to add more
features and support some features on more combinations, see below.

 * `filter` exceptions (#1905)

 Previously we could only filter exceptions to be retried on the simple
 long-based `retry` methods. With the specs we can `filter` in both
 immediate and exponential backoff retry strategies.

 * Add pre/post attempt hooks (#2063)

The specs let the user configure two types of pre/post hooks.
Note that if the retry attempt is denied (eg. we've reached the maximum
number of attempts), these hooks are NOT executed.

Synchronous hooks (`doBeforeRetry` and `doAfterRetry`) are side effects
that should not block for too long and are executed right before and
right after the retry trigger is sent by the companion publisher.

Asynchronous hooks (`doBeforeRetryAsync` and `doAfterRetryAsync`) are
composed into the companion publisher which generates the triggers, and
they both delay the emission of said trigger in non-blocking and
asynchronous fashion. Having pre and post hooks allows a user to better
manage the order in which these asynchronous side effect should be
performed.

 * Retry exhausted meaningful exception (#2052)

The `Retry` function implemented by both spec throw a `RuntimeException`
with a meaningful message when the configured maximum amount of attempts
is reached. That exception can be pinpointed by calling the utility
`Exceptions.isRetryExhausted` method.

For further customization, users can replace that default with their
own custom exception via `onRetryExhaustedThrow`. The BiFunction lets
user access the Spec, which has public final fields that can be
used to produce a meaningful message.

 * Ensure retry hooks completion is taken into account (#2064)

The old `retryBackoff` would internally use a `flatMap`, which can
cause issues. The Spec functions use `concatMap`.

 /!\ CAVEAT

This commit deprecates all of the retryBackoff methods as well as the
original `retryWhen` (based on Throwable companion publisher) in order
to introduce the new `RetrySignal` based signature.

The use of `Retry` explicit type lifts any ambiguity when using the Spec
but using a lambda instead will raise some ambiguity at call sites of
`retryWhen`.

We deem that acceptable given that the migration is quite easy
(turn `e -> whatever(e)` to `(Retry) rs -> whatever(rs.failure())`).
Furthermore, `retryWhen` is an advanced operator, and we expect most
uses to be combined with the retry builder in reactor-extra, which lifts
the ambiguity itself.
simonbasle added a commit that referenced this issue Mar 18, 2020
This big commit is a large refactor of the `retryWhen` operator in order
to add several features.

Fixes #1978
Fixes #1905
Fixes #2063
Fixes #2052
Fixes #2064

 * Expose more state to `retryWhen` companion (#1978)

This introduces a retryWhen variant based on a `Retry` functional
interface. This "function" deals not with a Flux of `Throwable` but of
`RetrySignal`. This allows retry function to check if there was some
success (onNext) since last retry attempt, in which case the current
attempt can be interpreted as if this was the first ever error.

This is especially useful for cases where exponential backoff delays
should be reset, for long lived sequences that only see intermittent
bursts of errors (transient errors).

We take that opportunity to offer a builder for such a function that
could take transient errors into account.

 * the `Retry` builders

Inspired by the `Retry` builder in addons, we introduce two classes:
`RetrySpec` and `RetryBackoffSpec`. We name them Spec and not Builder
because they don't require to call a `build()` method. Rather, each
configuration step produces A) a new instance (copy on write) that B)
is by itself already a `Retry`.

The `Retry` + `xxxSpec` approach allows us to offer 2 standard
strategies that both support transient error handling, while letting
users write their own strategy (either as a standalone `Retry` concrete
implementation, or as a builder/spec that builds one).

Both specs allow to handle `transientErrors(boolean)`, which when true
relies on the extra state exposed by the `RetrySignal`. For the simple
case, this means that the remaining number of retries is reset in case
of onNext. For the exponential case, this means retry delay is reset to
minimum after an onNext (#1978).

Additionally, the introduction of the specs allows us to add more
features and support some features on more combinations, see below.

 * `filter` exceptions (#1905)

 Previously we could only filter exceptions to be retried on the simple
 long-based `retry` methods. With the specs we can `filter` in both
 immediate and exponential backoff retry strategies.

 * Add pre/post attempt hooks (#2063)

The specs let the user configure two types of pre/post hooks.
Note that if the retry attempt is denied (eg. we've reached the maximum
number of attempts), these hooks are NOT executed.

Synchronous hooks (`doBeforeRetry` and `doAfterRetry`) are side effects
that should not block for too long and are executed right before and
right after the retry trigger is sent by the companion publisher.

Asynchronous hooks (`doBeforeRetryAsync` and `doAfterRetryAsync`) are
composed into the companion publisher which generates the triggers, and
they both delay the emission of said trigger in non-blocking and
asynchronous fashion. Having pre and post hooks allows a user to better
manage the order in which these asynchronous side effect should be
performed.

 * Retry exhausted meaningful exception (#2052)

The `Retry` function implemented by both spec throw a `RuntimeException`
with a meaningful message when the configured maximum amount of attempts
is reached. That exception can be pinpointed by calling the utility
`Exceptions.isRetryExhausted` method.

For further customization, users can replace that default with their
own custom exception via `onRetryExhaustedThrow`. The BiFunction lets
user access the Spec, which has public final fields that can be
used to produce a meaningful message.

 * Ensure retry hooks completion is taken into account (#2064)

The old `retryBackoff` would internally use a `flatMap`, which can
cause issues. The Spec functions use `concatMap`.

 /!\ CAVEAT

This commit deprecates all of the retryBackoff methods as well as the
original `retryWhen` (based on Throwable companion publisher) in order
to introduce the new `RetrySignal` based signature.

The use of `Retry` explicit type lifts any ambiguity when using the Spec
but using a lambda instead will raise some ambiguity at call sites of
`retryWhen`.

We deem that acceptable given that the migration is quite easy
(turn `e -> whatever(e)` to `(Retry) rs -> whatever(rs.failure())`).
Furthermore, `retryWhen` is an advanced operator, and we expect most
uses to be combined with the retry builder in reactor-extra, which lifts
the ambiguity itself.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement A general enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants