Skip to content

retryDelayOptions.customBackoff function called for non-retryable errors #3401

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jgehrcke opened this issue Aug 17, 2020 · 13 comments
Closed
Assignees
Labels
bug This issue is a bug.

Comments

@jgehrcke
Copy link
Contributor

SDK version number
v2.730

I noticed that retryDelayOptions.customBackoff() is being called once for non-retryable (e.g., 404-style) errors. The return value is subsequently ignored (no retry is going to happen).

For non-retryable errors there is one call too much which often times is not a problem but of course it can be for side effects triggered by the customBackoff() implementation.

Docs: https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/Config.html#retryDelayOptions-property

That problem was introduced with #2898:

Before #2898:

if (err && err.retryable && retryCount < maxRetries) {
    retryCount++;
    var delay = util.calculateRetryDelay(retryCount, options.retryDelayOptions);
    setTimeout(sendRequest, delay + (err.retryAfter || 0));
}

Note that here calculateRetryDelay() was only ever called when error information was available (err was set) and when err.retryable was true.

After #2898:

var delay = util.calculateRetryDelay(retryCount, options.retryDelayOptions, err);
if (err && err.retryable && retryCount < maxRetries && delay >= 0) {
    retryCount++;
    setTimeout(sendRequest, delay + (err.retryAfter || 0));
}

Note that util.calculateRetryDelay() is always called, w/o inspecting err.retryable.

Notably, it is even getting called when err is not set (which seems to be the case for 404-style API responses!).

However, the actual retry is then only triggered when err && err.retryable -- just like before the patch. That is, in that case the return value of util.calculateRetryDelay() (calling retryDelayOptions.customBackoff()) is ignored.

Proposal:

  • Call retryDelayOptions.customBackoff() only when the library internals have determined that err && err.retryable.
  • Document that behavior.

Future direction:
If if is up for dispute whether or not a specific error is retryable and custom code should have a word then maybe another hook is required. For example, a hook where calling code can inspect the API response in detail and set err and/or err.retryable. All that happening before retryDelayOptions.customBackoff() gets invoked.

I'd be happy to work on a patch, and in fact might get to doing so still today.

@jgehrcke jgehrcke added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Aug 17, 2020
jgehrcke added a commit to jgehrcke/aws-sdk-js that referenced this issue Aug 17, 2020
See aws#3401.

Call it only upon retryable errors, and only
when maxRetries is not yet reached.
jgehrcke added a commit to jgehrcke/aws-sdk-js that referenced this issue Aug 17, 2020
jgehrcke added a commit to jgehrcke/aws-sdk-js that referenced this issue Aug 17, 2020
See aws#3401.

Call it only upon retryable errors, and only
when maxRetries is not yet reached.
jgehrcke added a commit to jgehrcke/aws-sdk-js that referenced this issue Aug 17, 2020
@ajredniwja ajredniwja removed the needs-triage This issue or PR still needs to be triaged. label Aug 18, 2020
@ajredniwja
Copy link
Contributor

Hey @jgehrcke, thank-you for bringing this up, your initial PR looks good, can you complete all the tests?

@ajredniwja ajredniwja added the response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. label Aug 18, 2020
@jgehrcke
Copy link
Contributor Author

jgehrcke commented Aug 18, 2020 via email

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. label Aug 19, 2020
@ajredniwja
Copy link
Contributor

@jgehrcke I mean all the checklist tests in your PR

@jgehrcke
Copy link
Contributor Author

jgehrcke commented Aug 21, 2020

I mean all the checklist tests in your PR

@ajredniwja I still have a hard time understanding what "the checklist tests" are. Are you asking me to fill out the checklist in the PR template? I did that initially, commented on each checklist item that I did not check.

I have now also run npm run test and checked that checklist item, and added the outcome to the PR.

Maybe this is still a misunderstanding... .:-). Happy to resolve that.

@ajredniwja
Copy link
Contributor

@jgehrcke You're good, I'll have a look at the PR?

@jgehrcke
Copy link
Contributor Author

You're good, I'll have a look at the PR?

Thanks and yes, please :-).

@jgehrcke
Copy link
Contributor Author

jgehrcke commented Sep 3, 2020

@ajredniwja would appreciate if we could move this forward!! Thank you :-) Anything I can still do?

@jgehrcke
Copy link
Contributor Author

@ajredniwja @AllanFly120 would love for this to not stall. Any help is appreciated. Thank you!

@jgehrcke
Copy link
Contributor Author

jgehrcke commented Oct 1, 2020

@trivikr is that something you can push forward? Thank you!

@springmeyer
Copy link

Another aws-sdk user / premium aws customer who is watching this issue and looking forward to the fix landing (since I use the customBackoff and noticed this PR fixing the inefficiency. Thanks for the work here @jgehrcke!

@ajredniwja
Copy link
Contributor

The PR was merged. Apologies for the delay.

@jgehrcke
Copy link
Contributor Author

jgehrcke commented Oct 2, 2020

Thank you @ajredniwja and thanks for chiming in @springmeyer.

@peterlundberg
Copy link

peterlundberg commented Nov 19, 2021

Have noticed that there is still a problem with customBackoff being called also for errors that are not retryable (such as 404 from S3).

In that case event_listeners will trigger Request.RETRY_CHECK -> service.retryDelays -> calculateRetryDelay -> customBackoff.

Thus one needs a if (error.retryable) {} guard to avoid inefficiencies like calculating the delay or logging the retry attempt. This should at least be documented if true?

Would you prefer re-opening this issue or creating a new one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug.
Projects
None yet
Development

No branches or pull requests

4 participants