Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming HTTP requests retried after sending data #2311

Closed
3 tasks done
grddev opened this issue May 19, 2020 · 5 comments
Closed
3 tasks done

Streaming HTTP requests retried after sending data #2311

grddev opened this issue May 19, 2020 · 5 comments
Assignees
Labels
bug This issue is a bug. help wanted We are asking the community to submit a PR to resolve this issue. investigating Issue is being investigated

Comments

@grddev
Copy link

grddev commented May 19, 2020

Confirm by changing [ ] to [x] below to ensure that it's a bug:

Describe the bug
If a networking error is raised during processing of a streaming response, such as an S3 download, the request is retried. This is problematic, as the block received the same first chunk of the file three times, without any indication that the download was effectively restarted. I fixed this behavior for aws-sdk-v1 back in the day, but it seems the fix wasn't carried over to aws-sdk-v3.

Gem name ('aws-sdk', 'aws-sdk-resources' or service gems like 'aws-sdk-s3') and its version
aws-sdk-core (3.82.0)
aws-sdk-s3 (1.57.0)

Version of Ruby, OS environment
-paste the output of ruby -v
ruby 2.5.5p157 (2019-03-15 revision 67260) [x86_64-darwin18]

To Reproduce (observed behavior)
In order to reproduce the error in IRB, I simulate a socket error by raising it from the processing callback, and in the debug output, you can see that the request was retried three times.

irb(main):001:0> Aws::S3::Client.new(logger: Logger.new(STDERR), log_level: :debug).get_object(bucket: 'bkt', key: 'key') { raise SocketError }
D, [...] DEBUG -- : [Aws::S3::Client 200 4.182303 3 retries] get_object(bucket:"bkt",key:"key") Seahorse::Client::NetworkingError SocketError

Technically, the fact that I can simulate this by raising from within the block is a bug in itself, as the net-http-component shouldn't pick up a socket error from within the callback and translate into retryable networking error, as this is a completely unrelated error.

Expected behavior
The download error should not be retried after the callback block has been invoked, or there needs to be a separate callback for errors, or the callback needs to be supplied another argument that would indicate a restart.

@grddev grddev added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels May 19, 2020
@grddev
Copy link
Author

grddev commented May 20, 2020

It seems as if this is caused by the resetting of the body here:

context.http_response.body = StringIO.new

That is triggered before the retry logic, and thus the retry-logic, which is already actively checking whether the response body is truncatable, doesn't actually see the real body, and is therefore unable to prevent retries.

This overriding of the response body seems to have been introduced in ba932c1, but there doesn't seem to be any motivation for the response body change in that commit, as the issue pertains to handling error responses. In #1877 the behavior was fixed for one corner-case, but this issue indicates it needs to be fixed for other cases as well. Perhaps it should be disabled altogether?

The only other option I can think of would be to introduce separate replacement bodies depending on the type of body issued, or a generic wrapper, but neither option really seems reasonable.

@ajredniwja
Copy link
Member

Hey @grddev thank-you for reaching out to us with your issue, I was able to reproduce it, I am not sure if this will be prioritized until next week to be investigated and hopefully fixed over the other issues. We would appreciate if you'd like to contribute and open a PR for us to review as well.

@ajredniwja ajredniwja added investigating Issue is being investigated and removed needs-triage This issue or PR still needs to be triaged. labels May 21, 2020
@mullermp mullermp added the help wanted We are asking the community to submit a PR to resolve this issue. label May 26, 2020
@alextwoods
Copy link
Contributor

I've been looking into this and reviewing (and updating) an existing PR that we have open: #1617 that may (mostly) address this.

Our documentation for streaming blocks currently mentions: # WARNING: yielding data to a block disables retries of networking errors; however, obviously we are trying in this case and it leads to a confusing and incorrect experience. Because the body is set to the BlockIO only for successful (http status 2XX) responses, retries will function as expected for most cases except body truncation issues (and this is roughly what raising a SocketError in the block is equivalent to I believe).

#1617 attempts to fix this by keeping track of the bytes that have been sent to the body IO already and not re-sending them. However, this does not work for the S3 Encryption::Client (as the number of unencrypted vs encrypted bytes will differ) and the block will still be called multiple times with the same data.

@alextwoods
Copy link
Contributor

Merged #1617 which addresses this.

@grddev
Copy link
Author

grddev commented Jun 12, 2020

Thank you for looking into this @alextwoods. I was surprised that there was a PR given that I searched the issues and didn't find anything, but this isn't the first time I'm fooled by the separation of issues and PR:s. Sorry for not reaching out earlier, but the current solution looks reasonable to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. help wanted We are asking the community to submit a PR to resolve this issue. investigating Issue is being investigated
Projects
None yet
Development

No branches or pull requests

4 participants