Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fog::AWS::Storage don't retry client errors #691

Merged
merged 2 commits into from Sep 29, 2023

Conversation

rahim
Copy link
Contributor

@rahim rahim commented Sep 25, 2023

Resolves #690

Excon's default configuration retries HTTP 4xx class errors, which we generally can't expect to be resolved by retry. Combined with relatively recent retry config added in #674 we were seeing S3 calls to non-existent objects take multiple seconds when made via fog-aws.

See the issue for detailed explanation of the underlying cause.

Though this resolves the problem, I question whether it's the right approach:

  • it's odd that the default retry behaviour (and connection options) for Fog::AWS::Storage are different to all the other services covered by fog-aws.
  • we create something of a foot gun, because someone who chooses to just configure eg retry_limit (or any other arbitrary connection option) will inadvertently set retry_errors back to Excon's default. (addressed in later commit)
  • possibly this is better addressed in Excon, I've opend Default retry behaviour includes HTTP 4xx client errors excon/excon#833 to ask that question there

@rahim
Copy link
Contributor Author

rahim commented Sep 26, 2023

we create something of a foot gun, because someone who chooses to just configure eg retry_limit (or any other arbitrary connection option) will inadvertently set retry_errors back to Excon's default.

Across fog-aws we have the pattern @connection_options = options[:connection_options] || {}

lib/fog/aws/efs.rb:75:          @connection_options = options[:connection_options] || {}
lib/fog/aws/elb.rb:133:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/beanstalk.rb:68:          @connection_options = options[:connection_options] || {}
lib/fog/aws/cdn.rb:148:          @connection_options = options[:connection_options] || {}
lib/fog/aws/cloud_watch.rb:90:          @connection_options = options[:connection_options] || {}
lib/fog/aws/sts.rb:81:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/redshift.rb:81:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/cloud_formation.rb:66:          @connection_options = options[:connection_options] || {}
lib/fog/aws/support.rb:90:          @connection_options = options[:connection_options] || {}
lib/fog/aws/data_pipeline.rb:107:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/dns.rb:100:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/ses.rb:55:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/emr.rb:63:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/glacier.rb:187:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/lambda.rb:85:          @connection_options = options[:connection_options] || {}
lib/fog/aws/auto_scaling.rb:96:          @connection_options = options[:connection_options] || {}
lib/fog/aws/kinesis.rb:39:          @connection_options = options[:connection_options] || {}
lib/fog/aws/kms.rb:94:          @connection_options = options[:connection_options] || {}
lib/fog/aws/storage.rb:549:          @connection_options     = options[:connection_options] || { retry_limit: 5, retry_interval: 1 }
lib/fog/aws/iam.rb:264:          @connection_options = options[:connection_options] || {}
lib/fog/aws/sns.rb:83:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/sqs.rb:81:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/ecs.rb:62:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/dynamodb.rb:80:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/rds.rb:217:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/federation.rb:39:          @connection_options = options[:connection_options] || {}
lib/fog/aws/simpledb.rb:74:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/compute.rb:554:          @connection_options           = options[:connection_options] || {}

That generally makes sense; if connection options aren't set, ensure we pass an empty hash to Excon.

For the custom retry settings that have been applied only to Storage, we'd be better to merge any specified connection options over the defaults. I'll make that change.

@geemus
Copy link
Member

geemus commented Sep 26, 2023

I hope I covered it relatively clearly in the other issues, but just in case I'll briefly say that the idea (if I recall) was to make it easier to deal with the eventually consistent nature of AWS. ie if you create an object and then want to address it, you'd like that to just work even if it takes a little bit longer. That said, cases like what you point out show how that was inadvertently applied much too broadly and by seeking to make that case easier it introduces a lot of lag for other use cases. I'm not yet sure what the best approach or solution really is, but tried to leave some ideas and ask some questions across the issues, so I hope we'll be able to talk through them and figure something out. That being said, I'd like to chat a bit more before proceeding with this or similar patches to make sure we've considered the options and impacts thoroughly. Thanks.

Copy link
Member

@geemus geemus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After our discussions, I believe this should fix issues that some folks were having and not negatively impact the eventual consistency things it was originally set out to address (as AWS seems to have fixed that on their side of things).

@geemus geemus merged commit cc42f32 into fog:master Sep 29, 2023
11 checks passed
@geemus
Copy link
Member

geemus commented Sep 29, 2023

@rahim did you need a release with this?

@rajyan
Copy link
Contributor

rajyan commented Sep 29, 2023

Thank you very much for looking into this issue and thank you for your swift responses!

I’m not the author of this PR, but I think this issue reported in carrierwave seems related
carrierwaveuploader/carrierwave#2697
so I would be grateful if you could release it.

@geemus
Copy link
Member

geemus commented Sep 29, 2023

@rajyan Thanks, I hadn't seen that issue, though I'm not certain it will totally fix it either (it should hopefully at least help).

Released in v3.21.0.

@rahim rahim deleted the dont-retry-s3-client-errors branch September 29, 2023 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fog::AWS::Storage default retry behaviour guarantees 6s delay for 4xx class responses
3 participants