New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gcp: make per-chunk retry upload timeout configurable #80474
Conversation
Open to discussing this PR more since the 60-second value was chosen unscientifically. Our backups target writing 128MB files that are then chunked into the default 16MB chunks by gcs. This comment googleapis/google-api-go-client#685 (comment) was interesting, especially in the light of the throttling we are adding to external storage write paths. If the write is destined to fail this will cause the backup to take longer before it reaches a failed state. |
This change adds a cluster setting `cloudstorage.gs.chunking.retry_timeout` that can be used to change the default per-chunk retry timeout that GCS imposes when chunking of file upload is enabled. The default value is set to 60 seconds, which is double of the default google sdk value of 30s. This change was motivated by sporadic occurrences of a 503 service unavailable error during backups. On its own this change is not expected to solve the resiliency issues of backup when the upload service is unavailable, but it is nice to have configurable setting nonetheless. Release note (sql change): `cloudstorage.gs.chunking.retry_timeout` is a cluster setting that can be used to configure the per-chunk retry timeout of files to Google Cloud Storage. The default value is 60 seconds.
c9b7f33
to
ac2387e
Compare
TFTR! bors r=dt |
Build succeeded: |
This change adds a cluster setting
cloudstorage.gs.chunking.retry_timeout
that can be used to change the default per-chunk retry timeout
that GCS imposes when chunking of file upload is enabled. The default
value is set to 60 seconds, which is double of the default google sdk
value of 30s.
This change was motivated by sporadic occurrences of a 503 service unavailable
error during backups. On its own this change is not expected to solve the
resiliency issues of backup when the upload service is unavailable, but it
is nice to have configurable setting nonetheless.
Release note (sql change):
cloudstorage.gs.chunking.retry_timeout
is a cluster setting that can be used to configure the per-chunk retry
timeout of files to Google Cloud Storage. The default value is 60 seconds.