Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gensupport: resumable uploads shouldn't use a hardcoded retryDeadline #685

Closed
hphilipps opened this issue Oct 1, 2020 · 4 comments · Fixed by #1414
Closed

gensupport: resumable uploads shouldn't use a hardcoded retryDeadline #685

hphilipps opened this issue Oct 1, 2020 · 4 comments · Fixed by #1414
Assignees
Labels
api: storage Issues related to the Cloud Storage API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@hphilipps
Copy link

There is a hardcoded retryDeadline of 32s for each chunk of a resumable upload which seems rather short for reliable uploads: https://github.com/googleapis/google-api-go-client/blob/master/internal/gensupport/resumable.go#L27

When doing many resumable uploads to GCS we often encounter 503 responses for some objects lasting for longer than 32s, which makes the whole upload for that object fail.

It would be great if we could just use the deadline of the context for timing out retries or make the retryDeadline an option.

@codyoss
Copy link
Member

codyoss commented Oct 1, 2020

cc @tritone

@codyoss codyoss added the api: storage Issues related to the Cloud Storage API. label Oct 1, 2020
@yoshi-automation yoshi-automation added the triage me I really want to be triaged. label Oct 2, 2020
agneum added a commit to agneum/google-api-go-client that referenced this issue Oct 2, 2020
@codyoss codyoss added type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. and removed triage me I really want to be triaged. labels Oct 5, 2020
@rogerhub
Copy link

I keep running into HTTP 502 errors on GCS writes, and it took an embarrasingly long time to figure out this was the root cause. The hardcoded 32 second timeout means that @tritone's recent work on improving GCS retries in googleapis/google-cloud-go#5210 is basically useless for me.

On my peasant residential internet connection, the default chunk size (16MiB) takes longer than 32 seconds to upload (especially considering concurrent writes, intentional network throttling, and the unfortunate realities of cable internet), so there's no time left for retries if the first attempt fails. What's worse is that since this value is hardcoded, I had to patch google-api-go-client in order to extend the timeout.

diff --git a/internal/gensupport/retry.go b/internal/gensupport/retry.go
index 4a4861b..8674bbb 100644
--- a/internal/gensupport/retry.go
+++ b/internal/gensupport/retry.go
@@ -21,7 +21,7 @@ type Backoff interface {
 // These are declared as global variables so that tests can overwrite them.
 var (
        // Per-chunk deadline for resumable uploads.
-       retryDeadline = 32 * time.Second
+       retryDeadline = 3600 * time.Second
        // Default backoff timer.
        backoff = func() Backoff {
                return &gax.Backoff{Initial: 100 * time.Millisecond}

Please consider increasing the default or making this configurable (or both). Thanks! :)

@codyoss codyoss assigned tritone and unassigned codyoss Jan 31, 2022
tritone added a commit to tritone/google-api-go-client that referenced this issue Feb 1, 2022
Allow users to configure the per-chunk deadline for retries
that's used during resumable uploads.

Needs to be exposed via the manual layer for storage.

Fixes googleapis#685
@tritone
Copy link
Contributor

tritone commented Feb 1, 2022

Thanks @rogerhub for the report, and apologies for dropping the ball on this issue earlier. I have a PR up at #1414 to resolve; this would have to be exposed on the manual client via a field on the Writer. Feel free to take a look!

tritone added a commit that referenced this issue Feb 1, 2022
Allow users to configure the per-chunk deadline for retries
that's used during resumable uploads.

Needs to be exposed via the manual layer for storage.

Fixes #685
@rogerhub
Copy link

rogerhub commented Feb 2, 2022

Thanks for the quick action on this @tritone!

tritone added a commit to tritone/google-cloud-go that referenced this issue Feb 9, 2022
Expose the ChunkRetryDeadline MediaOption through the manual
client layer. This allows users to set a longer deadline for
chunk retries in resumable uploads if desired.

I also considered this as a RetryOption, but 1. it's only
relevant for uploads and 2. it should be configured in
conjunction with ChunkSize which is a field on Writer.

Updates googleapis/google-api-go-client#685
tritone added a commit to googleapis/google-cloud-go that referenced this issue Feb 10, 2022
Expose the ChunkRetryDeadline MediaOption through the manual
client layer. This allows users to set a longer deadline for
chunk retries in resumable uploads if desired.

I also considered this as a RetryOption, but 1. it's only
relevant for uploads and 2. it should be configured in
conjunction with ChunkSize which is a field on Writer.

Updates googleapis/google-api-go-client#685
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the Cloud Storage API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
5 participants