Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upload_file return value #82

Open
nathan-muir opened this issue Feb 9, 2017 · 6 comments
Open

upload_file return value #82

nathan-muir opened this issue Feb 9, 2017 · 6 comments

Comments

@nathan-muir
Copy link

nathan-muir commented Feb 9, 2017

Both PutObject and CompleteMultipartUpload respond with data that includes the VersionId and ETag. [1] [2]

It would be really useful if S3Transfer.upload_file could return this response, or some part of the response.

@dstufft
Copy link
Contributor

dstufft commented Feb 14, 2017

Thanks, marking this as a feature enhancement.

@eode
Copy link

eode commented Dec 4, 2018

looks like this isn't going anywhere. This actually destroys the viability of using s3transfer manager in any case where there could potentially be more than one version uploaded, as one can't guarantee that the data from a subsequent 'head' call refers to the same file -- since s3 is eventually consistent.

That's a pretty bad breakage, rather than just a feature request.

@isobit
Copy link

isobit commented May 28, 2019

Does anyone know of a workaround, or do we have to resort to not using s3transfer? As far as I can tell it is impossible to determine which version was just uploaded without this due to race conditions with a subsequent HEAD, like @eode mentioned.

@benmanns
Copy link

A few options I've thought of to work around this:

  1. Use a unique key that will never be chosen again. E.g. upload to a UUID and then head that object to get the version ID.
  2. Pass a unique Metadata key and value in ExtraArgs. Verify that when checking for the output version.
  3. If all you care about is that all files get version IDs copied somewhere you can subscribe to the bucket's S3 events and push file creation/update metadata to an outside data storage system.

@toojays
Copy link

toojays commented May 11, 2021

I have a workaround for s3transfer.manager.TransferManager, which is what boto3 uses. Monkeypatch the PutObjectTask and CompleteMultipartUploadTask so they actually return the response from the S3 client call. This fixes boto's Bucket.upload_fileobj() and friends.

import s3transfer.upload
import s3transfer.tasks


class PutObjectTask(s3transfer.tasks.Task):
    # Copied from s3transfer/upload.py, changed to return the result of client.put_object.
    def _main(self, client, fileobj, bucket, key, extra_args):
        with fileobj as body:
            return client.put_object(Bucket=bucket, Key=key, Body=body, **extra_args)


class CompleteMultipartUploadTask(s3transfer.tasks.Task):
    # Copied from s3transfer/tasks.py, changed to return a result.
    def _main(self, client, bucket, key, upload_id, parts, extra_args):
        print(f"Multipart upload {upload_id} for {key}.")
        return client.complete_multipart_upload(
            Bucket=bucket,
            Key=key,
            UploadId=upload_id,
            MultipartUpload={"Parts": parts},
            **extra_args,
        )


s3transfer.upload.PutObjectTask = PutObjectTask
s3transfer.upload.CompleteMultipartUploadTask = CompleteMultipartUploadTask

@mdavis-xyz
Copy link

What's the status of this? If we put that monkey patch into a pull request, will that fix the problem?

hoshimura added a commit to hoshimura/s3transfer that referenced this issue Sep 16, 2022
implements workaround in issue upload_file return value
[boto#82](boto#82)
and delete, copy, download tasks where a client response
is made available to the transfer future result() return

Change-Id: I434a0af1155cf66536d2222a097752a13ebb8d6a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants