Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upload_fileobj is always mutlipart #333

Closed
plotlogic-andrew opened this issue Apr 30, 2024 · 3 comments
Closed

upload_fileobj is always mutlipart #333

plotlogic-andrew opened this issue Apr 30, 2024 · 3 comments
Assignees
Labels

Comments

@plotlogic-andrew
Copy link

  • Async AWS SDK for Python version: 12.4.0
  • Python version: 3.12
  • Operating System: Fedora 36

Description

upload_fileobj always does multipart uploads. This results in:

  • small uploaded file having different etags than if the standard boto3.upload_fileobj was used
  • (presumably) 3 x the API calls for small files (create-upload-complete vs just upload)

What I Did

    import boto3
    import aioboto3

    file = 'test.txt'
    with open(file, "wb") as fp:
        pass
    
    bucket, key = ...

    s3_client = boto3.client('s3') 
    s3_client.upload_file(file, bucket, key + '/test-sync.txt')
        
    async with aioboto3.Session().client('s3') as s3_client:
        await s3_client.upload_file(file, bucket, key + '/test-async.txt')

The uploadfile_obj code make reference to multipart_chunksize but doesn't use it.

@terricain terricain self-assigned this Apr 30, 2024
@terricain terricain added the bug label Apr 30, 2024
@terricain
Copy link
Owner

Yeah I've not looked at the S3 transfer code in years but im sure it should not multipart if the file is small enough. Will look at fixing it at some point

@plotlogic-andrew
Copy link
Author

I'd call it a feature request more than a bug :-).

I'm happy to contribute - but after a quick look the code for upload_fileobj coming from botocore seemed opaque at best to me. I'll look again when I have more time to.

And thanks for putting this together! I've always been bothered that boto3 defaults to 10 threads to keep an upload pipe full.

@terricain
Copy link
Owner

s3.upload_fileobj now respects Config.multipart_threshold and will issue a singular s3.put_object if the file is below the threshold.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants