Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include Content-MD5 header when body is empty #1985

Merged
merged 2 commits into from Nov 16, 2020

Conversation

wimglenn
Copy link
Contributor

Closes #1979

@codecov-io
Copy link

codecov-io commented Feb 25, 2020

Codecov Report

Merging #1985 into develop will increase coverage by 0.02%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           develop   #1985      +/-   ##
==========================================
+ Coverage    92.98%     93%   +0.02%     
==========================================
  Files           60      60              
  Lines        10784   10784              
==========================================
+ Hits         10027   10030       +3     
+ Misses         757     754       -3
Impacted Files Coverage Δ
botocore/handlers.py 96.51% <100%> (ø) ⬆️
botocore/credentials.py 98.87% <0%> (+0.33%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fb2f126...223f3ea. Read the comment docs.

@swetashre
Copy link
Contributor

@wimglenn - Can you write a test case to make sure the header is present ?

@wimglenn
Copy link
Contributor Author

@swetashre Please check the CI failure. It does not look related to my change at all.

@wimglenn
Copy link
Contributor Author

@swetashre Is there anything more I can do to help get this bug fixed?

@wimglenn
Copy link
Contributor Author

wimglenn commented May 19, 2020

@swetashre Is it possible to get a review or get this patch from February merged? What's the hold-up?

@swetashre swetashre requested a review from joguSD May 21, 2020 21:55
@joguSD
Copy link
Contributor

joguSD commented May 21, 2020

@wimglenn Conceptually, these changes sound alright to me. We need to do some confirmation on our side around all the exact expectations for the content-md5 of empty bodies. We also have some changes in the works around this code path that will end up conflicting. The changes might end up fixing this issue anyways. We'll follow up on this after those changes go out.

@nateprewitt
Copy link
Contributor

nateprewitt commented Jun 19, 2020

Hi @wimglenn, just wanted to follow up from our last comment. The changes referenced are now in place. Please feel free to rebase this on top of the develop branch and we can take another look. Thanks!

@jangrewe
Copy link

@nateprewitt Maybe you can have a look at #2181 - this fix works fine for our issue.

@wimglenn
Copy link
Contributor Author

wimglenn commented Oct 17, 2020

@nateprewitt @joguSD @swetashre Rebased on develop. We're actually still on a forked botocore, and are blocked from updating boto3 because of this issue, so it would be helpful to get the bugfix upstream..

@nateprewitt nateprewitt changed the base branch from develop to empty_md5_fix November 16, 2020 20:22
@nateprewitt
Copy link
Contributor

Thanks again for this PR, @wimglenn! I'm merging this into a feature branch for a couple more tweaks and we're going to try to get this merged into live asap.

@nateprewitt nateprewitt merged commit 7d36d94 into boto:empty_md5_fix Nov 16, 2020
aws-sdk-python-automation added a commit that referenced this pull request Nov 17, 2020
* release-1.19.20:
  Bumping version to 1.19.20
  Update to latest models
  Correct the spelling of endpoints when throwing the Invalid IDMS Endpoint Error
  Add S3 tests for PutObject empty body
  Include Content-MD5 header when body is empty (#1985)
@radekwlsk
Copy link

radekwlsk commented Dec 2, 2020

@wimglenn @nateprewitt FYI: In Django (and especially django-storages with boto3) when using StringIO based ContentFile (file-like object that takes raw content, string or bytes) to create S3 object this change makes it fail on save to S3 as only bytes content works properly.

For me the issue arised when using empty file (ContentFile("")) to create S3 object stub to save it's reference to database before file content is uploaded under the key.

TypeError: Unicode-objects must be encoded before hashing
  ...
  File "django/db/models/fields/files.py", line 87, in save
    self.name = self.storage.save(name, content, max_length=self.field.max_length)
  File "django/core/files/storage.py", line 52, in save
    return self._save(name, content)
  File "storages/backends/s3boto3.py", line 451, in _save
    obj.upload_fileobj(content, ExtraArgs=params)
  File "boto3/s3/inject.py", line 619, in object_upload_fileobj
    return self.meta.client.upload_fileobj(
  File "boto3/s3/inject.py", line 539, in upload_fileobj
    return future.result()
  File "s3transfer/futures.py", line 106, in result
    return self._coordinator.result()
  File "s3transfer/futures.py", line 265, in result
    raise self._exception
  File "s3transfer/tasks.py", line 126, in __call__
    return self._execute_main(kwargs)
  File "s3transfer/tasks.py", line 150, in _execute_main
    return_value = self._main(**kwargs)
  File "s3transfer/upload.py", line 692, in _main
    client.put_object(Bucket=bucket, Key=key, Body=body, **extra_args)
  File "botocore/client.py", line 357, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "botocore/client.py", line 652, in _make_api_call
    handler, event_response = self.meta.events.emit_until_response(
  File "botocore/hooks.py", line 360, in emit_until_response
    return self._emitter.emit_until_response(aliased_event_name, **kwargs)
  File "botocore/hooks.py", line 243, in emit_until_response
    responses = self._emit(event_name, kwargs, stop_on_response=True)
  File "botocore/hooks.py", line 211, in _emit
    response = handler(**kwargs)
  File "botocore/utils.py", line 2219, in conditionally_calculate_md5
    md5_digest = calculate_md5(body, **kwargs)
  File "botocore/utils.py", line 2196, in calculate_md5
    binary_md5 = _calculate_md5_from_file(body)
  File "botocore/utils.py", line 2209, in _calculate_md5_from_file
    md5.update(chunk)

I suppose making it ContentFile(b"") will resolve that issue, as with string it uses StringIO instead of BytesIO buffer. But maybe text content could be handled here? To encode it first, before calculating md5.

@wimglenn
Copy link
Contributor Author

wimglenn commented Dec 2, 2020

@radekwlsk Yes I would think that's either an error in django-storages, or bug which was lurking in user code using it, that code should use BytesIO in the first place. Better to make sure you're sending bytes and not text, it's outside of the responsibility for botocore to guess any encoding to use.

@nateprewitt
Copy link
Contributor

nateprewitt commented Dec 2, 2020

Hi @radekwlsk, thanks for bringing this to our attention! It looks like django-storages has been violating the API constraint but happened to be getting away with it only for "falsey" files. The boto3 interface they're using states that the fileobj supplied must be in binary mode to work correctly.

@wimglenn is correct that this will likely need to be fixed in that project since guessing around file encodings leads to a lot of unexpected failures.

@radekwlsk
Copy link

Hi @radekwlsk, thanks for bringing this to our attention! It looks like django-storages has been violating the API constraint but happened to be getting away with it only for "falsey" files. The boto3 interface they're using states that the fileobj supplied must be in binary mode to work correctly.

Thanks for quick response! I've already reported that isssue in django-storages. If there is no response I may create a PR myself then.

@wimglenn wimglenn deleted the issue1979 branch December 2, 2020 21:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can not PUT a 0 bytes object when Object Lock is enabled on s3 bucket
7 participants