Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3Boto3Storage raises ValueError: I/O operation on closed file. #382

Closed
tsifrer opened this issue Aug 15, 2017 · 73 comments · Fixed by #754 or #955
Closed

S3Boto3Storage raises ValueError: I/O operation on closed file. #382

tsifrer opened this issue Aug 15, 2017 · 73 comments · Fixed by #754 or #955

Comments

@tsifrer
Copy link

tsifrer commented Aug 15, 2017

When running python manage.py collectstatic we get the following exception:

Traceback (most recent call last):
  File "manage.py", line 10, in <module>
    execute_from_command_line(sys.argv)
  File "~/venvs/app_root/lib/python3.5/site-packages/django/core/management/__init__.py", line 363, in execute_from_command_line
    utility.execute()
  File "items()venvs/app_root/lib/python3.5/site-packages/django/core/management/__init__.py", line 355, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "~/venvs/app_root/lib/python3.5/site-packages/django/core/management/base.py", line 283, in run_from_argv
    self.execute(*args, **cmd_options)
  File "~/venvs/app_root/lib/python3.5/site-packages/django/core/management/base.py", line 330, in execute
    output = self.handle(*args, **options)
  File "~/venvs/app_root/lib/python3.5/site-packages/django/contrib/staticfiles/management/commands/collectstatic.py", line 199, in handle
    collected = self.collect()
  File "~/venvs/app_root/lib/python3.5/site-packages/django/contrib/staticfiles/management/commands/collectstatic.py", line 139, in collect
    for original_path, processed_path, processed in processor:
  File "~/venvs/app_root/lib/python3.5/site-packages/django/contrib/staticfiles/storage.py", line 246, in post_process
    for name, hashed_name, processed, _ in self._post_process(paths, adjustable_paths, hashed_files):
  File "~/venvs/app_root/lib/python3.5/site-packages/django/contrib/staticfiles/storage.py", line 312, in _post_process
    hashed_name = self.hashed_name(name, content_file)
  File "~/venvs/app_root/lib/python3.5/site-packages/django/contrib/staticfiles/storage.py", line 109, in hashed_name
    file_hash = self.file_hash(clean_name, content)
  File "~/venvs/app_root/lib/python3.5/site-packages/django/contrib/staticfiles/storage.py", line 86, in file_hash
    for chunk in content.chunks():
  File "~/venvs/app_root/lib/python3.5/site-packages/django/core/files/base.py", line 76, in chunks
    self.seek(0)
ValueError: I/O operation on closed file.

This only happens when using django-storages 1.6.4 or above.
Versions 1.6.3 and lower work fine.

We're using Django 1.11.4, python 3.5.2, boto3 1.4.6.

@jschneier
Copy link
Owner

@tsifrer thanks for the report. Just to clarify -- this happens on both 1.6.4 and 1.6.5?

@tsifrer
Copy link
Author

tsifrer commented Aug 15, 2017

@jschneier Correct, happens on both, 1.6.4 and 1.6.5.

@Alberick
Copy link

Alberick commented Sep 4, 2017

I'm also having this issue, any update on it ?

@xrmx
Copy link

xrmx commented Sep 7, 2017

It looks like the save of S3Boto3Storage now closes the file. I think it's a side effect of c73680e but haven't tried with the commit reverted.

@abhinavnair
Copy link

I am also facing this issue but not with collectstatic. But rather with uploading a file. I just moved from boto 2.40.0 to boto3 1.4.7 and django-storages from 1.1.8 to 1.6.5. After reading the comments above, I changed django-storages to 1.6.3 but the issue still occurs. Any help?

@abhinavnair
Copy link

abhinavnair commented Sep 22, 2017

I just tried going back versions even further and at 1.5.0, this issue doesn't occur. It occurs on every version later than 1.5.0.
Moreover this issue only arises when I am trying to upload an image file after cropping it. For all other cases, there is no problem at all.

Here is a snippet for uploading a cropped image file:-

def decodeImage(dataurl):
    from django.core.files.temp import NamedTemporaryFile
    from django.core.files import File
    imgstr = re.search(r'base64,(.*)', dataurl).group(1)
    img_temp = NamedTemporaryFile(delete=True)
    img_temp.write(imgstr.decode('base64'))
    img_temp.flush()
    return File(img_temp)

def post(self, request, *args, **kwargs):
    dataurl = request.POST['cropped']
    new_pic = decodeImage(dataurl)
    from django.utils.timezone import now
    self.myobject.profile_picture.save(str(now()), new_pic)

@Flimm
Copy link
Contributor

Flimm commented Sep 28, 2017

I experience this issue with version 1.6.5, but not with 1.6.3.

@ghost
Copy link

ghost commented Oct 23, 2017

Same as @abhinavnair for me. Uploading a file results in a crash, but collectstatic is perfectly fine.

The bug seems to be filetype independent, The same issue occurs on image and text files.

The 1.5 version works well, but the bug shows up on every version greater than it.

@WhyNotHugo
Copy link
Contributor

Has anyone manged to work around this?

@xrmx
Copy link

xrmx commented Nov 17, 2017

@WhyNotHugo yes, you don't need to assume that the saved file handler is open #382 (comment)

@RaidoS
Copy link

RaidoS commented Nov 29, 2017

@abhinavnair
thx, downgrade to 1.5.0 helps

@WhyNotHugo
Copy link
Contributor

@xrmx It's not an assumption I'm making; it's an assumption ManifestFilesMixin makes (and that's part of Django itself). I've pushed #437 to work around this (I'm working on fixing a few tests right now).

Note that, if you're using ManifestFilesMixin, you also need to set AWS_QUERYSTRING_AUTH = False; otherwise there's an infinite loop rewriting file path URLs.

@coderberry
Copy link

coderberry commented Jan 18, 2018

I had to lock the Pipfile to this:

"boto3" = "==1.5.0"
django-storages = "==1.6.3"

@akalinmehmet
Copy link

I had this issue too with django-storages == "1.6.5" but after downgrading to "1.5.0" with suggestion by @abhinavnair problem doesn't occured.

@Flimm
Copy link
Contributor

Flimm commented Mar 28, 2018

I get this issue with django-storages version 1.6.6 too.

@charlesthk
Copy link

Using a custom StorageClass fix the issue :

from storages.backends.s3boto3 import S3Boto3Storage, SpooledTemporaryFile
import os


class CustomS3Boto3Storage(S3Boto3Storage):

    def _save_content(self, obj, content, parameters):
        """
        We create a clone of the content file as when this is passed to boto3 it wrongly closes
        the file upon upload where as the storage backend expects it to still be open
        """
        # Seek our content back to the start
        content.seek(0, os.SEEK_SET)

        # Create a temporary file that will write to disk after a specified size
        content_autoclose = SpooledTemporaryFile()

        # Write our original content into our copy that will be closed by boto3
        content_autoclose.write(content.read())

        # Upload the object which will auto close the content_autoclose instance
        super(CustomS3Boto3Storage, self)._save_content(obj, content_autoclose, parameters)

        # Cleanup if this is fixed upstream our duplicate should always close
        if not content_autoclose.closed:
            content_autoclose.close()

@BojanKogoj
Copy link

I tried suggestion by @coderberry and it didn't work.

I tried @WhyNotHugo pull request (did pip install git+git://github.com/WhyNotHugo/django-storages.git@fix-valueerror-boto3) but it didn't help either.

One thing I noticed was that when I tried to upload an image the first time it crashed (but uploaded), but second attempt worked. Every image worked in second attempt but not in first.

I managed to get it working with example by @charlesthk

@Safrone
Copy link

Safrone commented Apr 10, 2018

the PR from @WhyNotHugo worked with Boto3==1.5.36 and it seems @charlesthk's fix works with the latest version of everything django-storages[boto3]==1.6.6 and boto3==1.7.3

@SlipJ
Copy link

SlipJ commented Apr 23, 2018

@charlesthk can you make a pull request with this to the master?

@nikssardana
Copy link

nikssardana commented May 7, 2018

I tried the solution by @charlesthk but it gives me an error: cant' start new thread.
Can anyone help me with this?

Edit: The error occurs once in a while but not every time. I have been searching for the solution but found nothing.

@sww314 sww314 added s3boto and removed s3boto labels Jul 12, 2018
@martinvol
Copy link

I confirm that collectstatic works downgrading django-storages to 1.6.3. (I'm using the Storage class used in #151)

@martinvol
Copy link

And I also confirm that @charlesthk solution works with django-storages 1.6.6 (currently the lastest)

Thanks @charlesthk! 🎉

@sww314
Copy link
Contributor

sww314 commented Jul 16, 2018

#481 fixes similar issue in Google Cloud Storage. All the storages need to act like the Django file storage, which calls content.seek(0) before doing anything else.

@sww314 sww314 added the bug label Jul 16, 2018
@Flimm
Copy link
Contributor

Flimm commented Sep 12, 2018

I still experience this issue in django-storages 1.7.1 .

@rnegron
Copy link

rnegron commented Oct 3, 2018

Also having this issue. The custom class in @charlesthk's comment solves it.

@probabble
Copy link

What needs to happen to get this fix enabled by default?

@ironworld
Copy link

@pasevin Thank you! It doesn't work because the method doesn't return a result from the parent class.
@voiddragon
Here is the fixed version:

class CustomS3Boto3Storage(S3Boto3Storage, ABC):
    """
    This is our custom version of S3Boto3Storage that fixes a bug in
    boto3 where the passed in file is closed upon upload.
    From:
    https://github.com/matthewwithanm/django-imagekit/issues/391#issuecomment-275367006
    https://github.com/boto/boto3/issues/929
    https://github.com/matthewwithanm/django-imagekit/issues/391
    """

    def _save(self, name, content):
        """
        We create a clone of the content file as when this is passed to
        boto3 it wrongly closes the file upon upload where as the storage
        backend expects it to still be open
        """
        # Seek our content back to the start
        content.seek(0, os.SEEK_SET)

        # Create a temporary file that will write to disk after a specified
        # size. This file will be automatically deleted when closed by
        # boto3 or after exiting the `with` statement if the boto3 is fixed
        with SpooledTemporaryFile() as content_autoclose:

            # Write our original content into our copy that will be closed by boto3
            content_autoclose.write(content.read())

            # Upload the object which will auto close the
            # content_autoclose instance
            return super(CustomS3Boto3Storage, self)._save(name, content_autoclose)

Thanks! This fix works for django-storages version 1.9.1.

nikolas added a commit to ccnmtl/django-cacheds3storage that referenced this issue Jul 16, 2020
Using this class with the new django-storages gives the error:

  ValueError: seek of closed file

This change should fix that -
jschneier/django-storages#382 (comment)
nikolas added a commit to ccnmtl/django-cacheds3storage that referenced this issue Jul 16, 2020
Using this class with the new django-storages gives the error:

  ValueError: seek of closed file

This change should fix that -
jschneier/django-storages#382 (comment)
nikolas added a commit to nikolas/django-storages that referenced this issue Jul 16, 2020
@nikolas
Copy link
Contributor

nikolas commented Jul 16, 2020

Pull request opened for this fix here: #905

@kshpytsya
Copy link

For me, a patch like this (which does not involve copying) fixes the issues.

--- site-packages/storages/backends/s3boto3.py.orig     2020-07-31 11:46:19.309504453 +0300
+++ site-packages/storages/backends/s3boto3.py  2020-07-31 11:58:25.400758620 +0300
@@ -38,10 +38,16 @@


 boto3_version_info = tuple([int(i) for i in boto3_version.split('.')])


+# https://github.com/boto/s3transfer/issues/80#issuecomment-562356142
+class NonCloseableBufferedReader(io.BufferedReader):
+    def close(self):
+        self.flush()
+
+
 @deconstructible
 class S3Boto3StorageFile(File):

     """
     The default file object used by the S3Boto3Storage backend.
@@ -542,11 +548,12 @@
         obj = self.bucket.Object(encoded_name)
         if self.preload_metadata:
             self._entries[encoded_name] = obj

         content.seek(0, os.SEEK_SET)
-        obj.upload_fileobj(content, ExtraArgs=params)
+        with NonCloseableBufferedReader(content) as reader:
+            obj.upload_fileobj(reader, ExtraArgs=params)
         return cleaned_name

     def delete(self, name):
         name = self._normalize_name(self._clean_name(name))
         self.bucket.Object(self._encode_name(name)).delete()

@loicgasser
Copy link

This looks like a better solution, can we possibly have a patch for this and new version? I know it's because of s3transfer now, but this could be nice to have in django-storages.

@robertpro
Copy link

@pasevin Thank you! It doesn't work because the method doesn't return a result from the parent class.
@voiddragon
Here is the fixed version:

class CustomS3Boto3Storage(S3Boto3Storage, ABC):
    """
    This is our custom version of S3Boto3Storage that fixes a bug in
    boto3 where the passed in file is closed upon upload.
    From:
    https://github.com/matthewwithanm/django-imagekit/issues/391#issuecomment-275367006
    https://github.com/boto/boto3/issues/929
    https://github.com/matthewwithanm/django-imagekit/issues/391
    """

    def _save(self, name, content):
        """
        We create a clone of the content file as when this is passed to
        boto3 it wrongly closes the file upon upload where as the storage
        backend expects it to still be open
        """
        # Seek our content back to the start
        content.seek(0, os.SEEK_SET)

        # Create a temporary file that will write to disk after a specified
        # size. This file will be automatically deleted when closed by
        # boto3 or after exiting the `with` statement if the boto3 is fixed
        with SpooledTemporaryFile() as content_autoclose:

            # Write our original content into our copy that will be closed by boto3
            content_autoclose.write(content.read())

            # Upload the object which will auto close the
            # content_autoclose instance
            return super(CustomS3Boto3Storage, self)._save(name, content_autoclose)

Thanks! This fix works for django-storages version 1.9.1.

Can I see a full version of your code? where does ABC and SpooledTemporaryFile comes from?

@pasevin
Copy link

pasevin commented Aug 17, 2020

@robertpro
ABC (Pycharm added this to silence "Class must implement all abstract methods" error): https://docs.python.org/3/library/abc.html
SpooledTemporaryFile: https://docs.python.org/3/library/tempfile.html#tempfile.SpooledTemporaryFile

@robertpro
Copy link

@pasevin Thanks!

These are the imports:

import os
from abc import ABC
from tempfile import SpooledTemporaryFile

from storages.backends.s3boto3 import S3Boto3Storage

regisb added a commit to overhangio/openedx-scorm-xblock that referenced this issue Nov 2, 2020
in django-storages==1.6.5, te S3Boto3Storage backend closes the file
after saving. This effectively prevents from re-opening the same file
later.

Bug: jschneier/django-storages#382
Resolution: jschneier/django-storages#754
@devgreek
Copy link

November 2020 , I downgraded django stotages from 1.9.0 to 1.5.0, it solved the issue for me

@jschneier
Copy link
Owner

For me, a patch like this (which does not involve copying) fixes the issues.

--- site-packages/storages/backends/s3boto3.py.orig     2020-07-31 11:46:19.309504453 +0300
+++ site-packages/storages/backends/s3boto3.py  2020-07-31 11:58:25.400758620 +0300
@@ -38,10 +38,16 @@


 boto3_version_info = tuple([int(i) for i in boto3_version.split('.')])


+# https://github.com/boto/s3transfer/issues/80#issuecomment-562356142
+class NonCloseableBufferedReader(io.BufferedReader):
+    def close(self):
+        self.flush()
+
+
 @deconstructible
 class S3Boto3StorageFile(File):

     """
     The default file object used by the S3Boto3Storage backend.
@@ -542,11 +548,12 @@
         obj = self.bucket.Object(encoded_name)
         if self.preload_metadata:
             self._entries[encoded_name] = obj

         content.seek(0, os.SEEK_SET)
-        obj.upload_fileobj(content, ExtraArgs=params)
+        with NonCloseableBufferedReader(content) as reader:
+            obj.upload_fileobj(reader, ExtraArgs=params)
         return cleaned_name

     def delete(self, name):
         name = self._normalize_name(self._clean_name(name))
         self.bucket.Object(self._encode_name(name)).delete()

I much prefer this non-copying solution given that the reason that upload_fileobj was added was to add support for very large files.

@jschneier
Copy link
Owner

Can someone verify that this branch resolves their issue?

#955

@magraeber
Copy link

Yes, it solves my issue.

@israellias
Copy link

@jschneier I need to wait to see this change on pypi?
It is secure to put master of this repo in my requirements.txt?

@jschneier
Copy link
Owner

@jschneier I need to wait to see this change on pypi?
It is secure to put master of this repo in my requirements.txt?

I'm going to push in a breaking change at some point so I wouldn't recommend. I'll just deploy as-is actually.

@jschneier
Copy link
Owner

@israellias just released.

@jschneier
Copy link
Owner

Unfortunately the fix seems to have broken things more generally (see #967) so I've put together an alternative one. My solution is to separate out the collectstatic handling since that seems to be quite different & not as common. Please take a look at #968.

This way it'll just be a quick update to a setting and any additional fixes are logically separated.

@djackson-exo-inc
Copy link

@jschneier Can you please specify what tag/release this is fixed in?

@singhravi1
Copy link

Hey everyone!
Fantastic library 👍 thanks to all the collaborators 🙏
Can anyone please confirm if there is any need to any kind of patch in the current version of django-storages?

@anderspetersson
Copy link

I'm getting this error occasionally. (ValueError: seek of closed file)

boto3 1.18.28
django 3.2.6
django-storages 1.11.1

DEFAULT_FILE_STORAGE = 'storages.backends.s3boto3.S3Boto3Storage'

Should we use the patch from @robertpro ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment