Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

djanog-pictures not compatible wiht django-storages #26

Closed
vchrisb opened this issue Jun 8, 2022 · 16 comments
Closed

djanog-pictures not compatible wiht django-storages #26

vchrisb opened this issue Jun 8, 2022 · 16 comments
Assignees
Labels
bug Something isn't working

Comments

@vchrisb
Copy link
Contributor

vchrisb commented Jun 8, 2022

I finally did test django-pictures with django-storages (using s3) and they are not compatible.
When creating or updating an image I do get The file cannot be reopened.

I think this is due to https://github.com/codingjoe/django-pictures/blob/main/pictures/tasks.py#L15 trying to open the (closed) file again. From my understanding django-storages is using in-memory for file processing, which does not allow it to be reopened.

I also looked at django-stdimage and this implementation is opening the file from storage:
https://github.com/codingjoe/django-stdimage/blob/4c7194a88fc970cb0d910f444a2de29911d4a630/stdimage/models.py#L72

❓ Maybe adopting this would fix this?
❓ Or not closing it in the first place before processing finished?

@codingjoe
Copy link
Owner

@vchrisb interesting. I have been using the async methods in a life system only. I believe we should still try to open the files, but handle the exception. I use a custom storage form django-s3file that doesn't touch files during the upload, therefore the would be closed.

Do you want to do the honors of adding a try bock around this?

@codingjoe codingjoe added the bug Something isn't working label Jun 8, 2022
@vchrisb
Copy link
Contributor Author

vchrisb commented Jun 8, 2022

From my understanding it is not possible to open the files again when they are in-memory representations.
The files are already closed somewhere else, hence it will fail in anyway.

@vchrisb
Copy link
Contributor Author

vchrisb commented Jun 8, 2022

What about:

def _process_picture(field_file: PictureFieldFile) -> None:
    with field_file.storage.open(field_file.name) as file:
        with Image.open(file) as img:
            for ratio, sources in field_file.aspect_ratios.items():
                for file_type, srcset in sources.items():
                    for width, picture in srcset.items():
                        picture.save(img)

@codingjoe
Copy link
Owner

If they are open or in memory, we can just read from them? Loading it into memory twice might not to be so memory efficient.

@vchrisb
Copy link
Contributor Author

vchrisb commented Jun 8, 2022

So I did dig into it a bit more. Apparently S3boto is calling close on the file after upload: boto/boto3#3150

@vchrisb
Copy link
Contributor Author

vchrisb commented Jun 8, 2022

Another try. 😁

    # TemporaryUploadedFile can't be reopened
    if field_file.closed and getattr(field_file, "_file", None) is not None and isinstance(field_file._file, TemporaryUploadedFile):
        field_file._file = None

    with field_file.open() as file:
        with Image.open(file) as img:
            for ratio, sources in field_file.aspect_ratios.items():
                for file_type, srcset in sources.items():
                    for width, picture in srcset.items():
                        picture.save(img)                        

If _file is None, file will be reload from storage: https://github.com/django/django/blob/0dd29209091280ccf34e07c9468746c396b7778e/django/db/models/fields/files.py#L77

@codingjoe
Copy link
Owner

That looks like a lot of ifs. Maybe we just try to open the file, catch the exception and open it from storage if need be. Maybe that's more EAFP?

@vchrisb
Copy link
Contributor Author

vchrisb commented Jun 9, 2022

I further looked into it:
When uploading a file, it is a subclass of UploadedFile https://github.com/django/django/blob/stable/4.0.x/django/core/files/uploadedfile.py#L21.

boto3 is calling (unwanted) close() on this UploadedFile boto/boto3#3150, which gets deleted (either from memory or tmp disk)

With _process_picture being called in the upload request, _file is still an UploadedFile, but due to boto3 it is closed.
Therefore it can't be reopened.

django-storages actually introduced an adapted storage class (https://github.com/jschneier/django-storages/blob/master/storages/backends/s3boto3.py#L612) when using ManifestFilesMixin, which is copying the temporary file before passing it to s3boto. jschneier/django-storages#382

Long story short, creating a custom S3Boto3Storage class like the following, does handle the issue:

class MediaStorage(S3Boto3Storage):
    def _save(self, name, content):
        content.seek(0)
        with tempfile.SpooledTemporaryFile() as tmp:
            tmp.write(content.read())
            return super()._save(name, tmp)

Besides that, I think _process_picture should be updated to use with field_file.open() as file:, so the file gets closed at the end?

@codingjoe
Copy link
Owner

OK, you seem to have put a lot of research into this. I don't know if a custom storage is feasible for most. I would probably default to open the file from storage (always) and of course close it at the end as you suggested.

@vchrisb
Copy link
Contributor Author

vchrisb commented Jun 9, 2022

I think for django-storages it is not too uncommon to create a storage class: https://django-storages.readthedocs.io/en/latest/backends/amazon-S3.html#overriding-the-default-storage-class
It is e.g. necessary for configuring different buckets for static and media.
With _process_picture being called in the upload request I share your opinion, that loading the file again from storage isn't very efficient and makes the request longer than necessary.

With that said, I can add a section to the README for django-storages and also update _process_picture to use with field_file.open() as file:.

@codingjoe
Copy link
Owner

Hm… yes, but it's still going to raise many questions aka tickets. Besides, we'd leave all files open, even if they are not post processesed, this seems wrong. I believe storages is doing the right thing. Opening a file from storage seems more plausible. It needs to be done in all async cases anyway. So it would be a more consistent behavior across all implementations.

@vchrisb
Copy link
Contributor Author

vchrisb commented Jun 9, 2022

If someone is using django-storages with ManifestFilesMixin, they will also get a similar exception and need to look at the docs.
I also checked that the file gets closed anyway. So the with clause is actually not needed in the sync task.

Anyway, I'm happy with adapting the storage class and you might want to merge the addition to the Readme. ;)

@codingjoe
Copy link
Owner

Hi @vchrisb,

I understand how that works for you. But you also seem like a very talented software engineer. That aside, I don't really know what side effects no closing the files might have on other packages. Soooo, I would be more reluctant to have this package work regardless of the storage implementation and without interfering with other 3rd party code.

I am happy to do the changes myself, but I feel like you deserve the credit here.

Best Joe

@vchrisb
Copy link
Contributor Author

vchrisb commented Jun 21, 2022

I'm just good at googling. You are the engineer who created this great package! :)
I still feel we are not on the same page yet. There are two distinct things:

  1. Closing the file in the sync task is not necessary. From my testing, close will be called somewhere later on the file. I could try to dig out where.

  2. Only for django-storages with s3boto, it is necessary to modify the storage class. From my point of view this acceptable, because this is a known limitation of s3boto in django-storages. Hence there is a specific storage class when using ManifestStaticFilesStorage. I was thinking to do a PR to django-storages to add another storage class similar to storages.backends.s3boto3.S3ManifestStaticStorage.

@codingjoe
Copy link
Owner

@vchrisb I gave some thought and I do feel your first impulse was right. This package should handle this issue out of the box and there is no reasonable disadvantage to open the file from storage.

codingjoe added a commit that referenced this issue Aug 6, 2022
@Ian2012
Copy link

Ian2012 commented Nov 23, 2022

This issue persist, I'm using django-cookiecutter.

I've used the custom storage that @vchrisb kindly provided to fix it. Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants