Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling of file resources #3577

Merged
merged 1 commit into from Nov 1, 2019
Merged

Improve handling of file resources #3577

merged 1 commit into from Nov 1, 2019

Conversation

jdufresne
Copy link
Contributor

@jdufresne jdufresne commented Jan 13, 2019

Better follow Python's file object semantics. User code is responsible for closing resources (usually through a context manager) in a deterministic way.

To achieve this, remove __del__ functions. These functions used to close open file handlers in an attempt to silence Python ResourceWarnings. However, using __del__ has the following drawbacks:

  • __del__ isn't called until the object's reference count reaches 0. Therefore, resource handlers remain open or in use longer than necessary.

  • The __del__ method isn't guaranteed to execute on system exit. See the Python documentation:

    https://docs.python.org/3/reference/datamodel.html#object.__del__

    It is not guaranteed that __del__() methods are called for objects that still exist when the interpreter exits.

  • Exceptions that occur inside __del__ are ignored instead of raised. This has the potential of hiding bugs. This is also in the Python documentation:

    Warning: Due to the precarious circumstances under which __del__() methods are invoked, exceptions that occur during their execution are ignored, and a warning is printed to sys.stderr instead.

Instead, always close resource handlers when they are no longer in use. This will close the file handler at a specified point in the user's code and not wait until the interpreter chooses to. It is always guaranteed to run. And, if an exception occurs while closing the file handler, the bug will not be ignored.

Now, when code receives a ResourceWarning, it will highlight an area that is mishandling resources. It should not simply be silenced, but fixed by closing resources with a context manager.

All warnings that were emitted during tests have been cleaned up. To enable warnings, I passed the -Wa CLI option to Python. This exposed some mishandling of resources in ImageFile.__init__() and
SpiderImagePlugin.loadImageSeries(), they too were fixed.

@radarhere
Copy link
Member

See also #2818

@jdufresne
Copy link
Contributor Author

Would you like me to add a "Fixes: ..." to the commit message?

@radarhere
Copy link
Member

No, that's not necessary.

@jdufresne
Copy link
Contributor Author

Any idea why Travis fails immediately with the "canceled" state?

@hugovk
Copy link
Member

hugovk commented Jan 27, 2019

We have auto-cancellation turned on:

Auto Cancellation allows you to only run builds for the latest commits in the queue. This setting can be applied to builds for Branch builds and Pull Request builds separately. Builds will only be canceled if they are waiting to run, allowing for any running jobs to finish.

https://travis-ci.org/python-pillow/Pillow/settings

I guess the grey ones were cancelled because they hadn't started building yet and a newer build was queued:

image

https://travis-ci.org/python-pillow/Pillow/pull_requests

@jdufresne
Copy link
Contributor Author

Rebased 🙂

@radarhere
Copy link
Member

radarhere commented Mar 18, 2019

#2019 is a memory leak issue that I believe should be currently fixed on master. My concern with this PR is that it would trigger those memory leaks, deliberately causing a reported issue. However, I imagine that you would hold that this is a problem with the end code, rather than with Pillow. Let me know if I've misunderstood.

@jdufresne
Copy link
Contributor Author

jdufresne commented Mar 18, 2019

I'm not sure I fully follow how this would produce a memory leak. Sorry, the thread of that issue is a bit hard for me to know exactly where the issue will be. Are you able to provide sample code that you believe will now result in a memory leak? Or is this already captured in a test?

Is it this?

s = 0
for i in range(len(onlyfiles)):
    fname = onlyfiles[i]
    image = PIL.Image.open(fname)
    # Do something
    s += image.size[0]*image.size[1]

If so, then yeah. I would argue that the user isn't using this interface correctly and should instead do the following to close the resource after finished in a deterministic way.

s = 0
for i in range(len(onlyfiles)):
    fname = onlyfiles[i]
    with PIL.Image.open(fname) as image:
        # Do something
        s += image.size[0]*image.size[1]

If that is still memory leak then I can take a deeper look.

If you think this is too much of a backwards incompatible change, maybe it would make sense to wait until the next major release to include this.

@radarhere
Copy link
Member

Yes, I'm just discussing the first code sample you provided, and agree that the context manager should resolve the situation.

The next release is actually a major release, so if everyone is fine with this PR, then go for it, I just wanted to make sure that people were aware of this consequence.

@hugovk
Copy link
Member

hugovk commented Mar 19, 2019

Let's first confirm that it really is the case that memory leaks.

And if so, we'll need to communicate clearly the recommended way of doing things (we need to update the docs anyway).

Would it help to announce the change ahead of making it? If so, a major release is also planned for 2020-01-01 (dropping Python 2.7).

@jdufresne
Copy link
Contributor Author

Let's first confirm that it really is the case that memory leaks.

After rereading that full linked thread, it looks like it was concluded that the leak occurred in Python and has since been fixed. Am I understanding this right? If I understand, then no new action needs to be taken for this PR w/r/t a memory leak. Do you agree?

And if so, we'll need to communicate clearly the recommended way of doing things (we need to update the docs anyway).

Good call. I have updated the docs in the latest revision. This includes many changes to open_files.rst, a paragraph in the release notes, and some updates in the turorial to promote best practices.

Would it help to announce the change ahead of making it? If so, a major release is also planned for 2020-01-01 (dropping Python 2.7).

I'm open to this, no preference. Please let me know what you'd like and I'll adjust the PR.

@hugovk
Copy link
Member

hugovk commented Jun 25, 2019

Hi, sorry for the delay on this. This obviously didn't make the last Pillow 6.0.0 release.

Would it help to announce the change ahead of making it? If so, a major release is also planned for 2020-01-01 (dropping Python 2.7).

I'm open to this, no preference. Please let me know what you'd like and I'll adjust the PR.

Let's go for this.

Please could you add something to the deprecations and removals and any other relevant docs? Can we add some deprecation warnings in code too?

One thing, that page says:

Deprecated features are only removed in major releases after an appropriate period of deprecation has passed.

If we deprecate now, it's 6 months until Pillow 6.0.0. Is that long enough? I estimate Pillow 7.0.0 will be when Python 3.5 is removed (it's EOL on 2020-09-13) in the 2020-10-01 release.


PS I guess most of the merge conflicts are from formatting with Black, so many can probably be resolved by running Black on this branch.

@jdufresne
Copy link
Contributor Author

Great! I have rebased and fixed all newly discovered warnings. I updated the docs given that 6.0.0 has been released.


Can we add some deprecation warnings in code too?

After thinking about this some more, what do you see a deprecation warning providing that isn't also provided by Python 3's resource warning? I guess this isn't a true backwards incompatible change in the sense that the resultant behavior is the same (CPython closes the file when the reference count hits 0) except it now emits a resource warning (which would otherwise be a deprecation warning). So the end result is the same, just it is a resource warning instead of a deprecation warning. WDYT?

@hugovk
Copy link
Member

hugovk commented Jun 28, 2019

Oops, I made some typos! This is what I meant to write (in bold):

If we deprecate now, it's 6 months until Pillow 6.0.0 7.0.0. Is that long enough? I estimate Pillow 7.0.0 8.0.0 will be when Python 3.5 is removed (it's EOL on 2020-09-13) in the 2020-10-01 release.

Let's make a new PR to put the text in the deprecations section saying this is deprecated in 6.1.0, and will be removed in 7.0.0. We merge that new PR for Monday's release.

And then this PR can be merged during Q4.

Sorry for the mixup!


Let's just go for the deprecations in docs. Thank you!

@hugovk
Copy link
Member

hugovk commented Jul 1, 2019

Let's make a new PR to put the text in the deprecations section saying this is deprecated in 6.1.0, and will be removed in 7.0.0. We merge that new PR for Monday's release.

Please see #3929.

@hugovk hugovk mentioned this pull request Jul 1, 2019
22 tasks
@hugovk hugovk added this to the 7.0.0 milestone Jul 1, 2019
@hugovk

This comment has been minimized.

@aclark4life aclark4life added this to New Issues in Pillow Sep 12, 2019
@radarhere

This comment has been minimized.

Tests/test_file_dcx.py Outdated Show resolved Hide resolved
Follow Python's file object semantics. User code is responsible for
closing resources (usually through a context manager) in a deterministic
way.

To achieve this, remove __del__ functions. These functions used to
closed open file handlers in an attempt to silence Python
ResourceWarnings. However, using __del__ has the following drawbacks:

- __del__ isn't called until the object's reference count reaches 0.
  Therefore, resource handlers remain open or in use longer than
  necessary.

- The __del__ method isn't guaranteed to execute on system exit. See the
  Python documentation:

  https://docs.python.org/3/reference/datamodel.html#object.__del__

  > It is not guaranteed that __del__() methods are called for objects
  > that still exist when the interpreter exits.

- Exceptions that occur inside __del__ are ignored instead of raised.
  This has the potential of hiding bugs. This is also in the Python
  documentation:

  > Warning: Due to the precarious circumstances under which __del__()
  > methods are invoked, exceptions that occur during their execution
  > are ignored, and a warning is printed to sys.stderr instead.

Instead, always close resource handlers when they are no longer in use.
This will close the file handler at a specified point in the user's code
and not wait until the interpreter chooses to. It is always guaranteed
to run. And, if an exception occurs while closing the file handler, the
bug will not be ignored.

Now, when code receives a ResourceWarning, it will highlight an area
that is mishandling resources. It should not simply be silenced, but
fixed by closing resources with a context manager.

All warnings that were emitted during tests have been cleaned up. To
enable warnings, I passed the `-Wa` CLI option to Python. This exposed
some mishandling of resources in ImageFile.__init__() and
SpiderImagePlugin.loadImageSeries(), they too were fixed.
@hugovk hugovk merged commit 2cf510e into python-pillow:master Nov 1, 2019
@hugovk hugovk mentioned this pull request Nov 1, 2019
@jdufresne jdufresne deleted the warn-close branch November 1, 2019 16:33
@radarhere radarhere moved this from New Issues to Done in Pillow Apr 24, 2021
dairiki added a commit to dairiki/lektor that referenced this pull request May 7, 2023
When a PIL image is opened by file path (rather than by passing
`Image.open` an open file object), the `Image` can hold a reference
to its own private opened file. It seems to have been a [design choice]
of Pillow to not close these when the `Image` is garbage
collected. This can result in "unclosed file" `ResourceWarning`s —
I *think* these are harmless, but I'm not sure.

Here we go through some contortions to ensure that `Image`s are
explicitly closed before they are garbage collected.

[design choice]: python-pillow/Pillow#3577
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Pillow
  
Done
Development

Successfully merging this pull request may close these issues.

None yet

4 participants