New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve handling of file resources #3577
Conversation
See also #2818 |
Would you like me to add a "Fixes: ..." to the commit message? |
No, that's not necessary. |
Any idea why Travis fails immediately with the "canceled" state? |
We have auto-cancellation turned on: https://travis-ci.org/python-pillow/Pillow/settings I guess the grey ones were cancelled because they hadn't started building yet and a newer build was queued: |
Rebased 🙂 |
#2019 is a memory leak issue that I believe should be currently fixed on master. My concern with this PR is that it would trigger those memory leaks, deliberately causing a reported issue. However, I imagine that you would hold that this is a problem with the end code, rather than with Pillow. Let me know if I've misunderstood. |
I'm not sure I fully follow how this would produce a memory leak. Sorry, the thread of that issue is a bit hard for me to know exactly where the issue will be. Are you able to provide sample code that you believe will now result in a memory leak? Or is this already captured in a test? Is it this? s = 0
for i in range(len(onlyfiles)):
fname = onlyfiles[i]
image = PIL.Image.open(fname)
# Do something
s += image.size[0]*image.size[1] If so, then yeah. I would argue that the user isn't using this interface correctly and should instead do the following to close the resource after finished in a deterministic way. s = 0
for i in range(len(onlyfiles)):
fname = onlyfiles[i]
with PIL.Image.open(fname) as image:
# Do something
s += image.size[0]*image.size[1] If that is still memory leak then I can take a deeper look. If you think this is too much of a backwards incompatible change, maybe it would make sense to wait until the next major release to include this. |
Yes, I'm just discussing the first code sample you provided, and agree that the context manager should resolve the situation. The next release is actually a major release, so if everyone is fine with this PR, then go for it, I just wanted to make sure that people were aware of this consequence. |
Let's first confirm that it really is the case that memory leaks. And if so, we'll need to communicate clearly the recommended way of doing things (we need to update the docs anyway). Would it help to announce the change ahead of making it? If so, a major release is also planned for 2020-01-01 (dropping Python 2.7). |
After rereading that full linked thread, it looks like it was concluded that the leak occurred in Python and has since been fixed. Am I understanding this right? If I understand, then no new action needs to be taken for this PR w/r/t a memory leak. Do you agree?
Good call. I have updated the docs in the latest revision. This includes many changes to
I'm open to this, no preference. Please let me know what you'd like and I'll adjust the PR. |
Hi, sorry for the delay on this. This obviously didn't make the last Pillow 6.0.0 release.
Let's go for this. Please could you add something to the deprecations and removals and any other relevant docs? Can we add some deprecation warnings in code too? One thing, that page says:
If we deprecate now, it's 6 months until Pillow 6.0.0. Is that long enough? I estimate Pillow 7.0.0 will be when Python 3.5 is removed (it's EOL on 2020-09-13) in the 2020-10-01 release. PS I guess most of the merge conflicts are from formatting with Black, so many can probably be resolved by running Black on this branch. |
Great! I have rebased and fixed all newly discovered warnings. I updated the docs given that 6.0.0 has been released.
After thinking about this some more, what do you see a deprecation warning providing that isn't also provided by Python 3's resource warning? I guess this isn't a true backwards incompatible change in the sense that the resultant behavior is the same (CPython closes the file when the reference count hits 0) except it now emits a resource warning (which would otherwise be a deprecation warning). So the end result is the same, just it is a resource warning instead of a deprecation warning. WDYT? |
Oops, I made some typos! This is what I meant to write (in bold):
Let's make a new PR to put the text in the deprecations section saying this is deprecated in 6.1.0, and will be removed in 7.0.0. We merge that new PR for Monday's release. And then this PR can be merged during Q4. Sorry for the mixup! Let's just go for the deprecations in docs. Thank you! |
Please see #3929. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Follow Python's file object semantics. User code is responsible for closing resources (usually through a context manager) in a deterministic way. To achieve this, remove __del__ functions. These functions used to closed open file handlers in an attempt to silence Python ResourceWarnings. However, using __del__ has the following drawbacks: - __del__ isn't called until the object's reference count reaches 0. Therefore, resource handlers remain open or in use longer than necessary. - The __del__ method isn't guaranteed to execute on system exit. See the Python documentation: https://docs.python.org/3/reference/datamodel.html#object.__del__ > It is not guaranteed that __del__() methods are called for objects > that still exist when the interpreter exits. - Exceptions that occur inside __del__ are ignored instead of raised. This has the potential of hiding bugs. This is also in the Python documentation: > Warning: Due to the precarious circumstances under which __del__() > methods are invoked, exceptions that occur during their execution > are ignored, and a warning is printed to sys.stderr instead. Instead, always close resource handlers when they are no longer in use. This will close the file handler at a specified point in the user's code and not wait until the interpreter chooses to. It is always guaranteed to run. And, if an exception occurs while closing the file handler, the bug will not be ignored. Now, when code receives a ResourceWarning, it will highlight an area that is mishandling resources. It should not simply be silenced, but fixed by closing resources with a context manager. All warnings that were emitted during tests have been cleaned up. To enable warnings, I passed the `-Wa` CLI option to Python. This exposed some mishandling of resources in ImageFile.__init__() and SpiderImagePlugin.loadImageSeries(), they too were fixed.
When a PIL image is opened by file path (rather than by passing `Image.open` an open file object), the `Image` can hold a reference to its own private opened file. It seems to have been a [design choice] of Pillow to not close these when the `Image` is garbage collected. This can result in "unclosed file" `ResourceWarning`s — I *think* these are harmless, but I'm not sure. Here we go through some contortions to ensure that `Image`s are explicitly closed before they are garbage collected. [design choice]: python-pillow/Pillow#3577
Better follow Python's file object semantics. User code is responsible for closing resources (usually through a context manager) in a deterministic way.
To achieve this, remove
__del__
functions. These functions used to close open file handlers in an attempt to silence PythonResourceWarnings
. However, using__del__
has the following drawbacks:__del__
isn't called until the object's reference count reaches 0. Therefore, resource handlers remain open or in use longer than necessary.The
__del__
method isn't guaranteed to execute on system exit. See the Python documentation:https://docs.python.org/3/reference/datamodel.html#object.__del__
Exceptions that occur inside
__del__
are ignored instead of raised. This has the potential of hiding bugs. This is also in the Python documentation:Instead, always close resource handlers when they are no longer in use. This will close the file handler at a specified point in the user's code and not wait until the interpreter chooses to. It is always guaranteed to run. And, if an exception occurs while closing the file handler, the bug will not be ignored.
Now, when code receives a
ResourceWarning
, it will highlight an area that is mishandling resources. It should not simply be silenced, but fixed by closing resources with a context manager.All warnings that were emitted during tests have been cleaned up. To enable warnings, I passed the
-Wa
CLI option to Python. This exposed some mishandling of resources inImageFile.__init__()
andSpiderImagePlugin.loadImageSeries()
, they too were fixed.