Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change Image.tobytes return type to ByteArray or add a Image.tobytearray method #5815

Closed
FirefoxMetzger opened this issue Nov 6, 2021 · 6 comments · Fixed by #6394
Closed
Labels

Comments

@FirefoxMetzger
Copy link
Contributor

Currently the method Image.tobytes returns a byte string. There is nothing inherently bad about this, except that the returned bytes are read only (ofc) since bytes are immutable.

The consequence of this is that Image.__array__ does an extra copy when calling

return np.array(ArrayData(), dtype)

that (I think) can be avoided. The data is copied from the immutable bytes string into a new mutable memory block around which the array is constructed.

This copy can be avoided in a few ways:

  1. Chance the code in Image.__array__ to (roughly) np.frombuffer(self.tobytes(), dtype=dtype).reshape(shape). The downside is that the result is read-only (result.flags['WRITEABLE'] == False) which may be undesirable.
  2. Change the code in Image.tobytes to return a bytearray. This may require work, depending on what encoder returns (didn't dive that far into the code), and it may be undesirable because type(Image.tobytes()) changes.
  3. Introduce a new Image.tobytearray method that essentially mirrors Image.tobytes, except that it creates a mutable buffer. This resolves the problems that come with changing Image.tobytes return type, but may still need more work.

Would this be something interesting and/or useful to do?

@radarhere radarhere changed the title Enhancement: Change Image.tobytes return type to ByteArray or add a Image.tobytearray method. Enhancement: Change Image.tobytes return type to ByteArray or add a Image.tobytearray method Nov 6, 2021
@radarhere radarhere added the NumPy label Nov 9, 2021
@radarhere
Copy link
Member

Just a note - one day, the deprecation in numpy/numpy#19001 will be removed, probably in NumPy 1.23, paving the way for us to revert #5379 and move from __array_interface__ back to __array__- making whatever changes we do here redundant. But that is in the future.

I agree that 2 is undesirable, because we value backwards compatibility.

So 3 is most likely the solution - but I don't think it necessarily needs to be a public method, if there's no obvious use case apart from this.

@FirefoxMetzger
Copy link
Contributor Author

@radarhere Yes, other use-cases could go through numpy to get the writable buffer: np.array(pil_image).data. Then again numpy isn't a dependency of pillow, iirc, so it may not be available. I don't think this would be a common situation though, and if it does arise then Image._tobytearray can always be made public later.

@hugovk
Copy link
Member

hugovk commented Nov 9, 2021

(NumPy minor releases (1.x.0) go out twice a year, so 1.23.0 is due out mid-2022.)

@radarhere
Copy link
Member

numpy/numpy#20835 has been merged into NumPy, so that will be part of NumPy 1.23.

@radarhere
Copy link
Member

NumPy 1.23 has been released, so I've created #6394 to revert #5379, resolving this.

@radarhere radarhere changed the title Enhancement: Change Image.tobytes return type to ByteArray or add a Image.tobytearray method Change Image.tobytes return type to ByteArray or add a Image.tobytearray method Aug 6, 2022
@radarhere
Copy link
Member

radarhere commented Aug 6, 2022

@FirefoxMetzger is this resolved?

While the internal call to np.array is now removed, when a user calls np.array, I'm wondering if the original data copy still occurs. I'm not convinced it can be avoided though, as

l, s, d = e.encode(bufsize)

returns bytes, coming from
buf = PyBytes_FromStringAndSize(NULL, bufsize);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants