Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python: catch-all hardcoded error can hide real error in cast of input to numpy #541

Closed
cbm755 opened this issue Mar 28, 2023 · 2 comments
Closed

Comments

@cbm755
Copy link
Contributor

cbm755 commented Mar 28, 2023

I think this code tries to ensure the image can be cast to a particular numpy format

        Image image;
        try {
                image = _image.cast<Image>();
        }
        catch(...) {
                throw py::type_error("Unsupported type " + _type + ". Expect a PIL Image or numpy array");
        }

But the actual error message of catch(...) might be useful and is hidden.

For example: suppose I have an image that is broken, maybe b/c some idiot only read half the bytes:

with open("good.png", "rb") as f:
    b = f.read()
half = b[0:n//2]
with open("broken.png", "wb") as f:
    f.write(half)

Now if I read this in, surely Pillow will choke somewhere:

from PIL import Image
from zxingcpp import read_barcodes

im = Image.open("broken.png")   # <-- no error b/c of lazy evaluation

read_barcodes(im)
TypeError: Unsupported type <class 'PIL.PngImagePlugin.PngImageFile'>. Expect a PIL Image or numpy array

But this error is confusing b/c we do have a PIL Image!

Explanation

To understand, what is happening, compare to:

im = Image.open("broken.png")
im.load()    # <-- force the image to load

File /usr/lib64/python3.11/site-packages/PIL/ImageFile.py:245, in ImageFile.load(self)
    243         break
    244     else:
--> 245         raise OSError("image file is truncated") from e
    247 if not s:  # truncated jpeg
    248     if LOAD_TRUNCATED_IMAGES:

OSError: image file is truncated

read_barcodes(im)

TL-DR

The PIL Image is going to give a very reasonable error on lazy load, but ZXingcpp will catch that error and replace it with hardcoded best-guess error.

A possible fix

Echo the actual error message before speculating about possible unsupported types. I don't really speak enough C++ but the pseudocode change I'd like is something like this:

        Image image;
        try {
                image = _image.cast<Image>();
        }
        catch(ERROR) {
                throw py::type_error("Error casting Image to numpy: {ERROR}\nPerhaps you have an unsupported type " + _type + ". Expect a PIL Image or numpy array");
        }

References

A real-life example where this made it harder to debug is https://gitlab.com/plom/plom/-/issues/2597

@axxel
Copy link
Collaborator

axxel commented Mar 29, 2023

Thanks for the very detailed report. I had a look at the pybind11 doc and came up with the following:

	try {
		image = _image.cast<Image>();
#if PYBIND11_VERSION_HEX > 0x02080000 // py::raise_from is available starting from 2.8.0
	} catch (py::error_already_set &e) {
		py::raise_from(e, PyExc_RuntimeError, ("Could not convert " + _type + " to numpy array.").c_str());
		throw py::error_already_set();
#endif
	} catch (...) {
		throw py::type_error("Could not convert " + _type + " to numpy array. Expecting a PIL Image or numpy array.");
	}

I then created a truncated png file that raises your aforementioned error when trying explicitly loading it via img.load() but if I use my new code, I get the following message:

/home/axel/contrib/zxing-cpp.release/wrappers/python/demo_reader.py:6: DeprecationWarning: An exception was ignored while fetching the attribute `__array_interface__` from an object of type 'PngImageFile'.  With the exception of `AttributeError` NumPy will always raise this exception in the future.  Raise this deprecation warning to see the original exception. (Warning added NumPy 1.21)
  results = zxingcpp.read_barcodes(img)
TypeError: int() argument must be a string, a bytes-like object or a real number, not 'PngImageFile'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/axel/contrib/zxing-cpp.release/wrappers/python/demo_reader.py", line 6, in <module>
    results = zxingcpp.read_barcodes(img)
RuntimeError: Could not convert <class 'PIL.PngImagePlugin.PngImageFile'> to numpy array.

That looks only marginally better than what the current HEAD revision is providing. I don't see how I could influence what happens behind the cast<Image>() call, though. And I don't know why the "image file is truncated" message is still missing.

If you have ideas, please let me know. Otherwise I'll push the above change and consider it at least an improvement.

@cbm755
Copy link
Contributor Author

cbm755 commented Apr 1, 2023

I saw various chatter about numpy and that __array_interface__, where numpy was hiding errors. At first I thought that was what we were experiencing. Maybe that's what is happening here? At the time, I collected a few links:

python-pillow/Pillow#6588
python-pillow/Pillow#6394
numpy/numpy#18723
python-pillow/Pillow#4281
python-pillow/Pillow#3863

(but I didn't really dig in: my impression was numpy >= 1.23 was "fine".)


Only think I can think of is a different code path based on isinstance(..., PIL.Image).

To elaborate: I don't know the design/layout well enough but is it possible to have a thin layer of actual python code? I mean read_barcodes which calls a lower lever _read_barcodes. Then (some) of the conversion logic could live in Python code instead. In particular, it could force load:

  if instance(x, PIL.Image):
       # force load now to get an error messages at this point, Issue #541
       x.load()
  pybind11.magic.call._read_barcodes(x, ...)

@axxel axxel closed this as completed in 64c6b98 Apr 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants