Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exceptions with specific TIF files #6008

Closed
golden-e opened this issue Feb 2, 2022 · 6 comments
Closed

Exceptions with specific TIF files #6008

golden-e opened this issue Feb 2, 2022 · 6 comments
Labels

Comments

@golden-e
Copy link

golden-e commented Feb 2, 2022

What did you do?

Used PIL Image, ImageSequence classes in python to load and process a .tif file.
Case (1)
This specific file is titled "MultipleFormats.tif" and is available at:
https://www.leadtools.com/support/forum/posts/t10960-

This TIF file contains JPG2000 compression on its third page. It is here that the code fails

Case(2) Tried other files that were converted using online pdf to tiff converters. Sorry, I do not have these files handy

What did you expect to happen?

Process all the pages of the given TIFF file

What actually happened?

Case (1)
Exception when reading from COMPRESSION list in TiffImagePlugin -> _setup() -> Line 1259. JPEG2000 compression key is 34712, which is not defined in COMPRESSION list causing a "keyerror" 34712

Case (2)
With other test files, I noticed that: TiffImagePlugin -> _setup() is failing in line 1280 with typecasting error, as the get() method below returned "None"
x_size = int(self.tag_v2.get(IMAGEWIDTH))

If a SyntaxError can be raised for this situation instead of optimistically typecasting to int(), it would not break the code and I can handle it in my exception handler in the calling code

What are your OS, Python and Pillow versions?

  • OS: Ubuntu
  • Python: 3.8
  • Pillow: 9.0.0
from PIL import Image, ImageSequence

tiff_file_name = "MultiFormats.tif"

# Read the file

with Image.open(tiff_file_name) as tiff_img:
    for i, page in enumerate(ImageSequence.Iterator(tiff_image)):
      print (f"Page {i}: Mode: {page.mode}, Size: {page.size}")
@radarhere radarhere added the TIFF label Feb 2, 2022
@radarhere
Copy link
Member

Hi.

For your first case, the problem is that even if I do add 34712 to COMPRESSION_INFO, then libtiff throws an error - "Compression scheme 34712 strip decoding is not implemented."
So if that is to be fixed, it will need to be in libtiff.

For your second case, the casting was added by #4103. int(None) throws a TypeError, which I expect is then caught and an UnidentifiedImageError is thrown. Is that what you're experiencing?
Since both SyntaxError and TypeError are caught by

except (SyntaxError, IndexError, TypeError, struct.error):
I would think that SyntaxError would also result in UnidentifiedImageError. So I'm not following why a SyntaxError would make any difference to you?

@radarhere radarhere changed the title Exception with specific TIF files Exceptions with specific TIF files Feb 2, 2022
@golden-e
Copy link
Author

golden-e commented Feb 3, 2022

Hi radarhere,

Thank you for the kind response. For case 1 then, it perhaps will remain a limitation and I can live with that.

For case 2, I can confirm that for PIL 9.0.0 on python 3.8.10, a TypeError is raised and not an UnidentifiedImageError. I did in fact handle UnidentifiedImageError exception to handle other edge cases, but in this specific case, PIL did not report the problem as such.
One observation though: In the link you shared above, I do not see UnidentifiedImageError being thrown under the "except" handler. Rather, I see a "continue" statement, which I would believe would result in a None type being returned as well?

For now, I will extend my code to support IndexError and struct.error and treat it as an UnidentifiedImageError.

Thanks much for your time and consideration to this problem.

@radarhere
Copy link
Member

It returns None from _open_core, yes, but if there is no other image format that accepts your file (it is likely there is not), it then continues to the outer function to raise an UnidentifiedImageError.

Ok. So is that all that you were after then?

@golden-e
Copy link
Author

golden-e commented Feb 3, 2022

That would be correct, assuming the outer function is the one calling _open_core (which in my case is ImageSequence.Iterator)

@radarhere
Copy link
Member

It's being called here

Pillow/src/PIL/Image.py

Lines 3017 to 3021 in 5a8ad4e

im = _open_core(fp, filename, prefix, formats)
if im is None:
if init():
im = _open_core(fp, filename, prefix, formats)

as part of Image.open.

@radarhere
Copy link
Member

Closing, unless there are further comments or questions.

Without a specific image to discuss for case 2, I don't think there's anything to be done. The width of an image is a crucial piece of information, and without it, it doesn't sound surprising that the image cannot be read.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants