Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I;16 TIFF read with fillorder=2 #5249

Closed
jamesra opened this issue Feb 4, 2021 · 6 comments · Fixed by #6132
Closed

I;16 TIFF read with fillorder=2 #5249

jamesra opened this issue Feb 4, 2021 · 6 comments · Fixed by #6132
Labels

Comments

@jamesra
Copy link

jamesra commented Feb 4, 2021

OS: Windows 10
Python 3.7 x64
Pillow 8.1

I have an environment that has been using Pillow for years, at least since 2014. I recently upgraded to the latest version of Pillow and .tif files I use for test cases are no longer readable. I'm not sure if this is a regression in Pillow or a side-effect of SciPy/Numpy retiring some functions and the refactoring switching this path entirely to Pillow.

The problem

I get an image format exception.

Full resolution Bug Reproduction Image - A 16-bit grayscale tiff file exported by IMOD, a program we use on Transmission Electron Microscopes data.

The repro code is essentially:

import numpy as np
from PIL import Image

with Image.open(ImageFullPath) as im:
    image = np.array(im, dtype=np.uint16)

Then adjust contrast temp = image / image.max() and display in your favorite image viewer.

The image should look something like this:
PillowGoodLibTiffRead

The format problem is because the Tiff mode is not recognized. fillorder = 2 is missing from the mode table. The key the image generates is: (b'II', 1, (1,), 2, (16,), ())

In the TiffImagePlugin.py's _setup function (roughly line #1266) there is no entry for this key causing an "Unsupported format" error.

Existing OPEN_INFO for I;16 entries:

(II, 1, (1,), 1, (16,), ()): ("I;16", "I;16"),
(MM, 1, (1,), 1, (16,), ()): ("I;16B", "I;16B"),

If I add entries to match fillorder= 2 the _setup method proceeds to load the image, however the pixels are not read correctly.

Added Entries:

(II, 1, (1,), 2, (16,), ()): ("I;16", "I;16"),
(MM, 1, (1,), 2, (16,), ()): ("I;16B", "I;16B"),

The non-libtiff path will now load the image into a numpy array, but the pixel values are incorrect. I strongly suspect the byte order is reversed:
PillowBadTifRead

Workaround

Pillow does read the image correctly when libtiff is used. (READ_LIBTIFF = True) (TiffImagePlugin.py#58). Interestingly when TiffImageFile._setup follows the path for READ_LIBTIFF = True that path will set fillorder = 1 if fillorder == 2. (TiffImagePlugin.py ~#1315)

            # libtiff handles the fillmode for us, so 1;IR should
            # actually be 1;I. Including the R double reverses the
            # bits, so stripes of the image are reversed.  See
            # https://github.com/python-pillow/Pillow/issues/279
            if fillorder == 2:
                # Replace fillorder with fillorder=1
                key = key[:3] + (1,) + key[4:]
                logger.debug(f"format key: {key}")
                # this should always work, since all the
                # fillorder==2 modes have a corresponding
                # fillorder=1 mode
                self.mode, rawmode = OPEN_INFO[key]

Fix options

1) In my code I could manually set READ_LIBTIFF = True and add the missing entries to the OPEN_INFO dictionary. Then require LibTiff in my setup.py for my users.

2) It would be nice to default the READ_LIBTIFF and WRITE_LIBTIFF variables based upon the presence of LibTiff, but wiser minds may disagree. I had hoped a simple test of whether import libtiff threw an exception would be enough, but it appeared to be more complicated than that due to tiff.dll needing to be in the path. Instead I added _hasencoder and _hasdecoder to Image.py and used those to set the READ_LIBTIFF and WRITE_LIBTIFF variables in TiffImagePlugin.py

TiffImagePlugin.py ~#58

# Set these to true to force use of libtiff for reading or writing. 
# Otherwise, default to libtiff if it is available
READ_LIBTIFF = Image._hasdecoder('libtiff')
WRITE_LIBTIFF = Image._hasencoder('libtiff')

We also need to add entries to the OPEN_INFO dictionary... That should probably be conditional on libtiff being present. Feels sketchy.

Image.py ~#415 additions:

def _hasdecoder(decoder_name):

    if decoder_name in DECODERS:
        return True
    
    if hasattr(core, decoder_name + "_decoder"):
        return True
    
    return False

def _hasencoder(encoder_name):

    if encoder_name in ENCODERS:
        return True
    
    if hasattr(core, encoder_name + "_encoder"):
        return True
    
    return False

3) Pillow's software encoder could be fixed to handle the byte ordering problem so we can add the format to OPEN_INFO. I tried setting the rawmode = IR;16 but that prevented the image from loading into the numpy array. Without an obvious fix I decided to stop and file this bug for guidance.

@jamesra jamesra changed the title I:16 Tiff read, possible regression I:16 Tiff read with fillmode=2 Feb 4, 2021
@radarhere radarhere changed the title I:16 Tiff read with fillmode=2 I;16 Tiff read with fillmode=2 Feb 5, 2021
@radarhere radarhere added the TIFF label Feb 5, 2021
@kmilos
Copy link
Contributor

kmilos commented Feb 5, 2021

I think there is some confusion (generally, not only in this report) on what FillOrder does - it is for machine bit ordering within a byte, not byte ordering (endianness). Default 1 means the ordering is native for the machine (and probably matches byte order), 2 means it is reversed from the expected order. I think the tag definition of LibTIFF is deceptive as this is not an absolute definition (FILLORDER_MSB2LSB/LSB2MSB), but a relative one (native vs reversed).

It is very strange you should ever come across FillOrder=2 when dealing with "whole" byte images like 16-bit ones, as it does not make sense when the bytes are complete (unless the image was produced by some very exotic HW where byte and bit ordering don't match). It is really meant to give clues for bit ordering when you don't have complete bytes, like 1-bit facsimile... From the TIFF spec:

The logical order of bits within a byte.
...
We recommend that FillOrder=2 be used only in special-purpose applications. It
is easy and inexpensive for writers to reverse bit order by using a 256-byte lookup
table. FillOrder = 2 should be used only when BitsPerSample = 1 and the data is
either uncompressed or compressed using CCITT 1D or 2D compression, to
avoid potentially ambigous situations.
Support for FillOrder=2 is not required in a Baseline TIFF compliant reader
Default is FillOrder = 1.

That being said, I think LibTIFF should indeed already be handling this improbable corner case behind the scenes (TBC?), so it is up to Pillow to hook this up eventually...

In any case, sounds like you should also be fixing the writing application. ;)

@jamesra
Copy link
Author

jamesra commented Feb 8, 2021

Reading your answer it sounds like an improvement to my solution is for Pillow to pass all unknown TIFF formats to LibTIFF if it is available?

The images were IMOD's conversion of SerialEM .mrc files from roughly 2009... without going into ancient history I doubt any current IMOD users are getting this data format. These old images are still used in unit tests though. I could certainly convert them into a reasonable TIFF format and move on. In general I feel if I get a TIFF file it is usually generated by older tools and unexpected behavior is more common than it should be. PILLOW defaulting to LibTIFF in those cases seems like a good generalized fix.

I don't mind attempting to put together a pull request if you feel this approach is on the right track... Is my approach to detecting the presence of LibTIFF correct?

@kmilos
Copy link
Contributor

kmilos commented Feb 9, 2021

The images were IMOD's conversion of SerialEM .mrc files from roughly 2009...

No matter how old or exotic the source file is, there is no good reason for this IMOD software to produce a FIllOrder=2 16-bit image (see again the TIFF spec quoted text above). Since it looks like it's still maintained (latest update some 10 days ago), somebody should let them know.

@radarhere
Copy link
Member

@jamesra http://storage1.connectomes.utah.edu/Volumes/Temp/10000.tif is no longer available. Could you provide the image again?

@jamesra
Copy link
Author

jamesra commented Feb 6, 2022

Sorry about the late reply. It was a three-day weekend when I got this E-mail and I forgot when I got back to work. Please try this version: http://storage1.connectomes.utah.edu/Temp/PillowIssue5249/10000.tif
(Argh... CTRL+Enter posted instead of giving me a newline. So I'm editing the original reply)

I'm not sure why the image behind the link was removed. Probably an accident. I'm 99% sure the new link is the same file. However it may just be the same image. I'm switching dev machines today and haven't built my Python environment yet to test it against the repro above but don't want to forget this again. Thanks for taking a look.

@radarhere radarhere changed the title I;16 Tiff read with fillmode=2 I;16 Tiff read with fillorder=2 Mar 13, 2022
@radarhere radarhere changed the title I;16 Tiff read with fillorder=2 I;16 TIFF read with fillorder=2 Mar 13, 2022
@radarhere
Copy link
Member

radarhere commented Mar 13, 2022

I've created PR #6132 to resolve this.

Effectively, I have chosen option 3 from your initial post. I didn't encounter your numpy problem though. The following code works for me using my PR.

import numpy as np
from PIL import Image, TiffImagePlugin

with Image.open('10000.tif') as im:
    image = np.array(im, dtype=np.uint16)
    temp = image / image.max()
    Image.fromarray(temp).save("out.tif")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants