Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PIL cannot read BigTIFF #4513

Closed
madelinehayes opened this issue Apr 1, 2020 · 10 comments · Fixed by #6097
Closed

PIL cannot read BigTIFF #4513

madelinehayes opened this issue Apr 1, 2020 · 10 comments · Fixed by #6097
Labels
Anaconda Issues with Anaconda's Pillow TIFF

Comments

@madelinehayes
Copy link

madelinehayes commented Apr 1, 2020

What did you do?

I'm trying to open an orthomosaic geotiff into Python to crop into 1000x1000 tiles. I am able to open some tif files in my notebook, but other files return an UnidentifiedImageError. I have set up my notebook using a Dockerfile that creates a conda environment.

What did you expect to happen?

I expected that my code would work the same on all of my tif files - they are all 4-band 8-bit. This code works with a tif file that is 992 MB, but it doesn't work with a file that's 691 MB.

What actually happened?

---------------------------------------------------------------------------
UnidentifiedImageError                    Traceback (most recent call last)
<ipython-input-85-c7f17ab4563e> in <module>
      6 
      7 # break it up into crops
----> 8 for k, piece in enumerate(crop(infile, tile_height, tile_width, stride, img_dict, prj_name), start_num):
      9     img=Image.new('RGB', (tile_height, tile_width), (255, 255, 255))
     10     print(img.size)

<ipython-input-83-c9b468169840> in crop(infile, tile_height, tile_width, stride, img_dict, prj_name)
      5 
      6 def crop(infile, tile_height, tile_width, stride, img_dict, prj_name):
----> 7     im = Image.open(infile)
      8     img_width, img_height = im.size
      9     print(im.size)

/opt/conda/envs/geo_env/lib/python3.8/site-packages/PIL/Image.py in open(fp, mode)
   2859     for message in accept_warnings:
   2860         warnings.warn(message)
-> 2861     raise UnidentifiedImageError(
   2862         "cannot identify image file %r" % (filename if filename else fp)
   2863     )

UnidentifiedImageError: cannot identify image file '../data/mosaics/GrandJason_SWRightThird_Nov2019_transparent_mosaic_group1.tif'

What are your OS, Python and Pillow versions?

  • OS: Ubuntu 18.04
  • Python: 3.8.2
  • Pillow: 7.0.0
from PIL import Image
import os
import argparse
import numpy as np
import json
import csv
import rasterio
import matplotlib
import folium
from pyproj import Proj, transform


%matplotlib inline


Image.MAX_IMAGE_PIXELS = 100000000000
# ingest the image
infile = "../data/mosaics/GrandJason_SWRightThird_Nov2019_transparent_mosaic_group1.tif"

img_dir = '..' + infile.split(".")[2]
prj_name = img_dir.split("/")[-1]
dataset = rasterio.open(infile)

# what is the name of this image
img_name = dataset.name
print('Image filename: {n}\n'.format(n=img_name))

# How many bands does this image have?
num_bands = dataset.count
print('Number of bands in image: {n}\n'.format(n=num_bands))

# How many rows and columns?
rows, cols = dataset.shape
print('Image size is: {r} rows x {c} columns\n'.format(r=rows, c=cols))

# Does the raster have a description or metadata?
desc = dataset.descriptions
metadata = dataset.meta

print('Raster description: {desc}\n'.format(desc=desc))

# What driver was used to open the raster?
driver = dataset.driver
print('Raster driver: {d}\n'.format(d=driver))

# What is the raster's projection?
proj = dataset.crs
print('Image projection:')
print(proj, '\n')

# What is the raster's "geo-transform"
gt = dataset.transform

print('Image geo-transform:\n{gt}\n'.format(gt=gt))

print('All raster metadata:')
print(metadata)
print('\n')

tile_height = tile_width = 1000
overlap = 80
stride = tile_height - overlap
start_num=0

#crop image into tiles
def crop(infile, tile_height, tile_width, stride, img_dict, prj_name):
    im = Image.open(infile) 
    img_width, img_height = im.size
    print(im.size)
    print(img_width * img_height / (tile_height - stride) / (tile_width - stride))
    count = 0
    for r in range(0, img_height-tile_height+1, stride):
        for c in range(0, img_width-tile_width+1, stride):
            #tile = im[r:r+100, c:c+100]
            box = (c, r, c+tile_width, r+tile_height)
            top_pixel = [c,r]
            img_dict[prj_name + "---" + str(count) + ".png"] = top_pixel
            count += 1
            yield im.crop(box)

#split image into heightxwidth patches
img = Image

img_dict = {}

# create the dir if it doesn't already exist
if not os.path.exists(img_dir):
    os.makedirs(img_dir)

# break it up into crops
for k, piece in enumerate(crop(infile, tile_height, tile_width, stride, img_dict, prj_name), start_num):
    img=Image.new('RGB', (tile_height, tile_width), (255, 255, 255))
    print(img.size)
    print(piece.size)
    img.paste(piece)
    image_name = prj_name + "---%s.png" % k
    path=os.path.join(img_dir, image_name)
    img.save(path)
@cgohlke
Copy link
Contributor

cgohlke commented Apr 2, 2020

Could be a BigTIFF file, which is not supported by Pillow. The first 4 bytes in BigTIFF files are b'II\x2B\x00' or b'MM\x00\x2B'

@hugovk hugovk added the TIFF label Apr 2, 2020
@radarhere radarhere changed the title PIL Image cannot read some TIF files, but it can read other TIF files PIL Image cannot read some TIF files Apr 2, 2020
@radarhere radarhere changed the title PIL Image cannot read some TIF files PIL cannot read some TIF files Apr 2, 2020
@madelinehayes
Copy link
Author

Could be a BigTIFF file, which is not supported by Pillow. The first 4 bytes in BigTIFF files are b'II\x2B\x00' or b'MM\x00\x2B'

None of my tif files are BigTIFF, and they are all way under 4GB size

@madelinehayes
Copy link
Author

madelinehayes commented Apr 2, 2020

Here is the information of my raster:


Image size is: 18679 rows x 31880 columns

Raster description: (None, None, None, None)

Raster driver: GTiff

Image projection:
EPSG:32720 

Image geo-transform:
| 0.02, 0.00, 632879.80|
| 0.00,-0.02, 4341041.62|
| 0.00, 0.00, 1.00|

All raster metadata:
{'driver': 'GTiff', 'dtype': 'uint8', 'nodata': None, 'width': 31880, 'height': 18679, 'count': 4, 'crs': CRS.from_epsg(32720), 'transform': Affine(0.01804, 0.0, 632879.79994,
       0.0, -0.01804, 4341041.621540001)}

bytearray(b'II+\x00\x08')

@cgohlke
Copy link
Contributor

cgohlke commented Apr 2, 2020

bytearray(b'II+\x00\x08')

That's BigTIFF.

@radarhere radarhere changed the title PIL cannot read some TIF files PIL cannot read BigTIFF Apr 3, 2020
@hroskes
Copy link

hroskes commented Jun 4, 2020

Are there any plans to support bigtiff? I'd like to add my name to the list of people who would find it very useful.

@juliangilbey
Copy link

I've just stumbled upon this issue too. At a first glance, it seems to me that it should be pretty straightforward to modify src/libImaging/TiffDecode.c to handle the BigTIFF format, which is meant to be very closely compatible with the TIFF format.

@juliangilbey
Copy link

Ah, after working on this for about a day, and getting most of the way to a working solution, I realised that it is probably futile, as BigTIFF files are to a great extent used for medical images, and they will include multiple images within the file (for example, thumbnails). But PIL can only handle one image in a file, so this won't work very well.

A better solution is probably to identify BigTIFF files (from the b"MM\x00\x2B" or b"II\x2B\x00" header bytes) and then recommend using something like tifffile to handle the image.

If it's of interest, I could upload what I've done so far to github. (It's primarily work on TiffImagePlugin.py and TiffTags.py.)

Best wishes!

@radarhere
Copy link
Member

But PIL can only handle one image in a file, so this won't work very well.

Pillow can load multiple images. It opens the file at one image, and then uses seek() to navigate to a different image.

If you're willing to share what you've done, and an example image, that would be interesting.

@juliangilbey
Copy link

OK, shall do when I've had a chance to get at least some of it working!

@radarhere
Copy link
Member

I've created PR #6097 to resolve this.

@aclark4life aclark4life added the Anaconda Issues with Anaconda's Pillow label May 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Anaconda Issues with Anaconda's Pillow TIFF
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants