Added support for embedding indexed images #443

RedShy · 2022-05-19T20:49:20Z

I added the support for embedding indexed images in a PDF. The code was already there, I just had to do some minor fixes. I also added a test to cover the new feature.

There is a problem though: previous tests now breaks.

The tests processed indexed images by converting them to RGBA and then embedding them in the PDF, now instead they embeds them directly as indexed images.

Not sure how to proceed now.

Checklist:

The GitHub pipeline is OK (green),
meaning that both pylint (static code analyzer) and black (code formatter) are happy with the changes of this PR.
A unit test is covering the code added / modified by this PR
This PR is ready to be merged
In case of a new feature, docstrings have been added, with also some documentation in the docs/ folder
A mention of the change is present in CHANGELOG.md

Lucas-C

Good job really! Well done, the code is clean and you seemingly reduced the size of the PDFs generated when they embed indexed images.

My idea is to generate the PDF again using generate=True.

Seems like the best approach here.
Just make sure the new reference PDF files are smaller than the previous ones.

test/image/png_indexed/test_png_indexed.py

codecov · 2022-05-20T08:46:37Z

Codecov Report

Merging #443 (838d33c) into master (1477519) will increase coverage by 0.12%.
The diff coverage is 90.00%.

@@            Coverage Diff             @@
##           master     #443      +/-   ##
==========================================
+ Coverage   91.88%   92.00%   +0.12%     
==========================================
  Files          22       22              
  Lines        6408     6431      +23     
  Branches     1290     1298       +8     
==========================================
+ Hits         5888     5917      +29     
+ Misses        298      293       -5     
+ Partials      222      221       -1

Impacted Files	Coverage Δ
fpdf/image_parsing.py	`88.07% <87.50%> (+1.90%)`	⬆️
fpdf/fpdf.py	`89.51% <100.00%> (+0.40%)`	⬆️
fpdf/drawing.py	`93.24% <0.00%> (-0.15%)`	⬇️
fpdf/svg.py	`96.96% <0.00%> (+0.01%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1477519...838d33c. Read the comment docs.

RedShy · 2022-05-20T16:38:07Z

Good job really! Well done

Thank you! Much appreciated :)

I observed that if I change the image filter to "JPXDecode" or "DCTDecode" it throws an error when executing img.save(compressed_bytes, format="JPEG2000") so I forced the conversion to RGBA in case of image_filter != "FlateDecode" and image mode is "P" or "PA".

It also seems that Pillow doesn't work well with "PA" images.
I used pngquant to create an indexed-alpha PNG, but Pillow still opens it as a "P" image without alpha channel. I then tried to convert it to "PA" but the result embedded image is:

To make sure that the code regarding the alpha channel works correctly, I manually set the alpha channel:

palette = img.palette
a = np.asarray(img)
for r in range(a.shape[0]):
   for c in range(a.shape[1]):
       a[r][c][1] = 100
img = Image.fromarray(a, mode="PA")
img.palette = palette

And the result image is without artifacts:

Lucas-C · 2022-05-20T20:32:53Z

Thank you for you work on this!
I think I'll take the time to review this on Monday morning.
Have a good week-end 😊

PS: don't forget to add a mention of this in the CHANGELOG.md

RedShy · 2022-05-21T10:44:33Z

I think I'll take the time to review this on Monday morning.

Sure! Thank you! Have a good week-end you too 😊

test/errors/test_FPDF_errors.py

test/image/image_info.json

test/image/png_images/test_png_file.py

test/image/test_image_info.py

test/image/png_indexed/test_png_indexed.py

Lucas-C · 2022-05-23T20:00:52Z

I observed that if I change the image filter to JPXDecode or DCTDecode it throws an error when executing img.save(compressed_bytes, format="JPEG2000")

What was the error exactly?

It also seems that Pillow doesn't work well with "PA" images.
I used pngquant to create an indexed-alpha PNG, but Pillow still opens it as a "P" image without alpha channel.

Yeah, I've made some tests and Pillow definitively has trouble with those images...
It may be worth reporting them this issue btw.
Regarding fpdf2, I don't have any advice/guidance regarding this case, I think you did well in this PR!

RedShy · 2022-05-23T20:41:13Z

It may be worth reporting them this issue btw.

In the next days I will try open an issue in the Pillow's github page

I think you did well in this PR!

Thank you!

What was the error exactly?

image_filter=JPXDecode: OSError: encoder error -2 when writing image file executing img.save(compressed_bytes, format="JPEG2000")

Traceback (most recent call last):
  File "C:\Program Files\Python3.10\lib\site-packages\PIL\ImageFile.py", line 500, in _save
    fh = fp.fileno()
io.UnsupportedOperation: fileno

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "D:\Progetti OpenSource\playground_fpdf2\coverage.py", line 80, in <module>
    test_png_indexed_files(None)
  File "D:\Progetti OpenSource\playground_fpdf2\coverage.py", line 53, in test_png_indexed_files
    pdf.image(HERE / "palette_2.png", x=80, y=10, w=50, h=50)
  File "D:\Progetti OpenSource\forked_fpdf2\fpdf2\fpdf\fpdf.py", line 299, in wrapper
    return fn(self, *args, **kwargs)
  File "D:\Progetti OpenSource\forked_fpdf2\fpdf2\fpdf\fpdf.py", line 3235, in image
    info = get_img_info(img, self.image_filter)
  File "D:\Progetti OpenSource\forked_fpdf2\fpdf2\fpdf\image_parsing.py", line 92, in get_img_info
    info["data"] = _to_data(img, image_filter)
  File "D:\Progetti OpenSource\forked_fpdf2\fpdf2\fpdf\image_parsing.py", line 151, in _to_data
    img.save(compressed_bytes, format="JPEG2000")
  File "C:\Program Files\Python3.10\lib\site-packages\PIL\Image.py", line 2300, in save
    save_handler(self, fp, filename)
  File "C:\Program Files\Python3.10\lib\site-packages\PIL\Jpeg2KImagePlugin.py", line 348, in _save
    ImageFile._save(im, fp, [("jpeg2k", (0, 0) + im.size, 0, kind)])
  File "C:\Program Files\Python3.10\lib\site-packages\PIL\ImageFile.py", line 526, in _save
    raise OSError(f"encoder error {s} when writing image file") from exc
OSError: encoder error -2 when writing image file

image_filter=DCTDecode: OSError: cannot write mode P as JPEG executing img.save(compressed_bytes, format="JPEG")

Traceback (most recent call last):
  File "C:\Program Files\Python3.10\lib\site-packages\PIL\JpegImagePlugin.py", line 633, in _save
    rawmode = RAWMODE[im.mode]
KeyError: 'P'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "D:\Progetti OpenSource\playground_fpdf2\coverage.py", line 80, in <module>
    test_png_indexed_files(None)
  File "D:\Progetti OpenSource\playground_fpdf2\coverage.py", line 56, in test_png_indexed_files
    pdf.image(HERE / "palette_3.png", x=150, y=10, w=50, h=50)
  File "D:\Progetti OpenSource\forked_fpdf2\fpdf2\fpdf\fpdf.py", line 299, in wrapper
    return fn(self, *args, **kwargs)
  File "D:\Progetti OpenSource\forked_fpdf2\fpdf2\fpdf\fpdf.py", line 3235, in image
    info = get_img_info(img, self.image_filter)
  File "D:\Progetti OpenSource\forked_fpdf2\fpdf2\fpdf\image_parsing.py", line 92, in get_img_info
    info["data"] = _to_data(img, image_filter)
  File "D:\Progetti OpenSource\forked_fpdf2\fpdf2\fpdf\image_parsing.py", line 146, in _to_data
    img.save(compressed_bytes, format="JPEG")
  File "C:\Program Files\Python3.10\lib\site-packages\PIL\Image.py", line 2300, in save
    save_handler(self, fp, filename)
  File "C:\Program Files\Python3.10\lib\site-packages\PIL\JpegImagePlugin.py", line 635, in _save
    raise OSError(f"cannot write mode {im.mode} as JPEG") from e
OSError: cannot write mode P as JPEG

RedShy · 2022-05-27T11:25:58Z

In the last days I searched in the code of Pillow and I discovered I had the false assumption that "P" images don't support transparency and "PA" yes. Instead, both support transparency:

"P" support transparency through the attribute info["transparency"]: (source)
- int: is the color in the palette regarded as "transparent", others are full opaque
- bytes: a byte string with one alpha value for each palette entry
"PA" support transparency through one alpha value for each pixel

Seems that Pillow has a problem in generating the alpha layer when converting from "P" to "PA". This doesn't happen in the conversion "P" to "RGBA". I opened the issue and the relative PR that should fix it.

Regarding fpdf2 and "P" images, currently the transparency is not ported in the PDF and for doing this I see 2 approaches:

Implement a homebrew solution using info["transparency"]
Use Pillow to get the alpha channel of "P" images and reuse the existing code. This can be achieved by converting the image in "RGBA" and extracting the alpha channel. This is the solution I implemented now.

Regarding "PA" images, the code in the fpdf2 seems to work correctly if the alpha layer is correctly set. So I suggest we keep supporting them.
I added tests in which I test the "PA" images by inserting the alpha channel extracted from the converted "RGBA" image and they are displayed correctly in the PDF.

(I also changed the sample images in the test putting some pics of flowers that I took near my house 😊)

Lucas-C · 2022-05-27T13:06:56Z

Wow, you did an excellent job!
You even fixed the bug upstream in Pillow, awesome!

I'm going to fix the minor file conflicts that prevent this PR from being merged,
and see if it can be closed afterwards.

fpdf/image_parsing.py

test/errors/test_FPDF_errors.py

test/image/png_indexed/test_png_indexed.py

test/errors/test_FPDF_errors.py

… palette_images

RedShy · 2022-05-27T16:13:00Z

Wow, you did an excellent job!
You even fixed the bug upstream in Pillow, awesome!

Thank you, that means a lot to me! it's really rewarding and encourages me to do more! 😊

Lucas-C · 2022-05-27T16:25:18Z

It seems like some tests are failing:

FAILED test/html/test_html.py::test_img_inside_html_table - AssertionError: c...
FAILED test/html/test_html.py::test_img_inside_html_table_without_explicit_dimensions
FAILED test/html/test_html.py::test_img_inside_html_table_centered - Assertio...
FAILED test/html/test_html.py::test_img_inside_html_table_centered_with_align
FAILED test/image/test_image_info.py::test_get_img_info - assert {'bpc': 8,\n...
FAILED test/image/png_images/test_png_file.py::test_insert_png_files - Assert...
================= 6 failed, 914 passed, 18 warnings in 42.17s ==================

Do you have the same errors on your computer?

RedShy · 2022-05-27T16:39:52Z

Do you have the same errors on your computer?

Yes I forgot to run all the tests again

Lucas-C · 2022-05-27T16:54:51Z

Merged! Thank you @RedShy 😊

Lucas-C · 2022-06-17T09:00:34Z

This feature has been included in the new v2.5.5 release!

RedShy added 3 commits May 19, 2022 21:44

added support for palette images

40d7c2d

added test images palette

24769e2

removed generate=true

2d1d2e9

Lucas-C approved these changes May 20, 2022

View reviewed changes

test/image/png_indexed/test_png_indexed.py Outdated Show resolved Hide resolved

RedShy added 2 commits May 20, 2022 09:03

removed link=None

f6f2656

fixed previous tests

fb972e9

RedShy added 3 commits May 20, 2022 10:58

removed Trailing whitespace

3b5897e

testing image filters

58b7af0

managed "PA" images

78d626b

RedShy added 3 commits May 20, 2022 18:49

small fix

0487879

generating pdf with pillow 9.1.1

b689b10

unsupported image filter error test

320a47f

RedShy marked this pull request as ready for review May 20, 2022 17:19

allowing embedding of indexed PNG images

dce71c8

Lucas-C reviewed May 23, 2022

View reviewed changes

formatting

1ceb54f

RedShy added 3 commits May 25, 2022 09:50

added PIL image

6cf14be

added test for p/pa images

bf44515

porting transparency in the pdf of P images

21f9a4f

Merge branch 'master' into palette_images

84c8ccc

Lucas-C reviewed May 27, 2022

View reviewed changes

test/errors/test_FPDF_errors.py Outdated Show resolved Hide resolved

RedShy added 3 commits May 27, 2022 17:44

fixed typos

423f194

calling get_img_info

2ab0d83

Merge branch 'palette_images' of https://github.com/RedShy/fpdf2 into…

ff84c8f

… palette_images

removed trailing whitespace

097f264

RedShy added 2 commits May 27, 2022 18:28

fixed image info

45eff10

fixed pdfs

838d33c

Lucas-C merged commit d2fa052 into py-pdf:master May 27, 2022

RedShy deleted the palette_images branch May 27, 2022 17:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added support for embedding indexed images #443

Added support for embedding indexed images #443

RedShy commented May 19, 2022 •

edited

Lucas-C left a comment

codecov bot commented May 20, 2022 •

edited

RedShy commented May 20, 2022 •

edited

Lucas-C commented May 20, 2022 •

edited

RedShy commented May 21, 2022

Lucas-C commented May 23, 2022 •

edited

RedShy commented May 23, 2022 •

edited by Lucas-C

RedShy commented May 27, 2022

Lucas-C commented May 27, 2022

RedShy commented May 27, 2022

Lucas-C commented May 27, 2022 •

edited

RedShy commented May 27, 2022

Lucas-C commented May 27, 2022

Lucas-C commented Jun 17, 2022

Added support for embedding indexed images #443

Added support for embedding indexed images #443

Conversation

RedShy commented May 19, 2022 • edited

Lucas-C left a comment

Choose a reason for hiding this comment

codecov bot commented May 20, 2022 • edited

Codecov Report

RedShy commented May 20, 2022 • edited

Lucas-C commented May 20, 2022 • edited

RedShy commented May 21, 2022

Lucas-C commented May 23, 2022 • edited

RedShy commented May 23, 2022 • edited by Lucas-C

RedShy commented May 27, 2022

Lucas-C commented May 27, 2022

RedShy commented May 27, 2022

Lucas-C commented May 27, 2022 • edited

RedShy commented May 27, 2022

Lucas-C commented May 27, 2022

Lucas-C commented Jun 17, 2022

RedShy commented May 19, 2022 •

edited

codecov bot commented May 20, 2022 •

edited

RedShy commented May 20, 2022 •

edited

Lucas-C commented May 20, 2022 •

edited

Lucas-C commented May 23, 2022 •

edited

RedShy commented May 23, 2022 •

edited by Lucas-C

Lucas-C commented May 27, 2022 •

edited