Add support to extract gray scale images #1460

joeywang4 · 2022-11-30T03:55:57Z

Currently, when gray scale images are extracted, they will be incorrectly transformed to RGB images. This PR fixed this issue by changing the palette and the mode when images are extracted.

codecov · 2022-11-30T04:08:35Z

Codecov Report

Base: 94.31% // Head: 94.01% // Decreases project coverage by -0.30% ⚠️

Coverage data is based on head (6efa8a6) compared to base (940819f).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1460      +/-   ##
==========================================
- Coverage   94.31%   94.01%   -0.31%     
==========================================
  Files          28       30       +2     
  Lines        5171     5443     +272     
  Branches      980     1038      +58     
==========================================
+ Hits         4877     5117     +240     
- Misses        177      197      +20     
- Partials      117      129      +12

Impacted Files	Coverage Δ
PyPDF2/filters.py	`97.31% <100.00%> (+0.01%)`	⬆️
PyPDF2/_merger.py	`91.11% <0.00%> (-6.46%)`	⬇️
PyPDF2/_writer.py	`88.73% <0.00%> (-2.80%)`	⬇️
PyPDF2/constants.py	`100.00% <0.00%> (ø)`
PyPDF2/_utils.py	`99.48% <0.00%> (ø)`
PyPDF2/__init__.py	`100.00% <0.00%> (ø)`
PyPDF2/generic/_data_structures.py	`95.62% <0.00%> (+0.50%)`	⬆️
PyPDF2/_reader.py	`90.30% <0.00%> (+0.68%)`	⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

MartinThoma · 2022-12-02T11:42:25Z

Very nice!

Do you happen to have an example pdf where the new code would be applied? Or could you create one and upload it here?

I would add it to the https://github.com/py-pdf/PyPDF2 + add a test

joeywang4 · 2022-12-07T03:43:28Z

Ok, I have uploaded a test file grayscale.pdf and added it to test_get_images.

MartinThoma · 2022-12-08T21:01:40Z

Hi @joeywang4 ,

I've just added the grayscale.pdf to the sample-files git submodule. Would you be so kind to remove the file from your PR + use sample-files/019-grayscale-image/grayscale-image.pdf?

I want to avoid that the PyPDF2 repository becomes bigger and bigger due to example PDF files. This might have an impact on people who clone PyPDF2 / install from the repository. Hence adding the file via submodule.

MartinThoma

The code looks good - thanks for adding the test!

Please just remove the resources/grayscale.pdf and use SAMPLE_ROOT / "019-grayscale-image/grayscale-image.pdf" instead :-)

joeywang4 · 2022-12-09T05:04:17Z

@MartinThoma I have removed the test file and updated the path. Thanks for the suggestion!
I also changed the path of the extracted image file when running the test. Otherwise, the extracted image cannot be correctly written to a file when I ran the test locally on my computer.

MartinThoma · 2022-12-09T19:32:01Z

Very good work! 👍

If you want, I'll add you to the contributors list :-)

New Features (ENH): - Add support to extract gray scale images (#1460) - Add 'threads' property to PdfWriter (#1458) - Add 'open_destination' property to PdfWriter (#1431) - Make PdfReader.get_object accept integer arguments (#1459) Bug Fixes (BUG): - Scale PDF annotations (#1479) Robustness (ROB): - Padding issue with AES encryption (#1469) - Accept empty object as null objects (#1477) Documentation (DOC): - Add module documentation the PaperSize class (#1447) Maintenance (MAINT): - Use 'page_number' instead of 'pagenum' (#1365) - Add List of pages to PageRangeSpec (#1456) Testing (TST): - Cleanup temporary files (#1454) - Mark test_tounicode_is_identity as external (#1449) - Use Ubuntu 20.04 for running CI test suite (#1452) [Full Changelog](2.11.2...2.12.0)

joeywang4 added 2 commits November 29, 2022 22:48

Add support to extract gray scale images

410ad78

Fix whitespaces for tests

84919b4

Add indexed gray image test file and update lookup len check

62e88e2

MartinThoma self-requested a review December 8, 2022 21:01

MartinThoma requested changes Dec 8, 2022

View reviewed changes

Remove test file and update test file path

6efa8a6

MartinThoma merged commit 22214e8 into py-pdf:main Dec 9, 2022

joeywang4 deleted the gray-image branch December 9, 2022 21:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support to extract gray scale images #1460

Add support to extract gray scale images #1460

joeywang4 commented Nov 30, 2022

codecov bot commented Nov 30, 2022 •

edited

Loading

MartinThoma commented Dec 2, 2022

joeywang4 commented Dec 7, 2022

MartinThoma commented Dec 8, 2022

MartinThoma left a comment

joeywang4 commented Dec 9, 2022

MartinThoma commented Dec 9, 2022

Add support to extract gray scale images #1460

Add support to extract gray scale images #1460

Conversation

joeywang4 commented Nov 30, 2022

codecov bot commented Nov 30, 2022 • edited Loading

Codecov Report

MartinThoma commented Dec 2, 2022

joeywang4 commented Dec 7, 2022

MartinThoma commented Dec 8, 2022

MartinThoma left a comment

Choose a reason for hiding this comment

joeywang4 commented Dec 9, 2022

MartinThoma commented Dec 9, 2022

codecov bot commented Nov 30, 2022 •

edited

Loading