New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Image with Filter "[/FlateDecode/JPXDecode]" not extracted #2087
Comments
For the other part of the issue, extraction of the |
It happens with all image sizes, in my example it is 67 x 68 px. Only the filter combination matters. |
Does |
You have a typo in extracting the filter names: |
Please provide a reproducer file. |
My bad :( "Filter" works correctly, so image extraction is the only problem, |
Thanks for the file. |
Thank you! |
Issue 2087: `fitz.i (extract_image)´: the type of JPX images with more than one `/Filter` are not correctly recognized if inspecting the raw stream. Fixing this by extracting the decoded stream: we already know the type from the PDF dict. Issue 2094: Rectangle recognition `(helper-devices.i (jm_checkrect())` was wrong in not confirming that also x-coordinates are the same in respective corners. Also simplified rectangle orientation detection.
Issue 2087: `fitz.i (extract_image)´: the type of JPX images with more than one `/Filter` are not correctly recognized if inspecting the raw stream. Fixing this by extracting the decoded stream: we already know the type from the PDF dict. Issue 2094: Rectangle recognition `(helper-devices.i (jm_checkrect())` was wrong in not confirming that also x-coordinates are the same in respective corners. Also simplified rectangle orientation detection.
Fix #2110 (Discussion item #2111): File `__main__.py` - also include the font's xref in the generated file name. Fix #2094: File `helper-device.i' - also ensure equality of x coordinates of relevant corners before assuming a rectangle. Fix #2087: File `fitz.i`- if JPX image format is already known, make sure to read the decoded image stream, instead of raw stream in the other cases.
Fix pymupdf#2110 (Discussion item pymupdf#2111): File `__main__.py` - also include the font's xref in the generated file name. Fix pymupdf#2094: File `helper-device.i' - also ensure equality of x coordinates of relevant corners before assuming a rectangle. Fix pymupdf#2087: File `fitz.i`- if JPX image format is already known, make sure to read the decoded image stream, instead of raw stream in the other cases.
Fix #2110 (Discussion item #2111): File `__main__.py` - also include the font's xref in the generated file name. Fix #2094: File `helper-device.i' - also ensure equality of x coordinates of relevant corners before assuming a rectangle. Fix #2087: File `fitz.i`- if JPX image format is already known, make sure to read the decoded image stream, instead of raw stream in the other cases.
Fixed in PyMuPDF-1.21.1. |
Describe the bug (mandatory)
I'm trying to iterate over all images in the document, but for one filter combination it fails.
PyMuPDF works correctly for all filter combinations except
Filter [/FlateDecode/JPXDecode]
. Pdfs with such image filters are correctly read by pdf readers and other python pdf libs, but PyMuPDF fails to extract image and get correct filters byxref_get_key(xref, "/Filters")
.these:
<</ColorSpace/DeviceRGB/BitsPerComponent 8/Width 1672/Length 1964389/Height 1124/Name/im1/Subtype/Image/Type/XObject/Filter/JPXDecode>>
and
<</ID 26 0 R/Type/XObject/Length 476/Filter[/FlateDecode/DCTDecode]/Subtype/Image/BitsPerComponent 8/Width 126/Height 81/ColorSpace/DeviceRGB>>
are ok
this
<</ColorSpace/DeviceGray/BitsPerComponent 8/Width 67/Length 1958/Height 68/Name/im2/Subtype/Image/Type/XObject/Filter[/FlateDecode/JPXDecode]>>
fails
However,
document.xref_stream(xref)
correctly decompresses the stream and output is valid jpeg2000 stream.To Reproduce (mandatory)
output for images with such filters:
subtype of xref 76 is /Image, but pymupdf can not extract it as image. filters: ('null', 'null')
Your configuration (mandatory)
Windows 10 x64, python 3.10, pymupdf 1.21, installed by pip install pymupdf
The text was updated successfully, but these errors were encountered: