Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: implement im.getxmp() to return all embedded XMP meta data as XML #5076

Closed
laynr opened this issue Dec 3, 2020 · 6 comments · Fixed by #5144
Closed

Feature request: implement im.getxmp() to return all embedded XMP meta data as XML #5076

laynr opened this issue Dec 3, 2020 · 6 comments · Fixed by #5144

Comments

@laynr
Copy link

laynr commented Dec 3, 2020

Implement image.getxmp() similar to image.getexif(), that returns all embedded XMP meta data out of an image as XML

Something like:

def getxmp(self):
    for segment, content in self.applist:
        if segment == 'APP1':
            marker, xmp_tags = content.rsplit(b'\x00', 1)
            if marker == b'http://ns.adobe.com/xap/1.0/':
                root = xml.etree.ElementTree.fromstring(xmp_tags)
return root

# Based off https://stackoverflow.com/a/32001778

FYI: This didn't work for me:
xmp_tags = self.info.get("XML:com.adobe.xmp")

I am sure this feature request has been asked before... but a search of 'XMP' in issues yielded nothing. Just asking for MVP, not write support, or tag comprehension.

XMP documentation:
Official: https://www.adobe.com/devnet/xmp.html
Helpful: https://exiftool.org/TagNames/XMP.html

Requesting output similar to the output of:
exiftool.exe -xmp:all -X image.jpg

@hugovk hugovk changed the title Feature request: implement image.getxmp() to return all imbedded XMP meta data as XML. Feature request: implement image.getxmp() to return all embedded XMP meta data as XML Dec 6, 2020
@UrielMaD
Copy link
Contributor

Hello @laynr @hugovk I'd like to take this issue, I'm already working on this.

I already got the xmp tags out of the file, I'm just wondering which output structure would be best for the getxmp() to return

I was thinking that it could be an object like the getexif(), but in this one I already got the tags name, so, instead of the tag number I could implement the actual tag name and its value

@UrielMaD
Copy link
Contributor

Btw I also could just simply return its xml tree

@laynr
Copy link
Author

laynr commented Dec 24, 2020

Great thanks @UrielMaD!

It is probably best to keep it similar to getexif() if they return an object and adding tags name, instead of the tag number would be awesome...

That said, there is definite value in just returning the XML tree. For one, returning the XML tree may be the most future proof as you wouldn't need to stay current on new tags names.

I guess if you return an object, one of the objects functions can be to return the XML tree - perhaps that would be the best of both worlds! (but more work)

For my personal project I just used the XML tree as the parsers for XML are well established.

One thing I am noticing is that there can be multiple XMP sections in one file and not necessarily adjacent.

Thank you for taking this on. I believe it will be very useful for many people!

@UrielMaD
Copy link
Contributor

Thank you @Layn, then I'll send a PR implementing xml object, I can also return the whole xml tree as string.

I get the tag names directly from what's in the xml tree so I will return only the tags that comes in that file, so if there's more xmp tags added in the future it will still return the new ones as they just come as tags attributes.

@radarhere radarhere changed the title Feature request: implement image.getxmp() to return all embedded XMP meta data as XML Feature request: implement im.getxmp() to return all embedded XMP meta data as XML Dec 29, 2020
@UrielMaD
Copy link
Contributor

UrielMaD commented Dec 30, 2020

@hugovk Changes were merged into my PR and all tests have passed

@radarhere
Copy link
Member

Hi. Something to be aware of. Since the xml module is not secure -
https://docs.python.org/3/library/xml.etree.elementtree.html

Warning The xml.etree.ElementTree module is not secure against maliciously constructed data. If you need to parse untrusted or unauthenticated data see XML vulnerabilities.

in Pillow 8.3.0, we've added a new requirement - you will have to install defusedxml to get this method to work. See #5565 for more information

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants