New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance: Use subclass for parsing EXIF data #3663
Conversation
Would you be able to add tests? |
Since no external interfaces were modified, I don't see the point to add any tests. Of course, we could imagine tests for this new class, but they should already be covered by the existing EXIF tests. In fact, existing tests helped me a lot to avoid making useless mistakes. Could you be more precise on which part of the code could be tested? |
I would think that adding a new class constitutes a new interface, and so I would suggest adding independent tests for it. Feel free to disagree, but on the other hand, if you think that this class should not be used by end users, then why have the |
My goal was to return lazy version of the current dict in So normally, the output class should be transparent. However, the I don't know if it's possible (or even desirable) to hide this class declaration. In case you really want tests for that new class, I will need some help, because there is currently no test for |
The old method (using dict() and _fixup_dict) create clones from IFD_v1, but it read all items when iterating, losing all benefits from the lazyness of IFD. The new method use a custom subclass that try to prevent all iterations as much as possible, keeping all tags undecoded. The return value isn't a dict anymore, but a subclass of ImageFileDirectory_v1 that doesn't support tov2(). All decoded tags are formatted as before.
The _fixup does exactly the opposite
Co-Authored-By: Glandos <bugs-github@antipoul.fr>
Co-Authored-By: Glandos <bugs-github@antipoul.fr>
b0355d7
to
d48b98a
Compare
It seems that the merge of #3625 make this request invalid, but not irrelevant. I still have the following performance, when opening 1000 times the same JPEG image with a lot of EXIF tags produced by a camera: It seems that I can adapt this request to the new layout. Do you have any objections on that before I spend some of my spare time on it? |
The image link from your last comment no longer works. I've created PR #4031 as a new version of this PR - see what you think. |
I really dislike Github for their obscure retention policy.
Many, many thanks for taking care of that. It is really different from mine, but the code has changed too, and I think it does the job. |
The old method (using dict() and _fixup_dict) create clones from IFD_v1, but it read all items when iterating, losing all benefits from the lazyness of IFD.
The new method use a custom subclass that try to prevent all iterations as much as possible, keeping all tags undecoded.
The return value isn't a dict anymore, but a subclass of ImageFileDirectory_v1 that doesn't support tov2(). All decoded tags are formatted as before.
With a test image, I'm going from:
to
With a more common JPEG image from my Panasonic TZ10 camera, I'm going from:
to
I hope that even if the return type of
_getexif
is not the same, its compatibility with the previous format will make it merged.