Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GifImagePlugin seek-related optimizations #6075

Closed

Conversation

AnonymouX47
Copy link

@AnonymouX47 AnonymouX47 commented Feb 20, 2022

Changes proposed in this pull request:

  • Added self.__prev_offset to always hold the offset of the frame just before the current frame.
  • Optimized GifImagePlugin.seek(), GifImagePlugin.n_frames and GifImagePlugin.is_animated when seeking back to the current frame before invoking the operation, after an EOFError is raised.
  • Seeking beyond the last frame automatically computes GifImagePlugin.n_frames.
  • GifImagePlugin.n_frames now calls GifImagePlugin._seek() directly instead of GifImagePlugin.seek().
    • Skips unnecessary seek checks as all seeks will be valid, since the current frame is already valid and seeking starts from the next frame.
    • Skips the "current-frame-reset" step of .seek()
  • Replaced all internal calls to GifImagePlugin.tell() with direct references to self.__frame.

Extra Explanation

Concerning .n_frames for example, when a GIF (say 57 frames) is currently on the mid frame (say 28), invoking .n_frames for the first time would've resulted in the following chain of frame changes:

28 -> ... -> 56 -> 57 [(while performing seek checks for every step)... EOFError is caught in seek(), seek() tries to reset to 56]
57 -> 0 [Since 56 < 57]
0 -> ... -> 56 [EOFError is re-raised from seek(), caught in n_frames and tries to reset to 28]
56 -> 0 [Since 28 < 56]
0 -> 28 [n_frames finally sets and returns _n_frames]

Now, it goes thus:

28 -> ... -> 56 -> 57 [EOFError caught in n_frames, tries to reset to 28]
27 -> 28 [Causes the the file pointer to be directly seek-ed to the offset of frame 27 by setting __offset, which takes relatively "no" time. n_frames sets and returns _n_frames]

Tests

The test image has 57 frames.

def test_seek_1():
    img = Image.open("images/TIKTOK_LOGO_400.gif")
    try:
        img.seek(57)  # EOFError, then reset to frame 0
    except EOFError:
        pass

def test_seek_2():
    img = Image.open("images/TIKTOK_LOGO_400.gif")
    img.seek(56)  # Last frame
    try:
        img.seek(57)  # EOFError, then reset to frame 56
    except EOFError:
        pass

def test_seek_EOF_nframes():
    img = Image.open("images/TIKTOK_LOGO_400.gif")
    try:
        img.seek(57)  # EOFError
    except EOFError:
        pass
    img.n_frames  # Already computed in new version


def test_nframes_1():
    # 57 frames
    img = Image.open("images/TIKTOK_LOGO_400.gif")
    img.n_frames

def test_nframes_2():
    # 57 frames
    img = Image.open("images/TIKTOK_LOGO_400.gif")
    img.seek(28)  # Mid frame
    img.n_frames  # Seek to end and reset to frame 28

def test_nframes_3():
    # 57 frames
    img = Image.open("images/TIKTOK_LOGO_400.gif")
    img.seek(56)  # Last frame
    img.n_frames  # Seek to end and reset to frame 56


def test_is_animated_1():
    # 57 frames
    img = Image.open("images/TIKTOK_LOGO_400.gif")
    img.is_animated  # Seek to frame 1 and reset to frame 0

def test_is_animated_2():
    # 57 frames
    img = Image.open("images/TIKTOK_LOGO_400.gif")
    img.seek(56)  # Last frame
    img.is_animated  # Seek to frame 1 and reset to frame 56

Results

Formerly:

In [395]: %timeit test_seek_1()
76.1 ms ± 1.43 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [396]: %timeit test_seek_2()
155 ms ± 5.07 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [397]: %timeit test_seek_EOF_nframes()
220 ms ± 2.76 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [399]: %timeit test_nframes_1()
152 ms ± 4.13 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [400]: %timeit test_nframes_2()
195 ms ± 6.24 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [401]: %timeit test_nframes_3()
153 ms ± 4.92 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [402]: %timeit test_is_animated_1()
1.13 ms ± 21.9 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [403]: %timeit test_is_animated_2()
147 ms ± 2 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Now:

In [405]: %timeit test_seek_1()
74.5 ms ± 1.83 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [406]: %timeit test_seek_2()
73.5 ms ± 799 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [407]: %timeit test_seek_EOF_nframes()
73.7 ms ± 778 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [408]: %timeit test_nframes_1()
75.1 ms ± 1.14 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [409]: %timeit test_nframes_2()
79.1 ms ± 1.82 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [410]: %timeit test_nframes_3()
80.1 ms ± 3.98 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [411]: %timeit test_is_animated_1()
1.13 ms ± 44.8 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [412]: %timeit test_is_animated_2()
75.4 ms ± 1.8 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

- Add: Added `self.__prev_offset` to always hold the offset of the frame just before the current frame.
- Add: Seeking beyond the last frame automatically computes `GifImagePlugin.n_frames`.
- Change: Optimized `GifImagePlugin.seek()`, `GifImagePlugin.n_frames` and `GifImagePlugin.is_animated` when seeking back to the current frame before invoking the operation, after an `EOFError` is raised.
- Change: Replaced all internal calls to `GifImagePlugin.tell()` with direct references to `self.__frame`.
- Add: Included test for `n_frames` optimization.
@radarhere radarhere added the GIF label Feb 20, 2022
@AnonymouX47
Copy link
Author

For what it's worth, I realized the deficiencies in the course of working with a large (2188 frames, ~8 MiB) GIF image on a project I'm developing.

@radarhere
Copy link
Member

Could you upload a copy of your test image? Not to be included in PR, but just as a comment here, so that we can run your speed tests in the same way you are.

@AnonymouX47
Copy link
Author

AnonymouX47 commented Feb 21, 2022

Oh, sorry. Thought it would be unnecessary since it should apply to any GIF.
Here's the image:

TIKTOK_LOGO_400.gif

@AnonymouX47
Copy link
Author

AnonymouX47 commented Feb 21, 2022

Just added some explanation in the PR message, thought it would be useful.

@radarhere
Copy link
Member

Hi. So, an animated GIF does not have to be like a... PowerPoint slideshow, where independent images display one after another.
It can be like a series of Photoshop layers. Some layers are mostly transparent, and by applying one layer after another, the image is transformed.

When Pillow seeks back to the first frame each time, it is starting from the bottom of those layers, and applying each new layer in turn. Your solution potentially skips a number of layers, meaning the final image may be incorrect.

Consider this code, adapted from one of your tests.

from PIL import Image

img = Image.open("TIKTOK_LOGO_400.gif")
img.seek(28)  # Mid frame
img.save("before_n_frames.png")
img.n_frames  # Seek to end and reset to frame 28
img.save("after_n_frames.png")

before_n_frames.png and after_n_frames should be the same image. Except with your PR, here is what I get.

before_n_frames.png after_n_frames.png
before_n_frames after_n_frames

With the main branch, they are the same.

So unfortunately, there is a fundamental problem here.

@AnonymouX47
Copy link
Author

AnonymouX47 commented Feb 21, 2022

Hmm... I see. 🤔 Thanks

But can the part of seek() resetting to 56 from 57 be eliminated at least?... i.e for n_frames.

@AnonymouX47
Copy link
Author

Indeed, there's a fundamental problem.

Invoking im.is_animated or im.n_frames while on frame 0 causes any subsequent call to im.load() to raise a ValueError: Unrecognized image mode.

I guess i didn't realize all these earlier because the changes seemed to work fine with my use case.

@radarhere
Copy link
Member

So, my first commit in #6077 helped test_is_animated_2 - if the user has seeked past the first frame for us, then we don't need to seek to the second frame to know that the image is animated.

I've now added a commit to #6077 that will check if the next byte (or lack thereof) ends the GIF image, and if so, just stay on the current frame. This should eliminate the reset for most cases - I could deliberately create an image that had some data past the last frame and so still needed to be reset, but I would hope that is rare, since I don't know why a GIF writer would do that.

Testing those changes, test_seek_2, test_nframes_1 and test_nframes_3 now have the same speeds as your PR. test_seek_EOF_nframes and test_nframes_2 are not as fast as in your version, but there are improvements.

@radarhere
Copy link
Member

I've pushed another commit - when performing seeks for n_frames or is_animated, do not actually load the images. We're not interested in them for those moments. This matches this PR's speeds for test_seek_EOF_nframes and test_nframes_2, and is faster than this PR for test_nframes_1.

So somehow, I think that should cover all of the significant differences in speeds from your tests here.

@AnonymouX47
Copy link
Author

AnonymouX47 commented Feb 21, 2022

Great! Thanks

I'll try it out.

@radarhere
Copy link
Member

Closing. See #6077 instead.

@radarhere radarhere closed this Feb 21, 2022
@AnonymouX47 AnonymouX47 deleted the gif-seek-optimizations branch February 21, 2022 09:15
@AnonymouX47
Copy link
Author

AnonymouX47 commented Feb 21, 2022

@radarhere
I want to suggest that the cases where this PR failed should be added to the tests in order to avoid such issues in the future.

@radarhere
Copy link
Member

To be fair, there were 4 failures in our test suite.
But I will grant you, none of them were in test_file_gif.py. So I have created PR #6080

@AnonymouX47
Copy link
Author

AnonymouX47 commented Feb 21, 2022

I see... thanks for your rapid responses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants