Performance of ImageDraw::text() and potential use of FTC_Manager() #6618

time4tea · 2022-09-24T21:37:28Z

What did you do?

Used Pillow to render frames outputting to ffmpeg - in project https://github.com/time4tea/gopro-dashboard-overlay
Pillow is great!

I'm trying to render frames as quickly as possible, as there are many frames to render in a 1 or 2 hour video - even at 10 frames/second

I'm using the text facilities of Pillow to render text into an Image. I cache text images where possible - so rendering fixed text strings is very quick - however, with a dynamic text string, such as a datetime or GPS location - caching isn't so effective.

Looking at the call stack of drawing some text.. it seems to look something like:

ImageDraw::draw_text()
  ImageFont::getmask2()
    Font::getsize() - implemented in imagingft.c font_getsize
    Font::render() - implemented in imagingft.c font_render

When you call these functions a lot - as I do - it becomes clear that these functions probably do a lot of similar work - in a python profile of a run of my software (there are multiple call routes here so don't worry they don't all add up!)

draw_text -> 2077 calls 8259ms
  getmask2 -> 2076 calls 7567ms
    Font::getsize -> 2595 calls 4171ms
    Font::render -> 2076 calls 4195ms

Looking at imagingft.c, - they both seem to call (in my case) text_layout_raqm, which, I'm guessing calls through to FT to get the glyphs for the given string - allowing for ligatures/kerning etc.

I was wondering... FT seems to allow for glyph caching using FTC_Manager - is there any appetite for adding support for this?

I think that, in the case of rendering lots of frames of text, it has the possibility of adding quite a bit of performance. (Which is probably not a major goal for Pillow, totally fair!)

For example, rendering a compass widget using Pillow, with a few open and filled circles, lots of compass lines, and bilinear resize for AA takes about 2.6ms, but when adding in 4 characters for "N", "S","E", "W" - takes 13ms. (I could optimise this particular use case, its just an example of how the text rendering compares to the rest of Pillow)

Thanks for reading this far!
Thanks for a super library!

The text was updated successfully, but these errors were encountered:

nulano · 2022-09-25T00:05:14Z

I have another suggestion (in addition to glyph caching).

The function getmask2 performs the following steps:

calls getsize to get the size of the text
calls Image._decompression_bomb_check to compare size with MAX_IMAGE_PIXELS
calls the fill function passed as argument to create a blank image
calls render to draw text into the blank image

After Pillow 10 the deprecated fill parameter will be replaced by a direct call to the internal function. After this, the only Python function to be called between the two C functions is the decompression bomb check. If this was moved into C, the two functions could be combined to remove the duplicate call to text_layout.

time4tea · 2022-09-30T22:33:49Z

So - while it is a million miles from being ready for a library - i have some PoC code here using FreeType Cache from python. If it is useful ...
It almost certainly leaks memory, and will SEGV occasionally at the moment.
It renders the font into an ImageDraw in python, so that bit is quite slow
Layout is basically non-existant.
Only will build on linux, and even then only with GCC.

https://github.com/time4tea/gopro-dashboard-overlay/blob/c_extension/gopro_overlay/freetype.py
https://github.com/time4tea/gopro-dashboard-overlay/blob/c_extension/c/freetype.c
https://github.com/time4tea/gopro-dashboard-overlay/blob/c_extension/setup.py

time4tea · 2022-10-08T13:25:07Z

The code above, although still very(!) rough - is showing some interesting results so far. It is definitely not comparing apples to apples. but the performance so far makes me think it might be worth pursuing.
For example, to render some string into an RGBA image takes about 6ms for current pillow, but using FreeType cache, takes 40us -> its about 140x faster.
To render a stroked thing into an RGBA image takes about 14ms for current pillow, using cache takes 1.2ms -> its about 11x faster.
Like i say, its not a fair comparison, the pillow stuff is doing a lot more, but also given the difference, makes me think i might plod on a bit.
Here is an example of the output - top is new thing, bottom is pillow.

time4tea · 2022-10-08T20:12:33Z

It looks like the font rendering has got much faster in recent releases! - I was on 8.4.0 - upgrading to 9.2.0 speeds up my test case for pillow from 14ms to 4ms.

I think fixing #6649 would significantly improve the performance of text rendering.

I'm at a point where its basically working now - have a look at the above files if you're interested.

This is the current timing for my experiment - time to render the string in the below image.

Cache - Stroked
  1.95 msec
Cache - Plain
  346 usec
Pillow - Stroked
  4.22 msec
Pillow - Plain
  4.13 msec

Here you can see some strings rendered by Pillow and my new code using FT cache - it is hard to tell them apart. Plain text is very much faster, stroked text is about twice as fast. I think this could be improved by caching the stroked glyphs- which would probably not be too hard to do, but I'm not intending to do this in a PoC right now.

Hope that's interesting - if you decide you'd like to go further on using the FT cache - please give me a shout.

nulano · 2022-10-09T14:17:11Z

I think the difference between your and the Pillow output might be due to Raqm vs basic layout.
You might want to take a look at #6631 / #6633.

radarhere · 2022-10-10T04:28:36Z

#6649 has now been fixed in main.

time4tea · 2022-10-12T10:25:11Z

@nulano - yes - good observation.
i'll try to make another performance test showing pillow with raqm, pillow without raqm and my bodge code (no raqm, so broadly similar to basic layout)) - definitely one issue with the cache approach is that it completely changes how you get glyphs, so it requires considerable changes - so making cached with raqm might not be easily achievable. i didn't look that hard at the raqm code though tbh.
again - just to give some context - why is this important to me? i'm rendering many thousands of frames each with varying text. making the text function 12x (or 2x) quicker would make a big difference to me. I already cache images where the text doesn't vary, so looking at the text rendering makes sense. perhaps though, i jumped in at the deep end looking at the cache, when i could have tried disabling raqm! :-)

radarhere · 2023-04-10T09:26:08Z

After Pillow 10 the deprecated fill parameter will be replaced by a direct call to the internal function.

This has now been done in #7059

radarhere · 2023-06-10T07:38:50Z

the only Python function to be called between the two C functions is the decompression bomb check. If this was moved into C, the two functions could be combined to remove the duplicate call to text_layout.

I attempted this change, but found a problem - the _imagingft extension is not connected to the C code for creating new images. I couldn't call ImagingNewDirty and ImagingFill.

I worked around this by passing Image.core.fill to font_render - so the deprecation of the fill parameter may not have been blocking this after all.

I've created PR #7206 for the change. From my tests, it makes getmask2 10% faster.

radarhere added the Performance label Sep 25, 2022

radarhere mentioned this issue Jun 10, 2023

Only call text_layout once in getmask2 #7206

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance of ImageDraw::text() and potential use of FTC_Manager() #6618

Performance of ImageDraw::text() and potential use of FTC_Manager() #6618

time4tea commented Sep 24, 2022

nulano commented Sep 25, 2022

time4tea commented Sep 30, 2022

time4tea commented Oct 8, 2022

time4tea commented Oct 8, 2022 •

edited

nulano commented Oct 9, 2022

radarhere commented Oct 10, 2022

time4tea commented Oct 12, 2022

radarhere commented Apr 10, 2023

radarhere commented Jun 10, 2023

Performance of ImageDraw::text() and potential use of FTC_Manager() #6618

Performance of ImageDraw::text() and potential use of FTC_Manager() #6618

Comments

time4tea commented Sep 24, 2022

What did you do?

nulano commented Sep 25, 2022

time4tea commented Sep 30, 2022

time4tea commented Oct 8, 2022

time4tea commented Oct 8, 2022 • edited

nulano commented Oct 9, 2022

radarhere commented Oct 10, 2022

time4tea commented Oct 12, 2022

radarhere commented Apr 10, 2023

radarhere commented Jun 10, 2023

time4tea commented Oct 8, 2022 •

edited