Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache repeated COLR subtables (2) #3221

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
23 changes: 22 additions & 1 deletion Lib/fontTools/colorLib/table_builder.py
Expand Up @@ -5,6 +5,8 @@

import collections
import enum
import pickle
from hashlib import sha256
from fontTools.ttLib.tables.otBase import (
BaseTable,
FormatSwitchingBaseTable,
Expand Down Expand Up @@ -80,10 +82,15 @@ class TableBuilder:
based on otData info for the target class. See BuildCallbacks.
"""

def __init__(self, callbackTable=None):
def __init__(self, callbackTable=None, cacheMaxSize=128):
if callbackTable is None:
callbackTable = {}
self._callbackTable = callbackTable
# a LRU cache of (cls, sha256(pickle.dumps(source)).digest()) => dest.
self.cache = collections.OrderedDict() if cacheMaxSize > 0 else None
# cacheMaxSize of 0 means no cache; None means no limit.
# Not sure what a good default cache size is, 128 is just a random number...
self.cacheMaxSize = cacheMaxSize

def _convert(self, dest, field, converter, value):
enumClass = getattr(converter, "enumClass", None)
Expand Down Expand Up @@ -124,6 +131,14 @@ def build(self, cls, source):
if isinstance(source, cls):
return source

if self.cache is not None:
cacheKey = (cls, sha256(pickle.dumps(source)).digest())
if cacheKey in self.cache:
# Move to the end to mark it as most recently used
self.cache.move_to_end(cacheKey)
# return unique copies; pickle/unpickle is faster than copy.deepcopy
return pickle.loads(pickle.dumps(self.cache[cacheKey]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ugly, even if faster than deepcopy. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, you might get away without a copy whatsoever, no?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe as an opt-in.

Copy link
Member Author

@anthrotype anthrotype Aug 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ugly, even if faster than deepcopy. :)

the whole point of this was to speed things up...

you might get away without a copy whatsoever, no?

I don't think so. If you use this to build master COLR tables that are then sent to varLib.merger to build variable COLR, the thing is updated in-place and you don't want shared objects. When decompiling otTables from binary, you do get distinct objects even when they were pointed to by same offset.


callbackKey = (cls,)
fmt = None
if issubclass(cls, FormatSwitchingBaseTable):
Expand Down Expand Up @@ -178,6 +193,12 @@ def build(self, cls, source):
(BuildCallback.AFTER_BUILD,) + callbackKey, lambda d: d
)(dest)

if self.cache is not None:
if self.cacheMaxSize is not None and len(self.cache) >= self.cacheMaxSize:
# Evict the least recently used item at the front of the cache
self.cache.popitem(last=False)
self.cache[cacheKey] = dest

return dest


Expand Down