Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HashPointPen fails with ufoLib2 #3421

Closed
jenskutilek opened this issue Jan 18, 2024 · 19 comments
Closed

HashPointPen fails with ufoLib2 #3421

jenskutilek opened this issue Jan 18, 2024 · 19 comments
Assignees

Comments

@jenskutilek
Copy link
Collaborator

I'm using the HashPointPen to check glyphs from a UFO (opened with ufoLib2) against glyphs from a TTF.

Both produce different hashes for identical glyphs when the glyph is a composite.

TTF: w2671[...|(+1+0+0+1+0+0)][l1008+3162l459+3736l942+3736l1335+3162|(+1+0+0+1+180+0)]
UFO: w2671[...|(+1.0+0.0+0.0+1.0+0+0)][l1008+3162l459+3736l942+3736l1335+3162|(+1.0+0.0+0.0+1.0+180+0)]

As you can see, the difference is that the transformation (scale) is stored as an int when the pen is fed by the TTF, but as a float when it is fed by the UFO.

def addComponent(self, baseGlyphName, transformation, identifier=None, **kwargs):
tr = "".join([f"{t:+}" for t in transformation])

I guess the HashPointPen should be changed to even out this difference, but what would be preferred? The scales in the transformation are stored as F2Dot14 values in a TTF, so for maximum precision, we should probably always write out the full F2Dot14?

@jenskutilek
Copy link
Collaborator Author

Using the format {t:+g} instead of {t:+} would even out the difference for integer scales.

But the F2Dot14 precision comes into play once the components are scaled by a fractional number.

Example: Scaled in the font editor by 91.3 %, represented in the two formats:

UFO: (+0.913+0+0+0.913+516+272)
TTF: (+0.913025+0+0+0.913025+516+272)

@justvanrossum
Copy link
Collaborator

Example: Scaled in the font editor by 91.3 %, represented in the two formats:

But that's literally a difference between the two components... TTF necessarily has to store an approximation, so that is lossy, and therefore I would expect a different hash.

@jenskutilek
Copy link
Collaborator Author

Formatting the values with {floatToFixed(t, 14):+g} is not pretty, but solves the problem:

Scale 1.0, 1.0:

UFO: (+16384+0+0+16384-1.16818e+07+0)
TTF: (+16384+0+0+16384-1.16818e+07+0)

Scale 0.913, 0.913:

UFO: (+14959+0+0+14959+8.45414e+06+4.45645e+06)
TTF: (+14959+0+0+14959+8.45414e+06+4.45645e+06)

@justvanrossum
Copy link
Collaborator

Formatting the values with {floatToFixed(t, 14):+g} is not pretty, but solves the problem:

But UFO does not use F2Dot14, so this allows different values to hash the same. So it does not "solve" the problem, it hides an actual difference.

@anthrotype
Copy link
Member

But UFO does not use F2Dot14, so this allows different values to hash the same

but that is exactly what is desired in this case, because the hashpointpen is being used to determine whether the glyfs have changed and the hinting is thus invalidated. A slightly different float in the UFO that will then be rounded off to the fractional value when compiled the TTF does not actually matter, what matters is the F2Dot14

@anthrotype
Copy link
Member

maybe the pen could take a callable that will be used to round these and the caller (in this case ufo2ft I suppose?) will pass the desired floatToFixed

@anthrotype
Copy link
Member

FYI there's also floatToFixedToFloat

@justvanrossum
Copy link
Collaborator

but that is exactly what is desired in this case

Ah, I didn't realize that. But yeah, then a more mallable solution may be warranted, as HashPointPen doesn't advertize itself as "hash glyphs equal when they are are equal in TTF form".

@jenskutilek
Copy link
Collaborator Author

jenskutilek commented Jan 19, 2024

Example: Scaled in the font editor by 91.3 %, represented in the two formats:

But that's literally a difference between the two components... TTF necessarily has to store an approximation, so that is lossy, and therefore I would expect a different hash.

I'm using the HashPointPen to check if I can transfer TrueType byte code from a UFO to a TTF. Currently that marks all composites as different, though the difference is negligible*. If the hash used {t:+g}, it would work OK for integer scales, but would disallow me from using fractional scales for components in my sources :-/

*) The hinting editor would have showed me the F2Dot14 approximations anyway, so that is what's actually hinted

@anthrotype
Copy link
Member

are you using the g formatting to turn x.0 floats into actual integers? remember that g also sometimes uses scientific notation for very small floats, maybe you don't want that. I think you can change the pen to take an optional round function and pass in a partial(floatToFixedToFloat, precisionBits=14) to it

@jenskutilek
Copy link
Collaborator Author

Thanks for your comments. I've made a PR.

Now I only apply the round function to the scale values of the transformation, as the x and y offsets are stored as integers in TTF. In theory, you could have a UFO with fractional coordinates/offsets, and a TTF with rounded coordinates/offsets. Do we have to consider that case as well?

Even CFF may be affected by the precision issue, in case both the UFO and the CFF table use fractional coordinates.

@jenskutilek
Copy link
Collaborator Author

BTW ufo2ft is not affected by this, as it only compares the stored glyph hash against the calculated current glyph hash in the UFO.

@jenskutilek jenskutilek self-assigned this Jan 19, 2024
@jenskutilek
Copy link
Collaborator Author

jenskutilek commented Jan 19, 2024

With the PR, I still get a difference for components with a scale of -1.0:

UFO: (-1.0+0.0+0.0-1.0+2408+2498)
TTF: (-0.99993896484375+0.0+0.0-0.99993896484375+2408+2498)

@behdad
Copy link
Member

behdad commented Jan 19, 2024

We have float(floatToFixedToStr(v, 14)) that might help you here.

@jenskutilek
Copy link
Collaborator Author

Thanks, Behdad.

Wow, this is really hard. Opening a UFO with ufoLib2, the type of the transformation's scale values may either be a float or an int:

>>> f = Font.open("transform.ufo")
>>> f["c"].components
[Component(baseGlyph='b', transformation=<Transform [1 0 0 0.6623 70 304]>)]
>>> for c in f["c"].components:
...     for t in c.transformation:
...         print(type(t), t)
...
<class 'float'> 1.0
<class 'int'> 0
<class 'int'> 0
<class 'float'> 0.6623
<class 'int'> 70
<class 'int'> 304

(Sure, that's also what the UFO spec says)

So an editor that quietly changes a scale of 1.0 to 1 or vice versa will invalidate a stored hash if there is no post-processing on the scale values.

@anthrotype
Copy link
Member

since in python 1.0 == 1, it makes sense that the HashPointPen treats floats-that-are-really-ints as int, but the rounding should be optional

@jenskutilek
Copy link
Collaborator Author

With the PR, I still get a difference for components with a scale of -1.0:

UFO: (-1.0+0.0+0.0-1.0+2408+2498)
TTF: (-0.99993896484375+0.0+0.0-0.99993896484375+2408+2498)

This seems like a FontLab Studio 5 bug. It exports a scale of -1.0 as -0.99994 into a TTF, i.e. -16383/16384 instead of -16384/16384.

@jenskutilek
Copy link
Collaborator Author

After some more digging, it seems that the original difference 1 vs 1.0 lies in the way the UFO is created. When the UFO is built in memory, and then the hash is calculated, it contains floats. When the UFO is written to disk, then opened again (also with ufoLib2), and the hash is calculated, it contains ints.

I can work around this in my UFO builder, but maybe the HashPointPen should do some normalization there.

The HashPointPen has deviated from the UFO spec considerably (but for the better; I think the UFO spec should be updated). But maybe it is a good idea to reinstate the rounding as specified? Per the spec, coordinates and component offsets are rounded to max. 3 decimals; the transformation matrix values are rounded to max. 8 decimals:

The x and y values are then written as decimal strings separated by a comma. The x and y values are rounded to a precision of no more than 3 decimal places.
If the child is a component element, first the transform values are written, if any. [...] The four scale values will be rounded to a precision of 8 decimal places, and the offset values will be rounded to a precision of at most 3 decimal places.

@jenskutilek
Copy link
Collaborator Author

PR to round as in UFO spec: #3427

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants