Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(font): rewrite mapping.pl in Python #3715

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

IvoWingelaar
Copy link

@IvoWingelaar IvoWingelaar commented Sep 17, 2022

Assumes #3714 gets merged, and likewise this PR is ready for merging.

This rewrite is almost trivial, as we can reuse the mapping from the font generation code. Because this deduplicates code, there is no need to keep the mappings in sync. The only real rewrite is the inversion of the mapping from the original Perl code, the rest of the old Perl code was a duplication of the mapping.

This commit also contains the newly generated fontMetricsData.js. The changes in this file correspond to adding all of the metrics of two fonts (Caligraphic-Bold and Fraktur-Bold), and a few touch-ups for several glyphs:

  • In Main-Italic we add metrics for two glyphs:

    • 305 / 0x131: dotless i
    • 567 / 0x237: dotless j
  • The rest of the additions are in Typewriter-Regular:

    • 168 / 0xA8: diereses
    • 710 / 0x2C6: circumflex
    • 729 / 0x2D9: dot above (but the glyph is a line below?)
    • 732 / 0x2DC: small tilde
    • 733 / 0x2DD: double acute accent

    729 and 733 are a little weird as the Unicode descriptions do not match their glyphs.

  • Three glyphs are removed from Typewriter-Regular: (770, 771, 776) / (0x302, 0x303, 0x308). These are not present in the generated font files, so it's save to remove them.

  • Only 126: 0x7E in Typewriter-Regular has changed metrics, its height and depth change.

Disregarding the added metrics in the generated file, merging this chain of PR's removes a net of 1K LOC.

Closes #3702.

The old version was a big list of commands to `mftrace` in a single
target. Now the code uses a pattern to create the different files in a
more idiomatic way.

It is not necessary to explicitly state the encoding for some files.
The `mftrace` script can always find the correct encoding
automatically.
This list isn't necessary because we generate a bunch of fonts in the
docker container, and copy everything over. This list is implicitly
defined by the output of `.ff` scripts.
Rewriting `makeFF` in Python improves the code in several ways:
- Increases maintainability as Python code is more readable than
  dense Perl code using mostly string substitutions (sometimes with
  regular expressions) to write scripts, and rewrite other files.
- Pair kerning is a feature that requires us to read the `.tfm` files
  that contain those metric tables. The current codebase contains a
  `.tfm` parser written in Python, not in Perl. This commit makes it
  possible to use that to build pair kerning tables in the FontForge
  scripts.

This new code also splits the mapping between the old TeX font
codepoints and the modern Unicode scalar values into a separate module
that can in the future be reused for metrics generation. This decreases
code duplication, and makes changes more robust. It's also easier to
write sanity checks in the new mappings to catch easy-to-miss mistakes
like codepoints being mapped twice.
Python's module resolution rules requires us to have a slightly
different directory structure in order to properly share the mapping
code in a separate file.
This rewrite is almost trivial, as we can reuse the mapping from the
font generation code. Because this deduplicates code, there is no need
to keep the mappings in sync. The only real rewrite is the inversion of
the mapping from the original Perl code, the rest of the code was a
duplication of the mapping.

This commit also contains the newly generated `fontMetricsData.js`.
The changes in this file correspond to adding all of the metrics of two
fonts (`Caligraphic-Bold` and `Fraktur-Bold`), and a few touch-ups for
several glyphs:
- In `Main-Italic` we add metrics for two glyphs:
  - 305 / 0x131: dotless i
  - 567 / 0x237: dotless j
- The rest of the additions are in `Typewriter-Regular`:
  - 168 / 0xA8: diereses
  - 710 / 0x2C6: circumflex
  - 729 / 0x2D9: dot above (but the glyph is a line below?)
  - 732 / 0x2DC: small tilde
  - 733 / 0x2DD: double acute accent

  729 and 733 are a little weird as the Unicode descriptions do not
  match their glyphs.
- Three glyphs are removed from `Typewriter-Regular`: (770, 771, 776)
  / (0x302, 0x303, 0x308). These are not present in the generated font
  files, so it's save to remove them.
- Only 126: 0x7E in `Typewriter-Regular` has changed metrics, its
  height and depth change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Code duplication and maintainability ambiguities in font and metric generation
1 participant