Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{nb} breaks if text shaping is turned on with certain fonts #1090

Open
catsclaw opened this issue Jan 12, 2024 · 4 comments
Open

{nb} breaks if text shaping is turned on with certain fonts #1090

catsclaw opened this issue Jan 12, 2024 · 4 comments

Comments

@catsclaw
Copy link

catsclaw commented Jan 12, 2024

The special {nb} code fails with some fonts when text shaping is turned on.

Minimal code
Please include some minimal Python code reproducing your issue:

pdf = FPDF(format='letter')
pdf.add_font('gentium', style='', fname='GenBkBasR.ttf')
pdf.add_page()
pdf.set_font('gentium', '', 24)
pdf.set_text_shaping(True)
pdf.write(text='Pages {nb}')
pdf.ln()
pdf.set_text_shaping(False)
pdf.write(text='Pages {nb}')
pdf.output('test.pdf')

Result
image
Environment
Please provide the following information:

  • Operating System: Ubuntu
  • Python version: 11
  • fpdf2 version used: 2.7.7
@Lucas-C
Copy link
Member

Lucas-C commented Jan 12, 2024

Thank you for the report @catsclaw

You are right, those two features are currently incompatible.

The reason is that with test shaping, each character is rendered individually in the PDF (with a dedicated Tj operator for each letter). But in FPDF._substitute_page_number() we look for the sequence {nb} to be present inside a single "PDF string" (rendered by a single Tj operator).

As a consequence, this is currently a limitation in fpdf2.
We should mention it in our documention (in docs/PageBreaks.md).
And PRs are welcome to implement this feature also when text shaping is enabled!

Would you be interested to contribute regarding this @catsclaw? (docs improvement and/or implementation)

@andersonhc
Copy link
Collaborator

The characters will be rendered as a sequence if they are only moving on the x axis by the character length, but if there is any offset (kerning, etc) we need to adjust the text matrix and make individual Tj. That's why only some fonts will have this problem.

@andersonhc
Copy link
Collaborator

andersonhc commented Jan 27, 2024

Adding to this issue:

  • When you have alias ({nb}) in the text in a multi cell with alignment justified, the line width will be calculated with the alias size instead of the final number, so your text won't be correctly justified
  • If multi cell breaks the alias in 2 different lines it won't be replaced by the number of pages.

Example:

from fpdf import FPDF
text="Lorem ipsum dolor sit amet, {nb} {nb} {nb} {nb} {nb} {nb} consectetur adipiscing elit. {nb}{nb}{nb}{nb}{nb}{nb}{nb}{nb}{nb}{nb}{nb}{nb}{nb}{nb}{nb}{nb}{nb}{nb}{nb}{nb}Mauris sit amet lacus ut ex tincidunt vulputate non nec mauris. Lorem ipsum dolor sit amet, consectetur adipiscing elit."
pdf = FPDF()
pdf.add_page()
pdf.set_font("helvetica", "", 24)
pdf.multi_cell(w=pdf.epw, text=text, align="J", new_x="LEFT")
pdf.output('test_nb.pdf')

Result:
issue_nb

The problem is the replacement is done directly in the page content after all the rendering is done.
I don't see an obvious way to fix it and it will probably demand a lot of rework on how output works.

@gmischler
Copy link
Collaborator

The underlying problem here is that an otherwise legitimate sequence of text characters is given a special meaning under certain circumstances. This was bound to result in conflicts somewhere down the line.

The clean solution would be to use a reserved Unicode character for this purpose, which can't otherwise appear in renderable text.
A practical approach might be to convert self.str_alias_nb_pages into a special Glyph() subtype (say NbGlyph()) during text parsing. When rendering, NbGlyph() then inserts a sequence of three or four of this reserved Unicode character. And before writing the file, the reserved character sequences get replaced with the right sequence of digit glyphs.

Or am I missing some basic obstacle here?
Yes, various places in the code need to learn about this special case, but that is kind of inevitable if we want to avoid conflicts.

evilaliv3 added a commit to globaleaks/GlobaLeaks that referenced this issue May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants