Support for subscript/superscript in cell/multicell using markdown #860

Tolker-KU · 2023-07-21T17:07:13Z

Hi,

Thanks for all the great work going into this project!

I wonder if you have considered supporting subscript/superscript in cell/multicell when styling text with markdown?

Github supports this in their markdown implementation using the HTML tags <sub>/<sup>. I imagine fpdf2 could do something similar.

If you think this is a good idea, I would be happy to take a crack at it. It seems that the machinery for this feature already is in place.

Lucas-C · 2023-07-23T20:29:24Z

Hi @Tolker-KU!

Thank you for your nice words 😊

I think this was implemented by @gmischler in #520:
https://pyfpdf.github.io/fpdf2/TextStyling.html#subscript-superscript-and-fractional-numbers

I think it should work for multi_cell(), but we currently only have unit tests for .write(),
so extra unit tests covering multi_cell() would be welcome!

gmischler · 2023-07-23T21:33:06Z

I think this was implemented by @gmischler in #520:

#520 implements the general ability to render subscript and superscript text, as well as the <sub> and <sup> tags for write_html().
However, the feature is not currently supported by our version of markdown.

The reason for the latter was that I couldn't find a standard on which characters to use as markup.
The most popular markdown variant commonmark doesn't support them either, for reasons that aren't entirely clear.
But then, since our own markdown variant is rather weird anyway (fundamentally incompatible with any others), we could theoretically chose whatever we want... I've seen ^x^ and ~x~ suggested most often, in our case it would probably make sense to double them like ^^x^^ and ~~x~~ to match the style of the existing tags.

I'm not very comfortable with borrowing tags from HTML. Why not just use HTML in the first place then?
Github accepting <sub> and <sup> HTML tags has little to do with markdown. It simply passes those through to the browser unchanged, just as it does with <b>, <i>, etc.

And while we're on the topic: Adding a conforming commonmark implementation (possibly in parallel) should probably be the long term goal.

Tolker-KU · 2023-07-24T13:59:35Z

Thank for getting back this quickly.

I'm looking for a feature to render subscripts and superscript within cells. As far as I can figure out this is not quite achievable with .write_html. Or am I wrong here?

What do you about adding the ^^ and ~~ tags to the markdown syntax, so one can do .cell(txt="H~~2~~O") -> H₂O or .cell(text="E=MC^^2^^") -> E = MC²?

Lucas-C · 2023-07-25T15:49:20Z

I'm looking for a feature to render subscripts and superscript within cells. As far as I can figure out this is not quite achievable with .write_html. Or am I wrong here?

No, you are right.
fpdf2 currently does not support <sup> & <sup> tags inside <table>:

from fpdf import FPDF

pdf = FPDF()
pdf.set_font("Helvetica")
pdf.add_page()
pdf.write_html(
    """<table border="1"><thead><tr>
        <th width="33%">Name</th>
        <th width="66%">Formula</th>
    </tr></thead><tbody><tr>
        <td>Lucas-C</td><td>E = MC<sup>2</sup></td>
    </tr</tbody></table>""")
pdf.output("issue_860.pdf")

I agree that it would be nice if fpdf2 supported this usage! 😊
I would welcome a PR that implements this in HTML2FPDF: https://github.com/PyFPDF/fpdf2/blob/master/fpdf/html.py#L195

I also fully agree with you @gmischler on this:

And while we're on the topic: Adding a conforming commonmark implementation (possibly in parallel) should probably be the long term goal.

Ideally, we could support combining fpdf2 with https://github.com/executablebooks/markdown-it-py
But then, would the translation chain be Markdown -> HTML, and then use FPDF.write_html()?
This is not ideal, as our HTML2PDF converter is very limited: https://pyfpdf.github.io/fpdf2/HTML.html

So I'm not really sure of the path forward regarding Markdown support...

Tolker-KU · 2023-07-26T09:10:05Z

Ideally, we could support combining fpdf2 with https://github.com/executablebooks/markdown-it-py But then, would the translation chain be Markdown -> HTML, and then use FPDF.write_html()? This is not ideal, as our HTML2PDF converter is very limited: https://pyfpdf.github.io/fpdf2/HTML.html

So I'm not really sure of the path forward regarding Markdown support...

I think markdown-it-py parses markup to tokens before rendering to HTML. Maybe fpdf2 can render the tokens directly to PDF instead of using HTML as an intermediate step.

https://markdown-it-py.readthedocs.io/en/latest/using.html#the-token-stream

Lucas-C · 2023-07-26T09:16:46Z

I think markdown-it-py parses markup to tokens before rendering to HTML. Maybe fpdf2 can render the tokens directly to PDF instead of using HTML as an intermediate step.

Sure, we could do that!
But then we will basically have to maintain a new "Markdown2PDF" class 😅

I'm not opposed to this, if someone is willing to contribute / initiate such converter to this project,
and if it is mostlty compatible / does not break too many existing behaviours of fpdf2.

Tolker-KU · 2023-07-26T09:38:55Z

I'm looking for a feature to render subscripts and superscript within cells. As far as I can figure out this is not quite achievable with .write_html. Or am I wrong here?

No, you are right. fpdf2 currently does not support <sup> & <sup> tags inside <table>:
from fpdf import FPDF

pdf = FPDF()
pdf.set_font("Helvetica")
pdf.add_page()
pdf.write_html(
    """<table border="1"><thead><tr>
        <th width="33%">Name</th>
        <th width="66%">Formula</th>
    </tr></thead><tbody><tr>
        <td>Lucas-C</td><td>E = MC<sup>2</sup></td>
    </tr</tbody></table>""")
pdf.output("issue_860.pdf")

I've been looking into how to solving this. It seems that cells in tables rendered from HTML call FPDF.multi_cell().
https://github.com/PyFPDF/fpdf2/blob/54d2eb0266bd3b1ccbf4dc384ea46c9b0d6b718d/fpdf/table.py#L278-L293
As far as I can see FPDF.multi_cell() has no ability to render text with mixed vpos. One idea is to expose something like _render_styled_text_line() on FPDF that takes a TextLine which support text fragments with different styling. Could that be a way forward?

gmischler · 2023-07-27T10:51:00Z

As far as I can see FPDF.multi_cell() has no ability to render text with mixed vpos. One idea is to expose something like _render_styled_text_line() on FPDF that takes a TextLine which support text fragments with different styling. Could that be a way forward?

As you have correctly recognized, this is a fundamental limitation of multi_cell().
For formatting changes within a paragraph, there is the alternative write(), but that currently has the disadvantage that it can only create left-aligned text.

Fixing this cleanly requires some architectural changes to fpdf2. I have outlined a possible solution in #339, and have been working on-and-off on an actual implementation. I hope I'll find time again soon so I can actually show some more progress here.

Theoretically, write_html() could also get more low-level access to the fpdf.py internals as you suggest, but I think a more general high-level approach to text formatting is better in the long run. Several similar issues have been raised over the last year, which all correctly pointed at the same set of current limitations. I'm sorry to say that the necessary groundwork for a true and general solution will take a bit more time.

Lucas-C · 2023-08-02T11:15:23Z

By the way, I think that this other, older issue is related: #151

Tolker-KU added the enhancement label Jul 21, 2023

Lucas-C added pending-answer multi_cell labels Jul 23, 2023

Lucas-C added research needed too complicated to implement without careful study of official specifications up-for-grabs hacktoberfest markdown and removed pending-answer labels Jul 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for subscript/superscript in cell/multicell using markdown #860

Support for subscript/superscript in cell/multicell using markdown #860

Tolker-KU commented Jul 21, 2023

Lucas-C commented Jul 23, 2023

gmischler commented Jul 23, 2023

Tolker-KU commented Jul 24, 2023 •

edited

Lucas-C commented Jul 25, 2023 •

edited

Tolker-KU commented Jul 26, 2023 •

edited

Lucas-C commented Jul 26, 2023

Tolker-KU commented Jul 26, 2023 •

edited

gmischler commented Jul 27, 2023

Lucas-C commented Aug 2, 2023

Support for subscript/superscript in cell/multicell using markdown #860

Support for subscript/superscript in cell/multicell using markdown #860

Comments

Tolker-KU commented Jul 21, 2023

Lucas-C commented Jul 23, 2023

gmischler commented Jul 23, 2023

Tolker-KU commented Jul 24, 2023 • edited

Lucas-C commented Jul 25, 2023 • edited

Tolker-KU commented Jul 26, 2023 • edited

Lucas-C commented Jul 26, 2023

Tolker-KU commented Jul 26, 2023 • edited

gmischler commented Jul 27, 2023

Lucas-C commented Aug 2, 2023

Tolker-KU commented Jul 24, 2023 •

edited

Lucas-C commented Jul 25, 2023 •

edited

Tolker-KU commented Jul 26, 2023 •

edited

Tolker-KU commented Jul 26, 2023 •

edited