New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Implement inline-toc extra for TOC HTML insertion after first heading #280
Conversation
Fixes #279 Instead of adding a new CLI switch and trying to figure out whether CLI vs module usage is distinguishable, I have decided to create a sister extra for toc and make if behave the same way, but insert the TOC HTML to the MarkDown HTML (after the first heading) as well as preserving it as a property. This allows CLI usages the flexibility to just replace `toc` with `inline-toc` and get a table of contents right after the first heading without any worries about backwards compatibility or pollution the CLI switches. I believe this to be the cleanest solution.
lib/markdown2.py
Outdated
if "toc" in self.extras: | ||
if ("toc" in self.extras or "inline-toc" in self.extras): | ||
# Generate TOC HTML as a property to be able to hijack it in "inline-toc" for subsctitution | ||
# TODO (Tomas Hubelbauer): See about using that as a static method without a need for a throwaway instance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is currently needlessly inefficient, here I just take advantage of the fact that the object with toc_html
on it is already created and replace its reference with a new copy. I will investigate static methods in Python to see if I can run inline-toc
before line 393 and avoid having to create two objects.
- Investigate using a static method and avoid creating a throwaway object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not end up using staticmethod
decorator because I wasn't sure if Python 2.2 and 2.3 are expected to be supported and even then it gave me a hard time with me calling it within UnicodeWithAttrs
where it was defined but it reporting UnicodeWithAttrs.toc_html
not existing.
lib/markdown2.py
Outdated
rv._toc = self._toc | ||
if "inline-toc" in self.extras: | ||
if self._toc[0] is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have no headings, there is no need to prepend anything (toc_html
with be either empty or just <ul></ul>
I need to check), so this if
can probably be simplified.
- Check if if
_toc[0]
isNone
it makes sense to insert the TOC HTML or if it is empty and we can just skip
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed this to only include the TOC HTML if any heading exist.
lib/markdown2.py
Outdated
# Need to use a regex and rely on the HTML structure, tracking the regex's `end()` across all the HTML transformations would be extremely unreliable | ||
# TODO (Tomas Hubelbauer): Consider looser regex which allows for more attributes in order to to find heading even when more extras add attributes to it (future-proof) | ||
pattern = r"\<h{} id=[\"\']{}[\"\']\>{}<\/h{}\>".format(level, id, re.escape(name), level) | ||
text = re.sub(pattern, "\g<0>" + rv.toc_html, text) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to somehow check for the failure of the substitution and prepend the TOC HTML instead, but not sure what would be the best way to go about that.
Oh and this probably doesn't deal well with multiple headings with the same text, I need to check for that.
- Verify this only replaces once and not multiple headings with the same
level
,id
andtext
if ("inline-toc" in self.extras and self._toc[0] is not None): | ||
(level, id, name) = self._toc[0] | ||
# Use a regex and rely on the HTML structure as opposed to tracking the heading regex's `end()` across all the HTML transformations (unreliable) | ||
# TODO (Tomas Hubelbauer): Consider looser regex which allows for more attributes in order to to find heading even when more extras add attributes to it (future-proof) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am considering not doing that now because it feels premature. @nicholasserra thoughts?
`toc_html = property(calculate_toc_html(self._toc))` doesn't seem to work because `self` is not valid in that context so I had to hack around it with the useless proxy method.
lines[-1] += "</li>" | ||
lines.append("%s</ul>" % indent()) | ||
return '\n'.join(lines) + '\n' | ||
return calculate_toc_html(self._toc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to hack around the fact that self
was not valid when used like this:
toc_html = property(calculate_toc_html(self._toc))
I am not sure if this can be done better.
Nice, thanks. Give me some time to go over this one. I don't really love the idea of the separate extra to accomplish this. I'll probably take a stab at my ideal implementation of this and see where I land on it. |
Curious, why do you want to insert the toc html after the first heading as opposed to just tossing it at the beginning or end of the entire output? |
@nicholasserra I am of the mindset that the first heading is not really a heading, but rather a title and when crafting documents by hand, I add a semi-TOC after my first heading, so the logical structure becomes:
Possibly a personal preference, that's for sure. On the other hand, it's true (and I only now realize this) that for this to make complete sense, the first level of headings should be skipped in the TOC (pretty much the opposite of the recent TOC depth PR) because otherwise one is going to end up with a title first-level heading and underneath a TOC with the top-level item being the heading itself. I am cool with redoing the PR to only prepend, for my use-case I think a little Bash can help me move the heading where I need it to achieve my personal preference without pushing it onto the users of this library. |
@nicholasserra Would you like me to try to rewrite this as a parameter to I gave another thought to prepending and I don't really see anyone using it that way, so I think more control is necessary for this to make sense and I think providing a solution that utilizes substitution would make this functionality flexible enough for most use cases. I still don't know how to tackle the fact that the TOC HTML will contain all the headings, including the ones the precede it in the document if user places it into a nested section (like {title} > Table of Contents). I think this is a hard problem. We can just drop first level headings (or only the first one) but some people start their documents with level-3 headings because of the rendered output heading size (yep...), some people have multiple first-level headings which also doesn't jive well with this… A lot of variation. We could track which headings come before the substitution string (the value of the |
Apologies for delays. Couple thoughts: Regarding toc html placement: I think we should pick a spot and run with it. My general feeling with extras at this point in this package is that I want them to be lean and not too invasive in the codebase. If I were doing this alone i'd probably just toss the toc html at the top to avoid having to do too much ugly logic for placement. The thought of another flag or indicator specifying the placement doesn't sit well with me for the lean reasons. I think toc via the CLI is an edge case itself, so let's keep it clean. I'm fine with your thoughts on placement below the first heading if you think we can do it reliably. If not let's just toss it before any output. Regarding duplicate extra vs detecting CLI: I think it should be possible to pass a cli=True flag to the markdown class inside the I still haven't taken time to actually hack on this, but looks possible. Thanks again for your work and feedback on this. |
@TomasHubelbauer I finally took a stab at some code on this. Check out PR #283 . As I was testing this out I became less concerned about backwards compat. This feels more like a bug during CLI usage to me now. Essentially, it was broken in CLI because there was no meaningful way to use it. I went the route of just prepending the toc_html to the output. I know you have some reservations on that. Maybe we toss in your change for the smart output. I guess I think it's overkill. The CLI flag ended up working nicely. No need for a second extra. Let me know your thoughts. |
Hey @nicholasserra, thanks for getting back to this! I think your PR looks great, prepending is fine. While I do think that we can identify the heading reliably to be able to insert the TOC HTML after it, that's something that can easily live in a script wrapping |
Fixes #279
Instead of adding a new CLI switch and trying to figure out whether CLI vs module usage is distinguishable, I have decided to create a sister extra for toc and make if behave the same way, but insert the TOC HTML to the MarkDown HTML (after the first heading) as well as preserving it as a property. This allows CLI usages the flexibility to just replace
toc
withinline-toc
and get a table of contents right after the first heading without any worries about backwards compatibility or pollution to the CLI switches. I believe this to be the cleanest solution.@nicholasserra I don't know how to write Python and this design also happened to be the simplest one I think so that allowed me to take a stab at it. I am sure there is plenty to fix, but the PR is quite small so it should not take long to review. I will gladly improve it based on the review comments. This is missing tests currently but AFAICT there are no tests for TOC to begin with. I can add some for both, just wanted to get a first wave of the review first and see if you would accept this before I attempt that.