-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Yet another Math implementation #734
Conversation
8a59500
to
ebbcf99
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fairly solid! I dismissed the curly brace matching feature initially because it seemed too complex to implement compared to its utility. Your approach here makes it look quite straightforward though.
In my implementation (rhysd#1), I opted to emit each dollar as its own item in the first pass, instead of eagerly combining pairs as display tags. This seemed to make parsing simpler. It avoids the "parity avalanche" where a closing inline tag requires all subsequent dollars in the same run to be reconsidered. Our implementations differ in these corner cases:
$x$$$$$
yours: <p><span class="math inline">x</span>$$$$</p>
mine: <p><span class="math inline">x</span><span class="math display"></span></p>
$x$$$$$$$y$$
yours: <p><span class="math inline">x</span>$<span class="math display"></span>
<span class="math inline">y</span>$</p>
mine: <p><span class="math inline">x</span><span class="math display"></span>
<span class="math display">y</span></p>
Although I've just noticed commonmark-hs does not accept empty display math either, so it disagrees with both of us:
$x$$$$$$$y$$
<p><span class="math inline">x</span>$$$<span class="math display">$y</span></p>
- Display math can no longer be empty (aligns with commonmark-hs) - Add test case $x$$$$$$$y$$ from raphlinus#734 - Add description about how content is handled compared to code spans: newlines, surrounding spaces, and pipes inside tables.
- Display math can no longer be empty (aligns with commonmark-hs) - Add test case $x$$$$$$$y$$ from raphlinus#734 - Add description about how content is handled compared to code spans: newlines, surrounding spaces, and pipes inside tables. (Almost) implemented in https://github.com/ollpu/pulldown-cmark/tree/alt-math
Heads up: I changed #733 to also require display math to be non-empty. The behavior I described for escaped pipes in tables should be the same as in your implementation as far as I understand. |
- Display math can no longer be empty (aligns with commonmark-hs) - Add test case $x$$$$$$$y$$ from raphlinus#734 - Add description about how content is handled compared to code spans: newlines, surrounding spaces, and pipes inside tables.
8172f8f
to
2d0091f
Compare
I've rebased this PR onto the current mainline, and started fuzz testing it against commonmark-hs (and found a couple of problems). |
Great! I'll clean up the spec PR and add brace handling to it hopefully next week, so we can move forward with this. |
Hi!, and sorry for the long time without a response about this. I still have to review the code, but I have reviewed the possible specs and #733 and this are the way to go in my opinion. The only thing I would change is the target of the pull request, so for major changes it is better to use the branch https://github.com/pulldown-cmark/pulldown-cmark/tree/branch_0.11 I created last week. I will review the code of both pull requests and this will be merged soon, finally! The spec is mandatory in the crate docs though, I guess that we can convert #733 complete description into RustDoc comments. Thanks to both for this great work! |
The only thing in the spec I would change is using |
Based on pulldown-cmark#622 and copied from https://github.com/ollpu/pulldown-cmark/tree/alt-math. Co-authored-by: rhysd <lin90162@yahoo.co.jp>
This feature is loosely based on what 63a29a1 described, but copies [commonmark-hs] more closely (the balanced braces feature is added). [commonmark-hs]: https://github.com/nschloe/github-math-bugs It largely ignores GitHub, because its math parsing [is very buggy]. [is very buggy]: https://github.com/nschloe/github-math-bugs
This approach, based on @ollpu's suggestion, tracks single `$`s in the inline tree, and merges them later. It avoids having to merge and unmerge them in some corner cases.
The essential problem is: every time you write `$$x$}`, you get another entry added to a hash table. Even if it's not [theoretically] *quadratic*, it's still slow. Hard limiting it to 255 entries makes this not a problem. Interestingly enough, when I tried to write an analogous torture test for code spans, I couldn't find a way to do it because code spans are keyed by their *length* instead of their *position*. In order to get N entries in the hash table, I basically had to write N `` ` `` in a row, forcing me to write quadratic amounts of input text. Comparison: ``` michaelhowell@Michael-Howells-Macbook-Pro pulldown-cmark % python3 -c 'print("$$x$}"*5000)' | time target/release/pulldown-cmark.old -M > /dev/null target/release/pulldown-cmark.old -M > /dev/null 2.63s user 0.02s system 99% cpu 2.673 total michaelhowell@Michael-Howells-Macbook-Pro pulldown-cmark % python3 -c 'print("$$x$}"*5000)' | time target/release/pulldown-cmark.new -M > /dev/null target/release/pulldown-cmark.new -M > /dev/null 0.01s user 0.00s system 6% cpu 0.109 total ``` [theoretically]: http://www.ilikebigbits.com/2014_04_21_myth_of_ram_1.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments. Thanks for the great work!
So are you thinking this would be a brief explainer in an Currently the I could work on these – maybe in a new PR after this is merged. |
@ollpu, I think the best option is writing a small summary with a link to the full documentation, and creating a mdBook with all specs and other tips and guides, as the issue #863 I created yesterday. Thanks! |
I think that to merge this pull request and to close the math issue there are only two pending tasks:
|
One remaining divergence compared to #733 is that this PR accepts empty display math items:
I changed this to be disallowed in d583a8d Otherwise I think #733 can be closed in favor of this, and small edits can be done later. No need to merge both. |
I think that the high level description of #733 can be a good start point for the math spec in the documentation, but as it is true it will not be merged, I have closed it. Thanks! |
So let's merge it. Thanks! |
Bug:
I'm working on a small refactor which should fix this as well. |
This feature is loosely based on what 63a29a1 #733 described, but copies commonmark-hs more closely (the balanced braces feature is added).
It largely ignores GitHub, because its math parsing is very buggy.