Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yet another Math implementation #734

Merged
merged 21 commits into from
Mar 23, 2024
Merged

Conversation

notriddle
Copy link
Collaborator

@notriddle notriddle commented Oct 19, 2023

This feature is loosely based on what 63a29a1 #733 described, but copies commonmark-hs more closely (the balanced braces feature is added).

It largely ignores GitHub, because its math parsing is very buggy.

Copy link
Collaborator

@ollpu ollpu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fairly solid! I dismissed the curly brace matching feature initially because it seemed too complex to implement compared to its utility. Your approach here makes it look quite straightforward though.

In my implementation (rhysd#1), I opted to emit each dollar as its own item in the first pass, instead of eagerly combining pairs as display tags. This seemed to make parsing simpler. It avoids the "parity avalanche" where a closing inline tag requires all subsequent dollars in the same run to be reconsidered. Our implementations differ in these corner cases:

$x$$$$$
yours: <p><span class="math inline">x</span>$$$$</p>
mine:  <p><span class="math inline">x</span><span class="math display"></span></p>

$x$$$$$$$y$$
yours: <p><span class="math inline">x</span>$<span class="math display"></span>
          <span class="math inline">y</span>$</p>
mine:  <p><span class="math inline">x</span><span class="math display"></span>
          <span class="math display">y</span></p>

Although I've just noticed commonmark-hs does not accept empty display math either, so it disagrees with both of us:

$x$$$$$$$y$$
<p><span class="math inline">x</span>$$$<span class="math display">$y</span></p>

src/firstpass.rs Outdated Show resolved Hide resolved
src/parse.rs Outdated Show resolved Hide resolved
src/parse.rs Outdated Show resolved Hide resolved
src/parse.rs Outdated Show resolved Hide resolved
src/parse.rs Outdated Show resolved Hide resolved
ollpu referenced this pull request in ollpu/pulldown-cmark Nov 3, 2023
- Display math can no longer be empty (aligns with commonmark-hs)
- Add test case $x$$$$$$$y$$ from raphlinus#734
- Add description about how content is handled compared to code spans:
  newlines, surrounding spaces, and pipes inside tables.
ollpu referenced this pull request in ollpu/pulldown-cmark Nov 3, 2023
- Display math can no longer be empty (aligns with commonmark-hs)
- Add test case $x$$$$$$$y$$ from raphlinus#734
- Add description about how content is handled compared to code spans:
  newlines, surrounding spaces, and pipes inside tables.

(Almost) implemented in https://github.com/ollpu/pulldown-cmark/tree/alt-math
@ollpu
Copy link
Collaborator

ollpu commented Nov 4, 2023

Heads up: I changed #733 to also require display math to be non-empty. The behavior I described for escaped pipes in tables should be the same as in your implementation as far as I understand.

src/main.rs Outdated Show resolved Hide resolved
ollpu referenced this pull request in ollpu/pulldown-cmark Nov 6, 2023
- Display math can no longer be empty (aligns with commonmark-hs)
- Add test case $x$$$$$$$y$$ from raphlinus#734
- Add description about how content is handled compared to code spans:
  newlines, surrounding spaces, and pipes inside tables.
@notriddle notriddle force-pushed the math branch 3 times, most recently from 8172f8f to 2d0091f Compare March 5, 2024 03:07
@notriddle
Copy link
Collaborator Author

I've rebased this PR onto the current mainline, and started fuzz testing it against commonmark-hs (and found a couple of problems).

@ollpu
Copy link
Collaborator

ollpu commented Mar 7, 2024

I've rebased this PR onto the current mainline, and started fuzz testing it against commonmark-hs (and found a couple of problems).

Great! I'll clean up the spec PR and add brace handling to it hopefully next week, so we can move forward with this.

@Martin1887
Copy link
Collaborator

Martin1887 commented Mar 11, 2024

Hi!, and sorry for the long time without a response about this.

I still have to review the code, but I have reviewed the possible specs and #733 and this are the way to go in my opinion.

The only thing I would change is the target of the pull request, so for major changes it is better to use the branch https://github.com/pulldown-cmark/pulldown-cmark/tree/branch_0.11 I created last week.

I will review the code of both pull requests and this will be merged soon, finally! The spec is mandatory in the crate docs though, I guess that we can convert #733 complete description into RustDoc comments.

Thanks to both for this great work!

@Martin1887
Copy link
Collaborator

The only thing in the spec I would change is using math math-inline and math math-display classes instead math inline and math display, to avoid possible collisions with user-defined or CSS framework classes.

@notriddle notriddle changed the base branch from master to branch_0.11 March 11, 2024 22:41
ollpu and others added 5 commits March 11, 2024 15:43
Based on pulldown-cmark#622 and
copied from https://github.com/ollpu/pulldown-cmark/tree/alt-math.

Co-authored-by: rhysd <lin90162@yahoo.co.jp>
This feature is loosely based on what 63a29a1
described, but copies [commonmark-hs] more closely (the balanced braces
feature is added).

[commonmark-hs]: https://github.com/nschloe/github-math-bugs

It largely ignores GitHub, because its math parsing [is very buggy].

[is very buggy]: https://github.com/nschloe/github-math-bugs
This approach, based on @ollpu's suggestion, tracks single `$`s
in the inline tree, and merges them later. It avoids having
to merge and unmerge them in some corner cases.
The essential problem is: every time you write `$$x$}`, you get another
entry added to a hash table. Even if it's not [theoretically] *quadratic*,
it's still slow. Hard limiting it to 255 entries makes this not a problem.

Interestingly enough, when I tried to write an analogous torture test
for code spans, I couldn't find a way to do it because code spans are
keyed by their *length* instead of their *position*. In order to get
N entries in the hash table, I basically had to write N `` ` `` in a
row, forcing me to write quadratic amounts of input text.

Comparison:

```
michaelhowell@Michael-Howells-Macbook-Pro pulldown-cmark % python3 -c 'print("$$x$}"*5000)' | time target/release/pulldown-cmark.old -M > /dev/null
target/release/pulldown-cmark.old -M > /dev/null  2.63s user 0.02s system 99% cpu 2.673 total
michaelhowell@Michael-Howells-Macbook-Pro pulldown-cmark % python3 -c 'print("$$x$}"*5000)' | time target/release/pulldown-cmark.new -M > /dev/null
target/release/pulldown-cmark.new -M > /dev/null  0.01s user 0.00s system 6% cpu 0.109 total
```

[theoretically]: http://www.ilikebigbits.com/2014_04_21_myth_of_ram_1.html
@rhysd rhysd mentioned this pull request Mar 12, 2024
@Martin1887 Martin1887 mentioned this pull request Mar 12, 2024
Copy link
Collaborator

@Martin1887 Martin1887 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments. Thanks for the great work!

pulldown-cmark/src/firstpass.rs Show resolved Hide resolved
pulldown-cmark/src/lib.rs Outdated Show resolved Hide resolved
pulldown-cmark/src/parse.rs Outdated Show resolved Hide resolved
@ollpu
Copy link
Collaborator

ollpu commented Mar 12, 2024

@Martin1887

The spec is mandatory in the crate docs though, I guess that we can convert #733 complete description into RustDoc comments.

So are you thinking this would be a brief explainer in an Options::ENABLE_MATH doc comment or something else / more elaborate?

Currently the math.txt spec is kind of long and it's hard to read as a guide. It could use some rearranging (clearly separate the pathological test cases to the end) and a pass to make sure that the important parts are concise and understandable.

I could work on these – maybe in a new PR after this is merged.

@Martin1887
Copy link
Collaborator

Martin1887 commented Mar 13, 2024

@Martin1887

The spec is mandatory in the crate docs though, I guess that we can convert #733 complete description into RustDoc comments.

So are you thinking this would be a brief explainer in an Options::ENABLE_MATH doc comment or something else / more elaborate?

Currently the math.txt spec is kind of long and it's hard to read as a guide. It could use some rearranging (clearly separate the pathological test cases to the end) and a pass to make sure that the important parts are concise and understandable.

I could work on these – maybe in a new PR after this is merged.

@ollpu, I think the best option is writing a small summary with a link to the full documentation, and creating a mdBook with all specs and other tips and guides, as the issue #863 I created yesterday. Thanks!

@Martin1887
Copy link
Collaborator

I think that to merge this pull request and to close the math issue there are only two pending tasks:

  • Making clear the heading of the spec in this pull request.
  • Updating the spec pull request and search a place for it. Maybe we can create the mdBook now although it is not completed yet, many things can be already included.

@ollpu
Copy link
Collaborator

ollpu commented Mar 23, 2024

One remaining divergence compared to #733 is that this PR accepts empty display math items:

$ cargo run -- -m
$$$$
<p><span class="math math-display"></span></p>

I changed this to be disallowed in d583a8d to align with commonmark-hs (not true actually?).

Otherwise I think #733 can be closed in favor of this, and small edits can be done later. No need to merge both.

@Martin1887
Copy link
Collaborator

I think that the high level description of #733 can be a good start point for the math spec in the documentation, but as it is true it will not be merged, I have closed it. Thanks!

@Martin1887
Copy link
Collaborator

So let's merge it. Thanks!

@Martin1887 Martin1887 merged commit a51a362 into pulldown-cmark:branch_0.11 Mar 23, 2024
1 check passed
@Martin1887 Martin1887 mentioned this pull request Mar 23, 2024
@notriddle notriddle deleted the math branch March 23, 2024 19:59
@ollpu
Copy link
Collaborator

ollpu commented Mar 25, 2024

Bug:

$ cargo run -- -m
$x {$ $ } $x$
<p>$x {$ $ } $x$</p>

I'm working on a small refactor which should fix this as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants