Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Upgrade to remark-parse@10 #11867

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

TrevorBurnham
Copy link

@TrevorBurnham TrevorBurnham commented Nov 25, 2021

Re: #11828

This is a first pass at upgrading to the latest remark-parse version, based on micromark, for Markdown/MDX.

Current status

This is far from complete: 551 snapshots are failing on this branch (though some of those are due to harmless behavior changes; see Changes around escaping below). I'm hoping this will be a useful starting point for other contributors who can get this across the finish line.

Most of the work I've done here is getting Jest to run against all the ESM modules in the new packages, by running them through babel-jest. I've also replaced the custom plugins with remark- packages where possible, though in some cases these don't have all of the same behaviors (e.g. remark-frontmatter doesn't recognize this custom pragma) and may need to be replaced with new custom implementations.

Next steps

There are two in-house plugins, liquid and loose-items, that don't appear to have any remark- package equivalents, so someone will have to figure out how to reimplement those with micromark.

Once those are implemented, we need to take a careful look at the failing snapshots, decide which ones are blockers, and take steps to resolve them.

Finally, I've been focused on getting the unit tests running, but the build config will also need to be updated in order to get this branch running.

Changes around escaping

Many of the failing snapshots are caused by extra escaping. For example, the input

_ http://www.example.com:80/_a_/**

was previously converted to

_ http://www.example.com:80/_a_/\*\*

but on this branch comes out as

\_ http://www.example.com:80/_a_/\*\*

That is, the result is the same aside from the unmatched _ at the start being escaped.

That change may be a positive one? The current behavior is that the opening _ is escaped in the case where the input contains another _, e.g. _ http://www.example.com:80/_a*/* (the output is \_ http://www.example.com:80/_a*/*). So the new escaping behavior is more consistent, but noisier.

The build config will likely need changes, but `yarn test` is runnable.

563 snapshots failing.
563 snapshots failing (no change).
563 snapshots failing (no change).
563 snapshots failing (no change).
563 snapshots failing (no change).
551 snapshots failing.
@alexander-akait
Copy link
Member

Maybe off top, but does remark has raw data (many parser has raw data for these purposes - linters/fomatters/etc)? It can be useful, so we can avoid extra weird reescaping

@sosukesuzuki sosukesuzuki mentioned this pull request Dec 7, 2021
4 tasks
@fisker fisker mentioned this pull request Apr 24, 2022
4 tasks
@fisker fisker mentioned this pull request Aug 4, 2022
@seiyab seiyab mentioned this pull request Mar 1, 2024
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants