Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace Babel w/ estree #1399

Merged
merged 1 commit into from Dec 20, 2020
Merged

Replace Babel w/ estree #1399

merged 1 commit into from Dec 20, 2020

Conversation

wooorm
Copy link
Member

@wooorm wooorm commented Dec 20, 2020

This removes the last three custom Babel plugins we had and replaces them with estree versions. Furthermore, it removes @babel/generator.

For the plugins, we were only looking at ESM import/exports, but right now we’re delegating work to periscopic to look at which things are defined in the top-level scope. It’s a bit more complex, but this matches better with intentions, fixes some bugs, and prepares for a potential future where other ES constructs are allowed, so all in all should be a nice improvement.

For serializing, we’re switching to astring, and handling JSX for now internally (could be externalized later). astring seems fast and is incredibly small, but is not very popular. We might perhaps see bugs is serialization in the future because of that, but all our tests seem fine, so I’m not too worried about that.

Estree remains a somewhat fragmented ecosystem, such as that the tree walkers in periscopic and astring are different, so we might also consider writing our own serializer in the future. Or, when we implement Babel’s React JSX transform ourselves, could switch to another generator, or at least drop the JSX serialization code here.

Because of these changes, we can drop @babel/core and @babel/generator from @mdx-js/mdx, which drops the bundle size of from 349kb to 111kb. That’s 68%. Pretty nice. This should improve downloading and parsing time of bundles significantly. Of course, we currently still have JSX in the output, so folks will have to resort to Babel (or buble-jsx-only) in another step.

For performance, v2 (micromark) was already an improvement over v1. On 1000 simple files totalling about 1mb of MDX:

  • v1: 3739ms
  • v2: 2734ms (26% faster)
  • v2 (w/o babel): 1392ms (63% faster).

Of course, this all really depends on what type of stuff is in your MDX. But it looks pretty sweet!

Related to GH-1046.
Related to GH-1152.
Related to GH-1338.
Closes GH-704.
Closes GH-1384.

This removes the last three custom Babel plugins we had and replaces
them with estree versions.
Furthermore, it removes `@babel/generator`.

For the plugins, we were only looking at ESM import/exports, but right
now we’re delegating work to `periscopic` to look at which things are
defined in the top-level scope.
It’s a bit more complex, but this matches better with intentions,
fixes some bugs, and prepares for a potential future where other ES
constructs are allowed, so all in all should be a nice improvement.

For serializing, we’re switching to `astring`, and handling JSX for now
internally (could be externalized later).
`astring` seems fast and is incredibly small, but is not very popular.
We might perhaps see bugs is serialization in the future because of that,
but all our tests seem fine, so I’m not too worried about that.

Estree remains a somewhat fragmented ecosystem, such as that the tree
walkers in `periscopic` and `astring` are different, so we might also
consider writing our own serializer in the future.
Or, when we implement Babel’s React JSX transform ourselves, could switch
to another generator, or at least drop the JSX serialization code here.

Because of these changes, we can drop `@babel/core` and
`@babel/generator` from `@mdx-js/mdx`, which drops the bundle size of
from 349kb to 111kb.
That’s 68%.
Pretty nice.
This should improve downloading and parsing time of bundles
significantly.
Of course, we currently still have JSX in the output, so folks will
have to resort to Babel (or `buble-jsx-only`) in another step.

For performance, v2 (micromark) was already an improvement over v1.
On 1000 simple files totalling about 1mb of MDX:

* v1: 3739ms
* v2: 2734ms (26% faster)
* v2 (w/o babel): 1392ms (63% faster).

Of course, this all really depends on what type of stuff is in your MDX.
But it looks pretty sweet!

✨

Related to GH-1046.
Related to GH-1152.
Related to GH-1338.
Closes GH-704.
Closes GH-1384.
@wooorm wooorm added 🦋 type/enhancement This is great to have 🙆 yes/confirmed This is confirmed and ready to be worked on 🏡 area/internal This affects the hidden internals 👶 semver/patch This is a backwards-compatible fix labels Dec 20, 2020
@vercel
Copy link

vercel bot commented Dec 20, 2020

This pull request is being automatically deployed with Vercel (learn more).
To see the status of your deployment, click below or on the icon next to each commit.

🔍 Inspect: https://vercel.com/mdx/mdx/23qg5xomb
✅ Preview: Failed

Copy link
Member

@johno johno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

@johno johno merged commit 521f092 into main Dec 20, 2020
@johno johno deleted the estree branch December 20, 2020 14:59
@wooorm wooorm added 💪 phase/solved Post is done and removed 🙆 yes/confirmed This is confirmed and ready to be worked on labels Dec 20, 2020
wooorm added a commit that referenced this pull request Jan 2, 2021
This PR moves most of the runtime to the compile time.

This issue has nothing to do with `@mdx-js/runtime`. It’s about
`@mdx-js/mdx` being compile time, and moving most work there, from the
“runtimes” `@mdx-js/react`, `@mdx-js/preact`, `@mdx-js/vue`.

Most of the runtime is undocumented features that allow amazing things,
but those are in my opinion *too magical*, more powerful than needed,
complex to reason about, and again: undocumented.
These features are added by overwriting an actual renderer (such as
react, preact, or vue). Doing so makes it hard to combine MDX with for
example Emotion or theme-ui, to opt into a new JSX transform when React
introduces one, to support other hyperscripts, or to add features such
as members (`<Foo.Bar />`). Removing these runtime features does what
MDX says in the readme: “**🔥 Blazingly blazing fast: MDX has no
runtime […]**”

This does remove the ability to overwrite *anything* at runtime. This
brings back the project to what is documented: users can still
overwrite markdown things (e.g., blockquotes) to become components and
pass components in at runtime without importing them. And it does still
allow undocumented parent-child combos (`blockquote.p`).

* Remove runtime renderers (`createElement`s hijacking) from
  `@mdx-js/react`, `@mdx-js/preact`, `@mdx-js/vue`
* Add `jsxRuntime` option to switch to the modern automatic JSX runtime
* Add `jsxImportSource` option to switch to a modern non-React JSX
  runtime
* Add `pragma` option to define a classic JSX pragma
* Add `pragmaFrag` option to define a classic JSX fragment
* Add `mdxProviderImportSource` option to load an optional runtime
  provider
* Add tests for automatic React JSX runtime
* Add tests for `@mdx-js/mdx` combined with `emotion`
* Add support and test members as “tag names” of elements
* Add support and test qualified names (namespaces) as “tag names” of
  elements
* Add tests for parent-child combos
* Add tests to assert explicit (inline) components precede over
  provided/given components
* Add tests for `mdxFragment: false` (runtime renderers w/o fragment
  support)
* Fix and test double quotes in attribute values

This PR removes the runtime renderers and related things such as the
`mdxType` and `parentName` props while keeping the `MDXProvider` in
tact.

This improves runtime performance, because all that runs at runtime is
plain vanilla React/preact/vue code.

This reduces the surface of the MDX API while being identical to what
is documented and hence to user expectations (except perhaps to some
power users).

This also makes it easier to support other renderers without having to
maintain projects like `@mdx-js/react`, `@mdx-js/preact`, `@mdx-js/vue`:
anything that can be used as a JSX pragma (including the [automatic
runtime](https://reactjs.org/blog/2020/09/22/introducing-the-new-jsx-transform.html))
is now supported.
A related benefit is that it’s easier to integrate with
[emotion](https://github.com/emotion-js/emotion/blob/master/packages/react/src/jsx.js#L7)
(including through `theme-ui`) and similar projects which also
overwrite the renderer: as it’s not possible to have two runtimes, they
were hard to combine; because with this PR MDX is no longer a renderer,
there’s no conflict anymore.

This is done by the compile time (`@mdx-js/mdx`) knowing about an
(**optional**) runtime for an `MDXProvider` (such as `@mdx-js/react`,
`@mdx-js/preact`). Importantly, it’s not required for other
hyperscript interfaces to have a provider: `MDXContent` exported from
a compiled MDX file *also* accepts components (it already did), and Vue
comes with component passing out of the box.

In short, the runtime looked like this:

```js
function mdx(thing, props, ...children) {
  const overwrites = getOverwritesSomeWay()
  return React.createElement(overwrites[props.mdxType] || thing, props, ...children)
}
```

And we had a compile time, which added that `mdxType` prop. So:

```mdx
<Youtube />
```

Became:

```js
const Youtube = () => throw new Error('Youtube is not loaded!')

<Youtube mdxType="Youtube" />
```

Which in plain JS looks like:

```js
const Youtube = () => throw new Error('Youtube is not loaded!')

React.createElement(Youtube, {mdxType: 'Youtube'})
```

Instead, this now compiles to:

```js
const {Youtube} = Object.assign({Youtube: () => throw new Error('Youtube is not loaded!')}, getOverwritesSomeWay())

React.createElement(Youtube)
```

The previous example shows what is sometimes called a “shortcode”: a
way to inject components as identifiers into the MDX file, which was
introduced in [MDX 1](https://mdxjs.com/blog/shortcodes)

A different use case for the runtime was overwriting “defaults”. This
is documented on the website as the “[Table of
components](https://mdxjs.com/table-of-components)”. This MDX:

```mdx
Hello, *world*!
```

Became:

```js
<p mdxType="p">Hello, <em mdxType="em">world</em>!</p>
```

This now compiles to:

```js
const overwrites = Object.assign({p: 'p', em: 'em'}, getOverwritesSomeWay())

<overwrites.p>Hello, <overwrites.em>world</overwrites.em>!</overwrites.p>
```

This MDX:

```mdx
export const Video = () => <Vimeo />

<Video />
```

Used like so:

```jsx
<MDXProvider components={{Video: () => <Youtube />}}>
  <Content />
</MDXProvider>
```

Would result in a `Youtube` component being rendered. It no longer
does. I see the previous behavior as a bug and hence this as a fix.

A subset of the above point is that:

```mdx
export default props => <main {...props} />

x
```

Used like so:

```jsx
<MDXProvider components={{wrapper: props => <article {...props} />}}>
  <Content />
</MDXProvider>
```

Would result in an `article` instead of the explicit `main`. It no
longer does. I see the previous behavior as a bug and hence this as a
fix.

(#821)

```mdx

<h2>World</h2>
```

Used like so:

```jsx
<MDXProvider components={{h2: () => <SomethingElse />}}>
  <Content />
</MDXProvider>
```

Would result in a `SomethingElse` for both. This PR **does not** change
that. But it could more easily be changed if we want to, because at
compile time we know whether something was a tag or not.

An undocumented feature of the current MDX runtime renderer is that
it’s possible to overwrite anything:

```mdx
<span />
```

Used like so:

```jsx
<MDXProvider components={{span: props => <b>{props.children}</b>}}>
  <Content />
</MDXProvider>
```

Would overwrite to become bold, even though it’s not documented
anywhere. This PR changes that: only allowed markdown “tag names” can
be changed (`p`, `li`, ...). **This list could be expanded.**

Another undocumented feature is that parent–child combos can be
overwritten. A `li` in an `ol` can be treated differently from one in
an `ul` by passing `'ol.li': () => <SomethingElse />`.

This PR no longer lets users “nest” arbitrary parent–child combos
except for `ol.li`, `ul.li`, and `blockquote.p`. **This list could
be expanded.**

It was not possible to use members (`<foo.bar />`, `<Foo.bar.baz />`,
<#953>) and supporting it previously
would be complex. This PR adds support for them.

Previously, `mdxType` and `parentName` attributes were added to all
elements. And a `components` prop was accepted on **all** elements to
change the provider. These are no longer passed and no longer accepted.
Lastly, `components`, `props` were in scope for all JSX tags defined in
the “markdown” section (not the import/exports) of each document.

This adds identifiers to the scope prefixed with double underscores:
`__provideComponents`, `__components`, and `__props`.

A single 1mb MDX file, about 20k lines and 135k words (basically 3
books). Heavy on the “markdown”, few tags, no import/exports.
322kb gzipped.

* v1: 2895.122856
* 2.0.0-next.8: 3187.4684129999996
* main: 4058.917152000001
* this pr: 4066.642403

* v1: raw: 1.5mb, gzip: 348kb
* 2.0.0-next.8: raw: 1.4mb, gzip: 347kb
* main: raw: 1.3mb, gzip: 342kb
* this pr: raw: 1.8mb, gzip: 353kb
* this pr, automatic runtime: raw: 1.7mb, gzip: 355kb

* v1: 321.761208
* 2.0.0-next.8: 321.79749599999997
* main: 162.412757
* this pr: 107.28038599999996
* this pr, automatic runtime: 123.73588899999999

This PR is much faster on giant markdown-esque documents during runtime.
The win over the current `main` branch is 34%, the win over the last
beta and v1 is 66%.

For output size, the raw value increases with this PR, which is because
the output is now `/*#__PURE__*/React.createElement(__components.span…)`
or `/*#__PURE__*/_jsx(__components.span…)`, instead of `mdx("span",
{mdxType: "span"…})`. The change is more repetition, as can be seen by
the roughly same gzip sizes.

That the build time of `main` and this PR is slower than v1 and the
last beta does surprise me a lot. I benchmarked earlier with 1000 small
simple MDX files, totalling 1mb, [where the results were the
inverse](#1399 (comment)). So
it looks like we have a problem with giant files. Still, this PR has no
effect on build time performance, because the results are the same as
currently on `main`.

This PR makes MDX faster, adds support for the modern automatic JSX
runtime, and makes it easier to combine with Emotion and similar
projects.

---

Some of what this PR does has been discussed over the years:

Related-to: GH-166.
Related-to: GH-197.
Related-to: GH-466 (very similar).
Related-to: GH-714.
Related-to: GH-938.
Related-to: GH-1327.

This PR solves some of the items outlined in these issues:

Related-to: GH-1152.
Related-to: #1014 (comment).

This PR solves:

Closes GH-591.
Closes GH-638.
Closes GH-785.
Closes GH-953.
Closes GH-1084.
Closes GH-1385.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏡 area/internal This affects the hidden internals 💪 phase/solved Post is done 👶 semver/patch This is a backwards-compatible fix 🦋 type/enhancement This is great to have
Development

Successfully merging this pull request may close these issues.

Babel or estree? Explore moving to babel standalone or offering a more browser friendly build
2 participants