-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider changing Parser's switch cases to methods, for extensibility #2033
Comments
Addendum. For dynamic method lookup, we can avoid collisions with function Null() {}
Null.prototype = null
class Parser extends Null {}
console.log(Object.getOwnPropertyNames(Parser.prototype))
// ['constructor'] This gives us a squeaky-clean |
We have a proposal to add tokenizers and renderers to marked here. Maybe that would solve this problem. |
Thanks for pointing it out, I missed it. I still feel like both proposals are somewhat orthogonal, and this one could be done right now, while the other one is incubating. It doesn't seem to have as many design issues or performance issues. Even if it ends up being somewhat redundant, it would be quite handy for use cases like mine until #1872 is done (at some unspecified point). I might be able to PR it; let me know if I should try. |
We are always open to PRs. If it doesn't slow marked down it seems like something that would work. You can test the speed changes by running |
Logic for different token types has been moved from switch cases to methods. The main parse loop now simply looks up the method by token type, falling back on `default()`. Token types are now an open set, rather than a closed set. New token types can be supported by subclassing Parser and adding corresponding methods, or by overriding `default()`. Consolidated the parsing logic for "top" and "inline" tokens. Removed `parser.parseInline`. The top-level function `parseInline` uses `lexer.lexInline` to generate an appropriate AST; additional restrictions at the parser level appear redundant. More flexible approach to "contextual" parsing logic, such as using a different renderer or indicating "loose" mode. Now every parse method takes an additional parameter: a "context", which contains a renderer and possibly other settings. By default, such context objects are allocated lazily and no more than once. The method signatures make it possible to use context inheritance, where child contexts inherit from parent contexts, overriding only some of their properties. This is not used by default, but can be useful for advanced cases such as one described in markedjs#2033. Benchmarks indicate no significant performance regressions or improvements. Differences appear to be within noise levels. Known regressions that must be fixed before merging the MR: * No support for coalescing adjacent `text` nodes. * List "loose" mode, with paragraph wrapping, is now a placeholder with incorrect logic. * Fewer tests pass, presumably as a result of ↑. Addresses [markedjs#2033](markedjs#2033).
Custom tokenizers and renderer are available in v2.1.0 see Custom Extensions section in docs |
What pain point are you perceiving?
My use case seems to require special logic which cannot be done at the
Renderer
level, cannot be done by editing the AST viawalkTokens
(unless #2032 is implemented), but could be done at theParser
level if the API was easier to extend.Describe the solution you'd like
In
Parser
, convert the bodies of switch cases to methods:We could even drop the switches and look up methods dynamically:
This would also make it trivial to implement #2032 in user code, by overriding
default
to support new token types.Actual use case
To avoid an XY problem (proposing a poorly-chosen solution instead of stating the actual goal), here's what I actually need to do.
Depending on the source data:
.tokens
, ignoringhref
.With this proposal, this logic could be implemented by subclassing
Parser
and overriding just the right methods:The text was updated successfully, but these errors were encountered: