-
-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] migrate to typescript #16
[WIP] migrate to typescript #16
Conversation
Awesome! I see this effort as useful, as the current code has only seen my eyes and is young, so there’s much that can be better! Regarding this project as TS, well, we talked about our opinions on TS a lot already, I’m not really for it, in short: I feel strongly that the bundle size of mm should be as small as possible. The version of mm that lived here before was written in TS and while it only supported a couple of markdown constructs, it was already over the size that mm is now. This may indicate that a project in TS is bigger than a project by hand in JS. (Practically, for ESM the scripts would also need to be fixed) I also see that less folks understand TS compared to JS (including myself). Especially important for folks who want to build extensions and look at this project for examples. Any reason to not do the jsdoc approach? |
hooks: { | ||
[key: string]: unknown | ||
} | ||
flow: (something: unknown) => unknown |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more:
Line 98 in 307e0a5
content: create(initializeContent), |
These are the available tokenizers:
micromark/lib/util/create-tokenizer.js
Line 13 in 307e0a5
function createTokenizer(parser, initialize, from) { |
[key: string]: unknown | ||
} | ||
flow: (something: unknown) => unknown | ||
defined: unknown[] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This containers unique normalized identifiers for definitions:
micromark/lib/tokenize/definition.js
Line 28 in 307e0a5
identifier = normalizeIdentifier(context.sliceSerialize(events[index][1])) |
It does some rather complex things to how references are parsed 😓
micromark/lib/tokenize/label-end.js
Line 192 in 307e0a5
return self.parser.defined.indexOf(labelIdentifier) < 0 |
_closeFlow: unknown | ||
furtherBlankLines: unknown | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also this stuff?
micromark/lib/util/create-tokenizer.js
Line 36 in 307e0a5
// State and tools for resolving, serializing. |
initialBlankLine: unknown | ||
size: number | ||
_closeFlow: unknown | ||
furtherBlankLines: unknown |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
containerState is used in the document (container) tokenizer, because generally we can parse in one go (e.g., an atx heading), but for lists and block quotes we parse a part of it (e.g., >
), then we do the rest of the line, and then at the next line look for another marker. As we parse separate “runs” of content, information needs to be stored somewhere, and I came up with this.
All properties are used by lists
_closeFlow is a way to communicate from lists to the tokenizer that the flow is closed, but we do continue the list. That’s useful when finding a new list item, because the previous one needs to be closed, but the list remains open.
The difference in size doesn't necessarily have to do with TS.
TypeScript is JavaScript, with annotations. function tokenizeDefinition(
effects,
ok,
nok
) becomes function tokenizeDefinition(
effects: Effects,
ok: Okay,
nok: NotOkay
) and function start(code) { becomes function start(code: number) { that's it, little annotations on parameter types.
It took hours to even partially reconstruct what a tokenizer can accept for this PR.
https://www.typescriptlang.org/docs/handbook/type-checking-javascript-files.html#null-undefined-and-empty-array-initializers-are-of-type-any-or-any // in JavaScript mode TS can't tell that this is meant to be a constant, it thinks it is meant to be an initialized export representing another currently unknown type.
exports.eof = null |
The difference in size may not necessarily be due to TS, but I’m assuming that lack of access to the output format will result in more bytes (this is also what I remember from coffeescript).
I’m afraid of it getting way more complex than a couple of types sprinkled on top (effects, okay, notokay). I foresee that once TS is in, the more powerful and more confusing things will also be used. Regarding the learning curve, I can only speak for myself and for what I’ve seen when others’ started using TS.
It is complex, there are no docs, and it needs to be better.
Null is only used for the EOF character. Much of the code style is written to work well with minifiers, manglers, and gzip. |
Not necessarily, again, it's JavaScript with annotations, remove the annotations and it's back to JavaScript.
Using babel as the compiler gives pretty fine grained control over what features are/are not enabled.
Right, but that is just one example, the code is littered with |
Micromark has a number of complex/opaque types used in it's internals.
This attempts to document and validate these types using TypeScript.
In particular some complex and currently undocumented types include
Events
,Token
,Effects
,ok
, andnok
.In addition, micromark's current usage pattern includes directly accessing internal files, meaning that for TypeScript users, most (if not all) files will need types.