Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

types: add type definitions for micromark #17

Closed

Conversation

ChristianMurphy
Copy link
Member

alternative to #16

@ChristianMurphy ChristianMurphy requested review from a team September 10, 2020 23:22
@ChristianMurphy ChristianMurphy added ☂️ area/types This affects typings 🛠 blocked/wip This cannot progress yet, it’s being worked on 🦋 type/enhancement This is great to have labels Sep 10, 2020
@ChristianMurphy
Copy link
Member Author

@ChristianMurphy ChristianMurphy force-pushed the types/add-typings branch 4 times, most recently from 47ce9a1 to 9f0c358 Compare September 10, 2020 23:55
@codecov-commenter

This comment has been minimized.

@ChristianMurphy ChristianMurphy added the help wanted 🙏 This could use your insight or help label Sep 11, 2020
Comment on lines +128 to +131
notSureWhatThisIs:
| Construct
| Construct[]
| {[code: number]: Construct | Construct[]},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this construct like parameter called?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dunno!

I’ve been calling the third form “hooks”, as it defines which character codes are “hooked”. But that doesn’t say too much.

Maybe something like “usable” in unified? Attempt?

write: (value: number) => Event[]
}

export type Construct = unknown
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a construct?
Is it a function?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A construct is an object with:

  • a tokenize function
  • resolve, resolveTo, and resolveAll functions (which take a certain slice of events and can mutate them)
  • partial (bit ugly, used for things that aren’t really constructs, like whitespace)
  • concrete, used for multiline constructs such as HTML and fenced code that and how they mix with containers (e.g., > <div>\n> > this is html, not a second block quote)
  • Content has a name

/**
*
*/
export type Okay = (code: number) => () => void
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the inner function have any parameters?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the params in check/lazy/interrupt also be called “returnState” and “bogusState”.

The tokenizer function gets effects, ok, nok, which is kind of like the executor function given to Promise: resolve, reject.
On the outside, you can pass returnState, bogusState, which is kind of like the functions given to then: onfulfill, onreject

They all get character codes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these functions also should return another function: the next state-as-a-function to go to.
But there is one case where they return void: for the eof character code (at the end of a value)
The reason being: well, there isn’t any state that makes sense, so void works well. Practically that has also helped: if for some reason it was a mistake, then an exception is throw because there is no next function, meaning it surfaces early.

@meritozh
Copy link

I don't think syncing type definitions manually is a better way. *.d.ts should be generated by tsc.

@ChristianMurphy
Copy link
Member Author

@meritozh originally micromark was going to be in TypeScript (see remarkjs/remark#439 and #4) however the direction changed. Bringing back native typescript was proposed again and declined #16 (comment).

This has also been discussed before in unifiedjs/unified#76, most of the rest of unified also has typings files rather than being native TypeScript (see: https://github.com/search?p=2&q=org%3Asyntax-tree+typescript&type=Issues and https://github.com/search?q=org%3Aunifiedjs+typescript&type=Issues)
There is also a general propals to use JSDoc to run TypeScript unifiedjs/rfcs#5, which is stalled at the moment.

If you want to move native TypeScript or jsdoc based TypeScript forward, starting a thread at https://github.com/micromark/micromark/discussions would be a good place to have the discussion.
Tacking the discussion on to this PR, is not a good place to restart that broad discussion.

/**
*
*/
export type Okay = (code: number) => () => void
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the params in check/lazy/interrupt also be called “returnState” and “bogusState”.

The tokenizer function gets effects, ok, nok, which is kind of like the executor function given to Promise: resolve, reject.
On the outside, you can pass returnState, bogusState, which is kind of like the functions given to then: onfulfill, onreject

They all get character codes

/**
*
*/
export type Okay = (code: number) => () => void
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these functions also should return another function: the next state-as-a-function to go to.
But there is one case where they return void: for the eof character code (at the end of a value)
The reason being: well, there isn’t any state that makes sense, so void works well. Practically that has also helped: if for some reason it was a mistake, then an exception is throw because there is no next function, meaning it surfaces early.

write: (value: number) => Event[]
}

export type Construct = unknown
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A construct is an object with:

  • a tokenize function
  • resolve, resolveTo, and resolveAll functions (which take a certain slice of events and can mutate them)
  • partial (bit ugly, used for things that aren’t really constructs, like whitespace)
  • concrete, used for multiline constructs such as HTML and fenced code that and how they mix with containers (e.g., > <div>\n> > this is html, not a second block quote)
  • Content has a name

/**
* Consume deals with a character, and moves to the next
*/
consume: (code: number) => void
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be a good idea to make a specific type for the character codes?

It can be any result (positive int) from String.charCodeAt (except NaN). And the following special codes are used:

// This module is compiled away!
exports.carriageReturn = -5
exports.lineFeed = -4
exports.carriageReturnLineFeed = -3
exports.horizontalTab = -2
exports.virtualSpace = -1
exports.eof = null

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be a good idea.
Both the known types and character codes could be turned in a union of known types or turned into an opaque type, currently both are transparent primitives.

Copy link
Contributor

@ocavue ocavue Oct 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If micromark is written in native TypeScript, we can use TypeScript's const enum feature for these constant values.

By using const enum, those import statements will be removed while compiling (just like what script/babel-transform-constants.js does). And we also have an easy way to define types for character codes.

However, TypeScript doesn't allow null as a value of const enum, so we have to do something special for eof.

column: number
offset: number
_index?: number
_bufferIndex?: number
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the underscore props, as those are kinda for internal use in this project, should they be left undocumented?

Is there some comment to add here to say to TS that it shouldn’t be exported to library users?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could tag these with tsdoc

/**
 * @internal
 */

I'm cautious about removing them entirely, since extensions/plugins may leverage some of the internal attributes.

@@ -0,0 +1,3 @@
declare function compileHTML(): void
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compileHtml?

@ChristianMurphy
Copy link
Member Author

closing in favor of #25 which builds on this

@wooorm wooorm removed help wanted 🙏 This could use your insight or help 🛠 blocked/wip This cannot progress yet, it’s being worked on 🦋 type/enhancement This is great to have labels Oct 4, 2020
@wooorm wooorm added the 🤷 no/invalid This cannot be acted upon label Oct 4, 2020
@ChristianMurphy ChristianMurphy deleted the types/add-typings branch December 30, 2021 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
☂️ area/types This affects typings 🤷 no/invalid This cannot be acted upon
Development

Successfully merging this pull request may close these issues.

None yet

5 participants