Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(markdown codec): return position-to-node mapping for from_str #2175

Closed
nokome opened this issue Mar 27, 2024 · 0 comments · Fixed by #2179
Closed

feat(markdown codec): return position-to-node mapping for from_str #2175

nokome opened this issue Mar 27, 2024 · 0 comments · Fixed by #2179
Assignees
Labels
module: rust Related to Rust

Comments

@nokome
Copy link
Member

nokome commented Mar 27, 2024

Currently the signature for the to_string method of the Codec traits returns a mapping of the each node to the character position in the returned string:

async fn to_string( &self, node: &Node, options: Option<EncodeOptions>) -> Result<(String, Losses, Mapping)>

The Mapping allows us to associate cursor positions in the generated format (e.g. Markdown) with each node in the document. This in turn enables UX such as showing information related to a node or Ctrl+Enter to execute the node that the cursor is currently on.

The problem with this is that the generated Markdown is not necessarily exactly the same as the source Markdown (e.g. multiple blank lines between paragraphs are ignored). This requires us to update the editor with the generated Markdown (and the mapping associated with it), which in turn causes "jumpiness".

A better alternative would be to generate the mapping from the source by changing the signature of from_str to:

async fn from_str(&self, str: &str, options: Option<DecodeOptions>) -> Result<(Node, Losses, Mapping)>

Having the mapping generated from the source Markdown would solve the jumpiness issue in the SourceView editor while still having the benefits of a position-to-node mapping. Perhaps more importantly, it would allow us to have a mapping for Markdown in non-Stencila editors, such as VSCode.

This has not been done yet because the MarkdownCodec is based on pulldown-cmark which does not retain source location of parsed nodes and looks like it is unlikely to in the near future.

Given that we need to make changes to the Markdown decoding anyway, I suggest switching to the markdown crate which has a schema similar to Stencila (in fact, several years ago, the mdast schema was part inspiration for many of the node types in Stencila Schema) and retains a Position for each parsed node, e.g. Paragraph.

Moving from pulldown-cmark to markdown will need a rewrite of the codec_markdown::decode module but the rewritten code is likely to be far simpler given the similarity of mdast schema to Stencila schema. nom-based parsers for extensions should be able to be reused.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: rust Related to Rust
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant