Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Cursor implementation #675

Open
colinodell opened this issue Jun 19, 2021 · 2 comments
Open

Optimize Cursor implementation #675

colinodell opened this issue Jun 19, 2021 · 2 comments
Assignees
Labels
do not close Issue which won't close due to inactivity performance Something could be made faster or more efficient
Milestone

Comments

@colinodell
Copy link
Member

It may be possible to optimize the Cursor implementation by relying more heavily on byte positions internally than character positions. It's possible that character positions could be eliminated completely if they aren't entirely needed by external code, or perhaps we could track both for convenience but only rely on byte positions internally.

@colinodell colinodell added performance Something could be made faster or more efficient do not close Issue which won't close due to inactivity labels Jun 19, 2021
@colinodell colinodell added this to the v3.0 milestone Jun 19, 2021
@colinodell colinodell self-assigned this Jun 19, 2021
@live627
Copy link

live627 commented Mar 22, 2023

Seems to be why my page takes a little longer to load with multi-byte characters. Is there any specific reason for mbstring to process markdown? IIRC the syntax is all ASCII...

@colinodell
Copy link
Member Author

Although you're correct that the syntax is ASCII, how that syntax is interpreted depends on the context where it is used. The CommonMark specification says that Unicode whitespace and punctuation characters are significant when determining that context. For example:

A single _ character can close emphasis iff it is part of a right-flanking delimiter run and either (a) not part of a left-flanking delimiter run or (b) part of a left-flanking delimiter run followed by a Unicode punctuation character.

(emphasis added)

So we do need the ability to parse individual Unicode codepoints to properly handle the syntax - I guess the question is "how do we best do that?"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do not close Issue which won't close due to inactivity performance Something could be made faster or more efficient
Projects
None yet
Development

No branches or pull requests

2 participants