Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving performance #198

Closed
hukkin opened this issue Feb 5, 2022 · 4 comments · Fixed by #270
Closed

Improving performance #198

hukkin opened this issue Feb 5, 2022 · 4 comments · Fixed by #270
Labels
enhancement New feature or request

Comments

@hukkin
Copy link
Member

hukkin commented Feb 5, 2022

Describe the problem/need and solution

Problem / Idea
According to updated benchmark results (#196) mistletoe beat us. This is embarrassing. Haha, no, hats off to mistletoe authors!

Solution
The profiler (#197) reveals that we spend a significant portion of execution time, around one third, converting string characters to ints here.

As we've researched before, this is a performance hack in upstream JavaScript, but for us it hurts performance in a major way. It also makes the code slightly less readable. We've already resorted to caching these int sequences (diverging from JS upstream), which is basically a performance hack on top of a failed performance hack, where as a result performance still suffers. I'd be interested to move to using the str type only.

The naive way to implement this will break basically all parser extensions I believe (mdit-py-plugins). What we could do is a deprecation period for srcCharCode, where:

  • the core library moves to using src for increased performance
  • accessing srcCharCode emits DeprecationWarnings
  • srcCharCode is generated lazily, only when accessed. This means that the performance loss only occurs when using deprecated extension, not when using core markdown-it

Benefit
Increased performance.

Guide for implementation

No response

Tasks and updates

No response

@hukkin hukkin added the enhancement New feature or request label Feb 5, 2022
@hukkin
Copy link
Member Author

hukkin commented Feb 8, 2022

What do you think @chrisjsewell ? If you're worried about the breaking change, we could make the deprecation period very long or infinite.

@chrisjsewell
Copy link
Member

I think it sounds reasonable 👍

@hukkin
Copy link
Member Author

hukkin commented Feb 10, 2022

Great, I can work on this.

Do you agree with this comment that we should just scrap #190 ? Or do you want an out-of-range safe wrapper for src, something like

def srcAt(self, idx: int) -> Optional[str]:
    try:
        return self._src[idx]
    except IndexError:
        return None

Personally I wouldn't do this for reasons explained in the comment I linked.

@hukkin
Copy link
Member Author

hukkin commented May 8, 2022

Another way to increase performance would be to fill in whatever type annotations that are missing (enforced by the disallow_untyped_defs = true mypy setting) and publishing binary wheels built with mypyc.

What do you think @chrisjsewell ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
2 participants