Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write up a report / summary with feedback for mypyc #9

Closed
ichard26 opened this issue Aug 5, 2021 · 2 comments
Closed

Write up a report / summary with feedback for mypyc #9

ichard26 opened this issue Aug 5, 2021 · 2 comments
Assignees
Labels
A: tooling & infra priority-2-low Useful, but can be done later.

Comments

@ichard26
Copy link
Owner

ichard26 commented Aug 5, 2021

Mypyc is still alpha software (which is funny given I'm trying to integrate it with "beta" software) so there's a lot of rough edges. Now they aren't that hard to workaround but it would be great if I didn't need to do 'em in the first place. So to be a helpful user of mypyc and help push it forward, let's write some sort document containing feedback, suggestions, other things we noticed.

What follows is a draft ...


First of all, here's the relevant links:

The good

  • The performance wins are excellent: ~2x average across multiple workloads!
  • As long as the hot code wasn't too spread out, optimizing for mypyc wasn't that hard (tightening up types, adding lots of Final, etc.) - additional performance wins of 5-15% were achieved :D
  • There weren't actually that many crashes or other incompatibilities (ignoring dataclasses) and they all had easy enough workaround
  • Wheel building was pretty easy and straightforward (once some changes were made to the setup.py logic) - I did steal lots of ideas from mypyc/mypy_mypyc-wheels :P

The bad

  • Dataclasses were responsible for the majority of crashes and other compatibility issues. The cases that led to issues include:
  • Unanalyzed or unreachable code continues to be problematic - a "fun" case involved this:
    >   if sys.platform == "win32":
    RuntimeError: Reached allegedly unreachable code!
    
    This was caused by mypy.ini hard-coding platform=linux. Other interesting case was one involving a module typed as Any causing an else branch to appears as it's never ever taken. The resulting unanalyzed code crashed when hit. More details in this commit: psf/black@acb77f7

And other various issues already reported on the mypyc issue tracker:

Suggestions

  • Add a note about platform=$OS being problematic
  • Add a note about docstrings being stripped - the click CLI toolkit uses docstrings to get command descriptions ... which ended breaking without a clear reason why (until I stumbled on some issue on mypyc issue tracker that explained why :p)
  • Prioritize Reduce mypy/c package size, using -g0 CFLAG mypyc/mypyc#849 - while I'm OK with wheel sizes going up adding mypyc, the ~25-35x (120KB -> ~3.3MB) I observed wasn't the easiest thing to swallow. Adding CFLAGS=-g0 brought the wheels to a much more reasonable ~1MB each.
  • Prioritize [mypyc] Introduce subcommand to mypyc command line interface python/mypy#10395 - quick iteration and experimentation was slowed down by repeated cleaning up + python -c "import foo"
  • Perhaps add a way to skip invoking the C extension build - this would make analyzing differences at the IR or C level faster ... although I would understand if this is too low-level and shouldn't be made easier

Questions

  • Can anything be done about analysis + Python-to-C transcompilation being done twice during any modern wheel build? Unfortunately any evaluations of setup.py end up triggering the above and it turns out both pip wheel and python -m build call at-least two PEP 517 hooks that eventually end up evaluating setup.py. It's not a big deal, but it slows down wheel builds a little bit. Sounds hard to avoid tho (caching would be complicated).
  • What else can we do to help mypyc mature and overall get better?

What's next?

Currently the stability and performance data for Linux were just collected and look good so a PR was opened against psf/black. There's a lot of TODOs listed in that PR but it is progress.

The main thing is that I need to do is getting the wheels significantly more battle tested so they can eventually find their way to PyPI. First of all I need to run a mypy-primer equivalent on Windows and MacOS. Assuming that doesn't turn up anything scary, the PR will probably get merged once the reviews are over with. Then at that point, it's off to community testing where we'll be asking members of the community to use the experimental wheels and report issues. So expect updates on this issue as more stability data comes in ... but I'm sure this is already enough insights for a post ^^

Finally

As much as this is riddled with negativity and bugs, I appreciate all of the work y'all put into mypyc. It's awesome to see so much work into making Python code faster lately! 🖤

@ichard26 ichard26 added priority-2-low Useful, but can be done later. A: tooling & infra labels Aug 5, 2021
@ichard26 ichard26 self-assigned this Aug 5, 2021
@ichard26
Copy link
Owner Author

Whelp, I just posted the above: mypyc/mypyc#886. I plan to update it as more data comes in from testing, but it's a good start.

@ichard26
Copy link
Owner Author

ichard26 commented Aug 2, 2022

This is done. Further updates will come in blog posts.

@ichard26 ichard26 closed this as completed Aug 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: tooling & infra priority-2-low Useful, but can be done later.
Projects
None yet
Development

No branches or pull requests

1 participant