Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CLI command to py-compile wheels #3253

Merged
merged 17 commits into from Jan 4, 2023
Merged

Conversation

rth
Copy link
Member

@rth rth commented Nov 14, 2022

Together with #3166 this would address #3050 by py-compiling .py files inside wheels into .pyc

There are two ways to use this feature,

  • as part of the main build system (currently enabled by default)
  • via the CLI,
    pyodide py-compile <wheel_path>
    

Currently, the logic is the input is a wheel (i.e. a zip files) we py-compile individual files, and create the output wheel. Although this means we do wheel repacking twice, the advantage is that it's easier to debug, since in the build/ folder we would have the original wheel in the dist/ we would get this py-compiled wheel. If we do everything in the same extracted folder, it means we have to remove original .py files and if anything goes wrong we no longer have them.

Similarly I haven't managed to use compileall as it puts .pyc files in the __pycache__ folder instead of alongside the .py files that we want. Instead I'm using a lower level py_compile module.

This is an initial implementation, but it still needs

  • adding tests

  • fixing existing tests

  • making sure for input tag py3-none-any in the wheel filename, the output is cp310-none-any and that it's compatible according to packaging.compatible_tags (currently it's not the case): cp310-none-any not in compatible tags pypa/packaging#615

  • Decide how we want to disable/enable this feature (e.g. via some flag in Makefile.env)

  • fixing the file path in exceptions. Currently similarly to Vendor standard libraries in a zip file #3166 cc @ryanking13 it shows the path in the folder where it was py-compiled. This is fixable, but it also means you have to set this path at compilation time rather than at run time which can be rather confusing if the wheel is later installed under a different folder. Maybe for now the easiest would be to display the relative path, then see if there is a way to improve this in the exception handling code.

    File "/tmp/tmp9uxkx5eb/sklearn_linear_model__ridge.py", line 1122, in fit
    File "/tmp/tmp9uxkx5eb/sklearn_base.py", line 596, in _validate_data
    File "/tmp/tmp9uxkx5eb/sklearn_utils_validation.py", line 1074, in check_X_y
    File "/tmp/tmp9uxkx5eb/sklearn_utils_validation.py", line 871, in check_array
    ValueError: Expected 2D array, got scalar array instead:
    

Benchmarks for some packages with Firefox 106 and Chrome 106 (both with console open, which can impact performance),

Package Load + import time main (s) Load + import time PR (s) Speed up (x)
pandas 3.7 ± 0.1 (Firefox ) / 4.5 ± 0.01 (Chrome) 2.4 ± 0.005 (Firefox)/ 2.2 ± 0.005 (Chrome) x1.5 (Firefox) / x2.0 (Chrome)
sympy 2.8 ± 0.02 (Firefox ) / 5.0 ± 0.05 (Chrome) 1.3 ± 0.02 (Firefox ) / 1.7 ± 0.01 (Chrome) x2.1 (Firefox) / x2.9 (Chrome)
scikit-learn, scipy, numpy, joblib 7.87 ± 0.15 (Firefox ) / 6.2 ± 0.1 (Chrome) 6.65 ± 0.2 (Firefox ) / 4.6 ± 0.6 (Chrome) x1.2 (Firefox) / x1.3 (Chrome)

So overall pretty good. It would make sense to include this in the same release as #3166

@ryanking13
Copy link
Member

Similarly I haven't managed to use compileall as it puts .pyc files in the __pycache__ folder instead of alongside the .py files that we want. Instead I'm using a lower level py_compile module.

I failed too :<

Maybe for now the easiest would be to display the relative path, then see if there is a way to improve this in the exception handling code.

Maybe we should open a discussion in discuss.python.org. There might be people who already have been thinking about this before.

@rth
Copy link
Member Author

rth commented Nov 15, 2022

Maybe we should open a discussion in discuss.python.org.

Agreed.

@bollwyvl
Copy link
Contributor

As mentioned on pyodide/micropip#24, it may be worth further exploring some packages that could be trivially mypyc-ified, including some currently pypi-sourced packages (either as-is or vendored).

For example, simply importing micropip, and by import, packaging and some pyparsing compilation, anecdotally costs about 0.15s... in that specific case, discussion is on-going upstream, where the maintainability/vendorability issues are mentioned. Everyone wants to pass the buck further upstream, but in the case of pyodide-provided packages imported in nearly 100% of user sessions, it would seem there would less cause to worry.

@rth
Copy link
Member Author

rth commented Nov 15, 2022

packages that could be trivially mypyc-ified, including some currently pypi-sourced packages (either as-is or vendored).

Yeah, certainly worth exploring but maybe in a separate issue. Also worth keeping in mind that while it has the potential to make computing faster, it means more WASM binary code to compile by the browser which can make loading slower (or at least more CPU intensive).

Anyway for now, I have to fix tests a couple of tests that started to fail for unknown reasons, and syntactically invalid files apparently distributed in one of the wheels.

@ryanking13
Copy link
Member

Maybe we should open a discussion in discuss.python.org.

Agreed.

Opened a discussion: https://discuss.python.org/t/proper-file-paths-to-show-in-exception-for-py-compiled-files/21108. I would appreciate if you can add some extra information if I missed something.

@ryanking13 ryanking13 added this to the 0.23.0 milestone Dec 8, 2022
@rth rth changed the title Draft: py-compile wheels Add CLI command to py-compile wheels Dec 12, 2022
@rth
Copy link
Member Author

rth commented Dec 13, 2022

This should be ready for review. I reverted all patches that affected the build, so now this only adds the pyodide py-compile without affecting build. Added tests and a changelog. The patch to packaging from (v22.0) is important otherwise we cannot load such wheels with micropip.

@rth rth requested a review from ryanking13 January 3, 2023 16:05
Copy link
Member

@ryanking13 ryanking13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rth!

pyodide-build/pyodide_build/_py_compile.py Outdated Show resolved Hide resolved
pyodide-build/pyodide_build/_py_compile.py Outdated Show resolved Hide resolved
url: https://files.pythonhosted.org/packages/df/9e/d1a7217f69310c1db8fdf8ab396229f55a699ce34a203691794c5d1cad0c/packaging-21.3.tar.gz

patches:
- patches/allow-cpp310-none-any-tag.patch
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Packaging 0.22.0 is released now, so probably we can go ahead and update.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, we are still blocked by pypa/packaging#638 due to pyodide/micropip#36 so we would need to wait for 0.22.1.

I'll update the comment in the patch.

@rth rth merged commit e8f8324 into pyodide:main Jan 4, 2023
@rth rth deleted the py-compile-wheels branch January 4, 2023 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants