Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: python3Packages.scipy: allow overriding BLAS #230131

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

SomeoneSerge
Copy link
Contributor

Description of changes

An attempt to address #206866

Things done
  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandbox = true set in nix.conf? (See Nix manual)
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 23.05 Release Notes (or backporting 22.11 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Comment on lines 101 to 102
"-Csetup-args=-Dblas=cblas"
"-Csetup-args=-Dlapack=lapacke"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This uses the .pc file generated by build-support/alternatives/blas/. I am not sure if that's what we want to use, and I'm entirely not sure when build-support/alternatives/blas should be used.

For comparison:

pkg-config mkl-dynamic-ilp64-gomp --libs
-L/nix/store/ryqzxrbdabpjxbjw97w8x6nj4fg0qhcx-mkl-2023.0.0.25398/lib -Wl,--no-as-needed -lmkl_intel_ilp64 -lmkl_gnu_thread -lmkl_core -lgomp -lpthread -lm -ldlpkg-config cblas --libs
-L/nix/store/byi7kk3jd4nblhwbzavf82pc5p4s34b1-blas-3/lib -lcblas

Note that MKL does not seem to distribute a mkl.pc file, instead it ships:

pkg-config --list-all
mkl-dynamic-ilp64-gomp mkl - Intel(R) oneAPI Math Kernel Library
mkl-dynamic-ilp64-tbb  mkl - Intel(R) oneAPI Math Kernel Library
mkl-dynamic-lp64-iomp  mkl - Intel(R) oneAPI Math Kernel Library
mkl-dynamic-lp64-seq   mkl - Intel(R) oneAPI Math Kernel Library
mkl-dynamic-ilp64-iomp mkl - Intel(R) oneAPI Math Kernel Library
mkl-dynamic-ilp64-seq  mkl - Intel(R) oneAPI Math Kernel Library
mkl-dynamic-lp64-gomp  mkl - Intel(R) oneAPI Math Kernel Library
mkl-dynamic-lp64-tbb   mkl - Intel(R) oneAPI Math Kernel Library

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned whether using these cblas.pc and lapacke.pc allows for static linkage

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIU, blas.override { blasProvider = mkl; } is somehow a way to expose mkl_rt the "single dynamic library", and libcblas.so in that derivation is a copy of libblas.so from mkl? What I do not understand if scipy is expected to link any of MKL statically, and whether the blas switching mechanism supports static linkage.

@matthewbauer I see you worked both on blas switching, and on blas/lapack in numpy. Any hints?

@doronbehar
Copy link
Contributor

Somewhat related: I have played with cross compiling scipy by forking it and adding a flake.nix to it. I also discussed my progress with upstream here. I also experienced the need to completely override the buildPhase, here's a link to my fork:

scipy/scipy@main...doronbehar:scipy:nixFlake.deb#diff-0c96cbb3b4

I haven't contributed the changes to Nixpkgs yet, because the stable scipy version is still missing some patches that will make this work, not to mention issues in meson-python. Anyway, it'd be nice if the patch here will also include such a cross-file.txt - that will essentially make sure scipy is "always cross compiled".

Ideally, every package that depends on meson-python, would be built with a cross-file.txt such as this, hence I think this should be the naming of the format attribute you use here... I think currently only scipy depends on meson-python directly.

pkgs/development/python-modules/scipy/default.nix Outdated Show resolved Hide resolved
@@ -19,10 +21,15 @@
, libxcrypt
}:

assert blas.provider == numpy.blas;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no asserts, because they can block overriding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assert here is to ensure that when one attempts an override they override both scipy and numpy. IIRC, a similar assert is used for cudaPackages in several places, so if this change is rejected, we probably should update these as well

Previously the synchronization was achieved by putting numpy.blas in buildInputs. The reason I removed numpy.blas was because numpy.blas is blas.provider rather than blas. But as mentioned earlier, I'm not yet sure how to motivate the choice between blas and blas.provider. I'll have to inspect the history of this derivation, I guess...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assert here is to ensure that when one attempts an override they override both scipy and numpy. IIRC, a similar assert is used for cudaPackages in several places, so if this change is rejected, we probably should update these as well

The issue is explained in this older thread #36229.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't use blas.provider when I implemented multiple blas overrides in octave, because there things are a little bit more complicated - the octave expression has to coordinate between many dependencies that depend and should use the same blas and lapack. In this case, it's much simpler to my understanding, and I don't understand what's wrong with using numpy.blas which points to blas.provider. In the current state of things you can just:

scipy-myblas = super.scipy.override {
  numpy = super.numpy.override {
    blas = self.myblas;
  };
};

Which seems very nice to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current state of things you can just ... @doronbehar

Personally, I find this rather obscure: blas is scipy's direct dependency, but we cannot just override it, we have to override numpy instead. The actual "reason" I'm making blas an explicit argument, however, is that from numpy.blas (evaluates into mkl in case of blas = prev.blas.override { blasProvider = final.mkl; }) we cannot infer the correct pkg-config target name. With the blas-switching derivation the pkg-config target is fixed at cblas, as far as I can tell.

The reason I put the assert is because previously it wasn't possible to have un-synchronized numpy.blas and scipy.blas, but with the new interface it is and even likely to happen by accident (e.g. if one had a local override like your scipy-myblas, then it could still evaluate, but scipy its propagated numpy would be silently using different BLAS implementations)

I see the convenience argument, but global overlays are same cost in LOC (assuming there's a binary cache), are safer, and this PR doesn't break them.

I also haven't looked into why numpy exposes blas.provider instead of just blas.

Further thoughts?

@FRidh I'll switch to meta.broken, and later open a PR to migrate cuda packages as well

Copy link
Member

@FRidh FRidh Jun 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly!

And don't forget the self = pythonWithMyBlas; so withPackages functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only I think blas and lapack come from the outer package set (nixpkgs#blas, not nixpkgs#python3Packages.blas). But the idea is the same, only that it's even coarser granularity: you overlay the entire nixpkgs to achieve mutual compatibility. We can also add meta.broken = blas != numpy.blas. I still don't see any reason we expose blas.provider instead of blas in numpy.passthru, so I think we shouldn't be doing that

RE: python3.override

  pythonPackagesExtensions = prev.pythonPackagesExtensions ++ [
    (python-final: python-prev: { ... })
  ];

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only I think blas and lapack come from the outer package set (nixpkgs#blas, not nixpkgs#python3Packages.blas).

CallPackage should pick it up if I am correct

I still don't see any reason we expose blas.provider instead of blas in numpy.passthru, so I think we shouldn't be doing that

I don't know anymore why that was. Maybe that should be changed throughout?

Only I think blas and lapack come from the outer package set (nixpkgs#blas, not nixpkgs#python3Packages.blas). But the idea is the same, only that it's even coarser granularity: you overlay the entire nixpkgs to achieve mutual compatibility. We can also add meta.broken = blas != numpy.blas. I still don't see any reason we expose blas.provider instead of blas in numpy.passthru, so I think we shouldn't be doing that

RE: python3.override

  pythonPackagesExtensions = prev.pythonPackagesExtensions ++ [
    (python-final: python-prev: { ... })
  ];

That would have an effect on the entire nixpkgs then (all interpreters + every package using any of these).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so we could just add the blas attribute to the python package set via an overlay, is that what you're saying? And in either case it would be picked up by callPackage.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly!

Comment on lines +99 to +105
pypaBuildFlags = [
# Skip sdist
"--wheel"

"-Csetup-args=-Dblas=cblas"
"-Csetup-args=-Dlapack=lapacke"
];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently, pip now supports this as well: mesonbuild/meson-python#415 (comment)

However, we should probably finish the build version first, and then move back to pip, because if we're to introduce pipWheelFlags we'll have to go through staging

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rebuild is already very large and needs to go through staging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FRidh OK, new plan then:

  • drop -m build (because there's no need for it),
  • introduce pipWheelFlags,
  • expose setupHook in python3Packages.meson-python, so that including meson-python in nativeBuildInputs is sufficient to build meson-python-backed packages
  • meson-python.setupHook looks at the same variables as meson, e.g. mesonFlags
  • meson-python.setupHook prepends mesonFlags with -Csetup-args= and extends pipBuildFlags with them

Does that sound OK? I fear this is too much indirection and could be fragile

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think this is the right approach.

Note we'd want this also in the future when we would replace pip with build.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FRidh OK, new plan then:

  • drop -m build (because there's no need for it),
  • introduce pipWheelFlags,

I'm working on pipBuildFlags at #239969 .

  • expose setupHook in python3Packages.meson-python, so that including meson-python in nativeBuildInputs is sufficient to build meson-python-backed packages
  • meson-python.setupHook looks at the same variables as meson, e.g. mesonFlags
  • meson-python.setupHook prepends mesonFlags with -Csetup-args= and extends pipBuildFlags with them

Does that sound OK? I fear this is too much indirection and could be fragile

That sounds great to me too, but what I learned from the experience in #239969 is that passing --cross-file to meson is what's needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, does --cross-file go into mesonFlags (pipBuildFlags) in addition to -Dblas...?

@ofborg ofborg bot requested a review from FRidh May 5, 2023 17:56
@doronbehar
Copy link
Contributor

BTW scipy 1.11.0 is out https://github.com/scipy/scipy/releases/tag/v1.11.0 . I plan to work on it's update, possibly along with support for cross compilation, in the upcoming week.

@wegank wegank added 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md 2.status: merge conflict labels Mar 19, 2024
@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Mar 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants