Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configure OpenBLAS to use scikit-learn's OpenMP threadpool #28883

Open
jeremiedbb opened this issue Apr 24, 2024 · 3 comments
Open

Configure OpenBLAS to use scikit-learn's OpenMP threadpool #28883

jeremiedbb opened this issue Apr 24, 2024 · 3 comments

Comments

@jeremiedbb
Copy link
Member

jeremiedbb commented Apr 24, 2024

OpenBLAS v0.3.28 will have a new feature allowing OpenBLAS to use the threadpool chosen by the user, (see OpenMathLib/OpenBLAS#4577).

This is very interesting because it would solve a performance issue happening when there's a quick succession of BLAS calls and OpenMP (prange) calls. The issue happens when OpenBLAS and OpenMP don't share the same threadpool because both threadpools are in active wait mode when they're idle (see OpenMathLib/OpenBLAS#3187 for details), which is a current situation since numpy and scipy wheels are built against OpenBLAS with the pthreads threading layer.

This issue is currently impacting some estimators like KMeans (#20642), NMF (#16439), pairwise_distances (#26097), ...

Being able to configure OpenBLAS to use our OpenMP threadpool would allow to get rid of this issue even if numpy and scipy keep building their wheels against OpenBLAS pthreads (which is very likely).

I'm not sure yet if or how OpenMathLib/OpenBLAS#4577 would make this possible so I'm opening this issue to track the progress on this subject.

@ogrisel
Copy link
Member

ogrisel commented Apr 25, 2024

This is indeed interesting and might also be of interest for other people in the ecosystem, e.g. @rgommers and numpy/scipy developers interested in multithreading.

@thomasjpfan
Copy link
Member

Wow, this is amazing! From looking through, OpenMathLib/OpenBLAS#4577, I think we'll need to write and register a callback that hooks up scikit-learn's vendered OpenMP with OpenBLAS.

If this backend-specific callback code is useful for other projects, is there a way to share it through threadpoolctl? Concretely, something like threadpoolctl.register_openblas_backend("openmp"). Although, this does increase the scope of threadpoolctl, the feature feels related.

@ogrisel
Copy link
Member

ogrisel commented May 7, 2024

I am not sure we can do that as part of threadpoolctl itself or more precisely, if we do it as part of threadpoolctl, then openblas will use the openmp runtime linked against a native extension shipped with threadpoolctl but we would have no guarantee that this is the same runtime as the one linked to sklearn's Cython native extensions with prange loops.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants