You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OpenBLAS v0.3.28 will have a new feature allowing OpenBLAS to use the threadpool chosen by the user, (see OpenMathLib/OpenBLAS#4577).
This is very interesting because it would solve a performance issue happening when there's a quick succession of BLAS calls and OpenMP (prange) calls. The issue happens when OpenBLAS and OpenMP don't share the same threadpool because both threadpools are in active wait mode when they're idle (see OpenMathLib/OpenBLAS#3187 for details), which is a current situation since numpy and scipy wheels are built against OpenBLAS with the pthreads threading layer.
This issue is currently impacting some estimators like KMeans (#20642), NMF (#16439), pairwise_distances (#26097), ...
Being able to configure OpenBLAS to use our OpenMP threadpool would allow to get rid of this issue even if numpy and scipy keep building their wheels against OpenBLAS pthreads (which is very likely).
I'm not sure yet if or how OpenMathLib/OpenBLAS#4577 would make this possible so I'm opening this issue to track the progress on this subject.
The text was updated successfully, but these errors were encountered:
This is indeed interesting and might also be of interest for other people in the ecosystem, e.g. @rgommers and numpy/scipy developers interested in multithreading.
Wow, this is amazing! From looking through, OpenMathLib/OpenBLAS#4577, I think we'll need to write and register a callback that hooks up scikit-learn's vendered OpenMP with OpenBLAS.
If this backend-specific callback code is useful for other projects, is there a way to share it through threadpoolctl? Concretely, something like threadpoolctl.register_openblas_backend("openmp"). Although, this does increase the scope of threadpoolctl, the feature feels related.
I am not sure we can do that as part of threadpoolctl itself or more precisely, if we do it as part of threadpoolctl, then openblas will use the openmp runtime linked against a native extension shipped with threadpoolctl but we would have no guarantee that this is the same runtime as the one linked to sklearn's Cython native extensions with prange loops.
OpenBLAS v0.3.28 will have a new feature allowing OpenBLAS to use the threadpool chosen by the user, (see OpenMathLib/OpenBLAS#4577).
This is very interesting because it would solve a performance issue happening when there's a quick succession of BLAS calls and OpenMP (prange) calls. The issue happens when OpenBLAS and OpenMP don't share the same threadpool because both threadpools are in active wait mode when they're idle (see OpenMathLib/OpenBLAS#3187 for details), which is a current situation since numpy and scipy wheels are built against OpenBLAS with the pthreads threading layer.
This issue is currently impacting some estimators like KMeans (#20642), NMF (#16439), pairwise_distances (#26097), ...
Being able to configure OpenBLAS to use our OpenMP threadpool would allow to get rid of this issue even if numpy and scipy keep building their wheels against OpenBLAS pthreads (which is very likely).
I'm not sure yet if or how OpenMathLib/OpenBLAS#4577 would make this possible so I'm opening this issue to track the progress on this subject.
The text was updated successfully, but these errors were encountered: