Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NMF #2939

Open
ivirshup opened this issue Mar 21, 2024 · 0 comments
Open

NMF #2939

ivirshup opened this issue Mar 21, 2024 · 0 comments

Comments

@ivirshup
Copy link
Member

What kind of feature would you like to request?

New analysis tool: A simple analysis tool you have been using and are missing in sc.tools?

Please describe your wishes

I've thought for a while that we should have NMF in scanpy (#941).

But it's always been pretty trivial to implement, so not that much work for someone to cover. But now that we're increasing the amount of out of core support in scanpy I think we can offer a lot more value here with out-of-core NMF support.

I would suggest we start with a simple sklearn.decompositions.NMF wrapper for in memory datasets.

For out of core implementations, it'll be a bit more work. Some thoughts:

  • sklearn offers MiniBatchNMF which allows updating by batch. While this is out of core, it's effectively serial and may not scale well with increasing compute
  • But there are many distributed NMF implementations out there (including GPU specific ones, which is relevant for rapids-singlecell)
  • It would be nice to upstream whatever we do to dask-ml (Add NMF dask/dask-ml#96), maybe cuml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant