Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hvg flavors seurat and cellranger with batch: bug in subset #3042

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

eroell
Copy link
Contributor

@eroell eroell commented May 2, 2024

  • Release notes not necessary because:

This PR fixes the bug reported in the linked issue.

A new test which spots the erroneous computations has been added.

I would use this chance to refactor the _highly_variable_genes.py, rather than using the 2-lines fix suggested in the first commit:
Doing the multi-batch hvg flagging differently for seurat_v3 and seurat/cell_ranger is what made this bug hard to spot in the first place I think.

Copy link

codecov bot commented May 2, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 75.80%. Comparing base (23c20bc) to head (c0b77e1).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3042   +/-   ##
=======================================
  Coverage   75.80%   75.80%           
=======================================
  Files         110      110           
  Lines       12502    12504    +2     
=======================================
+ Hits         9477     9479    +2     
  Misses       3025     3025           
Files Coverage Δ
scanpy/preprocessing/_highly_variable_genes.py 95.63% <100.00%> (+0.03%) ⬆️

@eroell eroell requested a review from flying-sheep May 2, 2024 14:49
@flying-sheep
Copy link
Member

Looks simple enough! Please deduplicate the tests though, they have too many identical lines.

@eroell
Copy link
Contributor Author

eroell commented May 21, 2024

Please deduplicate the tests though, they have too many identical lines.

To do so did across-setting tests with for loop on top of former test...

Would you prefer one separate test with that for loop for the across-settings check?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unexplainable sc.pp.highly_variable_genes(subset = True) behavior
2 participants