(feat): raising errors where `backed` is not supported #3048

ilan-gold · 2024-05-08T12:21:56Z

The idea here is to raise errors where I have checked that things currently don't work, regardless of the reason why, and do not make any attempt to fix this problem. Once scverse/anndata#1469 is merged, we can make concrete recommendations for how to handle out-of-core data.

I think a decorator could work but we would have to check the type in the decorator like (instead of relying on current checks like in filter_genes):

if isinstance(arg1, AnnData) and arg1.isbacked:
    raise NotImplementedErrror(...)

But then there is something like log1p where we quasi-support backed via this chunked kwarg, which would no really fit the above paradigm.

Nonetheless, I think I need to go one-by-one through the functions to check what we support and don't.

Separately, we may want to drop support where it exists already (which from my searching, is only obs_df and var_df and then subsample_counts).

Closes AxisError when calculating QC metrics on backed data #3004 and closes Out of Memory filtering returns TypeError #2894
Tests included or not required because:

Release notes not necessary because:

codecov · 2024-05-08T12:27:30Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 75.87%. Comparing base (23c20bc) to head (b0ca228).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3048      +/-   ##
==========================================
+ Coverage   75.80%   75.87%   +0.06%     
==========================================
  Files         110      110              
  Lines       12502    12533      +31     
==========================================
+ Hits         9477     9509      +32     
+ Misses       3025     3024       -1

Files	Coverage Δ
scanpy/_utils/__init__.py	`75.05% <100.00%> (+0.48%)`	⬆️
scanpy/preprocessing/_pca.py	`92.93% <100.00%> (+0.15%)`	⬆️
scanpy/preprocessing/_scale.py	`91.96% <100.00%> (+0.07%)`	⬆️
scanpy/preprocessing/_simple.py	`87.95% <100.00%> (+0.38%)`	⬆️
scanpy/tools/_dendrogram.py	`86.95% <100.00%> (+0.28%)`	⬆️
scanpy/tools/_ingest.py	`77.33% <100.00%> (+0.10%)`	⬆️
scanpy/tools/_rank_genes_groups.py	`94.33% <100.00%> (+0.01%)`	⬆️
scanpy/tools/_score_genes.py	`85.54% <100.00%> (+0.35%)`	⬆️
scanpy/tools/_tsne.py	`93.18% <100.00%> (+0.15%)`	⬆️

... and 1 file with indirect coverage changes

ilan-gold · 2024-05-10T09:03:31Z

@flying-sheep What do you think here? If the plan looks good for this subset of functions, I'd expand it.

flying-sheep · 2024-05-13T08:13:46Z

I like the idea! Better error messages, and getting our modalities a bit under control is a great goal as well!

… into ig/backed_not_implemented

ilan-gold · 2024-05-13T13:29:02Z

Looking at this again, now that I have gone through everything, I think we actually need to check types directly and shouldn't rely on isbacked because it is possible to do something like adata.layers['foo'] = sparse_dataset(g_layer) and this should also error our with a helpful message.

flying-sheep · 2024-05-14T08:38:48Z

If that’s actually supported, we need to rethink isbacked anyway.

ilan-gold · 2024-05-14T08:56:31Z

If that’s actually supported

If what is actually supported? sparse_dataset is exported from experimental and in any case, does being more exhaustive hurt? I think we have been telling people to use sparse_dataset if it suits them.

… into ig/backed_not_implemented

scanpy/tests/test_highly_variable_genes.py

ilan-gold · 2024-05-15T11:20:41Z

How do we xfail stuff from dev? https://dev.azure.com/scverse/scanpy/_build/results?buildId=6692&view=logs&j=cb4d9293-b492-5d67-02b0-e6a595893958&t=99aeec2e-a40e-57fc-1ab3-27c1a626c3e0&l=108 It looks like the UMAP package via pynndescent is using something that has been removed (np.infty) in an upcoming release of numpy
Codecov, I think, is outright wrong aklthough that might have to do with the failing dev test

ilan-gold · 2024-05-15T11:50:15Z

So it looks like we definitely started downloading the rc for numpy relecently: https://dev.azure.com/scverse/scanpy/_build/results?buildId=6661&view=logs&j=cb4d9293-b492-5d67-02b0-e6a595893958&t=22c10d56-3e3b-5f98-5bc6-b33384a21306 (from last week or something, downloading 1.26.4) vs https://dev.azure.com/scverse/scanpy/_build/results?buildId=6692&view=logs&j=cb4d9293-b492-5d67-02b0-e6a595893958&t=efb91c47-e839-5730-ecc5-cc752bc791b5 (downloading the 2.0 rc)

flying-sheep · 2024-05-16T09:30:40Z

How do we xfail stuff from dev?

pytest.mark.xfail takes a condition:

xfail_if_dev_tests = pytest.mark.xfail(
    os.environ.get("DEPENDENCIES_VERSION", "latest") == "pre-release",
    reason="...",
)

@xfail_if_dev_tests
def test_xzy(): ...

You probably need to change the tests so it makes the CI variable visible as an env variable, I’m not an Azure expert so I don’t know if it already is.

Codecov, I think, is outright wrong aklthough that might have to do with the failing dev test

Yeah, maybe, let’s see once everything passes.

I’m also OK with lowering the percentage, I just set it to 75% to have some indication if codecov is broken or working. (Before it would report 20% for a PR and there would be no visual indication that that’s a problem)

flying-sheep

See above

scanpy/tools/_score_genes.py

ilan-gold · 2024-05-17T13:54:24Z

Flaky tests, it seems, scanpy/tests/test_scrublet.py::test_scrublet_data under dev and scanpy/tests/test_utils.py::test_is_constant_dask under min version: https://dev.azure.com/scverse/scanpy/_build/results?buildId=6733&view=logs&j=cb4d9293-b492-5d67-02b0-e6a595893958&t=99aeec2e-a40e-57fc-1ab3-27c1a626c3e0&l=2230 and https://dev.azure.com/scverse/scanpy/_build/results?buildId=6733&view=logs&j=cb4d9293-b492-5d67-02b0-e6a595893958&t=99aeec2e-a40e-57fc-1ab3-27c1a626c3e0

Will try re-running

…t supported

…rted (#3072) Co-authored-by: Ilan Gold <ilanbassgold@gmail.com> Co-authored-by: Philipp A <flying-sheep@web.de>

(feat): first step raising errors where backed is not supported

87450d0

ilan-gold added this to the 1.10.2 milestone May 8, 2024

ilan-gold added Area - Documentation 📒 Area – API API design Area - Out of core 💾 Working with on disk data labels May 8, 2024

ilan-gold added 2 commits May 8, 2024 14:54

(fix): fix chunked with log1p

b6e13eb

(fix): complications in log1p with chunked

75a703b

ilan-gold self-assigned this May 8, 2024

Merge branch 'main' into ig/backed_not_implemented

4a654d1

flying-sheep and others added 9 commits May 13, 2024 10:14

remove duplicated import

a80dece

(feat): pca check

4c22705

Merge branch 'ig/backed_not_implemented' of github.com:scverse/scanpy…

b3c4bb7

… into ig/backed_not_implemented

(feat): ingest check added

cb6116b

(feat): add dendrogram test

5b7a3e9

(feat): add tsne check

53e68c1

(feat): rank_genes_groups

cfa8d57

(feat): score_genes check

138c9be

(chore): add plotting backed tests

d68ee47

ilan-gold added 6 commits May 14, 2024 10:56

Merge branch 'main' into ig/backed_not_implemented

6632910

(chore): check type instead of isbacked

788a4cb

Merge branch 'ig/backed_not_implemented' of github.com:scverse/scanpy…

06ab4f1

… into ig/backed_not_implemented

(chore): release note

dafda50

(fix): string formatting error

786f096

(fix): worker_id default

4eb3487

(chore): remove other tempfile import

27c3d08

ilan-gold force-pushed the ig/backed_not_implemented branch from f25bddc to 27c3d08 Compare May 14, 2024 15:41

ilan-gold added 3 commits May 15, 2024 11:36

Merge branch 'main' into ig/backed_not_implemented

218067a

(fix): try moving scanpy install

01a0bcf

(fix): np.Inf -> np.inf

7e161bf

ilan-gold commented May 15, 2024

View reviewed changes

scanpy/tests/test_highly_variable_genes.py Outdated Show resolved Hide resolved

flying-sheep self-requested a review May 16, 2024 08:44

flying-sheep requested changes May 16, 2024

View reviewed changes

flying-sheep reviewed May 16, 2024

View reviewed changes

scanpy/tools/_score_genes.py Outdated Show resolved Hide resolved

ilan-gold and others added 7 commits May 16, 2024 14:43

(fix): fail if pynn is newest

88d51ab

(fix): name

e4ebaf7

(fix): score_genes name

037b77f

Merge branch 'main' into ig/backed_not_implemented

0d96e89

revert 88d51ab

e2e9252

session-scoped backed_adata

54ee86f

Merge branch 'main' into ig/backed_not_implemented

2e8ba99

ilan-gold requested a review from flying-sheep May 17, 2024 11:52

(fix): move umap import back to top

6b03ac8

(fix): try block shape

352166b

flying-sheep approved these changes May 17, 2024

View reviewed changes

(fix): revert chunks fix

b0ca228

ilan-gold merged commit 3ba3f46 into main May 21, 2024
14 checks passed

ilan-gold deleted the ig/backed_not_implemented branch May 21, 2024 08:19

meeseeksmachine pushed a commit to meeseeksmachine/scanpy that referenced this pull request May 21, 2024

Backport PR scverse#3048: (feat): raising errors where backed is no…

e8ca9a8

…t supported

meeseeksmachine mentioned this pull request May 21, 2024

Backport PR #3048: (feat): raising errors where backed is not supported #3072

Merged

flying-sheep added a commit that referenced this pull request Jun 3, 2024

Backport PR #3048: (feat): raising errors where backed is not suppo…

0f9ff18

…rted (#3072) Co-authored-by: Ilan Gold <ilanbassgold@gmail.com> Co-authored-by: Philipp A <flying-sheep@web.de>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(feat): raising errors where `backed` is not supported #3048

(feat): raising errors where `backed` is not supported #3048

ilan-gold commented May 8, 2024 •

edited

codecov bot commented May 8, 2024 •

edited

ilan-gold commented May 10, 2024

flying-sheep commented May 13, 2024

ilan-gold commented May 13, 2024

flying-sheep commented May 14, 2024

ilan-gold commented May 14, 2024 •

edited

ilan-gold commented May 15, 2024 •

edited

ilan-gold commented May 15, 2024

flying-sheep commented May 16, 2024

flying-sheep left a comment

ilan-gold commented May 17, 2024

(feat): raising errors where backed is not supported #3048

(feat): raising errors where backed is not supported #3048

Conversation

ilan-gold commented May 8, 2024 • edited

codecov bot commented May 8, 2024 • edited

Codecov Report

ilan-gold commented May 10, 2024

flying-sheep commented May 13, 2024

ilan-gold commented May 13, 2024

flying-sheep commented May 14, 2024

ilan-gold commented May 14, 2024 • edited

ilan-gold commented May 15, 2024 • edited

ilan-gold commented May 15, 2024

flying-sheep commented May 16, 2024

flying-sheep left a comment

Choose a reason for hiding this comment

ilan-gold commented May 17, 2024

(feat): raising errors where `backed` is not supported #3048

(feat): raising errors where `backed` is not supported #3048

ilan-gold commented May 8, 2024 •

edited

codecov bot commented May 8, 2024 •

edited

ilan-gold commented May 14, 2024 •

edited

ilan-gold commented May 15, 2024 •

edited