Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AxisError when calculating QC metrics on backed data #3004

Closed
2 of 3 tasks
dn-ra opened this issue Apr 12, 2024 · 1 comment · Fixed by #3048
Closed
2 of 3 tasks

AxisError when calculating QC metrics on backed data #3004

dn-ra opened this issue Apr 12, 2024 · 1 comment · Fixed by #3048
Assignees
Labels

Comments

@dn-ra
Copy link

dn-ra commented Apr 12, 2024

Please make sure these conditions are met

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of scanpy.
  • (optional) I have confirmed this bug exists on the main branch of scanpy.

What happened?

Loading data in backed mode, I get an AxisError when trying to calculate QC metrics. Problem has happened on three different datasets but doesn't happen when I read the data into memory.

Minimal code sample

sc.datasets.pbmc3k()
pbmc = sc.read_h5ad('data/pbmc3k_raw.h5ad', backed = 'r+')
pbmc.var['mt'] = pbmc.var_names.str.startswith('MT-')
pbmc.var['ribo'] = pbmc.var_names.str.startswith(("RPS", "RPL"))
sc.pp.calculate_qc_metrics(pbmc, qc_vars=['mt', 'ribo'], percent_top=None, log1p=False, inplace=True)

Error output

---------------------------------------------------------------------------
AxisError                                 Traceback (most recent call last)
Cell In[8], line 3
      1 pbmc.var['mt'] = pbmc.var_names.str.startswith('MT-')
      2 pbmc.var['ribo'] = pbmc.var_names.str.startswith(("RPS", "RPL"))
----> 3 sc.pp.calculate_qc_metrics(pbmc, qc_vars=['mt', 'ribo'], percent_top=None, log1p=False, inplace=True)

File ~/miniconda3/envs/parse_sepsis/lib/python3.12/site-packages/scanpy/preprocessing/_qc.py:315, in calculate_qc_metrics(adata, expr_type, var_type, qc_vars, percent_top, layer, use_raw, inplace, log1p, parallel)
    312 if isinstance(qc_vars, str):
    313     qc_vars = [qc_vars]
--> 315 obs_metrics = describe_obs(
    316     adata,
    317     expr_type=expr_type,
    318     var_type=var_type,
    319     qc_vars=qc_vars,
    320     percent_top=percent_top,
    321     inplace=inplace,
    322     X=X,
    323     log1p=log1p,
    324 )
    325 var_metrics = describe_var(
    326     adata,
    327     expr_type=expr_type,
   (...)
    331     log1p=log1p,
    332 )
    334 if not inplace:

File ~/miniconda3/envs/parse_sepsis/lib/python3.12/site-packages/scanpy/preprocessing/_qc.py:109, in describe_obs(adata, expr_type, var_type, qc_vars, percent_top, layer, use_raw, log1p, inplace, X, parallel)
    107     obs_metrics[f"n_{var_type}_by_{expr_type}"] = X.getnnz(axis=1)
    108 else:
--> 109     obs_metrics[f"n_{var_type}_by_{expr_type}"] = np.count_nonzero(X, axis=1)
    110 if log1p:
    111     obs_metrics[f"log1p_n_{var_type}_by_{expr_type}"] = np.log1p(
    112         obs_metrics[f"n_{var_type}_by_{expr_type}"]
    113     )

File ~/miniconda3/envs/parse_sepsis/lib/python3.12/site-packages/numpy/core/numeric.py:486, in count_nonzero(a, axis, keepdims)
    483 else:
    484     a_bool = a.astype(np.bool_, copy=False)
--> 486 return a_bool.sum(axis=axis, dtype=np.intp, keepdims=keepdims)

File ~/miniconda3/envs/parse_sepsis/lib/python3.12/site-packages/numpy/core/_methods.py:49, in _sum(a, axis, dtype, out, keepdims, initial, where)
     47 def _sum(a, axis=None, dtype=None, out=None, keepdims=False,
     48          initial=_NoValue, where=True):
---> 49     return umr_sum(a, axis, dtype, out, keepdims, initial, where)

AxisError: axis 1 is out of bounds for array of dimension 0

Versions

-----
anndata     0.10.7
scanpy      1.10.1
-----
PIL                         10.3.0
anyio                       NA
arrow                       1.3.0
asttokens                   NA
attr                        23.2.0
attrs                       23.2.0
babel                       2.14.0
certifi                     2024.02.02
cffi                        1.16.0
charset_normalizer          3.3.2
colorama                    0.4.6
comm                        0.2.2
cycler                      0.12.1
cython_runtime              NA
dateutil                    2.9.0
debugpy                     1.8.1
decorator                   5.1.1
defusedxml                  0.7.1
executing                   2.0.1
fastjsonschema              NA
fqdn                        NA
h5py                        3.11.0
idna                        3.7
igraph                      0.11.4
ipykernel                   6.29.4
isoduration                 NA
jedi                        0.19.1
jinja2                      3.1.3
joblib                      1.4.0
json5                       0.9.24
jsonpointer                 2.4
jsonschema                  4.21.1
jsonschema_specifications   NA
jupyter_events              0.10.0
jupyter_server              2.14.0
jupyterlab_server           2.26.0
kiwisolver                  1.4.5
legacy_api_wrap             NA
leidenalg                   0.10.2
llvmlite                    0.42.0
markupsafe                  2.1.5
matplotlib                  3.8.4
matplotlib_inline           0.1.6
mpl_toolkits                NA
natsort                     8.4.0
nbformat                    5.10.4
numba                       0.59.1
numpy                       1.26.4
overrides                   NA
packaging                   24.0
pandas                      2.2.2
parso                       0.8.4
patsy                       0.5.6
platformdirs                4.2.0
prometheus_client           NA
prompt_toolkit              3.0.43
psutil                      5.9.8
pure_eval                   0.2.2
pydev_ipython               NA
pydevconsole                NA
pydevd                      2.9.5
pydevd_file_utils           NA
pydevd_plugins              NA
pydevd_tracing              NA
pygments                    2.17.2
pyparsing                   3.1.2
pythonjsonlogger            NA
pytz                        2024.1
referencing                 NA
requests                    2.31.0
rfc3339_validator           0.1.4
rfc3986_validator           0.1.1
rpds                        NA
scipy                       1.13.0
seaborn                     0.13.2
send2trash                  NA
session_info                1.0.0
six                         1.16.0
sklearn                     1.4.1.post1
sniffio                     1.3.1
stack_data                  0.6.3
statsmodels                 0.14.1
texttable                   1.7.0
threadpoolctl               3.4.0
tornado                     6.4
traitlets                   5.14.2
uri_template                NA
urllib3                     2.2.1
wcwidth                     0.2.13
webcolors                   1.13
websocket                   1.7.0
yaml                        6.0.1
zmq                         25.1.2
-----
IPython             8.23.0
jupyter_client      8.6.1
jupyter_core        5.7.2
jupyterlab          4.1.6
-----
Python 3.12.2 | packaged by conda-forge | (main, Feb 16 2024, 20:50:58) [GCC 12.3.0]
Linux-5.14.0-362.8.1.el9_3.x86_64-x86_64-with-glibc2.34
-----
Session information updated at 2024-04-12 13:17
@dn-ra dn-ra added the Bug 🐛 label Apr 12, 2024
@ilan-gold ilan-gold self-assigned this Apr 18, 2024
@ilan-gold
Copy link
Contributor

We will start to return helpful errors for when we don't support something, and allow currently passing things to continue as such.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants