Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed violins #3005

Open
2 of 3 tasks
apodtele opened this issue Apr 12, 2024 · 3 comments
Open
2 of 3 tasks

Failed violins #3005

apodtele opened this issue Apr 12, 2024 · 3 comments
Assignees
Labels
Bug 🐛 Needs info❔ More information needed

Comments

@apodtele
Copy link

Please make sure these conditions are met

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of scanpy.
  • (optional) I have confirmed this bug exists on the main branch of scanpy.

What happened?

Untitled-1

This was supposed to be a violin plot of total_counts. Notice that some cell categories have no data. This is by design: some categories defined but not assigned to any samples. They are assigned and used elsewhere. This totally breaks the violin plots, which work only if all categories have at least some data. I like that empty categories are still but I would like to see non-empty violins.

Minimal code sample

This code can be used to have additional unassigned categories added:


ord = ['B', 'B_mz', 'B_gro', 'B_pls', 'B_mem',
       'Th', 'Th_reg', 'Th_mem', 'Tc', 'Tc_act', 'Tc_mem',
       'NKT', 'NK_0', 'NK_1', 'NK_2',
       'ncMo', 'cMo', 'DC_1', 'DC_2', 'MΦ_1', 'MΦ_2',
       'Ne', 'RBC', 'PLT', 'HSC', 'Whatever', 'Whatnot', 'Unassigned', 'Huh?', 'What?']

adata.obs['cell_type'] = pd.Categorical(values=adata.obs.cell_type, categories=ord, ordered=True)


### Error output

_No response_

### Versions

scanpy==1.10.1 anndata==0.10.7 umap==0.5.5 numpy==1.26.4 scipy==1.13.0 pandas==2.2.2 scikit-learn==1.4.2 statsmodels==0.14.1 igraph==0.10.3 pynndescent==0.5.12
@eroell eroell self-assigned this Apr 18, 2024
@eroell
Copy link
Contributor

eroell commented Apr 19, 2024

Hey, thanks for the request.

To be able to reproduce and help, it is a big aid for us if you can supply a code sample that we can run: that is, with some dummy data (the datasets scanpy readily supplies are great for that), and the error/unexpected behaviour you get.

I think in your case this would be e.g.

import scanpy as sc
adata = sc.datasets.pbmc68k_reduced()
adata.obs["louvain"] = adata.obs["louvain"].cat.set_categories(new_categories=["0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11"])
sc.pl.violin(adata, keys='n_counts', groupby='louvain')

Yielding

ValueError: The palette dictionary is missing keys: {'11'}

Is that the issue you are facing?

@eroell eroell added the Needs info❔ More information needed label Apr 19, 2024
@apodtele
Copy link
Author

I do not know why set_categories fails to add the new ones for you. Perhaps you need to added ordered=True. Notice that in my example I use a different method of adding additional categories which works:

ord = ['1','2','3', 'Whatever', 'Whatnot', 'Huh?', 'What?']

adata.obs['cell_type'] = pd.Categorical(values=adata.obs.cell_type, categories=ord, ordered=True)

Then try to plot any violin plot.

@eroell
Copy link
Contributor

eroell commented Apr 19, 2024

To be able to reproduce and help, it is a big aid for us if you can supply a code sample that we can run: that is, with some dummy data (the datasets scanpy readily supplies are great for that), and the error/unexpected behaviour you get.

Can you show such an example, with data? It is not immediately clear to me what specific you are trying to add or construct; I'm not sure whether basically the dataframe gets destroyed by the operation you intend to perform, or whether it is the violin plot failing (if the dataframe is crooked, it would be this to be fixed)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug 🐛 Needs info❔ More information needed
Projects
None yet
Development

No branches or pull requests

2 participants