Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More complete dataset documentation #3051

Closed
flying-sheep opened this issue May 13, 2024 · 2 comments · Fixed by #3060
Closed

More complete dataset documentation #3051

flying-sheep opened this issue May 13, 2024 · 2 comments · Fixed by #3060

Comments

@flying-sheep
Copy link
Member

flying-sheep commented May 13, 2024

Each dataset’s documentation should contain

  1. what it contains (listing obs, …)
  2. what steps have been run on it
  3. better links (e.g. is pbmc68k_reduced this one? the docstring isn’t clear. It was added by @fidelram in new ranked genes plotting functions #228 …)

Especially important is if its .X is logarithmized, normalized, and/or filtered

See also: https://github.com/orgs/scverse/projects/18/views/1?pane=issue&itemId=62702062

cc @ilan-gold

@flying-sheep
Copy link
Member Author

flying-sheep commented May 13, 2024

idea for semi-automating 1.: we could have a representation (ideally the new fancy HTML one) created and attached by CI

@flying-sheep
Copy link
Member Author

flying-sheep commented May 13, 2024

I think pbmc68k_reduced was processed something like

sc.pp.normalize_total(adata, target_sum=1e6)
sc.pp.log1p(adata)
sc.pp.scale(adata)

still no idea what’s in “raw” as it’s clearly not counts …

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant