🚄 vitessce-data

Utils to pre-process data for Vitessce.

Sample datasets come from:

Codeluppi et al.: Spatial organization of the somatosensory cortex revealed by cyclic smFISH
Dries et al.: Giotto, a pipeline for integrative analysis and visualization of single-cell spatial transcriptomic data
Wang et al.: Multiplexed imaging of high density libraries of RNAs with MERFISH and expansion microscopy
Cao et al.: The single-cell transcriptional landscape of mammalian organogenesis

JSON is our target format right now because it is easily read by Javascript, and not so inefficient as to cause problems with storage or processing. For example: The mRNA HDF5 is 30M, but as JSON it is still only 37M.

Install

Set up the vitessce-data environment using conda:

conda env create -f environment.yml

Users may also install the dependencies with pip:

pip install -r requirements.txt

Develop and run

conda activate vitessce-data

# To update with new packages:
conda env update --file environment.yml --prune

test.sh exercises all the scripts, using the fixtures in fake-files/, and errors if the output is not what is expected.
process.sh downloads full data from the internet, caches these input files in big-files/input, processes them, caches the output in big-files/output, and pushes to S3.

process.sh only performs the work necessary. To regenerate just a portion of the data, delete the files in big-files/output that need to be replaced.

Configure AWS and Google Cloud CLIs

Install aws CLI and add to your PATH (reference).

Install gcloud and gsutil and add to your PATH (reference).

Configure the AWS CLI by setting AWS environment variables (reference) or running aws configure (reference).

Configure the Google Cloud CLI by running gcloud auth login (reference).

Creating a new release

Update the contents of cloud_target.txt to bump the version number. Then update the version where it is referenced in test fixtures in the fake-files/ directory.

Name		Name	Last commit message	Last commit date
Latest commit History 129 Commits
big-files		big-files
fake-files		fake-files
python		python
scripts		scripts
snakemake/satija		snakemake/satija
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
OMETIFF-TOOLS.md		OMETIFF-TOOLS.md
README.md		README.md
cloud_target.txt		cloud_target.txt
environment.yml		environment.yml
install-vips.sh		install-vips.sh
process.sh		process.sh
requirements.txt		requirements.txt
test.sh		test.sh

License

vitessce/vitessce-data

Folders and files

Latest commit

History

Repository files navigation

🚄 vitessce-data

Install

Develop and run

Configure AWS and Google Cloud CLIs

Creating a new release

About

Topics

Resources

License

Stars

Watchers

Forks

Languages