Skip to content

CDCgov/aquascope

Repository files navigation

Aquascope

Nextflow run with conda run with docker run with singularity

This project is a successor to the C-WAP pipeline and is intended to process SARS-CoV-2 wastewater samples to determine relative variant abundance.

Introduction

CDCgov/aquascope is a bioinformatics best-practice pipeline for early detection of SARS-COV variants of concern, sequenced throughshotgun metagenomic sequencing, from wastewater.

The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It uses Docker/Singularity containers making installation trivial and results highly reproducible.

Pipeline summary

  1. Read QC: FastQC
  2. Trimming reads: Fastp
  3. Aligning short reads: Minimap2
  4. Ivar trim aligned reads: IVAR Trim
  5. Freyja Variant classification: Freyja
  6. Present QC for raw reads: MultiQC

Quick Start

  1. Install Nextflow (>=21.04.0)

  2. Install any of Docker, Singularity, Podman, Shifter or Charliecloud for full pipeline reproducibility (please only use Conda as a last resort; see docs)

  3. Prepare the assets/samplesheet.csv. Refer to [prepare-files] (https://cdcgov.github.io/aquascope/).

  4. Prepare the configuration files A. nextflow.config is prepared with default parameters, update as needed B. test.config is prepared with default parameters, update as needed C. cdc-dev.config is prepared for CDC-Users and it has the Rosalind cluster configurations.

  5. Run the pipeline profile

    nextflow run main.nf -profile <docker/singularity/podman/shifter/charliecloud/conda/institute>
    

    A. The -profile test will run the test parameters and samples only for Illumina test data

    • Please check nf-core/configs to see if a custom config file to run nf-core pipelines already exists for your Institute. If so, you can simply use -profile <institute> in your command. This will enable either docker or singularity and set the appropriate execution settings for your local compute environment. NOTE: CDC users can only use singularity on SciComp resources.
    • If you are using singularity then the pipeline will auto-detect this and attempt to download the Singularity images directly. If you are persistently observing issues downloading Singularity images directly due to timeout or network issues then please use the --singularity_pull_docker_container parameter to pull and convert the Docker image instead.
    • If you are using conda, it is highly recommended to use the NXF_CONDA_CACHEDIR or conda.cacheDir settings to store the environments in a central location for future pipeline runs.

Documentation

For more detailed documentation, please visit our user-guides.

Contributions and Support

Aquascope was largely developed by OAMD's SciComp Team, with inputs from NWSS and the DCIPHER Team at Palantir. Detailed contributions can be found in our user-guides.

If you would like to contribute to this pipeline, please see the contributing guidelines.

For further information or help, don't hesitate to get in touch on the Slack #aquascope channel (you can join with this invite).

Citations

An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.