Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gatk variantrecalibrator wrapper does not specify properly the resources path #280

Open
alejocrojo09 opened this issue Jan 12, 2021 · 1 comment
Labels
bug Something isn't working

Comments

@alejocrojo09
Copy link

alejocrojo09 commented Jan 12, 2021

Snakemake version
Snakemake v5.31.1
Wrapper 0.68.0/bio/gatk/variantrecalibrator

Describe the bug
For VariantRecalibrator, I have my own path for the vcf files of the resources. When running the rule, I get an error from GATK that the resource does not exist. Checking the wrapper's code, it seems that there is a syntax error when returning the resources path. For GATK v4.1.1, the ":" in the parameters is removed according to:
https://gatk.broadinstitute.org/hc/en-us/community/posts/360072126112-Variant-Recalibrator-Couldn-t-read-file

Logs
Snakemake
RuleException:
CalledProcessError in line 52 of /home/VITO/correara/genomics/rules/filtering.smk:
Command 'source /home/VITO/correara/miniconda3/bin/activate '/home/VITO/correara/genomics/.snakemake/conda/d509627e'; set -euo pipefail; python /home/VITO/correara/genomics/.snakemake/scripts/tmpz9cha92t.wrapper.py' returned non-zero exit status 1.
File "/home/VITO/correara/miniconda3/envs/BioSnake/lib/python3.9/site-packages/snakemake/executors/init.py", line 2317, in run_wrapper
File "/home/VITO/correara/genomics/rules/filtering.smk", line 52, in __rule_recalibrate_calls
File "/home/VITO/correara/miniconda3/envs/BioSnake/lib/python3.9/site-packages/snakemake/executors/init.py", line 566, in _callback
File "/home/VITO/correara/miniconda3/envs/BioSnake/lib/python3.9/concurrent/futures/thread.py", line 52, in run
File "/home/VITO/correara/miniconda3/envs/BioSnake/lib/python3.9/site-packages/snakemake/executors/init.py", line 552, in cached_or_run
File "/home/VITO/correara/miniconda3/envs/BioSnake/lib/python3.9/site-packages/snakemake/executors/init.py", line 2348, in run_wrapper

GATK
Using GATK jar /home/VITO/correara/genomics/.snakemake/conda/d509627e/share/gatk4-4.1.4.1-1/gatk-package-4.1.4.1-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /home/VITO/correara/genomics/.snakemake/conda/d509627e/share/gatk4-4.1.4.1-1/gatk-package-4.1.4.1-local.jar VariantRecalibrator --max-gaussians 4 --resource hapmap,known=false,training=true,truth=true,prior=15.0:/home/VITO/correara/genomics/hapmap/hapmap_3.3.hg38.vcf.gz --resource omni,known=false,training=true,truth=false,prior=12.0:/home/VITO/correara/genomics/omni/1000G_omni2.5.hg38.vcf.gz --resource g1k,known=false,training=true,truth=false,prior=10.0:/home/VITO/correara/genomics/g1k/1000G_phase1.snps.high_confidence.hg38.vcf.gz --resource dbsnp,known=true,training=false,truth=false,prior=2.0:/home/VITO/correara/genomics/dbsnp/hg38_dbsnp138.vcf.gz -R resources/hg38/hg38.fa -V filtered/ERR032031.indels.vcf.gz -mode INDEL --output filtered/ERR032031.indels.recalibrated.vcf.gz --tranches-file filtered/ERR032031.indels.tranches -an QD -an FS -an ReadPosRankSum -an MQRankSum -an SOR -an DP
13:58:07.143 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/VITO/correara/genomics/.snakemake/conda/d509627e/share/gatk4-4.1.4.1-1/gatk-package-4.1.4.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
Jan 09, 2021 1:58:07 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
13:58:07.421 INFO VariantRecalibrator - ------------------------------------------------------------
13:58:07.421 INFO VariantRecalibrator - The Genome Analysis Toolkit (GATK) v4.1.4.1
13:58:07.422 INFO VariantRecalibrator - For support and documentation go to https://software.broadinstitute.org/gatk/
13:58:07.422 INFO VariantRecalibrator - Executing as correara@dev01 on Linux v4.4.0-198-generic amd64
13:58:07.422 INFO VariantRecalibrator - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_265-b11
13:58:07.422 INFO VariantRecalibrator - Start Date/Time: January 9, 2021 1:58:07 PM CET
13:58:07.422 INFO VariantRecalibrator - ------------------------------------------------------------
13:58:07.422 INFO VariantRecalibrator - ------------------------------------------------------------
13:58:07.422 INFO VariantRecalibrator - HTSJDK Version: 2.21.0
13:58:07.422 INFO VariantRecalibrator - Picard Version: 2.21.2
13:58:07.422 INFO VariantRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
13:58:07.422 INFO VariantRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
13:58:07.422 INFO VariantRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
13:58:07.423 INFO VariantRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
13:58:07.423 INFO VariantRecalibrator - Deflater: IntelDeflater
13:58:07.423 INFO VariantRecalibrator - Inflater: IntelInflater
13:58:07.423 INFO VariantRecalibrator - GCS max retries/reopens: 20
13:58:07.423 INFO VariantRecalibrator - Requester pays: disabled
13:58:07.423 INFO VariantRecalibrator - Initializing engine
13:58:08.124 INFO VariantRecalibrator - Shutting down engine
[January 9, 2021 1:58:08 PM CET] org.broadinstitute.hellbender.tools.walkers.vqsr.VariantRecalibrator done. Elapsed time: 0.02 minutes.
Runtime.totalMemory()=627572736


A USER ERROR has occurred: Couldn't read file file:///home/VITO/correara/genomics/hapmap,known=false,training=true,truth=true,prior=15.0:/home/VITO/correara/genomics/hapmap/hapmap_3.3.hg38.vcf.gz. Error was: It doesn't exist.


Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.

Minimal example

rule recalibrate_calls:
      input:
          vcf = "filtered/{sample}.{vartype}.vcf.gz",
          ref = "resources/hg38/hg38.fa",
          hapmap = "/home/VITO/correara/genomics/hapmap/hapmap_3.3.hg38.vcf.gz",
          omni = "/home/VITO/correara/genomics/omni/1000G_omni2.5.hg38.vcf.gz",
          g1k = "/home/VITO/correara/genomics/g1k/1000G_phase1.snps.high_confidence.hg38.vcf.gz",
          dbsnp = "/home/VITO/correara/genomics/dbsnp/hg38_dbsnp138.vcf.gz",
      output:
          vcf = temp("filtered/{sample}.{vartype}.recalibrated.vcf.gz"),
          tranches = "filtered/{sample}.{vartype}.tranches",
          rscript = "filtered/{sample}.{vartype}.recal.plots.R"
      params:
          mode = get_mode,
          resources = {"hapmap": {"known": False, "training": True, "truth": True, "prior": 15.0},
                   "omni":   {"known": False, "training": True, "truth": False, "prior": 12.0},
                   "g1k":   {"known": False, "training": True, "truth": False, "prior": 10.0},
                   "dbsnp":  {"known": True, "training": False, "truth": False, "prior": 2.0}},
          annotation = ["QD", "FS", "ReadPosRankSum", "MQRankSum", "SOR", "DP"],
          extra = get_gaussians
      log:
          "logs/gatk.{sample}.{vartype}.variantrecalibrator.log"
      wrapper:
          "0.68.0/bio/gatk/variantrecalibrator"

Could it be possible to fix the wrapper with the updated GATK syntax?

@alejocrojo09 alejocrojo09 added the bug Something isn't working label Jan 12, 2021
@fgvieira
Copy link
Collaborator

fgvieira commented Mar 2, 2022

GATK wrappers were all recently updated. Can you check if the error still persists?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants