Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add Orthanq candidates hla and virus wrappers #2640

Open
wants to merge 21 commits into
base: master
Choose a base branch
from

Conversation

huzuner
Copy link
Contributor

@huzuner huzuner commented Feb 4, 2024

QC

  • I confirm that:

For all wrappers added by this PR,

  • there is a test case which covers any introduced changes,
  • input: and output: file paths in the resulting rule can be changed arbitrarily,
  • either the wrapper can only use a single core, or the example rule contains a threads: x statement with x being a reasonable default,
  • rule names in the test case are in snake_case and somehow tell what the rule is about or match the tools purpose or name (e.g., map_reads for a step that maps reads),
  • all environment.yaml specifications follow the respective best practices,
  • the environment.yaml pinning has been updated by running snakedeploy pin-conda-envs environment.yaml on a linux machine,
  • wherever possible, command line arguments are inferred and set automatically (e.g. based on file extensions in input: or output:),
  • all fields of the example rules in the Snakefiles and their entries are explained via comments (input:/output:/params: etc.),
  • stderr and/or stdout are logged correctly (log:), depending on the wrapped tool,
  • temporary files are either written to a unique hidden folder in the working directory, or (better) stored where the Python function tempfile.gettempdir() points to (see here; this also means that using any Python tempfile default behavior works),
  • the meta.yaml contains a link to the documentation of the respective tool or command,
  • Snakefiles pass the linting (snakemake --lint),
  • Snakefiles are formatted with snakefmt,
  • Python wrapper scripts are formatted with black.
  • Conda environments use a minimal amount of channels, in recommended ordering. E.g. for bioconda, use (conda-forge, bioconda, nodefaults, as conda-forge should have highest priority and defaults channels are usually not needed because most packages are in conda-forge nowadays).

@huzuner huzuner changed the title Add Orthanq wrapper feat: add Orthanq wrapper Feb 4, 2024
@huzuner huzuner changed the title feat: add Orthanq wrapper feat: add Orthanq candidates hla wrapper Feb 4, 2024
@huzuner huzuner changed the title feat: add Orthanq candidates hla wrapper feat: add Orthanq candidates hla and virus wrappers Feb 5, 2024
@@ -0,0 +1,7 @@
rule orthanq_candidates_virus:
output:
directory("viral_candidates"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What files are created in this directory? Would it be possible to name them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many files are created in this folder and it's not possible to name them all here. That was one reason I implemented the output of the tool as a folder. If there is any improvement regarding the tool output, I'll update it here aswell :)

@fgvieira
Copy link
Collaborator

Don't have any experience with orthanq but, from the command line, it seems that it could make more sense to have candidates\hla and cadidates\virus, no?

It seems candidates\hla uses a temp folder; can you use the tmpdir resource?

@huzuner
Copy link
Contributor Author

huzuner commented Mar 21, 2024

Don't have any experience with orthanq but, from the command line, it seems that it could make more sense to have candidates\hla and cadidates\virus, no?

It seems candidates\hla uses a temp folder; can you use the tmpdir resource?

Yes that could make sense aswell. But I was thinking of putting everything related to the each application in one folder, to make the navigation within the usage easier. Therefore the commands are as follows:

orthanq candidates hla
orthanq preprocess hla
orthanq call hla

and

orthanq candidates virus
orthanq preprocess virus
orthanq call virus

also could you eloborate more regarding the temp usage?

Thank you for all the reviewing :)

@fgvieira
Copy link
Collaborator

fgvieira commented May 10, 2024

From what I can see in the docs, it seems that the tool has three subcommands and hla/virus is an option. If so, I'd organize the wrapper the same way as (e.g.)samtools ('orthanq/candidates, orthanq/preprocess, and orthanq/call) with hla/virusas a parameter. However, sinceorthanq` is developed by @johanneskoester lab, I think it would be nice to hear from him.

If the program supports a temp folder, then it would be nice if the wrappers would take care of it. For example, as in:

with tempfile.TemporaryDirectory() as tmpdir:
tmp_prefix = Path(tmpdir) / "samtools_sort"
shell(
"samtools sort {samtools_opts} -m {mem_per_thread_mb}M {extra} -T {tmp_prefix} {snakemake.input[0]} {log}"
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants