Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

web-based nextclade issue when using another reference #1364

Open
yl315504 opened this issue Dec 20, 2023 · 8 comments
Open

web-based nextclade issue when using another reference #1364

yl315504 opened this issue Dec 20, 2023 · 8 comments
Labels
t:ask Type: question, request of information 1

Comments

@yl315504
Copy link

Hello,

I was trying to use a mouse covid reference, called MA10.
https://www.ncbi.nlm.nih.gov/nuccore/1898953378

I uploaded the fasta file as the new reference. Should I use fasta file as the reference?

image

I got the following error. Could you please help?

Thanks,

Error message: Error: When initializing Nextclade runner: When parsing reference tree Auspice JSON v2: When parsing Auspice Tree JSON contents: When parsing JSON: expected value at line 1 column 1

Nextclade version 2.14.1 (commit: 85e00e8, branch: release)

Memory available: 3586 MBytes

User agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36

Browser details: {"browser":{"name":"Chrome","version":"119.0.0.0"},"os":{"name":"Windows","version":"NT 10.0","versionName":"10"},"platform":{"type":"desktop"},"engine":{"name":"Blink"}}

Call stack:

Error: When initializing Nextclade runner: When parsing reference tree Auspice JSON v2: When parsing Auspice Tree JSON contents: When parsing JSON: expected value at line 1 column 1
at M (https://clades.nextstrain.org/_next/static/chunks/68.daad1778960b5734.js:1:15774)
at https://clades.nextstrain.org/_next/static/wasm/cf835b4f9fbcf56c.wasm:wasm-function[1547]:0x1e3d5c
at new a (https://clades.nextstrain.org/_next/static/chunks/68.daad1778960b5734.js:1:9424)
at https://clades.nextstrain.org/_next/static/chunks/68.daad1778960b5734.js:1:24056
at u (https://clades.nextstrain.org/_next/static/chunks/444.170c49d0571b2ba1.js:1:28086)
at Generator._invoke (https://clades.nextstrain.org/_next/static/chunks/444.170c49d0571b2ba1.js:1:29376)
at a. [as next] (https://clades.nextstrain.org/_next/static/chunks/444.170c49d0571b2ba1.js:1:28489)
at p (https://clades.nextstrain.org/_next/static/chunks/68.daad1778960b5734.js:1:21841)
at d (https://clades.nextstrain.org/_next/static/chunks/68.daad1778960b5734.js:1:22038)
at https://clades.nextstrain.org/_next/static/chunks/68.daad1778960b5734.js:1:22097

@yl315504 yl315504 added good first issue Good for newcomers help wanted Extra attention is needed needs triage Mark for review and label assignment t:bug Type: bug, error, something isn't working labels Dec 20, 2023
@corneliusroemer
Copy link
Member

Hi @yl315504! Unfortunately you can't just swap out a reference when the dataset contains a tree that is based on a particular reference. If the genemap is different that might also cause issues - unfortunately this isn't as simple as plugging in your own reference.

What's the reason you'd like to use this particular reference? There might be another way to achieve your goal.

If you just want to align against that reference, you can use nextalign CLI.

@corneliusroemer corneliusroemer added t:ask Type: question, request of information 1 and removed t:bug Type: bug, error, something isn't working good first issue Good for newcomers help wanted Extra attention is needed needs triage Mark for review and label assignment labels Dec 20, 2023
@ivan-aksamentov
Copy link
Member

ivan-aksamentov commented Dec 20, 2023

Dear @yl315504,

On the screenshot, I see you have a red circle with "1" inside of it, near "Customize dataset files". This means you've provided some custom dataset files under that section (the section can be opened and closed).

From the error, saying that Nextclade failed to parse Auspice JSON, I hypothesize that you've provided a reference tree file there which is not a correctly formatted Auspice JSON tree file. This means that you either need to provide a correctly formatted reference tree file or to remove it (in which case the default one will be used, from the SC2 dataset).

The part you've encircled with a red line is the list of files which Nextclade will be analyzing (we call them "query sequences").

Note that each dataset is tailored towards a particular pathogen strain. In general, you cannot just swap one component and expect everything to work as usual. For example, as a general rule, reference sequence must be a root of the reference tree (there are workarounds, but they are quite advanced). So only slight customizations are possible. I don't believe analyzing viruses from different host organisms are possible with our human SC2 dataset, unless the strains are very close to Wuhan strain or other pandemic strains. You can always create your own whole dataset though.

So all this looks very confusing to me. I invite you to read Nextclade documentation (the "Docs" link on the top panel) It's important to understand how Nextclade works and how to configure it before using it, and especially before using its advanced features. It's not the best documentation in the world, but it might worth to give it a shot. If you still have questions after reading documentation, we'll try to answer.

Please explain better what you are trying to achieve and we'll try to help.

@yl315504
Copy link
Author

yl315504 commented Dec 20, 2023 via email

@corneliusroemer
Copy link
Member

Thanks for explaining your goal! It's possible right now but not easy - but we might be able to implement something to make this easier, I'll break it out into a feature request.

Right now, I don't think we have the option to start with a "blank" dataset and add your own files onto it.

Currently, it's a bit tricky to do this. One way to achieve this would be to create your own dataset: https://docs.nextstrain.org/projects/nextclade/en/stable/user/datasets.html

You might be best off running Nextclade v3 CLI like this:

nextclade3 run --input-ref mouse-seq.fasta --output-tsv results.tsv input.fasta

This is much easier than creating a dataset. You could look at the output tsv in Excel and see whether there are any mutations.

I'll think about how to make things easier for this use case. Thanks for reaching out in any case, it's always useful to know what end users struggle with and what their use cases are!

@yl315504
Copy link
Author

yl315504 commented Dec 20, 2023 via email

@corneliusroemer
Copy link
Member

@yl315504 Nextclade v3 is not yet conda-installable, so you need to download the binary from here: https://github.com/nextstrain/nextclade/releases/tag/3.0.0-alpha.1

However, as an early christmas present for you (I hope) I just implemented a basic dataset that you can drop your own reference into (the thing I suggested earlier we could possibly do).

This is very experimental and might not work for very long, but you can give it a try here:
https://nextclade-git-cds-error-nextstrain.vercel.app/?dataset-server=gh:@scratch@&dataset-name=nextstrain/scratch/reference-only

You need to "customize" the dataset as shown in the video, but otherwise it should work now (in contrast to your earlier attempts!):
2023-12-21 00 03 16

Would love to know if this works and if it does what you need it to do.

@yl315504
Copy link
Author

yl315504 commented Dec 21, 2023 via email

@corneliusroemer
Copy link
Member

Excellent! Right now the URL you need to use is quite ugly, we might make this a normal dataset so it's easier to select in the future.

If you have other ideas/use cases/requests, let us know and we can see whether it's possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
t:ask Type: question, request of information 1
Projects
None yet
Development

No branches or pull requests

3 participants