Skip to content

Nextclade_Batch

Quick Facts

Workflow Type Applicable Kingdom Last Known Changes Command-line Compatibility Workflow Level Dockstore
Phylogenetic Placement Monkeypox virus, SARS-CoV-2, Viral vX.X.X Yes Set-level Nextclade_Batch_PHB

Nextclade_Batch_PHB

Nextclade Batch rapidly calls mutations, places samples on a reference phylogenetic tree, and rapidly genotypes batches of samples using Nextclade. Phylogenetic placement is done by comparing the mutations of the query sequence (relative to the reference) with the mutations of every node and tip in the reference tree, and finding the node which has the most similar set of mutations. This operation is repeated for each query sequence, until all of them are placed onto the tree. This workflow uses the Nextstrain-maintained nextclade datasets for manually inputted datasets or downloaded datasets (e.g. SARS-CoV-2, mpox, influenza A and B, HIV, and RSV-A and RSV-B).

Contact us if you need help generating your own mutation-annotated tree, or follow the instructions available on the Augur wiki here.

Placement not construction

This workflow is not for building a tree from scratch, but rather for genotyping and placement of new sequences onto an existing high-quality input reference tree with representative samples on it. In effect, query samples are only compared to reference samples and never to the other query samples.

Inputs

Terra Task Name Variable Type Description Default Value Terra Status
nextclade_batch assembly_fastas Array[File] The assembly files for your samples in FASTA format Required
nextclade_batch dataset_name String What nextclade dataset name to run nextclade on; some options are: "sars-cov-2", "flu_h1n1pdm_ha", "flu_h1n1pdm_na", "flu_h3n2_ha", "flu_h3n2_na", "flu_vic_ha", "flu_vic_na", "flu_yam_ha", "hMPXV", "hMPXV_B1", "MPXV", "rsv_a" and "rsv_b" Required
nextclade_batch dataset_tag String nextclade dataset tag Uses the dataset tag associated with the nextclade docker image version Optional
nextclade_batch gene_annotations_gff File A genome annotations file for codon-aware alignment, gene translation and calling of amino acid mutations Uses the genome annotation associated with the nextclade dataset name Optional
nextclade_batch input_ref File An optional FASTA file containing reference sequence. This file should contain exactly 1 sequence Uses the reference fasta associated with the specified nextclade dataset name Optional
nextclade_batch pathogen_json File An optional pathogen JSON file containing configuration and data specific to a pathogen Uses the reference pathogen JSON file associated with the specified nextclade dataset name Optional
nextclade_batch reference_tree_json File An optional phylogenetic reference tree file which serves as a target for phylogenetic placement Uses the reference tree associated with the specified nextclade dataset name Optional
nextclade_v3_set cpu Int Number of CPUs to allocate to the task 2 Optional
nextclade_v3_set disk_size Int Amount of storage (in GB) to allocate to the task 100 Optional
nextclade_v3_set docker String The Docker container to use for the task us-docker.pkg.dev/general-theiagen/nextstrain/nextclade:3.14.5 Optional
nextclade_v3_set memory Int Amount of memory/RAM (in GB) to allocate to the task 4 Optional
nextclade_v3_set verbosity String Set the nextclade output verbosity level. Options: off, error, warn, info, debug, trace warn Optional
version_capture docker String The Docker container to use for the task us-docker.pkg.dev/general-theiagen/theiagen/alpine-plus-bash:3.20.0 Optional
version_capture timezone String Set the time zone to get an accurate date of analysis (uses UTC by default) Optional

Outputs

Variable Type Description
nextclade_batch_analysis_date String Date of analysis
nextclade_batch_auspice_json File Phylogenetic tree with user placed samples
nextclade_batch_nextclade_docker String Nextclade docker image used
nextclade_batch_nextclade_json File JSON file with the results of the Nextclade analysis
nextclade_batch_nextclade_tsv File Tab-delimited file with Nextclade results
nextclade_batch_nextclade_version String Nextclade version used
nextclade_batch_version String Version of the Public Health Bioinformatics (PHB) repository used