VADR_Update¶
Quick Facts¶
Workflow Type | Applicable Kingdom | Last Known Changes | Command-line Compatibility | Workflow Level | Dockstore |
---|---|---|---|---|---|
Genomic Characterization | HAV, Influenza, Monkeypox virus, RSV-A, RSV-B, SARS-CoV-2, Viral, WNV | vX.X.X | Yes | Sample-level | VADR_Update_PHB |
VADR_Update_PHB¶
VADR_Update_PHB is a standalone workflow dedicated to running VADR. By default, the workflow uses a slimmed-down docker image running VADR (v1.6.4), which requires models to be provided separately. The table below outlines the recommended models and VADR parameters for use in the workflow.
Organism | vadr_model_file | vadr_opts | max_length |
---|---|---|---|
sars-cov-2 | "gs://theiagen-public-resources-rp/reference_data/databases/vadr_models/vadr-models-sarscov2-1.3-2.tar.gz" |
"--mkey sarscov2 --glsearch -s -r --nomisc --lowsim5seq 6 --lowsim3seq 6 --alt_fail lowscore,insertnn,deletinn --noseqnamemax --out_allfasta" |
30000 |
MPXV | "gs://theiagen-public-resources-rp/reference_data/databases/vadr_models/vadr-models-mpxv-1.4.2-1.tar.gz" |
"--mkey mpxv --glsearch --minimap2 -s -r --nomisc --r_lowsimok --r_lowsimxd 100 --r_lowsimxl 2000 --alt_pass discontn,dupregin --s_overhang 150 --out_allfasta" |
210000 |
WNV | "gs://theiagen-public-resources-rp/reference_data/databases/vadr_models/vadr-models-flavi-1.2-1.tar.gz" |
"--mkey flavi --nomisc --noprotid --out_allfasta" |
11000 |
flu | "gs://theiagen-public-resources-rp/reference_data/databases/vadr_models/vadr-models-flu-1.6.3-2.tar.gz" |
"--mkey flu --atgonly --xnocomp --nomisc --alt_fail extrant5,extrant3" |
13500 |
rsv_a | "gs://theiagen-public-resources-rp/reference_data/databases/vadr_models/vadr-models-rsv-1.5-2.tar.gz" |
"--mkey rsv --xnocomp -r" |
15500 |
rsv_b | "gs://theiagen-public-resources-rp/reference_data/databases/vadr_models/vadr-models-rsv-1.5-2.tar.gz" |
"--mkey rsv --xnocomp -r" |
15500 |
measles | "gs://theiagen-public-resources-rp/reference_data/databases/vadr_models/vadr-models-mev-1.02.tar.gz" |
"--mkey mev -r --indefclass 0.01" |
18000 |
mumps | "gs://theiagen-public-resources-rp/reference_data/databases/vadr_models/vadr-models-muv-1.01.tar.gz" |
"--mkey muv -r --indefclass 0.025" |
18000 |
rubella | "gs://theiagen-public-resources-rp/reference_data/databases/vadr_models/vadr-models-ruv-1.01.tar.gz" |
"--mkey ruv -r" |
10000 |
Inputs¶
Please note the default values are for SARS-CoV-2.
This workflow runs on the sample level.
Terra Task Name | Variable | Type | Description | Default Value | Terra Status |
---|---|---|---|---|---|
vadr_update | genome_fasta | File | Consensus genome assembly | Required | |
consensus_qc | cpu | Int | Number of CPUs to allocate to the task | 1 | Optional |
consensus_qc | disk_size | Int | Amount of storage (in GB) to allocate to the task | 100 | Optional |
consensus_qc | docker | String | The Docker container to use for the task | us-docker.pkg.dev/general-theiagen/theiagen/utility:1.1 | Optional |
consensus_qc | genome_length | Int | Internal component, do not modify | Optional | |
consensus_qc | memory | Int | Amount of memory/RAM (in GB) to allocate to the task | 2 | Optional |
consensus_qc | reference_genome | File | Internal component, do not modify | Optional | |
organism_parameters | auspice_config | File | Internal component, do not modify | Optional | |
organism_parameters | clades_tsv | File | Internal component, do not modify | Optional | |
organism_parameters | flu_genoflu_genotype | String | Internal component, do not modify | N/A | Optional |
organism_parameters | flu_segment | String | Internal component, do not modify | N/A | Optional |
organism_parameters | flu_subtype | String | Internal component, do not modify | N/A | Optional |
organism_parameters | gene_locations_bed_file | File | Internal component, do not modify | Optional | |
organism_parameters | genome_length_input | Int | Internal component, do not modify | Optional | |
organism_parameters | hiv_primer_version | String | Internal component, do not modify | v1 | Optional |
organism_parameters | kraken_target_organism_input | String | Internal component, do not modify | Optional | |
organism_parameters | lat_longs_tsv | File | Internal component, do not modify | Optional | |
organism_parameters | min_date | Float | Internal component, do not modify | Optional | |
organism_parameters | min_num_unambig | Int | Internal component, do not modify | Optional | |
organism_parameters | narrow_bandwidth | Float | Internal component, do not modify | Optional | |
organism_parameters | nextclade_dataset_name_input | String | Internal component, do not modify | Optional | |
organism_parameters | nextclade_dataset_tag_input | String | Internal component, do not modify | Optional | |
organism_parameters | pangolin_docker_image | String | Internal component, do not modify | Optional | |
organism_parameters | pivot_interval | Int | Internal component, do not modify | Optional | |
organism_parameters | primer_bed_file | File | Internal component, do not modify | Optional | |
organism_parameters | proportion_wide | Float | Internal component, do not modify | Optional | |
organism_parameters | reference_genbank | File | Internal component, do not modify | Optional | |
organism_parameters | reference_genome | File | Internal component, do not modify | Optional | |
organism_parameters | reference_gff_file | File | Internal component, do not modify | Optional | |
vadr | cpu | Int | Number of CPUs to allocate to the task | 4 | Optional |
vadr | disk_size | Int | Amount of storage (in GB) to allocate to the task | 100 | Optional |
vadr | docker | String | The Docker container to use for the task | us-docker.pkg.dev/general-theiagen/staphb/vadr:1.6.4 | Optional |
vadr | min_length | Int | Minimum length subsequence to possibly replace Ns for the fasta-trim-terminal-ambigs.pl VADR script | 50 | Optional |
vadr_update | organism | String | Target organism for VADR | sars-cov-2 | Optional |
vadr_update | vadr_max_length | Int | Maximum length for the fasta-trim-terminal-ambigs.pl VADR script | 30000 | Optional |
vadr_update | vadr_memory | Int | Amount of memory/RAM (in GB) to allocate to the task | 16 | Optional |
vadr_update | vadr_model_file | File | Path to the a tar + gzipped VADR model file | gs://theiagen-public-resources-rp/reference_data/databases/vadr_models/vadr-models-sarscov2-1.3-2.tar.gz | Optional |
vadr_update | vadr_opts | String | Options for the v-annotate.pl VADR script | "--noseqnamemax --glsearch -s -r --nomisc --mkey sarscov2 --lowsim5seq 6 --lowsim3seq 6 --alt_fail lowscore,insertnn,deletinn --out_allfasta" | Optional |
vadr_update | vadr_skip_length | Int | Minimum assembly length (unambiguous) to run VADR | 10000 | Optional |
version_capture | docker | String | The Docker container to use for the task | us-docker.pkg.dev/general-theiagen/theiagen/alpine-plus-bash:3.20.0 | Optional |
version_capture | timezone | String | Set the time zone to get an accurate date of analysis (uses UTC by default) | Optional |
Outputs¶
Variable | Type | Description |
---|---|---|
vadr_alerts_list | File | A file containing all of the fatal alerts as determined by VADR |
vadr_all_outputs_tar_gz | File | A .tar.gz file (gzip-compressed tar archive file) containing all outputs from the VADR command v-annotate.pl. This file must be uncompressed & extracted to see the many files within. See https://github.com/ncbi/vadr/blob/master/documentation/formats.md#format-of-v-annotatepl-output-files for more complete description of all files present within the archive. Useful when deeply investigating a sample's genome & annotations. |
vadr_classification_summary_file | File | Per-sequence tabular classification file. See https://github.com/ncbi/vadr/blob/master/documentation/formats.md#explanation-of-sqc-suffixed-output-files for more complete description. |
vadr_docker | String | Docker image used to run VADR |
vadr_fastas_zip_archive | File | Zip archive containing all fasta files created during VADR analysis |
vadr_feature_tbl_fail | File | 5 column feature table output for failing sequences. See https://github.com/ncbi/vadr/blob/master/documentation/formats.md#format-of-v-annotatepl-output-files for more complete description. |
vadr_feature_tbl_pass | File | 5 column feature table output for passing sequences. See https://github.com/ncbi/vadr/blob/master/documentation/formats.md#format-of-v-annotatepl-output-files for more complete description. |
vadr_num_alerts | String | Number of fatal alerts as determined by VADR |
vadr_update_analysis_date | String | Date of analysis |
vadr_update_version | String | Version of the Public Health Bioinformatics (PHB) repository used |