Pangolin_Update¶
Quick Facts¶
Workflow Type | Applicable Kingdom | Last Known Changes | Command-line Compatibility | Workflow Level | Dockstore |
---|---|---|---|---|---|
Genomic Characterization | SARS-CoV-2, Viral | v3.0.1 | Yes | Sample-level | Pangolin_Update_PHB |
Pangolin_Update_PHB¶
The Pangolin_Update workflow re-runs Pangolin updating prior lineage calls from one docker image to meet the lineage calls specified in an alternative docker image. The most common use case for this is updating lineage calls to be up-to-date with the latest Pangolin nomenclature by using the latest available Pangolin docker image (found here).
Inputs¶
This workflow runs on the sample level.
Terra Task Name | Variable | Type | Description | Default Value | Terra Status |
---|---|---|---|---|---|
pangolin_update | assembly_fasta | File | The assembly file for your sample in FASTA format | Required | |
pangolin_update | old_lineage | String | The Pangolin lineage previously assigned to the sample | Required | |
pangolin_update | old_pangolin_assignment_version | String | Version of the Pangolin software previously used for lineage assignment. | Required | |
pangolin_update | old_pangolin_docker | String | The Pangolin docker image previously used for lineage assignment. | Required | |
pangolin_update | old_pangolin_versions | String | All pangolin software and database versions previously used for lineage assignment. | Required | |
pangolin_update | samplename | String | The name of the sample being analyzed | Required | |
organism_parameters | auspice_config | File | Auspice config file for customizing visualizations in the Augur_PHB workflow; takes priority over the other customization values available for augur_export. Defaults are set for various organisms & flu segments. A minimal auspice config file is set in cases where organism is not specified and user does not provide an optional input config file. | Optional | |
organism_parameters | clades_tsv | File | Internal component, do not modify | Optional | |
organism_parameters | flu_genoflu_genotype | String | Internal component, do not modify | N/A | Optional |
organism_parameters | lat_longs_tsv | File | Internal component, do not modify | Optional | |
organism_parameters | min_date | Float | Internal component, do not modify | Optional | |
organism_parameters | min_num_unambig | Int | Minimum number of called bases in genome to pass prefilter | Defaults are organism-specific. Please find default values for all organisms (and for Flu - their respective genome segments and subtypes) here: https://github.com/theiagen/public_health_bioinformatics/blob/main/workflows/utilities/wf_organism_parameters.wdl. For an organism without set defaults, the default value is 0 | Optional |
organism_parameters | narrow_bandwidth | Float | Internal component, do not modify | Optional | |
organism_parameters | pivot_interval | Int | Internal component, do not modify | Optional | |
organism_parameters | proportion_wide | Float | Internal component, do not modify | Optional | |
organism_parameters | reference_genbank | File | Internal component, do not modify | Optional | |
organism_parameters | vadr_model | File | Path to the a tar + gzipped VADR model file | gs://theiagen-public-resources-rp/reference_data/databases/vadr_models/vadr-models-sarscov2-1.3-2.tar.gz | Optional |
pangolin4 | analysis_mode | String | Used to switch between usher and pangolearn analysis modes. Only use usher because pangolearn is no longer supported as of Pangolin v4.3 and higher versions. | Optional | |
pangolin4 | cpu | Int | Number of CPUs to allocate to the task | 4 | Optional |
pangolin4 | disk_size | Int | Amount of storage (in GB) to allocate to the task | 100 | Optional |
pangolin4 | expanded_lineage | Boolean | True/False that determines if a lineage should be expanded without aliases (e.g., BA.1 → B.1.1.529.1) | TRUE | Optional |
pangolin4 | max_ambig | Float | The maximum proportion of Ns allowed for pangolin to attempt an assignment | 0.5 | Optional |
pangolin4 | memory | Int | Amount of memory/RAM (in GB) to allocate to the task | 8 | Optional |
pangolin4 | min_length | Int | Minimum query length allowed for pangolin to attempt an assignment | 10000 | Optional |
pangolin4 | pangolin_arguments | String | Optional arguments for pangolin e.g. ''--skip-scorpio'' | Optional | |
pangolin4 | skip_designation_cache | Boolean | A True/False option that determines if the designation cache should be used | FALSE | Optional |
pangolin4 | skip_scorpio | Boolean | A True/False option that determines if scorpio should be skipped. | FALSE | Optional |
pangolin_update | lineage_log | File | TSV file detailing previous lineage assignments and software versions for this sample. | Optional | |
pangolin_update | new_pangolin_docker | String | The Pangolin docker image used to update the Pangolin lineage assignments. | Optional | |
pangolin_update | organism | String | The organism to be analyzed | sars-cov-2 | Optional |
pangolin_update_log | cpu | Int | Number of CPUs to allocate to the task | 4 | Optional |
pangolin_update_log | disk_size | Int | Amount of storage (in GB) to allocate to the task | 100 | Optional |
pangolin_update_log | docker | String | The Docker container to use for the task | us-docker.pkg.dev/general-theiagen/theiagen/utility:1.1 | Optional |
pangolin_update_log | memory | Int | Amount of memory/RAM (in GB) to allocate to the task | 8 | Optional |
pangolin_update_log | timezone | String | Set the time zone to get an accurate date of update (uses UTC by default) | Optional | |
version_capture | docker | String | The Docker container to use for the task | us-docker.pkg.dev/general-theiagen/theiagen/alpine-plus-bash:3.20.0 | Optional |
version_capture | timezone | String | Set the time zone to get an accurate date of analysis (uses UTC by default) | Optional |
Outputs¶
Variable | Type | Description |
---|---|---|
pango_lineage | String | Pango lineage as determined by Pangolin |
pango_lineage_expanded | String | Pango lineage without use of aliases; e.g., "BA.1" → "B.1.1.529.1" |
pango_lineage_log | File | TSV file listing Pangolin lineage assignments and software versions for this sample |
pango_lineage_report | File | Full Pango lineage report generated by Pangolin |
pangolin_assignment_version | String | The version of the pangolin software (e.g. PANGO or PUSHER) used for lineage assignment |
pangolin_conflicts | String | Number of lineage conflicts as determined by Pangolin |
pangolin_docker | String | Docker image used to run Pangolin |
pangolin_notes | String | Lineage notes as determined by Pangolin |
pangolin_update_analysis_date | String | Date of analysis |
pangolin_update_version | String | Version of the Public Health Bioinformatics (PHB) repository used |
pangolin_updates | String | Result of Pangolin Update (lineage changed versus unchanged) with lineage assignment and date of analysis |
pangolin_versions | String | All Pangolin software and database versions |
References¶
Pangolin: RRambaut A, Holmes EC, O'Toole Á, Hill V, McCrone JT, Ruis C, du Plessis L, Pybus OG. A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology. Nat Microbiol. 2020 Nov;5(11):1403-1407. doi: 10.1038/s41564-020-0770-5. Epub 2020 Jul 15. PMID: 32669681; PMCID: PMC7610519.