TBProfiler_tNGS¶
Quick Facts¶
| Workflow Type | Applicable Kingdom | Last Known Changes | Command-line Compatibility | Workflow Level | Dockstore |
|---|---|---|---|---|---|
| Standalone | Bacteria, TB | vX.X.X | Yes | Sample-level | TBProfiler_tNGS_PHB |
TBProfiler_tNGS_PHB¶
This workflow is still in experimental research stages. Documentation is minimal as changes may occur in the code; it will be fleshed out when a stable state has been achieved.
Inputs¶
| Terra Task Name | Variable | Type | Description | Default Value | Terra Status |
|---|---|---|---|---|---|
| tbprofiler_tngs | read1 | File | Illumina forward read file in FASTQ file format (compression optional) | Required | |
| tbprofiler_tngs | read2 | File | Illumina reverse read file in FASTQ file format (compression optional) | Required | |
| tbprofiler_tngs | samplename | String | The name of the sample being analyzed | Required | |
| clockwork_decon_reads | cpu | Int | Number of CPUs to allocate to the task | 16 | Optional |
| clockwork_decon_reads | disk_size | Int | Amount of storage (in GB) to allocate to the task | 200 | Optional |
| clockwork_decon_reads | docker | String | Docker image to use for the task | us-docker.pkg.dev/general-theiagen/cdcgov/varpipe_wgs_with_refs:2bc7234074bd53d9e92a1048b0485763cd9bbf6f4d12d5a1cc82bfec8ca7d75e | Optional |
| clockwork_decon_reads | memory | Int | Amount of memory (in GB) to allocate to the task | 64 | Optional |
| fastq_scan_clean | cpu | Int | Number of CPUs to allocate to the task | 1 | Optional |
| fastq_scan_clean | disk_size | Int | Amount of storage (in GB) to allocate to the task | 50 | Optional |
| fastq_scan_clean | docker | String | Docker image to use for the task | us-docker.pkg.dev/general-theiagen/biocontainers/fastq-scan:1.0.1--h4ac6f70_3 | Optional |
| fastq_scan_clean | memory | Int | Amount of memory (in GB) to allocate to the task | 4 | Optional |
| fastq_scan_raw | cpu | Int | Number of CPUs to allocate to the task | 1 | Optional |
| fastq_scan_raw | disk_size | Int | Amount of storage (in GB) to allocate to the task | 50 | Optional |
| fastq_scan_raw | docker | String | Docker image to use for the task | us-docker.pkg.dev/general-theiagen/biocontainers/fastq-scan:1.0.1--h4ac6f70_3 | Optional |
| fastq_scan_raw | memory | Int | Amount of memory (in GB) to allocate to the task | 4 | Optional |
| tbp_parser | config | File | The configuration file to use, in YAML format (overrides all other arguments except other file-type arguments) | Optional | |
| tbp_parser | coverage_bed | File | the BED file containing the genes of interest, their locus tags, and their regions for QC/coverage calculations; should be formatted like the TBDB.bed file in TBProfiler | Optional | |
| tbp_parser | cpu | Int | Number of CPUs to allocate to the task | 1 | Optional |
| tbp_parser | disk_size | Int | Amount of storage (in GB) to allocate to the task | 100 | Optional |
| tbp_parser | docker | String | The Docker container to use for the task | us-docker.pkg.dev/general-theiagen/theiagen/tbp-parser:v3.0.3 | Optional |
| tbp_parser | err_coverage_bed | File | the BED file containing the "essential for resistance regions." This file indicates to tbp-parser that these regions should also have breadth of coverage and average depth calculations performed; this file should be formatted like the genes.bed file in TBProfiler and the coverage BED described above | Optional | |
| tbp_parser | find_and_replace | Map[String,String] | A JSON string that can be used to specify any text in the output files that should be find-and-replaced with other text. The keys will be the text to find, and the values will be the text to replace it with. This is useful for labs that want to customize the text in their reports (e.g. renaming drugs or genes or output columns). For example, '{"rifampicin": "rifampin", "fbiD": "Rv2983", "mmpR5": "Rv0678", "p.0?": ""}' |
Optional | |
| tbp_parser | gene_database_yml | File | An optional YAML file that specifies the gene database information for the genes of interest; if not provided, a default format will be used | Optional | |
| tbp_parser | lims_report_format_yml | File | An optional YAML file that specifies the format of the LIMS report; if not provided, a default format will be used | Optional | |
| tbp_parser | memory | Int | Amount of memory/RAM (in GB) to allocate to the task | 8 | Optional |
| tbp_parser | min_depth | Int | The minimum depth of coverage required for a site to pass QC | 10 | Optional |
| tbp_parser | min_frequency | Float | The minimum frequency for a mutation to pass QC (0.1 -> 10%) | Optional | |
| tbp_parser | min_percent_coverage | Float | The minimum percentage of a region that has depth above the threshold set by min_depth (used for a gene/locus to pass QC; 1.0 -> 100%) | Optional | |
| tbp_parser | min_percent_loci_covered | Float | The minimum percentage of loci/genes in the LIMS report that must pass coverage QC for the sample to be identified as MTBC (0.7 -> 70%) | Optional | |
| tbp_parser | min_read_support | Int | The minimum read support for a mutation to pass QC | Optional | |
| tbp_parser | operator | String | Optional | ||
| tbp_parser | resolve_overlapping_regions | Boolean | Resolve overlapping BED regions to avoid double-counting reads across overlapping targets. Recommended for tNGS data with overlapping amplicon regions. | False | Optional |
| tbp_parser | sequencing_method | String | The sequencing method used to generate the data; used in the LIMS & Looker reports. Enclose in quotes if including a space | Optional | |
| tbp_parser | tbp_parser_debug | Boolean | Activate the debug mode on tbp_parser; increases logging outputs | True | Optional |
| tbp_parser | tngs_frequency_boundaries | String | the frequency boundaries (comma-delimited; lower_f,upper_f) for tNGS QC reporting, used in conjunction with --tngs_read_support_boundaries | Optional | |
| tbp_parser | tngs_read_support_boundaries | String | the read support boundaries (comma-delimited; lower_rs,upper_rs) for tNGS QC reporting, used in conjunction with --tngs_frequency_boundaries | Optional | |
| tbp_parser | use_err_for_qc | Boolean | if an ERR BED file is provided, use the ERR coverage regions in place of the typical coverage regions for all QC determinations. Note: This will influence how variants are interpretated and how deletions are reported because the QC thresholds for breadth of coverage and average depth will be based on the coverage found within the ERR regions. |
False | Optional |
| tbprofiler | additional_parameters | String | Additional parameters for TBProfiler | Optional | |
| tbprofiler | cpu | Int | Number of CPUs to allocate to the task | 8 | Optional |
| tbprofiler | disk_size | Int | Amount of storage (in GB) to allocate to the task | 100 | Optional |
| tbprofiler | docker | String | The Docker container to use for the task | us-docker.pkg.dev/general-theiagen/staphb/tbprofiler:6.6.3 | Optional |
| tbprofiler | mapper | String | The mapping tool used in TBProfiler to align the reads to the reference genome; see TBProfiler's original documentation for available options. | bwa | Optional |
| tbprofiler | memory | Int | Amount of memory/RAM (in GB) to allocate to the task | 16 | Optional |
| tbprofiler | min_af | Float | The minimum allele frequency to call a variant | 0.1 | Optional |
| tbprofiler | min_depth | Int | The minimum depth for a variant to be called. | 10 | Optional |
| tbprofiler | ont_data | Boolean | Specifies nanopore specific tbprofiler parameters | False | Optional |
| tbprofiler | tbdb_branch_commit_hash | String | The commit hash for the TBDB branch that TBProfiler should use; this allows pinning the mutation library to a specific commit | Optional | |
| tbprofiler | tbprofiler_custom_db | File | TBProfiler uses by default the TBDB database; if you have a custom database you wish to use, you must provide a custom database in this field | Optional | |
| tbprofiler | variant_caller | String | Select a different variant caller for TBProfiler to use by writing it in this block; see TBProfiler's original documentation for available options. | gatk | Optional |
| tbprofiler | variant_calling_params | String | Enter additional variant calling parameters in this free text input to customize how the variant caller works in TBProfiler | Optional | |
| tbprofiler_tngs | run_clockwork | Boolean | Set to True to run Clockwork read decontamination | False | Optional |
| tbprofiler_tngs | run_trimmomatic | Boolean | Set to False to skip trimmomatic read trimming | True | Optional |
| tbprofiler_tngs | tbdb_branch | String | TBProfiler uses by default the TBDB database (a combination of the original library and the WHO v2 catalogue). You can opt to use a different mutation library by using this field to specify the name of the TBDB branch to use in TBProfiler; see the TBProfiler documentation for more information regarding available databases | Optional | |
| trimmomatic | cpu | Int | Number of CPUs to allocate to the task | 4 | Optional |
| trimmomatic | disk_size | Int | Amount of storage (in GB) to allocate to the task | 100 | Optional |
| trimmomatic | docker | String | The Docker container to use for the task | us-docker.pkg.dev/general-theiagen/staphb/trimmomatic:0.40 | Optional |
| trimmomatic | memory | Int | Amount of memory/RAM (in GB) to allocate to the task | 8 | Optional |
| trimmomatic | trimmomatic_adapter_fasta | File | A FASTA file containing adapter sequences | TruSeq3-PE-2.fa | Optional |
| trimmomatic | trimmomatic_adapter_trim_args | String | Colon-delimited adapter trimming parameters representing [seed mismatches: palindrome clip threshold: simple clip threshold] | 2:30:10 | Optional |
| trimmomatic | trimmomatic_min_length | Int | Specifies minimum length of each read after trimming to be kept | 75 | Optional |
| trimmomatic | trimmomatic_override_args | String | Additional arguments to pass to trimmomatic. Can be used to override all trimming parameters | Optional | |
| trimmomatic | trimmomatic_trim_adapters | Boolean | A True/False option that determines if adapters should be trimmed | False | Optional |
| trimmomatic | trimmomatic_window_quality | Int | The trimming window quality score | 30 | Optional |
| trimmomatic | trimmomatic_window_size | Int | The window size for trimming | 4 | Optional |
| version_capture | docker | String | The Docker container to use for the task | us-docker.pkg.dev/general-theiagen/theiagen/alpine-plus-bash:3.20.0 | Optional |
| version_capture | timezone | String | Set the time zone to get an accurate date of analysis (uses UTC by default) | Optional |
Terra Outputs¶
| Variable | Type | Description |
|---|---|---|
| clockwork_decontaminated_read1 | File | Decontaminated forward reads by Clockwork |
| clockwork_decontaminated_read2 | File | Decontaminated reverse reads by Clockwork |
| clockwork_version | String | The version of Clockwork used |
| fastq_scan_clean1_json | File | The JSON file output from fastq-scan containing summary stats about clean forward read quality and length |
| fastq_scan_clean2_json | File | The JSON file output from fastq-scan containing summary stats about clean reverse read quality and length |
| fastq_scan_clean_pairs | String | Number of read pairs after cleaning |
| fastq_scan_docker | String | The Docker image of fastq_scan |
| fastq_scan_num_reads_clean1 | Int | The number of forward reads after cleaning as calculated by fastq_scan |
| fastq_scan_num_reads_clean2 | Int | The number of reverse reads after cleaning as calculated by fastq_scan |
| fastq_scan_num_reads_raw1 | Int | The number of input forward reads as calculated by fastq_scan |
| fastq_scan_num_reads_raw2 | Int | The number of input reserve reads as calculated by fastq_scan |
| fastq_scan_raw1_json | File | The JSON file output from fastq-scan containing summary stats about raw forward read quality and length |
| fastq_scan_raw2_json | File | The JSON file output from fastq-scan containing summary stats about raw reverse read quality and length |
| fastq_scan_raw_pairs | String | Number of raw read pairs |
| fastq_scan_version | String | The version of fastq_scan |
| tbp_parser_average_genome_depth | Float | The mean depth of coverage across all target regions included in the analysis |
| tbp_parser_docker | String | The docker image and version tag for the tbp_parser tool |
| tbp_parser_genome_percent_coverage | Float | The percent breadth of coverage across the entire genome |
| tbp_parser_laboratorian_report_csv | File | An output file containing information regarding each mutation and its associated drug resistance profile in a CSV file. This file also contains two interpretation fields -- "Looker" and "MDL" which are generated using the CDC's expert rules for interpreting the severity of potential drug resistance mutations. |
| tbp_parser_lims_report_csv | File | A file summarizing the highest severity mutations for each antimicrobial and lists the relevant mutations for each gene. |
| tbp_parser_lims_report_transposed_csv | File | A transposed version of the LIMS report CSV produced by tbp_parser; the rows and columns from the tbp_parser_lims_report_csv are swapped so that the report is more human-readable |
| tbp_parser_locus_coverage_report_csv | File | A file containing the breadth of coverage across each locus |
| tbp_parser_log | File | A log file capturing the stdout/stderr from the tbp_parser run |
| tbp_parser_looker_report_csv | File | An output file that contains condensed information suitable for generating a dashboard in Google's Looker studio. |
| tbp_parser_target_coverage_report_csv | File | A file containing the breadth of coverage across each target; identical to the locus report if each line in the coverage BED file corresponds to a single locus |
| tbp_parser_version | String | Version of tbp-parser used |
| tbprofiler_dr_type | String | The drug resistance category as determined by TBProfiler (sensitive, Pre-MDR, MDR, Pre-XDR, XDR) |
| tbprofiler_main_lineage | String | The Mycobacterium tuberculosis lineage assignment as made by TBProfiler |
| tbprofiler_median_depth | Float | The median depth of the H37Rv TB reference genome covered by the sample |
| tbprofiler_num_dr_variants | String | The total number of drug resistance conferring variants detected by TBProfiler |
| tbprofiler_num_other_variants | String | The total number of non-drug resistance conferring variants detected by TBProfiler |
| tbprofiler_output_bai | File | Index BAM file generated by mapping sequencing reads to reference genome by TBProfiler |
| tbprofiler_output_bam | File | BAM alignment file produced by TBProfiler |
| tbprofiler_output_file | File | CSV report from TBProfiler |
| tbprofiler_output_json | File | JSON output file from TBProfiler |
| tbprofiler_output_tsv | File | TSV report output file from TBProfiler |
| tbprofiler_pct_reads_mapped | Float | The percentage of reads that successfully mapped to the H37Rv genome |
| tbprofiler_resistance_genes | String | The genes in which a mutation was detected that may be resistance conferring, in the format of <gene target> <variant detected> (<estimated fraction of reads that support the variant>) |
| tbprofiler_sub_lineage | String | The Mycobacterium tuberculosis sub-lineage assignment as made by TBProfiler |
| tbprofiler_tngs_wf_analysis_date | String | The date on which the workflow was run |
| tbprofiler_tngs_wf_version | String | The version of the tbprofiler_tngs workflow used for this analysis |
| tbprofiler_version | String | The version of TBProfiler used for this analysis |
| trimmomatic_docker | String | The docker image used for the trimmomatic module in this workflow |
| trimmomatic_read1_trimmed | File | The read1 file post trimming |
| trimmomatic_read2_trimmed | File | The read2 file post trimming |
| trimmomatic_stats | File | The read trimming statistics |
| trimmomatic_version | String | The version of Trimmomatic used |