Skip to content

TBProfiler_tNGS

Quick Facts

Workflow Type Applicable Kingdom Last Known Changes Command-line Compatibility Workflow Level Dockstore
Standalone Bacteria, TB vX.X.X Yes Sample-level TBProfiler_tNGS_PHB

TBProfiler_tNGS_PHB

This workflow is still in experimental research stages. Documentation is minimal as changes may occur in the code; it will be fleshed out when a stable state has been achieved.

Inputs

Terra Task Name Variable Type Description Default Value Terra Status
tbprofiler_tngs read1 File Illumina forward read file in FASTQ file format (compression optional) Required
tbprofiler_tngs read2 File Illumina reverse read file in FASTQ file format (compression optional) Required
tbprofiler_tngs samplename String The name of the sample being analyzed Required
clockwork_decon_reads cpu Int Number of CPUs to allocate to the task 16 Optional
clockwork_decon_reads disk_size Int Amount of storage (in GB) to allocate to the task 200 Optional
clockwork_decon_reads docker String Docker image to use for the task us-docker.pkg.dev/general-theiagen/cdcgov/varpipe_wgs_with_refs:2bc7234074bd53d9e92a1048b0485763cd9bbf6f4d12d5a1cc82bfec8ca7d75e Optional
clockwork_decon_reads memory Int Amount of memory (in GB) to allocate to the task 64 Optional
fastq_scan_clean cpu Int Number of CPUs to allocate to the task 1 Optional
fastq_scan_clean disk_size Int Amount of storage (in GB) to allocate to the task 50 Optional
fastq_scan_clean docker String Docker image to use for the task us-docker.pkg.dev/general-theiagen/biocontainers/fastq-scan:1.0.1--h4ac6f70_3 Optional
fastq_scan_clean memory Int Amount of memory (in GB) to allocate to the task 4 Optional
fastq_scan_raw cpu Int Number of CPUs to allocate to the task 1 Optional
fastq_scan_raw disk_size Int Amount of storage (in GB) to allocate to the task 50 Optional
fastq_scan_raw docker String Docker image to use for the task us-docker.pkg.dev/general-theiagen/biocontainers/fastq-scan:1.0.1--h4ac6f70_3 Optional
fastq_scan_raw memory Int Amount of memory (in GB) to allocate to the task 4 Optional
tbp_parser config File The configuration file to use, in YAML format (overrides all other arguments except other file-type arguments) Optional
tbp_parser coverage_bed File the BED file containing the genes of interest, their locus tags, and their regions for QC/coverage calculations; should be formatted like the TBDB.bed file in TBProfiler Optional
tbp_parser cpu Int Number of CPUs to allocate to the task 1 Optional
tbp_parser disk_size Int Amount of storage (in GB) to allocate to the task 100 Optional
tbp_parser docker String The Docker container to use for the task us-docker.pkg.dev/general-theiagen/theiagen/tbp-parser:v3.0.3 Optional
tbp_parser err_coverage_bed File the BED file containing the "essential for resistance regions." This file indicates to tbp-parser that these regions should also have breadth of coverage and average depth calculations performed; this file should be formatted like the genes.bed file in TBProfiler and the coverage BED described above Optional
tbp_parser find_and_replace Map[String,String] A JSON string that can be used to specify any text in the output files that should be find-and-replaced with other text. The keys will be the text to find, and the values will be the text to replace it with. This is useful for labs that want to customize the text in their reports (e.g. renaming drugs or genes or output columns).
For example, '{"rifampicin": "rifampin", "fbiD": "Rv2983", "mmpR5": "Rv0678", "p.0?": ""}'
Optional
tbp_parser gene_database_yml File An optional YAML file that specifies the gene database information for the genes of interest; if not provided, a default format will be used Optional
tbp_parser lims_report_format_yml File An optional YAML file that specifies the format of the LIMS report; if not provided, a default format will be used Optional
tbp_parser memory Int Amount of memory/RAM (in GB) to allocate to the task 8 Optional
tbp_parser min_depth Int The minimum depth of coverage required for a site to pass QC 10 Optional
tbp_parser min_frequency Float The minimum frequency for a mutation to pass QC (0.1 -> 10%) Optional
tbp_parser min_percent_coverage Float The minimum percentage of a region that has depth above the threshold set by min_depth (used for a gene/locus to pass QC; 1.0 -> 100%) Optional
tbp_parser min_percent_loci_covered Float The minimum percentage of loci/genes in the LIMS report that must pass coverage QC for the sample to be identified as MTBC (0.7 -> 70%) Optional
tbp_parser min_read_support Int The minimum read support for a mutation to pass QC Optional
tbp_parser operator String Optional
tbp_parser resolve_overlapping_regions Boolean Resolve overlapping BED regions to avoid double-counting reads across overlapping targets. Recommended for tNGS data with overlapping amplicon regions. False Optional
tbp_parser sequencing_method String The sequencing method used to generate the data; used in the LIMS & Looker reports. Enclose in quotes if including a space Optional
tbp_parser tbp_parser_debug Boolean Activate the debug mode on tbp_parser; increases logging outputs True Optional
tbp_parser tngs_frequency_boundaries String the frequency boundaries (comma-delimited; lower_f,upper_f) for tNGS QC reporting, used in conjunction with --tngs_read_support_boundaries Optional
tbp_parser tngs_read_support_boundaries String the read support boundaries (comma-delimited; lower_rs,upper_rs) for tNGS QC reporting, used in conjunction with --tngs_frequency_boundaries Optional
tbp_parser use_err_for_qc Boolean if an ERR BED file is provided, use the ERR coverage regions in place of the typical coverage regions for all QC determinations.
Note: This will influence how variants are interpretated and how deletions are reported because the QC thresholds for breadth of coverage and average depth will be based on the coverage found within the ERR regions.
False Optional
tbprofiler additional_parameters String Additional parameters for TBProfiler Optional
tbprofiler cpu Int Number of CPUs to allocate to the task 8 Optional
tbprofiler disk_size Int Amount of storage (in GB) to allocate to the task 100 Optional
tbprofiler docker String The Docker container to use for the task us-docker.pkg.dev/general-theiagen/staphb/tbprofiler:6.6.3 Optional
tbprofiler mapper String The mapping tool used in TBProfiler to align the reads to the reference genome; see TBProfiler's original documentation for available options. bwa Optional
tbprofiler memory Int Amount of memory/RAM (in GB) to allocate to the task 16 Optional
tbprofiler min_af Float The minimum allele frequency to call a variant 0.1 Optional
tbprofiler min_depth Int The minimum depth for a variant to be called. 10 Optional
tbprofiler ont_data Boolean Specifies nanopore specific tbprofiler parameters False Optional
tbprofiler tbdb_branch_commit_hash String The commit hash for the TBDB branch that TBProfiler should use; this allows pinning the mutation library to a specific commit Optional
tbprofiler tbprofiler_custom_db File TBProfiler uses by default the TBDB database; if you have a custom database you wish to use, you must provide a custom database in this field Optional
tbprofiler variant_caller String Select a different variant caller for TBProfiler to use by writing it in this block; see TBProfiler's original documentation for available options. gatk Optional
tbprofiler variant_calling_params String Enter additional variant calling parameters in this free text input to customize how the variant caller works in TBProfiler Optional
tbprofiler_tngs run_clockwork Boolean Set to True to run Clockwork read decontamination False Optional
tbprofiler_tngs run_trimmomatic Boolean Set to False to skip trimmomatic read trimming True Optional
tbprofiler_tngs tbdb_branch String TBProfiler uses by default the TBDB database (a combination of the original library and the WHO v2 catalogue). You can opt to use a different mutation library by using this field to specify the name of the TBDB branch to use in TBProfiler; see the TBProfiler documentation for more information regarding available databases Optional
trimmomatic cpu Int Number of CPUs to allocate to the task 4 Optional
trimmomatic disk_size Int Amount of storage (in GB) to allocate to the task 100 Optional
trimmomatic docker String The Docker container to use for the task us-docker.pkg.dev/general-theiagen/staphb/trimmomatic:0.40 Optional
trimmomatic memory Int Amount of memory/RAM (in GB) to allocate to the task 8 Optional
trimmomatic trimmomatic_adapter_fasta File A FASTA file containing adapter sequences TruSeq3-PE-2.fa Optional
trimmomatic trimmomatic_adapter_trim_args String Colon-delimited adapter trimming parameters representing [seed mismatches: palindrome clip threshold: simple clip threshold] 2:30:10 Optional
trimmomatic trimmomatic_min_length Int Specifies minimum length of each read after trimming to be kept 75 Optional
trimmomatic trimmomatic_override_args String Additional arguments to pass to trimmomatic. Can be used to override all trimming parameters Optional
trimmomatic trimmomatic_trim_adapters Boolean A True/False option that determines if adapters should be trimmed False Optional
trimmomatic trimmomatic_window_quality Int The trimming window quality score 30 Optional
trimmomatic trimmomatic_window_size Int The window size for trimming 4 Optional
version_capture docker String The Docker container to use for the task us-docker.pkg.dev/general-theiagen/theiagen/alpine-plus-bash:3.20.0 Optional
version_capture timezone String Set the time zone to get an accurate date of analysis (uses UTC by default) Optional

Terra Outputs

Variable Type Description
clockwork_decontaminated_read1 File Decontaminated forward reads by Clockwork
clockwork_decontaminated_read2 File Decontaminated reverse reads by Clockwork
clockwork_version String The version of Clockwork used
fastq_scan_clean1_json File The JSON file output from fastq-scan containing summary stats about clean forward read quality and length
fastq_scan_clean2_json File The JSON file output from fastq-scan containing summary stats about clean reverse read quality and length
fastq_scan_clean_pairs String Number of read pairs after cleaning
fastq_scan_docker String The Docker image of fastq_scan
fastq_scan_num_reads_clean1 Int The number of forward reads after cleaning as calculated by fastq_scan
fastq_scan_num_reads_clean2 Int The number of reverse reads after cleaning as calculated by fastq_scan
fastq_scan_num_reads_raw1 Int The number of input forward reads as calculated by fastq_scan
fastq_scan_num_reads_raw2 Int The number of input reserve reads as calculated by fastq_scan
fastq_scan_raw1_json File The JSON file output from fastq-scan containing summary stats about raw forward read quality and length
fastq_scan_raw2_json File The JSON file output from fastq-scan containing summary stats about raw reverse read quality and length
fastq_scan_raw_pairs String Number of raw read pairs
fastq_scan_version String The version of fastq_scan
tbp_parser_average_genome_depth Float The mean depth of coverage across all target regions included in the analysis
tbp_parser_docker String The docker image and version tag for the tbp_parser tool
tbp_parser_genome_percent_coverage Float The percent breadth of coverage across the entire genome
tbp_parser_laboratorian_report_csv File An output file containing information regarding each mutation and its associated drug resistance profile in a CSV file. This file also contains two interpretation fields -- "Looker" and "MDL" which are generated using the CDC's expert rules for interpreting the severity of potential drug resistance mutations.
tbp_parser_lims_report_csv File A file summarizing the highest severity mutations for each antimicrobial and lists the relevant mutations for each gene.
tbp_parser_lims_report_transposed_csv File A transposed version of the LIMS report CSV produced by tbp_parser; the rows and columns from the tbp_parser_lims_report_csv are swapped so that the report is more human-readable
tbp_parser_locus_coverage_report_csv File A file containing the breadth of coverage across each locus
tbp_parser_log File A log file capturing the stdout/stderr from the tbp_parser run
tbp_parser_looker_report_csv File An output file that contains condensed information suitable for generating a dashboard in Google's Looker studio.
tbp_parser_target_coverage_report_csv File A file containing the breadth of coverage across each target; identical to the locus report if each line in the coverage BED file corresponds to a single locus
tbp_parser_version String Version of tbp-parser used
tbprofiler_dr_type String The drug resistance category as determined by TBProfiler (sensitive, Pre-MDR, MDR, Pre-XDR, XDR)
tbprofiler_main_lineage String The Mycobacterium tuberculosis lineage assignment as made by TBProfiler
tbprofiler_median_depth Float The median depth of the H37Rv TB reference genome covered by the sample
tbprofiler_num_dr_variants String The total number of drug resistance conferring variants detected by TBProfiler
tbprofiler_num_other_variants String The total number of non-drug resistance conferring variants detected by TBProfiler
tbprofiler_output_bai File Index BAM file generated by mapping sequencing reads to reference genome by TBProfiler
tbprofiler_output_bam File BAM alignment file produced by TBProfiler
tbprofiler_output_file File CSV report from TBProfiler
tbprofiler_output_json File JSON output file from TBProfiler
tbprofiler_output_tsv File TSV report output file from TBProfiler
tbprofiler_pct_reads_mapped Float The percentage of reads that successfully mapped to the H37Rv genome
tbprofiler_resistance_genes String The genes in which a mutation was detected that may be resistance conferring, in the format of <gene target> <variant detected> (<estimated fraction of reads that support the variant>)
tbprofiler_sub_lineage String The Mycobacterium tuberculosis sub-lineage assignment as made by TBProfiler
tbprofiler_tngs_wf_analysis_date String The date on which the workflow was run
tbprofiler_tngs_wf_version String The version of the tbprofiler_tngs workflow used for this analysis
tbprofiler_version String The version of TBProfiler used for this analysis
trimmomatic_docker String The docker image used for the trimmomatic module in this workflow
trimmomatic_read1_trimmed File The read1 file post trimming
trimmomatic_read2_trimmed File The read2 file post trimming
trimmomatic_stats File The read trimming statistics
trimmomatic_version String The version of Trimmomatic used