TBProfiler_tNGS¶

Quick Facts¶

Workflow Type	Applicable Kingdom	Last Known Changes	Command-line Compatibility	Workflow Level	Dockstore
Standalone	Bacteria, TB	v4.2.0	Yes	Sample-level	TBProfiler_tNGS_PHB

TBProfiler_tNGS_PHB¶

This workflow is still in experimental research stages. Documentation is minimal as changes may occur in the code; it will be fleshed out when a stable state has been achieved.

Inputs¶

Terra Task Name	Variable	Type	Description	Default Value	Terra Status
tbprofiler_tngs	read1	File	Illumina forward read file in FASTQ file format (compression optional)		Required
tbprofiler_tngs	read2	File	Illumina reverse read file in FASTQ file format (compression optional)		Required
tbprofiler_tngs	samplename	String	The name of the sample being analyzed		Required
clockwork_decon_reads	cpu	Int	Number of CPUs to allocate to the task	16	Optional
clockwork_decon_reads	disk_size	Int	Amount of storage (in GB) to allocate to the task	200	Optional
clockwork_decon_reads	docker	String	Docker image to use for the task	us-docker.pkg.dev/general-theiagen/cdcgov/varpipe_wgs_with_refs:2bc7234074bd53d9e92a1048b0485763cd9bbf6f4d12d5a1cc82bfec8ca7d75e	Optional
clockwork_decon_reads	memory	Int	Amount of memory (in GB) to allocate to the task	64	Optional
fastq_scan_clean	cpu	Int	Number of CPUs to allocate to the task	1	Optional
fastq_scan_clean	disk_size	Int	Amount of storage (in GB) to allocate to the task	50	Optional
fastq_scan_clean	docker	String	Docker image to use for the task	us-docker.pkg.dev/general-theiagen/biocontainers/fastq-scan:1.0.1--h4ac6f70_3	Optional
fastq_scan_clean	memory	Int	Amount of memory (in GB) to allocate to the task	4	Optional
fastq_scan_raw	cpu	Int	Number of CPUs to allocate to the task	1	Optional
fastq_scan_raw	disk_size	Int	Amount of storage (in GB) to allocate to the task	50	Optional
fastq_scan_raw	docker	String	Docker image to use for the task	us-docker.pkg.dev/general-theiagen/biocontainers/fastq-scan:1.0.1--h4ac6f70_3	Optional
fastq_scan_raw	memory	Int	Amount of memory (in GB) to allocate to the task	4	Optional
tbp_parser	config	File	The configuration file to use, in YAML format (overrides all other arguments except other file-type arguments)		Optional
tbp_parser	coverage_bed	File	the BED file containing the genes of interest, their locus tags, and their regions for QC/coverage calculations; should be formatted like the TBDB.bed file in TBProfiler		Optional
tbp_parser	cpu	Int	Number of CPUs to allocate to the task	1	Optional
tbp_parser	disk_size	Int	Amount of storage (in GB) to allocate to the task	100	Optional
tbp_parser	docker	String	The Docker container to use for the task	us-docker.pkg.dev/general-theiagen/theiagen/tbp-parser:v3.0.3	Optional
tbp_parser	err_coverage_bed	File	the BED file containing the "essential for resistance regions." This file indicates to tbp-parser that these regions should also have breadth of coverage and average depth calculations performed; this file should be formatted like the genes.bed file in TBProfiler and the coverage BED described above		Optional
tbp_parser	find_and_replace	Map[String,String]	A JSON string that can be used to specify any text in the output files that should be find-and-replaced with other text. The keys will be the text to find, and the values will be the text to replace it with. This is useful for labs that want to customize the text in their reports (e.g. renaming drugs or genes or output columns). For example, '{"rifampicin": "rifampin", "fbiD": "Rv2983", "mmpR5": "Rv0678", "p.0?": ""}'		Optional
tbp_parser	gene_database_yml	File	An optional YAML file that specifies the gene database information for the genes of interest; if not provided, a default format will be used		Optional
tbp_parser	lims_report_format_yml	File	An optional YAML file that specifies the format of the LIMS report; if not provided, a default format will be used		Optional
tbp_parser	memory	Int	Amount of memory/RAM (in GB) to allocate to the task	8	Optional
tbp_parser	min_depth	Int	The minimum depth of coverage required for a site to pass QC	10	Optional
tbp_parser	min_frequency	Float	The minimum frequency for a mutation to pass QC (0.1 -> 10%)		Optional
tbp_parser	min_percent_coverage	Float	The minimum percentage of a region that has depth above the threshold set by min_depth (used for a gene/locus to pass QC; 1.0 -> 100%)		Optional
tbp_parser	min_percent_loci_covered	Float	The minimum percentage of loci/genes in the LIMS report that must pass coverage QC for the sample to be identified as MTBC (0.7 -> 70%)		Optional
tbp_parser	min_read_support	Int	The minimum read support for a mutation to pass QC		Optional
tbp_parser	operator	String			Optional
tbp_parser	resolve_overlapping_regions	Boolean	Resolve overlapping BED regions to avoid double-counting reads across overlapping targets. Recommended for tNGS data with overlapping amplicon regions.	False	Optional
tbp_parser	sequencing_method	String	The sequencing method used to generate the data; used in the LIMS & Looker reports. Enclose in quotes if including a space		Optional
tbp_parser	tbp_parser_debug	Boolean	Activate the debug mode on tbp_parser; increases logging outputs	True	Optional
tbp_parser	tngs_frequency_boundaries	String	the frequency boundaries (comma-delimited; lower_f,upper_f) for tNGS QC reporting, used in conjunction with --tngs_read_support_boundaries		Optional
tbp_parser	tngs_read_support_boundaries	String	the read support boundaries (comma-delimited; lower_rs,upper_rs) for tNGS QC reporting, used in conjunction with --tngs_frequency_boundaries		Optional
tbp_parser	use_err_for_qc	Boolean	if an ERR BED file is provided, use the ERR coverage regions in place of the typical coverage regions for all QC determinations. Note: This will influence how variants are interpretated and how deletions are reported because the QC thresholds for breadth of coverage and average depth will be based on the coverage found within the ERR regions.	False	Optional
tbprofiler	additional_parameters	String	Additional parameters for TBProfiler		Optional
tbprofiler	cpu	Int	Number of CPUs to allocate to the task	8	Optional
tbprofiler	disk_size	Int	Amount of storage (in GB) to allocate to the task	100	Optional
tbprofiler	docker	String	The Docker container to use for the task	us-docker.pkg.dev/general-theiagen/staphb/tbprofiler:6.6.3	Optional
tbprofiler	mapper	String	The mapping tool used in TBProfiler to align the reads to the reference genome; see TBProfiler's original documentation for available options.	bwa	Optional
tbprofiler	memory	Int	Amount of memory/RAM (in GB) to allocate to the task	16	Optional
tbprofiler	min_af	Float	The minimum allele frequency to call a variant	0.1	Optional
tbprofiler	min_depth	Int	The minimum depth for a variant to be called.	10	Optional
tbprofiler	ont_data	Boolean	Specifies nanopore specific tbprofiler parameters	False	Optional
tbprofiler	tbdb_branch_commit_hash	String	The commit hash for the TBDB branch that TBProfiler should use; this allows pinning the mutation library to a specific commit		Optional
tbprofiler	tbprofiler_custom_db	File	TBProfiler uses by default the TBDB database; if you have a custom database you wish to use, you must provide a custom database in this field		Optional
tbprofiler	variant_caller	String	Select a different variant caller for TBProfiler to use by writing it in this block; see TBProfiler's original documentation for available options.	gatk	Optional
tbprofiler	variant_calling_params	String	Enter additional variant calling parameters in this free text input to customize how the variant caller works in TBProfiler		Optional
tbprofiler_tngs	run_clockwork	Boolean	Set to True to run Clockwork read decontamination	False	Optional
tbprofiler_tngs	run_trimmomatic	Boolean	Set to False to skip trimmomatic read trimming	True	Optional
tbprofiler_tngs	tbdb_branch	String	TBProfiler uses by default the TBDB database (a combination of the original library and the WHO v2 catalogue). You can opt to use a different mutation library by using this field to specify the name of the TBDB branch to use in TBProfiler; see the TBProfiler documentation for more information regarding available databases		Optional
trimmomatic	cpu	Int	Number of CPUs to allocate to the task	4	Optional
trimmomatic	disk_size	Int	Amount of storage (in GB) to allocate to the task	100	Optional
trimmomatic	docker	String	The Docker container to use for the task	us-docker.pkg.dev/general-theiagen/staphb/trimmomatic:0.40	Optional
trimmomatic	memory	Int	Amount of memory/RAM (in GB) to allocate to the task	8	Optional
trimmomatic	trimmomatic_adapter_fasta	File	A FASTA file containing adapter sequences	TruSeq3-PE-2.fa	Optional
trimmomatic	trimmomatic_adapter_trim_args	String	Colon-delimited adapter trimming parameters representing [seed mismatches: palindrome clip threshold: simple clip threshold]	2:30:10	Optional
trimmomatic	trimmomatic_min_length	Int	Specifies minimum length of each read after trimming to be kept	75	Optional
trimmomatic	trimmomatic_override_args	String	Additional arguments to pass to trimmomatic. Can be used to override all trimming parameters		Optional
trimmomatic	trimmomatic_trim_adapters	Boolean	A True/False option that determines if adapters should be trimmed	False	Optional
trimmomatic	trimmomatic_window_quality	Int	The trimming window quality score	30	Optional
trimmomatic	trimmomatic_window_size	Int	The window size for trimming	4	Optional
version_capture	docker	String	The Docker container to use for the task	us-docker.pkg.dev/general-theiagen/theiagen/alpine-plus-bash:3.20.0	Optional
version_capture	timezone	String	Set the time zone to get an accurate date of analysis (uses UTC by default)		Optional

Terra Outputs¶

Variable	Type	Description
clockwork_decontaminated_read1	File	Decontaminated forward reads by Clockwork
clockwork_decontaminated_read2	File	Decontaminated reverse reads by Clockwork
clockwork_version	String	The version of Clockwork used
fastq_scan_clean1_json	File	The JSON file output from `fastq-scan` containing summary stats about clean forward read quality and length
fastq_scan_clean2_json	File	The JSON file output from `fastq-scan` containing summary stats about clean reverse read quality and length
fastq_scan_clean_pairs	String	Number of read pairs after cleaning
fastq_scan_docker	String	The Docker image of fastq_scan
fastq_scan_num_reads_clean1	Int	The number of forward reads after cleaning as calculated by fastq_scan
fastq_scan_num_reads_clean2	Int	The number of reverse reads after cleaning as calculated by fastq_scan
fastq_scan_num_reads_raw1	Int	The number of input forward reads as calculated by fastq_scan
fastq_scan_num_reads_raw2	Int	The number of input reserve reads as calculated by fastq_scan
fastq_scan_raw1_json	File	The JSON file output from `fastq-scan` containing summary stats about raw forward read quality and length
fastq_scan_raw2_json	File	The JSON file output from `fastq-scan` containing summary stats about raw reverse read quality and length
fastq_scan_raw_pairs	String	Number of raw read pairs
fastq_scan_version	String	The version of fastq_scan
tbp_parser_average_genome_depth	Float	The mean depth of coverage across all target regions included in the analysis
tbp_parser_docker	String	The docker image and version tag for the tbp_parser tool
tbp_parser_genome_percent_coverage	Float	The percent breadth of coverage across the entire genome
tbp_parser_laboratorian_report_csv	File	An output file containing information regarding each mutation and its associated drug resistance profile in a CSV file. This file also contains two interpretation fields -- "Looker" and "MDL" which are generated using the CDC's expert rules for interpreting the severity of potential drug resistance mutations.
tbp_parser_lims_report_csv	File	A file summarizing the highest severity mutations for each antimicrobial and lists the relevant mutations for each gene.
tbp_parser_lims_report_transposed_csv	File	A transposed version of the LIMS report CSV produced by tbp_parser; the rows and columns from the `tbp_parser_lims_report_csv` are swapped so that the report is more human-readable
tbp_parser_locus_coverage_report_csv	File	A file containing the breadth of coverage across each locus
tbp_parser_log	File	A log file capturing the stdout/stderr from the tbp_parser run
tbp_parser_looker_report_csv	File	An output file that contains condensed information suitable for generating a dashboard in Google's Looker studio.
tbp_parser_target_coverage_report_csv	File	A file containing the breadth of coverage across each target; identical to the locus report if each line in the coverage BED file corresponds to a single locus
tbp_parser_version	String	Version of tbp-parser used
tbprofiler_dr_type	String	The drug resistance category as determined by TBProfiler (sensitive, Pre-MDR, MDR, Pre-XDR, XDR)
tbprofiler_main_lineage	String	The Mycobacterium tuberculosis lineage assignment as made by TBProfiler
tbprofiler_median_depth	Float	The median depth of the H37Rv TB reference genome covered by the sample
tbprofiler_num_dr_variants	String	The total number of drug resistance conferring variants detected by TBProfiler
tbprofiler_num_other_variants	String	The total number of non-drug resistance conferring variants detected by TBProfiler
tbprofiler_output_bai	File	Index BAM file generated by mapping sequencing reads to reference genome by TBProfiler
tbprofiler_output_bam	File	BAM alignment file produced by TBProfiler
tbprofiler_output_file	File	CSV report from TBProfiler
tbprofiler_output_json	File	JSON output file from TBProfiler
tbprofiler_output_tsv	File	TSV report output file from TBProfiler
tbprofiler_pct_reads_mapped	Float	The percentage of reads that successfully mapped to the H37Rv genome
tbprofiler_resistance_genes	String	The genes in which a mutation was detected that may be resistance conferring, in the format of `<gene target> <variant detected> (<estimated fraction of reads that support the variant>)`
tbprofiler_sub_lineage	String	The Mycobacterium tuberculosis sub-lineage assignment as made by TBProfiler
tbprofiler_tngs_wf_analysis_date	String	The date on which the workflow was run
tbprofiler_tngs_wf_version	String	The version of the tbprofiler_tngs workflow used for this analysis
tbprofiler_version	String	The version of TBProfiler used for this analysis
trimmomatic_docker	String	The docker image used for the trimmomatic module in this workflow
trimmomatic_read1_trimmed	File	The read1 file post trimming
trimmomatic_read2_trimmed	File	The read2 file post trimming
trimmomatic_stats	File	The read trimming statistics
trimmomatic_version	String	The version of Trimmomatic used