Laboratorian report
The laboratorian report is the main report produced by tbp-parser and is used to generate all of the other reports. What follows is an explanation of all the columns in the report.
Any fields from TBProfiler are from the input JSON file produced by TBProfiler.
| Column name | Explanation | Column source |
|---|---|---|
| sample_id | The name of the sample | TBProfiler "id" field |
| tbprofiler_gene_name | The name of the gene where the mutation has been identified | TBProfiler "gene_name" field |
| tbprofiler_locus_tag | The locus tag for the mutation that has been identified | TBProfiler "locus_tag" field OR by the gene database file indicated with the --gene_database_yml input parameter if the field is blank |
| tbprofiler_variant_substitution_type | The type of mutation identified, whether or not it was a frameshift, missense, or synonymous mutation | TBProfiler "type" field |
| tbprofiler_variant_substitution_nt | The mutation in nucleotide format | TBProfiler "nucleotide_change" field |
| tbprofiler_variant_substitution_aa | The mutation in amino acid format, if possible | TBProfiler "protein_change" field |
| genomic_start_pos | The genomic start position of the mutation | TBProfiler "pos" field |
| genomic_end_pos | The genomic end position of the mutation | Calculated from the TBProfiler "pos" field |
| confidence | Contains either: - the WHO annotation - an indication that there is no WHO annotation - NA for when there is no mutation |
Edited by tbp-parser, originates from the TBProfiler "confidence" field |
| antimicrobial | The antimicrobial drug that may be affected by this mutation | TBProfiler "annotation.drug" field, split into multiple rows if multiple annotation items are present. May also originate from the "gene_associated_drugs" field if not all are included in the annotation |
| looker_interpretation | The drug resistance interpretation intended for the Looker report | Determined by tbp-parser |
| mdl_interpretation | The drug resistance interpretation intended for the LIMS report | Determined by tbp-parser |
| depth | The depth of coverage at the mutation | TBProfiler "depth" field |
| frequency | The frequency of the mutation in the reads | TBProfiler "freq" field |
| read_support | How many reads support the mutation (depth * frequency) | Calculated by tbp-parser |
| rationale | Contains an indication of what was used (the WHO annotation, the specific expert rule used, or neither) to create the two interpretations | Determined by tbp-parser |
| warning | Any potential quality warnings that may indicate lower reliability | Determined by tbp-parser |
| gene_tier | The gene tier of the mutation’s gene (Tier 1, Tier 2, or NA) | Determined by the gene database file indicated with the --gene_database_yml input parameter |
| source | The source of the mutation information (WHO v2 catalogue, tbdb, etc.) | TBProfiler "annotation.source" field |
| tbdb_comment | Any comments from TBProfiler about the mutation | TBProfiler "annotation.comment" field |
Because of how a particular mutation may contribute resistance to different drugs at the same time, each mutation may be listed multiple times, once for each antimicrobial drug that could be impacted.
Any genes that do not have any mutations are also included in the laboratorian report with NA or WT in the appropriate field.
Customizing column names¶
To overwrite any of the output column names or text in the laboratorian report, please use the following format in a configuration file or use the command-line parameter --find_and_replace:
FIND_AND_REPLACE:
"sample_id": "My_Sample_ID_Column"
"tbprofiler_gene_name": "My_Gene_Name_Column"
...
Please note that this will rename every instance of that text in all output reports (every instance of "sample_id" will be renamed to "My_Sample_ID_Column" in all output files, etc.).