Skip to content

PhyloCompare

Quick Facts

Workflow Type Applicable Kingdom Last Known Changes Command-line Compatibility Workflow Level Dockstore
Standalone Any taxa v4.0.0 Yes Sample-level PhyloCompare_PHB

PhyloCompare_PHB

PhyloCompare will generate a cophylogeny plot that visualizes the differences in two trees' tip arrangements. PhyloCompare can also quantitatively compare two phylogenies by calculating the distance between two trees as a measure of the difference in their topologies (tip and branch arrangement). Validation is triggered by setting the validate boolean to "true".

It is recommended to root a phylogeny and PhyloCompare can root upon an outgroup tip or the midpoint.

Tree rooting

If no rooting options are supplied PhyloCompare will determine if the trees are rooted or unrooted.

outgroup and midpoint are incompatible options and the outgroups input will take precedence.

phylovalidate_flag errors

The phylovalidate_flag flags information that may confound distance calculation; e.g. "polytomy" can confound tree comparison if there are non-0 length branches descending from a polytomy, which may lead to erroneous distances if tips are reported in different order. In other words, phylogenies with the same topology may be reported with a non-0 distance if the tips within a polytomy are rearranged within the tree file.

If flags are accompanied by a ">0" phylocompare_distance, then this indicates no distance was calculated; e.g. the "edge_count_mismatch" flag is raised when the number of edges differs between trees and a distance could not be calculated.

Inputs

Terra Task Name Variable Type Description Default Value Terra Status
phylocompare tree1 File Path to a newick-formatted phylogenetic tree in an accessible bucket Required
phylocompare tree2 File Path to a newick-formatted phylogenetic tree in an accessible bucket Required
phylovalidate_task memory Int Amount of memory/RAM (in GB) to allocate to the task 4 Optional
root_tree1_task memory Int Amount of memory/RAM (in GB) to allocate to the task 4 Optional
root_tree2_task memory Int Amount of memory/RAM (in GB) to allocate to the task 4 Optional
cophylo_task cpu Int Number of CPUs to allocate to the task 1 Optional
cophylo_task disk_size Int Amount of storage (in GB) to allocate to the task 10 Optional
cophylo_task docker String Docker image to use for the task us-docker.pkg.dev/general-theiagen/theiagen/theiaphylo:0.2.0 Optional
cophylo_task memory Int Amount of memory (in GB) to allocate to the task 4 Optional
phylocompare midpoint Boolean Root phylogenies at their midpoint False Optional
phylocompare outgroup String Root phylogenies with an outgroup tip Optional
phylocompare validate Boolean Run phylogenetic validation by calculating the distance between two phylogenies' tips and branching order False Optional
phylovalidate_task cpu Int Number of CPUs to allocate to the task 1 Optional
phylovalidate_task disk_size Int Amount of storage (in GB) to allocate to the task 10 Optional
phylovalidate_task docker String The Docker container to use for the task us-docker.pkg.dev/general-theiagen/theiagen/theiaphylo:0.2.0 Optional
phylovalidate_task max_distance Float Maximum tolerable distance during validation Optional
phylovalidate_task resolve_tip_discrepancies Boolean Remove tips that are discrepant between trees instead of failing True Optional
root_tree1_task cpu Int Number of CPUs to allocate to the task 1 Optional
root_tree1_task disk_size Int Amount of storage (in GB) to allocate to the task 10 Optional
root_tree1_task docker String The Docker container to use for the task us-docker.pkg.dev/general-theiagen/theiagen/theiaphylo:0.1.8 Optional
root_tree2_task cpu Int Number of CPUs to allocate to the task 1 Optional
root_tree2_task disk_size Int Amount of storage (in GB) to allocate to the task 10 Optional
root_tree2_task docker String The Docker container to use for the task us-docker.pkg.dev/general-theiagen/theiagen/theiaphylo:0.1.8 Optional
version_capture docker String The Docker container to use for the task us-docker.pkg.dev/general-theiagen/theiagen/alpine-plus-bash:3.20.0 Optional
version_capture timezone String Set the time zone to get an accurate date of analysis (uses UTC by default) Optional

Workflow Tasks

root_phylo

root_phylo returns a rooted phylogeny from inputted outgroup or by rooting upon the midpoint root. Outgroups must be tip names (case-sensitive) that exist within the tree.

Root_Phylo Technical Details

Links
Task task_root_phylo.wdl
Software Source Code https://github.com/theiagen/theiaphylo
Software Documentation TheiaPhylo
cophylogeny

The Cophylogeny task will generate cophylogeny plots of two inputted phylogenies. A cophylogeny plot draws lines between two trees' tips as a method for visualizing their topological (tip/branch arrangment) differences.

A cophylogeny plot is generated with branch lengths (cophylogeny_plot_with_branch_lengths) and a cophylogeny plot without branch lengths (cophylogeny_plot). The plot without branch lengths is better for depicting branching order differences, though it is important to note that the branch lengths within this plot are arbitrary and do not convey evolutionary distance. Users will most likely need to visualize the phylogenies independently to interpret evolutionary distance because it is difficult to automatically graph two phylogenies with scaled and viewable branch lengths.

Cophylogeny Technical Details

Links
Task task_cophylogeny.wdl
Software Source Code https://github.com/theiagen/theiaphylo
Software Documentation TheiaPhylo
phylovalidate

phylovalidate will clean two phylogenies and validate if the distance between these two phylogenies' topologies is less than an inputted max_distance float (0 by default). Phylogenies are cleaned by converting 0 branch length nodes into polytomies, and any detected polytomies are reported as a flag. Polytomies may arbitrarily yield a non-0 distance, though if a 0 distance is reported with a polytomy then it indicates that the polytomy did not confound distance calculation. Trees can only be compared if the number of nodes between the trees are the same. Additionally, the tips must be the same between trees, though the resolve_tip_discrepancies boolean is set to "true" by default to remove discrepant tips.

It is difficult to conceptualize what a non-0 distance indicates, so please see the following citations for their interpretation. For unrooted phylogenies, phylovalidate calculates the Lin-Rajan-Moret distance, and for rooted phylogenies, phylovalidate calculates the matching cluster distance. The Robinson-Foulds distance is also calculated, though it is disregarded in validation (see citations for criticism).

PhyloValidate Technical Details

Links
Task task_phylovalidate.wdl
Software Source Code https://github.com/theiagen/theiaphylo
Software Documentation TheiaPhylo

Outputs

Variable Type Description
cophylogeny_plot File A cophylogeny plot depicting branching order differences between two phylogenies without branch lengths
cophylogeny_plot_with_branch_lengths File A cophylogeny plot depicting branching order differences between two phylogenies with branch lengths
cophylogeny_version String Version of the TheiaPhylo repository used for analysis
phylocompare_phb_version String The version of the Public Health Bioinformatics (PHB) repository used
phylovalidate_distance String The quantitative distance between two phylogenies' tip/branch arrangements
phylovalidate_flag String Flag depicting potential confounding factors during validation status
phylovalidate_report File Report file summarizing the validation results
phylovalidate_tree1_clean File Cleaned version of the first phylogenetic tree
phylovalidate_tree2_clean File Cleaned version of the second phylogenetic tree
phylovalidate_version String Version of phylovalidate used

References