The ETE toolkit provides tools for automated manipulation, analysis and visualization of any type of hierarchical trees. It is free software that was developed for academic purposes.
ETE is a Python programming toolkit that offers a big number of tree handling options, particularly methods to analyze phylogenetic and clustering trees, to handle and visualize tree topologies. Among other ETE features are node annotation, automatic orthology and paralogy detection, independent editing of tree partitions, tree reconciliation, highly customizable tree drawing engine, etc.
The ETE toolkit was created at several science institutions: at first bioinformatics department of CIPF had worked on it, then comparative genomics unit of CRG enhanced this tool, now Jaime Huerta-Cepas at the Structural and Computational Biology unit of EMBL in Germany maintains the project.
Mainly ETE is used in bioinformatics analyses, since results of such fields as gene clustering and phylogenetics can be represented as hierarchical trees. Of course there are other applications that can either perform tree visualization and/or several analysis types. But mostly they lack scalability. The ETE toolkit can handle tree structures at large scale and a high level of automation.
The ETE toolkit functionality
- Trees are Python objects, so you can create, load, prune, concatenate, modify and search hierarchical tree structures with ETE Python API.
- Tree visualization offers full control over tree images, including browsing and rendering.
- Node annotation feature allows rendering custom node attributes as graphical elements.
- There is support of Jupyter notebook framework for method prototyping and inline visualization of trees.
- ETE allows running CodeML and SLR, optimizing and testing models of molecular evolution.
- Phylogenetic trees can be built and run using reproducible workflows. This is favourable for automated reconstruction of trees for gene and species.
- ETE features efficient queries of NCBI taxonomy database.
- The toolkit assists with summarizing phylogenetic signal from multiple gene trees into a single species tree.
- Tree images can be visualized or rendered directly from the command line. Visualizations include multiple sequence alignment: detailed view, block-based, domain-based or condensed alignment formats.
- ETE provides tree topologies comparison. You can compare trees with different sizes, estimate distances between gene and species trees.
There are different tools that benefit from use of ETE, for instance Polyphony (analysis and comparison of multiple protein molecules 3D structures ), ReproPhylo (reproducible phylogenomics pipeline), Avocado (Linux automated testing suite), ITEP (exploration of microbial pan-genomes), TreeKO (tree topologies comparison), etc.
The ETE toolkit was developed to work with phylogenetic analysis, it supports large tree data structures and interactive tree visualization system. The main advantage of this tool is capability to handle large-scale projects. ETE is free software, you can learn more and download it from the ETE website.