mebioda

Large phylogenies in biodiversity

Why care about large phylogenies outside systematics?

Large, ad hoc phylogenies: Mammals

ORP Bininda-Emonds, et al. 2007. The delayed rise of present-day mammals. Nature 446: 507–512 doi:10.1038/nature05634

Large, ad hoc phylogenies: Birds

W Jetz, et al. 2012. The global diversity of birds in space and time. Nature 491: 444–448 doi:10.1038/nature11631

Large, ad hoc phylogenies: Plants

AE Zanne, et al. 2014. Three keys to the radiation of angiosperms into freezing environments. Nature 506: 89–92 doi:10.1038/nature12872

Large, ongoing, phylogenetic projects

Apart from these ad hoc projects where a tree was published once, there are ongoing initiatives to periodically release estimates of phylogeny for a given taxonomic group and/or marker.

Examples from molecular biodiversity:

Examples of species tree initiatives:

Tools to operate on large phylogenies

Given the increasing availability of large phylogenies for different taxonomic groups there is probably a tree out there that is fairly close to the set of taxa you’re interested in. Nonetheless, there are probably some additional steps to take when re-using such a tree:

  1. Taxonomic name resolution - for example, to reconcile the taxon names in the tree with those used in other data sets you already have (such as occurrence data, trait data, etc.).
  2. Pruning - to reduce the large input tree down to the set of taxa of interest, many superfluous taxa may need to be pruned out of the tree.
  3. Grafting - if a small number of taxa of interest is missing from the tree, there are algorithms that can place those taxa in the tree (at least, “close enough”) on the basis of the location of related species.

Taxonomic Name Resolution

When integrating data sets, you will often end up trying to reconcile taxonomic names from different data sources. Therefore, numerous data resources have APIs that allow for (fuzzy?) lookups of names, synonyms, alternative spellings:

The taxize package allows you to scan all these different databases for name variants, common names, and higher classifications.

In this exercise, compare the outputs of NCBI and ITIS

Pruning and grafting

In this exercise, prune the PhytoPhylo tree down to just our set of crop species, and inspect the subtree. What are some of the higher groups you recognize?