mebioda

Metabarcoding

General workflow of metabarcoding assays

Species identification of gut contents of permafrost grazers

Analysis workflow:

  1. demultiplex on IonTorrent adaptors; Phred quality (Q20) and length (>=100bp) filter
  2. cluster reads with CD-HIT
  3. BLAST against NCBI nr

Mid-Holocene horse

B Gravendeel, A Protopopov, I Bull, E Duijm, F Gill, A Nieman, N Rudaya, A N Tikhonov, S Trofimova, GBA van Reenen, R Vos, S Zhilich & B van Geel. 2014. Multiproxy study of the last meal of a mid-Holocene Oyogos Yar horse, Sakha Republic, Russia. The Holocene 24(10): 1288-1296 doi:10.1177/0959683614540953

Early Holocene Yakutian bison

B van Geel, A Protopopov, I Bull, E Duijm, F Gill, Y Lammers, A Nieman, N Rudaya, S Trofimova, A N Tikhonov, R Vos, S Zhilich, B Gravendeel. 2014. Multiproxy diet analysis of the last meal of an early Holocene Yakutian bison. Journal of Quaternary Science 29(3): 261-268 doi:10.1002/jqs.2698

Joining CITES listing with species detection in organic mixtures

Y Lammers, T Peelen, R A Vos & B Gravendeel. 2014. The HTS barcode checker pipeline, a tool for automated detection of illegally traded species from high-throughput sequencing data. BMC Bioinformatics 15:44 doi:10.1186/1471-2105-15-44

First, species from the CITES appendices are joined with the species in the NCBI taxonomy using the GlobalNames taxonomic name resolution service. A simple request to this service would be:

curl -o globalnames.json http://resolver.globalnames.org/name_resolvers.json?names=Homo+sapiens

Which results in a large JSON file. In a script, you might process such data as follows:

# doing a single request to the globalnames service
# usage: tnrs.py 'Homo sapiens'
import requests, sys
try:
	url = 'http://resolver.globalnames.org/name_resolvers.json'
	response = requests.get(url, params={'names':sys.argv[1]}, allow_redirects=True)
	json = response.json()
except:
	json = []

if 'results' in json['data'][0].keys():
	if 'name_string' in json['data'][0]['results'][0].keys():
		for data_dict in json['data']:
			for results_dict in data_dict['results']:
				if results_dict['data_source_title'] == 'NCBI':
					print( results_dict['taxon_id'] )					

Using this logic, a file-based database is populated:

Querying the CITES-annotated reference database

Comparing treatments: metabarcoding the Deepwater Horizon oil spill

HM Bik, KM Halanych, J Sharma & WK Thomas. 2012. Dramatic Shifts in Benthic Microbial Eukaryote Communities following the Deepwater Horizon Oil Spill. PLoS ONE 7(6): e38550 doi:10.1371/journal.pone.0038550

Deepwater Horizon sampling design

A study using 454 data processed with the QIIME pipeline. With these data the assumption was that the reads are structured according to the following primer and amplicon construct:

In this case with data with the following experimental design:

Oil spill impact: dramatic shifts in benthic microbial eukaryote communities

Accordingly, the reads were demultiplexed following this complex mapping. The reads were then clustered with UCLUST and denoised. Finally, taxonomic identification of each cluster was performed using MegaBLAST, resulting in a sample by taxon table alternatively visualized as follows:

Phylogenetic diversity

To quantify the turnover between sites and treatments, it is useful to compute metrics of phylogenetic β diversity, such as UniFrac.

Squares, triangles, and circles denote sequences derived from different communities. Branches attached to nodes are colored black if they are unique to a particular environment and gray if they are shared.

Constructing a large metabarcoding phylogeny

Principal Coordinate Analysis

The authors claim that the pre-spill sites were more diverse. This is not very obvious from the 2D plot, but perhaps it is clearer in an interactive 3D view.

The supplementary data with the paper have PCoA results in *.kin files next to folders that have a king.jar file in them. Use this to view some of the *.kin files. Were the pre-spill sites more diverse?