Powered by Precision, Driven by Quality

Microbiome Taxonomic Profile (MTP) Documentation & Glossary

Subscribe To Our Newsletter

Get updates and learn from the best

Below you will find the reference to the underline algorithms used through the EzBioCloud and TrueBac ID family of tools and applications. If you cannot find the particular algorithm or documentation you are searching for please reach out to us and we will get back to you as soon as we can.

ACE

ACE is an indicator of species richness (total number of species in a sample) that is sensitive to rare OTUs (singletons and doubletons). Higher values indicate higher diversity.

Reference

Chao, A., and Lee, S.-M. “Estimating the number of classes via sample coverage.” Journal of the American statistical Association 87.417 (1992): 210-217.

Chao1

Chao1 is an indicator of species richness (total number of species in a sample) that is sensitive to rare OTUs (singletons and doubletons). Higher values indicate higher diversity.

Reference

Chao, A. “Estimating the population size for capture-recapture data with unequal catchability.” Biometrics (1987): 783-791.

Clone

A clone is an individual sequence that was not included in contigs.

Contig

A contig is a set of identical and sometimes overlapping sequences that together represent a consensus region of DNA

Diversity indices

Diversity indices are measures of species diversity, based on the number and pattern of OTUs observed in the sample. The indices include statistical estimates of species richness (Ace, Chao, Jackknife), and estimates of species evenness (Shannon, Simpson, NPShannon).

Good coverage of library (%)

This is an index of the extent to which the number of sequencing reads used for analysis represents the actual species population of the sample. The value can range from 0 to 100%, with 100% indicating a complete sampling of species, meaning that additional sequencing is unlikely to find any more new species.

Reference

Good, I. J. “The population frequencies of species and the estimation of population parameters.” Biometrika (1953): 237-264

Jackknife

Jackknife is an indicator of species richness (total number of species in a sample) that is sensitive to rare OTUs (singletons and doubletons) as well as to abundant OTUs (tripletons and more). Higher values indicate higher diversity.

Reference

Burnham, K. P. & Overton, W. S. (1979) Robust estimation of population size when capture probabilities vary among animals. Ecology, 60, 927-936.

No. of OTUs found in the sample

Operational Taxonomic Unit (OTU) is a group of sequences clustered by sequence similarity. Because many bacterial species exhibit greater than 97% sequence similarity with other species, OTU count doesn’t necessarily equate to the actual number of different species. This value represents the number of OTUs observed during experimentation, and may be different from the total number of OTUs (Species richness) in the sample.

NPShannon

NPShannon is an indicator of species evenness (proportional distribution of the number of each species in a sample) that estimates diversity when there are unseen species and unknown abundance. Values are greater than 0, and higher values indicate higher diversity.

Reference

Magurran, A. E. (2013). Measuring biological diversity. John Wiley & Sons.

OTU-Cutoff

This is the sequence similarity value used for OTU calculation, species-level identification against the reference database, and de novo clustering. 97% is commonly used for Bacteria.

OTU-picking Method

This section indicates what clustering method was used to form OTUs from sequenced reads. CL_OPEN_REF_UCLUST_MC2: each read is identified at the species-level against the reference database with a given similarity cutoff. Reads that fall below this cutoff are compiled and UCLUST is used to perform de novo clustering to generate additional OTUs. This strategy is called Open-reference OTU picking. Finally, OTUs with single reads (singletons) are omitted from further analysis.

Reference

* uclust : http://drive5.com/usearch/manual/uclust_algo.html
* cdhit :
http://www.bioinformatics.org/cd-hit/

Rank abundance curve

The rank abundance graph can be used to observe species evenness. The x-axis represents the rank of OTUs, and the y-axis represents the relative abundance of OTUs at each rank. The graph converges to 0, and the steeper the slope of the curve, the lower the species diversity.

Rarefaction curve

The rarefaction curve is a graph that expresses species diversity by plotting the correlation between the size of the sample data and the number of OTUs.

The x-axis represents the number of sampled reads, and the y-axis represents the number of OTUs discovered. In general, as the number of reads increases, the number of OTUs converges to the maximum value.

The steeper the slope of the curve, the higher the species diversity.

Reference

Heck, K. L., van Belle, G., & Simberloff, D. (1975). Explicit calculation of the rarefaction diversity measurement and the determination of sufficient sample size. Ecology, 56(6), 1459-1461.

Shannon

Shannon is an indicator of species evenness (proportional distribution of the number of each species in a sample) that exhibits values greater than 0. Higher values indicate higher diversity, and the maximum value is achieved when all species are present in equal numbers.

Reference

Magurran, A. E. (2013). Measuring biological diversity. John Wiley & Sons.

Simpson

Simpson is an indicator of species evenness (proportional distribution of the number of each species in a sample) that displays the probability that two randomly selected sequences are of the same species. Values range from 0 to 1, and lower values indicate higher diversity.

Reference

Magurran, A. E. (2013). Measuring biological diversity. John Wiley & Sons.

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

EzBioCloud Genome Database

The EzBioCloud Genome Database is a part of EzBioCloud.net. It is maintained by ChunLab, Inc. to provide best-curated genome database of Bacteria and Archaea. Data

How to download public SRA Run data

Install SRA toolkit on your computer To download any file from the NCBI SRA database, you need to install SRA toolkit software from NCBI SRA Toolkit.

Share This Post

Share on facebook
Share on linkedin
Share on twitter
Share on email
small_c_popup.png

Have a Question? Let's have a chat?

We're here to answer any question you might have

small_c_popup.png

Have a Question? Let's have a chat?

We're here to answer any question you might have

small_c_popup.png

Stay up to date

Keep up with our latest developments