Powered by Precision, Driven by Quality

Tetra-Nucleotide Analysis (TNA)

Subscribe To Our Newsletter

Get updates and learn from the best

A tetra-nucleotide is a fragment of DNA sequence with 4 bases (e.g. AGTC or TTGG). Pride et al. (2003) showed that the frequency of tetra-nucleotides in bacterial genomes contain useful, albeit weak, phylogenetic signals. Even though tetra-nucleotide analysis (TNA) utilizes the information of whole genome, it is evident that it cannot replace other alignment-based phylogenetic methods such as OrthoANI or 16S rRNA phylogeny. However, TNA can be useful for phylogenetic characterization when whole genome or 16S rRNA gene information is not available. For example, a partial genomic fragment obtained from a metagenome can be identified by TNA (Teeling et al., 2004). TNA is also fast enough that it can be used as a search engine against a large genome database.

Algorithm

Basically, information contained in a genome sequence can be transformed to an array of tetra-nucleotide frequencies (See the below figure).

Information of each genome sequence is now stored as counts of 256 tetra-nucleotides. When two genome sequences are similar, the more correlated these tetra-nucleotide patterns are. Therefore, statistical measure of tetra-nucleotide frequency correlation between two genome sequences can be roughly used to determine the genome-relatedness of two genomes.

Tetra-nucleotide correlation coefficient ranges from 0 to 1, and two identical genomes would produce 1.0.

References

  1. Pride, D. T., Meinersmann, R. J., Wassenaar, T. M. & Blaser, M. J. Evolutionary implications of microbial genome tetranucleotide frequency biases. Genome Res 13, 145-158 (2003).
  2. Teeling, H., Meyerdierks, A., Bauer, M., Amann, R. & Glockner, F. O. Application of tetranucleotide frequencies for the assignment of genomic fragments. Environ Microbiol 6, 938-947 (2004).

Last updated on April 28th, 2016 (EK)

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

[UBCG] User’s manual

What is the UBCG? UBCG stands for the up-to-date bacterial core gene. It is a method and software tool for inferring phylogenetic relationships using a

EzBioCloud 16S database

Publications that introduced the EzBioCloud 16S database Our database has been introduced in the following three publications (The numbers of citations are as of Mar.

Share This Post

Share on facebook
Share on linkedin
Share on twitter
Share on email
small_c_popup.png

Have a Question? Let's have a chat?

We're here to answer any question you might have

small_c_popup.png

Have a Question? Let's have a chat?

We're here to answer any question you might have

small_c_popup.png

Stay up to date

Keep up with our latest developments