How to find public microbiome data based on 16S rRNA gene

Course: Prokaryotic Taxonomy and Microbiome
11/03/2018
How to download public SRA Run data
09/07/2019

How to find public microbiome data based on 16S rRNA gene

Majority of microbiome taxonomic profile data are based on 16S rRNA gene targetting bacteria. In this document, I will explain how to search and locate the data of your interests.

Researchers from all countries usually deposit their microbiome data at NCBI SRA (Short Read Archive) database at //www.ncbi.nlm.nih.gov/sra. Therefore, we will download data from this site. There are two ways to achieve this. 

1. Search the publications first and locate the related data in NCBI SRA.

Let’s assume that you are interested in the ‘snake’ microbiome.

1) Goto NCBI PubMed site (//www.ncbi.nlm.nih.gov/pubmed/) to search publications related ‘snake microbiome.

2) Type “16S microbiome snake” into the search box and hit [Search]

3) In this case, we found a paper that may contain the data that we are looking for.

5) Click the link to the publication to go to the publication website.

6) In the publication, there is usually a section where specific SRA accession numbers are given. In this case, we were able to find the SRA Run accession IDs for snakes. We will use these Run IDs later.

2. Search SRA database using the keywords

1) Goto NCBI SRA advanced page (//www.ncbi.nlm.nih.gov/sra/advanced).

2) Type ‘amplicon’ into ‘Strategy’ box, ‘paired’ into ‘Layout’ box and ’16S microbiome cheese’ into ‘All Fields’, then click [Search]

3) In the screenshot below, I found Minas Frescal Cheese sample data! Click the link to go to the page with more detailed information.

4) You can gather necessary information, called metadata, from this page. SRA Run accessions IDs that start SRR* (e.g. are the information we need to download the (raw) NGS data.


Next step: How to download the public microbiome data using NCBI SRA Run IDs is provided here.


Last updated Sept. 7, 2019