Genome annotation

Subscribe To Our Newsletter

Get updates and learn from the best

The analysis of all bacterial genome starts with genome annotation. This process can be divided into two steps: Gene-finding step and Functional Annotation step.

“Gene-finding” step uses genome sequences to find the various patterns of gene’s start and end location, and “Functional Annotation” step finds and annotates the function of each gene through sequence search. The analysis results obtained between researchers can vary slightly due to the different software, database, and parameters they used, but there is no big difference between our pipeline and pipelines used by other database since we use the most common pipeline in academia. In EzBioCloud, for all genomes, the following software and database are used to perform genome annotation and also comparative genomics. As of Sept 2017, more than 90,000 genomes were annotated using the following method and is provided through www.ezbiocloud.net.

More detailed information:

Pipeline Steps	Run Description
Finding tRNA genes	Program: tRNA-scan version 1.3.1 Run Parameter: tRNA-scan-SE –bact [Fasta File]
Finding rRNA genes	Program: INFERNAL version 1.0.2 (cmsearch) Database: rfam 12.0 Run Parameter: -E 1.0E-5 -Z 700 –noali rfam12.0/rRNA_bact.cm [Fasta File]
Finding CRISPR	Program: PilerCR version 1.06 Run Parameter: pilercr -in [Fasta File] -out [Output File] Program: CRT version 1.2 Run Parameter: java -cp CRT1.2-CLI.jar crt [Input Fasta File]
Finding ncRNA	Program: INFERNAL version 1.0.2 (cmsearch) Database: Rfam 12.0 Run Parameter: cmsearch -E 1.0E-5 -Z 700 –noali rfam12.0/RNase_bact.cm [Fasta File] Run Parameter: -E 1.0E-5 -Z 700 –noali rfam12.0/Gene_bact.cm [Fasta File]
Finding CDS	Program: PRODIGAL version 2.6.2 Run Parameter: -i [Input Fasta File] -o [Output GFF File] -f gff -m -c -g 11 -a [Output Protein Fasta File]
Functional annotation	Program: usearch 64bit version 8.0.1517 Database: -KEGG version (Date: 2015.12.10) -eggnog version 4.1 -swissprot (Date: 2015.12.10) -SEED subsystems (Date: 2015.12.10) Run Parameter: -ublast [Input Fasta File] -db [DB File] -maxaccepts 1 -evalue 1.0E-5 -accel 1.0 -ka_dbsize 700000000 -alnout [Output File]

Last updated on Sept 10, 2017 (JC)

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

zZ-format

What is the zZ-format? zZ format is typically composed by zZ+unique id or accesscion+zZ (e.g. zZCP001758zZ or zZ12494zZ). These zZ-formatted labels are used in plain text

CJ Bioscience, Inc. 07/30/2017

[UBCG] programPath file

“programPath” file is a simple text file containing paths of the required external program. For example, all external programs are included in the PATH, it

CJ Bioscience, Inc. 03/31/2018

Powered by Precision,
Driven by Quality

Genome annotation

Subscribe To Our Newsletter

Get updates and learn from the best

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

zZ-format

[UBCG] programPath file

Share This Post

Powered by Precision,
Driven by Quality

Site map

Contact info

Address

Family sites

Have a Question? Let's have a chat?

We're here to answer any question you might have

Have a Question? Let's have a chat?

We're here to answer any question you might have

Stay up to date

Keep up with our latest developments

Powered by Precision, Driven by Quality

Genome annotation

Subscribe To Our Newsletter

Get updates and learn from the best

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

zZ-format

[UBCG] programPath file

Share This Post

Powered by Precision, Driven by Quality

Site map

Contact info

Address

Family sites

Have a Question? Let's have a chat?

We're here to answer any question you might have

Have a Question? Let's have a chat?

We're here to answer any question you might have

Stay up to date

Keep up with our latest developments

Powered by Precision,
Driven by Quality

Powered by Precision,
Driven by Quality