In this class, we will try to identify genomic islands (GI; DNA containing multiple genes that are often transferred between bacterial strains) and track down when and how these were transferred.
Before working on the project, students should take a tutorial class (available here) to familiarize themselves about bioinformatics of comparative genomics and how to use the EzBioCloud cloud platform.
We will use the same data set as “Vibrio cholerae tutorial set”. Please provide the answers to the following questions:
We will assume that dendrogram based on ANI (average nucleotide identity) reflects the real phylogeny.
Gene frequency plot in pan-genome
All potential orthologous protein-coding  genes (=CDSs) are clustered into non-redundant gene sets after pan-genome calculation to generate “Pan-genome Orthologous Groups (POGs)”. Obviously, a core part