TY - JOUR
T1 - Data on genome annotation and analysis of earthworm Eisenia fetida
AU - Paul, Sayan
AU - Arumugaperumal, Arun
AU - Rathy, Rashmi
AU - Ponesakki, Vasanthakumar
AU - Arunachalam, Palavesam
AU - Sivasubramaniam, Sudhakar
N1 - Publisher Copyright:
© 2018 The Authors
PY - 2018/10
Y1 - 2018/10
N2 - The present article reports the complete draft genome annotation of earthworm Eisenia fetida, obtained from the manuscript entitled “Timing and Scope of Genomic Expansion within Annelida: Evidence from Homeoboxes in the Genome of the Earthworm E. fetida” (Zwarycz et al., 2015) and provides the data on the repetitive elements, protein coding genes and noncoding RNAs present in the genome dataset of the species. The E. fetida protein coding genes were predicted from AUGUSTUS gene prediction and subsequently annotated based on their sequence similarity, Gene Ontology (GO) functional terms, InterPro domains, Clusters of Orthologous Groups (COGs) and KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways information. The genome wide comparison of orthologous clusters and phylogenomic analysis of the core genes were performed to understand the events of genome evolution and genomic diversity between E. fetida and its related metazoans. In addition, the genome dataset was screened to identify the crucial stem cell markers, regeneration specific genes and immune-related genes and their functionally enriched GO terms were predicted from Fisher׳s enrichment analysis. The E. fetida genome annotation data containing the GFF (general feature format) annotation file, predicted coding gene sequences and translated protein sequences were deposited to the figshare repository under the DOI: https://doi.org/10.6084/m9.figshare.6142322.v1.
AB - The present article reports the complete draft genome annotation of earthworm Eisenia fetida, obtained from the manuscript entitled “Timing and Scope of Genomic Expansion within Annelida: Evidence from Homeoboxes in the Genome of the Earthworm E. fetida” (Zwarycz et al., 2015) and provides the data on the repetitive elements, protein coding genes and noncoding RNAs present in the genome dataset of the species. The E. fetida protein coding genes were predicted from AUGUSTUS gene prediction and subsequently annotated based on their sequence similarity, Gene Ontology (GO) functional terms, InterPro domains, Clusters of Orthologous Groups (COGs) and KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways information. The genome wide comparison of orthologous clusters and phylogenomic analysis of the core genes were performed to understand the events of genome evolution and genomic diversity between E. fetida and its related metazoans. In addition, the genome dataset was screened to identify the crucial stem cell markers, regeneration specific genes and immune-related genes and their functionally enriched GO terms were predicted from Fisher׳s enrichment analysis. The E. fetida genome annotation data containing the GFF (general feature format) annotation file, predicted coding gene sequences and translated protein sequences were deposited to the figshare repository under the DOI: https://doi.org/10.6084/m9.figshare.6142322.v1.
KW - Eisenia fetida
KW - Genome annotation
KW - Orthologous groups
KW - Regeneration
UR - http://www.scopus.com/inward/record.url?scp=85052908058&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85052908058&partnerID=8YFLogxK
U2 - 10.1016/j.dib.2018.08.067
DO - 10.1016/j.dib.2018.08.067
M3 - Article
C2 - 30191166
AN - SCOPUS:85052908058
SN - 2352-3409
VL - 20
SP - 525
EP - 534
JO - Data in Brief
JF - Data in Brief
ER -