Comparative omics-driven genome annotation refinement: Application across yersiniae

Alexandra C. Schrimpe-Rutledge, Marcus B. Jones, Sadhana Chauhan, Samuel O. Purvine, James A. Sanford, Matthew E. Monroe, Heather M. Brewer, Samuel H. Payne, Charles Ansong, Bryan C. Frank, Richard D. Smith, Scott N. Peterson, Vladimir Motin, Joshua N. Adkins

Research output: Contribution to journalArticle

29 Citations (Scopus)

Abstract

Genome sequencing continues to be a rapidly evolving technology, yet most downstream aspects of genome annotation pipelines remain relatively stable or are even being abandoned. The annotation process is now performed almost exclusively in an automated fashion to balance the large number of sequences generated. One possible way of reducing errors inherent to automated computational annotations is to apply data from omics measurements (i.e. transcriptional and proteomic) to the un-annotated genome with a proteogenomic-based approach. Here, the concept of annotation refinement has been extended to include a comparative assessment of genomes across closely related species. Transcriptomic and proteomic data derived from highly similar pathogenic Yersiniae (Y. pestis CO92, Y. pestis Pestoides F, and Y. pseudotuberculosis PB1/+) was used to demonstrate a comprehensive comparative omic-based annotation methodology. Peptide and oligo measurements experimentally validated the expression of nearly 40% of each strain's predicted proteome and revealed the identification of 28 novel and 68 incorrect (i.e., observed frameshifts, extended start sites, and translated pseudogenes) protein-coding sequences within the three current genome annotations. Gene loss is presumed to play a major role in Y. pestis acquiring its niche as a virulent pathogen, thus the discovery of many translated pseudogenes, including the insertion-ablated argD, underscores a need for functional analyses to investigate hypotheses related to divergence. Refinements included the discovery of a seemingly essential ribosomal protein, several virulence-associated factors, a transcriptional regulator, and many hypothetical proteins that were missed during annotation.

Original languageEnglish (US)
Article numbere33903
JournalPLoS One
Volume7
Issue number3
DOIs
StatePublished - Mar 27 2012

Fingerprint

Yersinia
Genes
Genome
genome
Pseudogenes
pseudogenes
Proteomics
proteomics
Yersinia pestis
Ribosomal Proteins
ribosomal proteins
Virulence Factors
Proteome
proteome
transcriptomics
Pathogens
Proteins
niches
virulence
transcription factors

ASJC Scopus subject areas

  • Agricultural and Biological Sciences(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Medicine(all)

Cite this

Schrimpe-Rutledge, A. C., Jones, M. B., Chauhan, S., Purvine, S. O., Sanford, J. A., Monroe, M. E., ... Adkins, J. N. (2012). Comparative omics-driven genome annotation refinement: Application across yersiniae. PLoS One, 7(3), [e33903]. https://doi.org/10.1371/journal.pone.0033903

Comparative omics-driven genome annotation refinement : Application across yersiniae. / Schrimpe-Rutledge, Alexandra C.; Jones, Marcus B.; Chauhan, Sadhana; Purvine, Samuel O.; Sanford, James A.; Monroe, Matthew E.; Brewer, Heather M.; Payne, Samuel H.; Ansong, Charles; Frank, Bryan C.; Smith, Richard D.; Peterson, Scott N.; Motin, Vladimir; Adkins, Joshua N.

In: PLoS One, Vol. 7, No. 3, e33903, 27.03.2012.

Research output: Contribution to journalArticle

Schrimpe-Rutledge, AC, Jones, MB, Chauhan, S, Purvine, SO, Sanford, JA, Monroe, ME, Brewer, HM, Payne, SH, Ansong, C, Frank, BC, Smith, RD, Peterson, SN, Motin, V & Adkins, JN 2012, 'Comparative omics-driven genome annotation refinement: Application across yersiniae', PLoS One, vol. 7, no. 3, e33903. https://doi.org/10.1371/journal.pone.0033903
Schrimpe-Rutledge AC, Jones MB, Chauhan S, Purvine SO, Sanford JA, Monroe ME et al. Comparative omics-driven genome annotation refinement: Application across yersiniae. PLoS One. 2012 Mar 27;7(3). e33903. https://doi.org/10.1371/journal.pone.0033903
Schrimpe-Rutledge, Alexandra C. ; Jones, Marcus B. ; Chauhan, Sadhana ; Purvine, Samuel O. ; Sanford, James A. ; Monroe, Matthew E. ; Brewer, Heather M. ; Payne, Samuel H. ; Ansong, Charles ; Frank, Bryan C. ; Smith, Richard D. ; Peterson, Scott N. ; Motin, Vladimir ; Adkins, Joshua N. / Comparative omics-driven genome annotation refinement : Application across yersiniae. In: PLoS One. 2012 ; Vol. 7, No. 3.
@article{13d14456c5734f87bb3c8ab47ae3b2c4,
title = "Comparative omics-driven genome annotation refinement: Application across yersiniae",
abstract = "Genome sequencing continues to be a rapidly evolving technology, yet most downstream aspects of genome annotation pipelines remain relatively stable or are even being abandoned. The annotation process is now performed almost exclusively in an automated fashion to balance the large number of sequences generated. One possible way of reducing errors inherent to automated computational annotations is to apply data from omics measurements (i.e. transcriptional and proteomic) to the un-annotated genome with a proteogenomic-based approach. Here, the concept of annotation refinement has been extended to include a comparative assessment of genomes across closely related species. Transcriptomic and proteomic data derived from highly similar pathogenic Yersiniae (Y. pestis CO92, Y. pestis Pestoides F, and Y. pseudotuberculosis PB1/+) was used to demonstrate a comprehensive comparative omic-based annotation methodology. Peptide and oligo measurements experimentally validated the expression of nearly 40{\%} of each strain's predicted proteome and revealed the identification of 28 novel and 68 incorrect (i.e., observed frameshifts, extended start sites, and translated pseudogenes) protein-coding sequences within the three current genome annotations. Gene loss is presumed to play a major role in Y. pestis acquiring its niche as a virulent pathogen, thus the discovery of many translated pseudogenes, including the insertion-ablated argD, underscores a need for functional analyses to investigate hypotheses related to divergence. Refinements included the discovery of a seemingly essential ribosomal protein, several virulence-associated factors, a transcriptional regulator, and many hypothetical proteins that were missed during annotation.",
author = "Schrimpe-Rutledge, {Alexandra C.} and Jones, {Marcus B.} and Sadhana Chauhan and Purvine, {Samuel O.} and Sanford, {James A.} and Monroe, {Matthew E.} and Brewer, {Heather M.} and Payne, {Samuel H.} and Charles Ansong and Frank, {Bryan C.} and Smith, {Richard D.} and Peterson, {Scott N.} and Vladimir Motin and Adkins, {Joshua N.}",
year = "2012",
month = "3",
day = "27",
doi = "10.1371/journal.pone.0033903",
language = "English (US)",
volume = "7",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "3",

}

TY - JOUR

T1 - Comparative omics-driven genome annotation refinement

T2 - Application across yersiniae

AU - Schrimpe-Rutledge, Alexandra C.

AU - Jones, Marcus B.

AU - Chauhan, Sadhana

AU - Purvine, Samuel O.

AU - Sanford, James A.

AU - Monroe, Matthew E.

AU - Brewer, Heather M.

AU - Payne, Samuel H.

AU - Ansong, Charles

AU - Frank, Bryan C.

AU - Smith, Richard D.

AU - Peterson, Scott N.

AU - Motin, Vladimir

AU - Adkins, Joshua N.

PY - 2012/3/27

Y1 - 2012/3/27

N2 - Genome sequencing continues to be a rapidly evolving technology, yet most downstream aspects of genome annotation pipelines remain relatively stable or are even being abandoned. The annotation process is now performed almost exclusively in an automated fashion to balance the large number of sequences generated. One possible way of reducing errors inherent to automated computational annotations is to apply data from omics measurements (i.e. transcriptional and proteomic) to the un-annotated genome with a proteogenomic-based approach. Here, the concept of annotation refinement has been extended to include a comparative assessment of genomes across closely related species. Transcriptomic and proteomic data derived from highly similar pathogenic Yersiniae (Y. pestis CO92, Y. pestis Pestoides F, and Y. pseudotuberculosis PB1/+) was used to demonstrate a comprehensive comparative omic-based annotation methodology. Peptide and oligo measurements experimentally validated the expression of nearly 40% of each strain's predicted proteome and revealed the identification of 28 novel and 68 incorrect (i.e., observed frameshifts, extended start sites, and translated pseudogenes) protein-coding sequences within the three current genome annotations. Gene loss is presumed to play a major role in Y. pestis acquiring its niche as a virulent pathogen, thus the discovery of many translated pseudogenes, including the insertion-ablated argD, underscores a need for functional analyses to investigate hypotheses related to divergence. Refinements included the discovery of a seemingly essential ribosomal protein, several virulence-associated factors, a transcriptional regulator, and many hypothetical proteins that were missed during annotation.

AB - Genome sequencing continues to be a rapidly evolving technology, yet most downstream aspects of genome annotation pipelines remain relatively stable or are even being abandoned. The annotation process is now performed almost exclusively in an automated fashion to balance the large number of sequences generated. One possible way of reducing errors inherent to automated computational annotations is to apply data from omics measurements (i.e. transcriptional and proteomic) to the un-annotated genome with a proteogenomic-based approach. Here, the concept of annotation refinement has been extended to include a comparative assessment of genomes across closely related species. Transcriptomic and proteomic data derived from highly similar pathogenic Yersiniae (Y. pestis CO92, Y. pestis Pestoides F, and Y. pseudotuberculosis PB1/+) was used to demonstrate a comprehensive comparative omic-based annotation methodology. Peptide and oligo measurements experimentally validated the expression of nearly 40% of each strain's predicted proteome and revealed the identification of 28 novel and 68 incorrect (i.e., observed frameshifts, extended start sites, and translated pseudogenes) protein-coding sequences within the three current genome annotations. Gene loss is presumed to play a major role in Y. pestis acquiring its niche as a virulent pathogen, thus the discovery of many translated pseudogenes, including the insertion-ablated argD, underscores a need for functional analyses to investigate hypotheses related to divergence. Refinements included the discovery of a seemingly essential ribosomal protein, several virulence-associated factors, a transcriptional regulator, and many hypothetical proteins that were missed during annotation.

UR - http://www.scopus.com/inward/record.url?scp=84858988418&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84858988418&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0033903

DO - 10.1371/journal.pone.0033903

M3 - Article

C2 - 22479471

AN - SCOPUS:84858988418

VL - 7

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 3

M1 - e33903

ER -