A new probabilistic database search algorithm for ETD spectra

Rovshan Sadygov, David M. Good, Danielle L. Swaney, Joshua J. Coon

Research output: Contribution to journalArticle

26 Citations (Scopus)

Abstract

Peptide characterization using electron transfer dissociation (ETD) is an important analytical tool for protein identification. The fragmentation observed in ETD spectra is complementary to that seen when using the traditional dissociation method, collision activated dissociation (CAD). Applications of ETD enhance the scope and complexity of the peptides that can be studied by mass spectrometry-based methods. For example, ETD is shown to be particularly useful for the study of post-translationally modified peptides. To take advantage of the power provided by ETD, it is important to have an ETD-specific database search engine, an integral tool of mass spectrometry-based analytical proteomics. In this paper, we report on our development of a database search engine using ETD spectra and protein sequence databases to identify peptides. The search engine is based on the probabilistic modeling of shared peaks count and shared peaks intensity between the spectra and the peptide sequences. The shared peaks count accounts for the cumulative variations from amino acid sequences, while shared peaks intensity models the variations between the candidate sequence and product ion intensities. To demonstrate the utility of this algorithm for searching real-world data, we present the results of applications of this model to two high-throughput data sets. Both data sets were obtained from yeast whole cell lysates. The first data set was obtained from a sample digested by Lys-C, and the second data set was obtained by a digestion using trypsin. We searched the data sets against a combined forward and reversed yeast protein database to estimate false discovery rates. We compare the search results from the new methods with the results from a search engine often employed for ETD spectra, OMSSA. Our findings show that overall the new model performs comparably to OMSSA for low false discovery rates. At the same time, we demonstrate that there are substantial differences with OMSSA for results on subsets of data. Therefore, we conclude the new model can be considered as being complementary to previously developed models.

Original languageEnglish (US)
Pages (from-to)3198-3205
Number of pages8
JournalJournal of Proteome Research
Volume8
Issue number6
DOIs
StatePublished - Jun 5 2009

Fingerprint

Databases
Electrons
Search Engine
Search engines
Peptides
Protein Databases
Mass Spectrometry
Mass spectrometry
Fungal Proteins
Proteomics
Trypsin
Digestion
Amino Acid Sequence
Yeasts
Yeast
Datasets
Ions
Proteins
Throughput
Amino Acids

Keywords

  • Compound distribution
  • Electron transfer dissociation
  • Probabilistic model for peptide identification
  • Tandem mass spectrometry

ASJC Scopus subject areas

  • Biochemistry
  • Chemistry(all)

Cite this

A new probabilistic database search algorithm for ETD spectra. / Sadygov, Rovshan; Good, David M.; Swaney, Danielle L.; Coon, Joshua J.

In: Journal of Proteome Research, Vol. 8, No. 6, 05.06.2009, p. 3198-3205.

Research output: Contribution to journalArticle

Sadygov, Rovshan ; Good, David M. ; Swaney, Danielle L. ; Coon, Joshua J. / A new probabilistic database search algorithm for ETD spectra. In: Journal of Proteome Research. 2009 ; Vol. 8, No. 6. pp. 3198-3205.
@article{d19d11a23b51440dba30ec02eaea7c49,
title = "A new probabilistic database search algorithm for ETD spectra",
abstract = "Peptide characterization using electron transfer dissociation (ETD) is an important analytical tool for protein identification. The fragmentation observed in ETD spectra is complementary to that seen when using the traditional dissociation method, collision activated dissociation (CAD). Applications of ETD enhance the scope and complexity of the peptides that can be studied by mass spectrometry-based methods. For example, ETD is shown to be particularly useful for the study of post-translationally modified peptides. To take advantage of the power provided by ETD, it is important to have an ETD-specific database search engine, an integral tool of mass spectrometry-based analytical proteomics. In this paper, we report on our development of a database search engine using ETD spectra and protein sequence databases to identify peptides. The search engine is based on the probabilistic modeling of shared peaks count and shared peaks intensity between the spectra and the peptide sequences. The shared peaks count accounts for the cumulative variations from amino acid sequences, while shared peaks intensity models the variations between the candidate sequence and product ion intensities. To demonstrate the utility of this algorithm for searching real-world data, we present the results of applications of this model to two high-throughput data sets. Both data sets were obtained from yeast whole cell lysates. The first data set was obtained from a sample digested by Lys-C, and the second data set was obtained by a digestion using trypsin. We searched the data sets against a combined forward and reversed yeast protein database to estimate false discovery rates. We compare the search results from the new methods with the results from a search engine often employed for ETD spectra, OMSSA. Our findings show that overall the new model performs comparably to OMSSA for low false discovery rates. At the same time, we demonstrate that there are substantial differences with OMSSA for results on subsets of data. Therefore, we conclude the new model can be considered as being complementary to previously developed models.",
keywords = "Compound distribution, Electron transfer dissociation, Probabilistic model for peptide identification, Tandem mass spectrometry",
author = "Rovshan Sadygov and Good, {David M.} and Swaney, {Danielle L.} and Coon, {Joshua J.}",
year = "2009",
month = "6",
day = "5",
doi = "10.1021/pr900153b",
language = "English (US)",
volume = "8",
pages = "3198--3205",
journal = "Journal of Proteome Research",
issn = "1535-3893",
publisher = "American Chemical Society",
number = "6",

}

TY - JOUR

T1 - A new probabilistic database search algorithm for ETD spectra

AU - Sadygov, Rovshan

AU - Good, David M.

AU - Swaney, Danielle L.

AU - Coon, Joshua J.

PY - 2009/6/5

Y1 - 2009/6/5

N2 - Peptide characterization using electron transfer dissociation (ETD) is an important analytical tool for protein identification. The fragmentation observed in ETD spectra is complementary to that seen when using the traditional dissociation method, collision activated dissociation (CAD). Applications of ETD enhance the scope and complexity of the peptides that can be studied by mass spectrometry-based methods. For example, ETD is shown to be particularly useful for the study of post-translationally modified peptides. To take advantage of the power provided by ETD, it is important to have an ETD-specific database search engine, an integral tool of mass spectrometry-based analytical proteomics. In this paper, we report on our development of a database search engine using ETD spectra and protein sequence databases to identify peptides. The search engine is based on the probabilistic modeling of shared peaks count and shared peaks intensity between the spectra and the peptide sequences. The shared peaks count accounts for the cumulative variations from amino acid sequences, while shared peaks intensity models the variations between the candidate sequence and product ion intensities. To demonstrate the utility of this algorithm for searching real-world data, we present the results of applications of this model to two high-throughput data sets. Both data sets were obtained from yeast whole cell lysates. The first data set was obtained from a sample digested by Lys-C, and the second data set was obtained by a digestion using trypsin. We searched the data sets against a combined forward and reversed yeast protein database to estimate false discovery rates. We compare the search results from the new methods with the results from a search engine often employed for ETD spectra, OMSSA. Our findings show that overall the new model performs comparably to OMSSA for low false discovery rates. At the same time, we demonstrate that there are substantial differences with OMSSA for results on subsets of data. Therefore, we conclude the new model can be considered as being complementary to previously developed models.

AB - Peptide characterization using electron transfer dissociation (ETD) is an important analytical tool for protein identification. The fragmentation observed in ETD spectra is complementary to that seen when using the traditional dissociation method, collision activated dissociation (CAD). Applications of ETD enhance the scope and complexity of the peptides that can be studied by mass spectrometry-based methods. For example, ETD is shown to be particularly useful for the study of post-translationally modified peptides. To take advantage of the power provided by ETD, it is important to have an ETD-specific database search engine, an integral tool of mass spectrometry-based analytical proteomics. In this paper, we report on our development of a database search engine using ETD spectra and protein sequence databases to identify peptides. The search engine is based on the probabilistic modeling of shared peaks count and shared peaks intensity between the spectra and the peptide sequences. The shared peaks count accounts for the cumulative variations from amino acid sequences, while shared peaks intensity models the variations between the candidate sequence and product ion intensities. To demonstrate the utility of this algorithm for searching real-world data, we present the results of applications of this model to two high-throughput data sets. Both data sets were obtained from yeast whole cell lysates. The first data set was obtained from a sample digested by Lys-C, and the second data set was obtained by a digestion using trypsin. We searched the data sets against a combined forward and reversed yeast protein database to estimate false discovery rates. We compare the search results from the new methods with the results from a search engine often employed for ETD spectra, OMSSA. Our findings show that overall the new model performs comparably to OMSSA for low false discovery rates. At the same time, we demonstrate that there are substantial differences with OMSSA for results on subsets of data. Therefore, we conclude the new model can be considered as being complementary to previously developed models.

KW - Compound distribution

KW - Electron transfer dissociation

KW - Probabilistic model for peptide identification

KW - Tandem mass spectrometry

UR - http://www.scopus.com/inward/record.url?scp=67149093415&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=67149093415&partnerID=8YFLogxK

U2 - 10.1021/pr900153b

DO - 10.1021/pr900153b

M3 - Article

C2 - 19354237

AN - SCOPUS:67149093415

VL - 8

SP - 3198

EP - 3205

JO - Journal of Proteome Research

JF - Journal of Proteome Research

SN - 1535-3893

IS - 6

ER -