Partially sequenced organisms, decoy searches and false discovery rates

Bjorn Victor, Sarah Gabriël, Kirezi Kanobana, Ekaterina Mostovenko, Katja Polman, Pierre Dorny, André M. Deelder, Magnus Palmblad

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

Tandem mass spectrometry is commonly used to identify peptides, typically by comparing their product ion spectra with those predicted from a protein sequence database and scoring these matches. The most reported quality metric for a set of peptide identifications is the false discovery rate (FDR), the fraction of expected false identifications in the set. This metric has so far only been used for completely sequenced organisms or known protein mixtures. We have investigated whether FDR estimations are also applicable in the case of partially sequenced organisms, where many high-quality spectra fail to identify the correct peptides because the latter are not present in the searched sequence database. Using real data from human plasma and simulated partial sequence databases derived from two complete human sequence databases with different levels of redundancy, we could demonstrate that the mixture model approach in PeptideProphet is robust for partial databases, particularly if used in combination with decoy sequences. We therefore recommend using this method when estimating the FDR and reporting peptide identifications from incompletely sequenced organisms.

Original languageEnglish (US)
Pages (from-to)1991-1995
Number of pages5
JournalJournal of Proteome Research
Volume11
Issue number3
DOIs
StatePublished - Mar 2 2012
Externally publishedYes

Keywords

  • PeptideProphet
  • false discovery rates
  • mixture models
  • partially sequenced organism

ASJC Scopus subject areas

  • Biochemistry
  • General Chemistry

Fingerprint

Dive into the research topics of 'Partially sequenced organisms, decoy searches and false discovery rates'. Together they form a unique fingerprint.

Cite this