Using SEQUEST with Theoretically Complete Sequence Databases

    Research output: Contribution to journalArticle

    3 Citations (Scopus)

    Abstract

    SEQUEST has long been used to identify peptides/proteins from their tandem mass spectra and protein sequence databases. The algorithm has proven to be hugely successful for its sensitivity and specificity in identifying peptides/proteins, the sequences of which are present in the protein sequence databases. In this work, we report on work that attempts a new use for the algorithm by applying it to search a complete list of theoretically possible peptides, a de novo-like sequencing. We used freely available mass spectral data and determined a number of unique peptides as identified by SEQUEST. Using masses of these peptides and the mass accuracy of 0.001 Da, we have created a database of all theoretically possible peptide sequences corresponding to the precursor masses. We used our recently developed algorithm for determining all amino acid compositions corresponding to a mass interval, and used a lexicographic ordering to generate theoretical sequences from the compositions. The newly generated theoretical database was many-fold more complex than the original protein sequence database. We used SEQUEST to search and identify the best matches to the spectra from all theoretically possible peptide sequences. We found that SEQUEST cross-correlation score ranked the correct peptide match among the top sequence matches. The results testify to the high specificity of SEQUEST when combined with the high mass accuracy for intact peptides. [Figure not available: see fulltext.]

    Original languageEnglish (US)
    Pages (from-to)1858-1864
    Number of pages7
    JournalJournal of the American Society for Mass Spectrometry
    Volume26
    Issue number11
    DOIs
    StatePublished - Nov 1 2015

    Fingerprint

    Databases
    Peptides
    Protein Databases
    Proteins
    Chemical analysis
    Amino Acids
    Sensitivity and Specificity

    Keywords

    • All theoretically possible peptides
    • De novo Peptide sequencing
    • Mass distribution of peptides
    • SEQUEST

    ASJC Scopus subject areas

    • Structural Biology
    • Spectroscopy

    Cite this

    Using SEQUEST with Theoretically Complete Sequence Databases. / Sadygov, Rovshan.

    In: Journal of the American Society for Mass Spectrometry, Vol. 26, No. 11, 01.11.2015, p. 1858-1864.

    Research output: Contribution to journalArticle

    @article{ae3668bf90d840ab83f6b81fa24147df,
    title = "Using SEQUEST with Theoretically Complete Sequence Databases",
    abstract = "SEQUEST has long been used to identify peptides/proteins from their tandem mass spectra and protein sequence databases. The algorithm has proven to be hugely successful for its sensitivity and specificity in identifying peptides/proteins, the sequences of which are present in the protein sequence databases. In this work, we report on work that attempts a new use for the algorithm by applying it to search a complete list of theoretically possible peptides, a de novo-like sequencing. We used freely available mass spectral data and determined a number of unique peptides as identified by SEQUEST. Using masses of these peptides and the mass accuracy of 0.001 Da, we have created a database of all theoretically possible peptide sequences corresponding to the precursor masses. We used our recently developed algorithm for determining all amino acid compositions corresponding to a mass interval, and used a lexicographic ordering to generate theoretical sequences from the compositions. The newly generated theoretical database was many-fold more complex than the original protein sequence database. We used SEQUEST to search and identify the best matches to the spectra from all theoretically possible peptide sequences. We found that SEQUEST cross-correlation score ranked the correct peptide match among the top sequence matches. The results testify to the high specificity of SEQUEST when combined with the high mass accuracy for intact peptides. [Figure not available: see fulltext.]",
    keywords = "All theoretically possible peptides, De novo Peptide sequencing, Mass distribution of peptides, SEQUEST",
    author = "Rovshan Sadygov",
    year = "2015",
    month = "11",
    day = "1",
    doi = "10.1007/s13361-015-1228-5",
    language = "English (US)",
    volume = "26",
    pages = "1858--1864",
    journal = "Journal of the American Society for Mass Spectrometry",
    issn = "1044-0305",
    publisher = "Springer New York",
    number = "11",

    }

    TY - JOUR

    T1 - Using SEQUEST with Theoretically Complete Sequence Databases

    AU - Sadygov, Rovshan

    PY - 2015/11/1

    Y1 - 2015/11/1

    N2 - SEQUEST has long been used to identify peptides/proteins from their tandem mass spectra and protein sequence databases. The algorithm has proven to be hugely successful for its sensitivity and specificity in identifying peptides/proteins, the sequences of which are present in the protein sequence databases. In this work, we report on work that attempts a new use for the algorithm by applying it to search a complete list of theoretically possible peptides, a de novo-like sequencing. We used freely available mass spectral data and determined a number of unique peptides as identified by SEQUEST. Using masses of these peptides and the mass accuracy of 0.001 Da, we have created a database of all theoretically possible peptide sequences corresponding to the precursor masses. We used our recently developed algorithm for determining all amino acid compositions corresponding to a mass interval, and used a lexicographic ordering to generate theoretical sequences from the compositions. The newly generated theoretical database was many-fold more complex than the original protein sequence database. We used SEQUEST to search and identify the best matches to the spectra from all theoretically possible peptide sequences. We found that SEQUEST cross-correlation score ranked the correct peptide match among the top sequence matches. The results testify to the high specificity of SEQUEST when combined with the high mass accuracy for intact peptides. [Figure not available: see fulltext.]

    AB - SEQUEST has long been used to identify peptides/proteins from their tandem mass spectra and protein sequence databases. The algorithm has proven to be hugely successful for its sensitivity and specificity in identifying peptides/proteins, the sequences of which are present in the protein sequence databases. In this work, we report on work that attempts a new use for the algorithm by applying it to search a complete list of theoretically possible peptides, a de novo-like sequencing. We used freely available mass spectral data and determined a number of unique peptides as identified by SEQUEST. Using masses of these peptides and the mass accuracy of 0.001 Da, we have created a database of all theoretically possible peptide sequences corresponding to the precursor masses. We used our recently developed algorithm for determining all amino acid compositions corresponding to a mass interval, and used a lexicographic ordering to generate theoretical sequences from the compositions. The newly generated theoretical database was many-fold more complex than the original protein sequence database. We used SEQUEST to search and identify the best matches to the spectra from all theoretically possible peptide sequences. We found that SEQUEST cross-correlation score ranked the correct peptide match among the top sequence matches. The results testify to the high specificity of SEQUEST when combined with the high mass accuracy for intact peptides. [Figure not available: see fulltext.]

    KW - All theoretically possible peptides

    KW - De novo Peptide sequencing

    KW - Mass distribution of peptides

    KW - SEQUEST

    UR - http://www.scopus.com/inward/record.url?scp=84958176451&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84958176451&partnerID=8YFLogxK

    U2 - 10.1007/s13361-015-1228-5

    DO - 10.1007/s13361-015-1228-5

    M3 - Article

    VL - 26

    SP - 1858

    EP - 1864

    JO - Journal of the American Society for Mass Spectrometry

    JF - Journal of the American Society for Mass Spectrometry

    SN - 1044-0305

    IS - 11

    ER -