TY - JOUR
T1 - Data mining of sequences and 3D structures of allergenic proteins
AU - Ivanciuc, Ovidiu
AU - Schein, Catherine H.
AU - Braun, Werner
N1 - Funding Information:
This work was supported by a Research Development Grant (#2535-01) from the John Sealy Memorial Endowment Fund for Biomedical Research.
PY - 2002/10/1
Y1 - 2002/10/1
N2 - Motivation: Many sequences, and in some cases structures, of proteins that induce an allergic response in atopic individuals have been determined in recent years. This data indicates that allergens, regardless of source, fall into discreet protein families. Similarities in the sequence may explain clinically observed cross-reactivities between different biological triggers. However, previously available allergy databases group allergens according to their biological sources, or observed clinical cross-reactivities, without providing data about the proteins. A computer-aided data mining system is needed to compare the sequential and structural details of known allergens. This information will aid in predicting allergenic cross-responses and eventually in determining possible common characteristics of IgE recognition. Results: The new web-based Structural Database of Allergenic Proteins (SDAP) permits the user to quickly compare the sequence and structure of allergenic proteins. Data from literature sources and previously existing lists of allergens are combined in a MySQL interactive database with a wide selection of bioinformatics applications. SDAP can be used to rapidly determine the relationship between allergens and to screen novel proteins for the presence of IgE or T-cell epitopes they may share with known allergens. Further, our novel similarity search method, based on five dimensional descriptors of amino acid properties, can be used to scan the SDAP entries with a peptide sequence. For example, when a known IgE binding epitope from shrimp tropomyosin was used as a query, the method rapidly identified a similar sequence in known shellfish and insect allergens. This prediction of cross-reactivity between allergens is consistent with clinical observations.
AB - Motivation: Many sequences, and in some cases structures, of proteins that induce an allergic response in atopic individuals have been determined in recent years. This data indicates that allergens, regardless of source, fall into discreet protein families. Similarities in the sequence may explain clinically observed cross-reactivities between different biological triggers. However, previously available allergy databases group allergens according to their biological sources, or observed clinical cross-reactivities, without providing data about the proteins. A computer-aided data mining system is needed to compare the sequential and structural details of known allergens. This information will aid in predicting allergenic cross-responses and eventually in determining possible common characteristics of IgE recognition. Results: The new web-based Structural Database of Allergenic Proteins (SDAP) permits the user to quickly compare the sequence and structure of allergenic proteins. Data from literature sources and previously existing lists of allergens are combined in a MySQL interactive database with a wide selection of bioinformatics applications. SDAP can be used to rapidly determine the relationship between allergens and to screen novel proteins for the presence of IgE or T-cell epitopes they may share with known allergens. Further, our novel similarity search method, based on five dimensional descriptors of amino acid properties, can be used to scan the SDAP entries with a peptide sequence. For example, when a known IgE binding epitope from shrimp tropomyosin was used as a query, the method rapidly identified a similar sequence in known shellfish and insect allergens. This prediction of cross-reactivity between allergens is consistent with clinical observations.
UR - http://www.scopus.com/inward/record.url?scp=0036772523&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0036772523&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/18.10.1358
DO - 10.1093/bioinformatics/18.10.1358
M3 - Article
C2 - 12376380
AN - SCOPUS:0036772523
SN - 1367-4803
VL - 18
SP - 1358
EP - 1364
JO - Bioinformatics
JF - Bioinformatics
IS - 10
ER -