Abstract
Background: Innovations in protein engineering offer promising solutions for redesigning allergenic proteins to minimize adverse reactions in sensitive individuals. Earlier models for predicting allergenicity have relied on the knowledge of physicochemical properties and sequence homology to assess the potential risk. However, to better understand the allergenic proteins’ sequence features, we need a novel sequence-based deep learning model for predicting allergenicity. Results: We present a novel AI-based tool, AllergenAI, to quantify the allergenic potential of a protein’s sequence without using any other known features. Our study utilized allergenic protein sequence data archived in the three well-established databases, SDAP 2.0, COMPARE, and AlgPred 2, to train a convolutional neural network and assessed its prediction performance by cross-validation. We then used AllergenAI to find novel potential proteins of the cupin family in date palm, spinach, maize, and red clover plants with a high allergenicity score that might have an adverse allergenic effect on sensitive individuals. By analyzing the feature importance scores (FIS) of vicilins, we identified a proline-alanine-rich (P-A) motif in the top 50% of FIS regions that overlapped with known IgE epitope regions of vicilin allergens. We then used the approximately 1600 allergen structures in our SDAP database, in a pilot study to show the potential to incorporate 3D information in a CNN model. The prediction quality was slightly increased. Conclusion: Our allergenicity prediction study through the development of AllergenAI provides a foundation for identifying the critical features that distinguish allergenic proteins.
| Original language | English (US) |
|---|---|
| Article number | 279 |
| Journal | BMC bioinformatics |
| Volume | 26 |
| Issue number | 1 |
| DOIs | |
| State | Published - Dec 2025 |
Keywords
- 3D structure
- Allergenic proteins
- CNN
- Deep learning
- Novel vicilin allergen analogs
ASJC Scopus subject areas
- Structural Biology
- Biochemistry
- Molecular Biology
- Computer Science Applications
- Applied Mathematics
Fingerprint
Dive into the research topics of 'AllergenAI: a deep learning model predicting allergenicity based on protein sequence'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS