Abstract
Intrinsically disordered regions (IDRs) of proteins are often characterized by a high fraction of charged residues, but differ in their overall net charge and in the organization of the charged residues. The function-encoding information stored via IDR charge composition and organization remains elusive. Here, we aim to decipher the sequence–function relationship in IDRs by presenting a comprehensive bioinformatic analysis of the charge properties of IDRs in the human, mouse, and yeast proteomes. About 50% of the proteins comprise at least a single IDR, which is either positively or negatively charged. Highly negatively charged IDRs are longer and possess greater net charge per residue compared with highly positively charged IDRs. A striking difference between positively and negatively charged IDRs is the characteristics of the repeated units, specifically, of consecutive Lys or Arg residues (K/R repeats) and Asp or Glu (D/E repeats) residues. D/E repeats are found to be about five times longer than K/R repeats, with the longest found containing 49 residues. Long stretches of consecutive D and E are found to be more prevalent in nucleic acid-related proteins. They are less common in prokaryotes, and in eukaryotes their abundance increases with genome size. The functional role of D/E repeats and the profound differences between them and K/R repeats are discussed.
Original language | English (US) |
---|---|
Article number | 167660 |
Journal | Journal of Molecular Biology |
Volume | 434 |
Issue number | 14 |
DOIs | |
State | Published - Jul 30 2022 |
Externally published | Yes |
Keywords
- D/E repeat
- disordered regions
- electrostatics
- polyampholytes
- repeat sequences
ASJC Scopus subject areas
- Biophysics
- Structural Biology
- Molecular Biology