Thermodynamic propensities of amino acids in the native state ensemble

Implications for fold recognition

J. O. Wrabl, S. A. Larson, V. J. Hilser

Research output: Contribution to journalArticle

21 Citations (Scopus)

Abstract

An amino acid sequence, in the context of the solvent environment, contains all of the thermodynamic information necessary to encode a three-dimensional protein structure. To investigate the relationship between an amino acid sequence and its corresponding protein fold, a database of thermodynamic stability information was assembled that spanned 2951 residues from 44 nonhomologous proteins. This information was obtained using the COREX algorithm, which computes an ensemble-based description of the native state of a protein. It was observed that amino acid types partitioned unequally into high, medium, and low thermodynamic stability environments. Furthermore, these distributions were reproducible and were significantly different than those expected from random partitioning. To assess the structural importance of the distributions, simple fold-recognition experiments were performed based on a 3D-1D scoring matrix containing only COREX residue stability information. This procedure was able to recover amino acid sequences corresponding to correct target structures more effectively than scoring matrices derived from randomized data. High-scoring sequences were often aligned correctly with their corresponding target profiles, suggesting that calculated thermodynamic stability profiles have the potential to encode sequence information. As a control, identical fold-recognition experiments were performed on the same database of proteins using DSSP secondary structure information in the scoring matrix, instead of COREX residue stability information. The comparable performance of both approaches suggested that COREX residue stability information and secondary structure information could be of equivalent utility in more sophisticated fold-recognition techniques. The results of this work are a consequence of the idea that amino acid sequences fold not into single, rigidly stable structures but rather into thermodynamic ensembles best represented by a time-averaged structure.

Original languageEnglish (US)
Pages (from-to)1032-1045
Number of pages14
JournalProtein Science
Volume10
Issue number5
DOIs
StatePublished - 2001

Fingerprint

Thermodynamics
Amino Acid Sequence
Amino Acids
Thermodynamic stability
Proteins
Protein Databases
Experiments
Databases

Keywords

  • Native state ensemble
  • Protein stability
  • Protein structure prediction
  • Residue thermodynamics
  • Threading and fold-recognition

ASJC Scopus subject areas

  • Biochemistry

Cite this

Thermodynamic propensities of amino acids in the native state ensemble : Implications for fold recognition. / Wrabl, J. O.; Larson, S. A.; Hilser, V. J.

In: Protein Science, Vol. 10, No. 5, 2001, p. 1032-1045.

Research output: Contribution to journalArticle

Wrabl, J. O. ; Larson, S. A. ; Hilser, V. J. / Thermodynamic propensities of amino acids in the native state ensemble : Implications for fold recognition. In: Protein Science. 2001 ; Vol. 10, No. 5. pp. 1032-1045.
@article{fda301be3a8c49648e1c4f238a888686,
title = "Thermodynamic propensities of amino acids in the native state ensemble: Implications for fold recognition",
abstract = "An amino acid sequence, in the context of the solvent environment, contains all of the thermodynamic information necessary to encode a three-dimensional protein structure. To investigate the relationship between an amino acid sequence and its corresponding protein fold, a database of thermodynamic stability information was assembled that spanned 2951 residues from 44 nonhomologous proteins. This information was obtained using the COREX algorithm, which computes an ensemble-based description of the native state of a protein. It was observed that amino acid types partitioned unequally into high, medium, and low thermodynamic stability environments. Furthermore, these distributions were reproducible and were significantly different than those expected from random partitioning. To assess the structural importance of the distributions, simple fold-recognition experiments were performed based on a 3D-1D scoring matrix containing only COREX residue stability information. This procedure was able to recover amino acid sequences corresponding to correct target structures more effectively than scoring matrices derived from randomized data. High-scoring sequences were often aligned correctly with their corresponding target profiles, suggesting that calculated thermodynamic stability profiles have the potential to encode sequence information. As a control, identical fold-recognition experiments were performed on the same database of proteins using DSSP secondary structure information in the scoring matrix, instead of COREX residue stability information. The comparable performance of both approaches suggested that COREX residue stability information and secondary structure information could be of equivalent utility in more sophisticated fold-recognition techniques. The results of this work are a consequence of the idea that amino acid sequences fold not into single, rigidly stable structures but rather into thermodynamic ensembles best represented by a time-averaged structure.",
keywords = "Native state ensemble, Protein stability, Protein structure prediction, Residue thermodynamics, Threading and fold-recognition",
author = "Wrabl, {J. O.} and Larson, {S. A.} and Hilser, {V. J.}",
year = "2001",
doi = "10.1110/ps.01601",
language = "English (US)",
volume = "10",
pages = "1032--1045",
journal = "Protein Science",
issn = "0961-8368",
publisher = "Cold Spring Harbor Laboratory Press",
number = "5",

}

TY - JOUR

T1 - Thermodynamic propensities of amino acids in the native state ensemble

T2 - Implications for fold recognition

AU - Wrabl, J. O.

AU - Larson, S. A.

AU - Hilser, V. J.

PY - 2001

Y1 - 2001

N2 - An amino acid sequence, in the context of the solvent environment, contains all of the thermodynamic information necessary to encode a three-dimensional protein structure. To investigate the relationship between an amino acid sequence and its corresponding protein fold, a database of thermodynamic stability information was assembled that spanned 2951 residues from 44 nonhomologous proteins. This information was obtained using the COREX algorithm, which computes an ensemble-based description of the native state of a protein. It was observed that amino acid types partitioned unequally into high, medium, and low thermodynamic stability environments. Furthermore, these distributions were reproducible and were significantly different than those expected from random partitioning. To assess the structural importance of the distributions, simple fold-recognition experiments were performed based on a 3D-1D scoring matrix containing only COREX residue stability information. This procedure was able to recover amino acid sequences corresponding to correct target structures more effectively than scoring matrices derived from randomized data. High-scoring sequences were often aligned correctly with their corresponding target profiles, suggesting that calculated thermodynamic stability profiles have the potential to encode sequence information. As a control, identical fold-recognition experiments were performed on the same database of proteins using DSSP secondary structure information in the scoring matrix, instead of COREX residue stability information. The comparable performance of both approaches suggested that COREX residue stability information and secondary structure information could be of equivalent utility in more sophisticated fold-recognition techniques. The results of this work are a consequence of the idea that amino acid sequences fold not into single, rigidly stable structures but rather into thermodynamic ensembles best represented by a time-averaged structure.

AB - An amino acid sequence, in the context of the solvent environment, contains all of the thermodynamic information necessary to encode a three-dimensional protein structure. To investigate the relationship between an amino acid sequence and its corresponding protein fold, a database of thermodynamic stability information was assembled that spanned 2951 residues from 44 nonhomologous proteins. This information was obtained using the COREX algorithm, which computes an ensemble-based description of the native state of a protein. It was observed that amino acid types partitioned unequally into high, medium, and low thermodynamic stability environments. Furthermore, these distributions were reproducible and were significantly different than those expected from random partitioning. To assess the structural importance of the distributions, simple fold-recognition experiments were performed based on a 3D-1D scoring matrix containing only COREX residue stability information. This procedure was able to recover amino acid sequences corresponding to correct target structures more effectively than scoring matrices derived from randomized data. High-scoring sequences were often aligned correctly with their corresponding target profiles, suggesting that calculated thermodynamic stability profiles have the potential to encode sequence information. As a control, identical fold-recognition experiments were performed on the same database of proteins using DSSP secondary structure information in the scoring matrix, instead of COREX residue stability information. The comparable performance of both approaches suggested that COREX residue stability information and secondary structure information could be of equivalent utility in more sophisticated fold-recognition techniques. The results of this work are a consequence of the idea that amino acid sequences fold not into single, rigidly stable structures but rather into thermodynamic ensembles best represented by a time-averaged structure.

KW - Native state ensemble

KW - Protein stability

KW - Protein structure prediction

KW - Residue thermodynamics

KW - Threading and fold-recognition

UR - http://www.scopus.com/inward/record.url?scp=0035061715&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0035061715&partnerID=8YFLogxK

U2 - 10.1110/ps.01601

DO - 10.1110/ps.01601

M3 - Article

VL - 10

SP - 1032

EP - 1045

JO - Protein Science

JF - Protein Science

SN - 0961-8368

IS - 5

ER -