Thermodynamic environments in proteins: Fundamental determinants of fold specificity

James O. Wrabl, Scott A. Larson, Vincent J. Hilser

Research output: Contribution to journalArticle

14 Citations (Scopus)

Abstract

To investigate the relationship between an amino acid sequence and its corresponding protein fold, a database of thermodynamic stability information was assembled as a function of residue type from 81 nonhomologous proteins. This information was obtained using the COREX algorithm, which computes an ensemble-based description of the native state of proteins. Dissection of the COREX stability constant into its fundamental energetic components resulted in 12 thermodynamic environments describing the tertiary architecture of protein folds. Because of the observation that residue types partitioned unequally between these environments, it was hypothesized that thermodynamic environments contained energetic information that connected sequence to fold. To test the significance of this hypothesis, the thermodynamic stability information was incorporated into a three-dimensional-to-one-dimensional scoring matrix, and simple fold recognition experiments were performed in a manner such that information about the fold target was never included in the scoring. For 60 out of 81 fold targets, the correct sequence for the target scored in the top 5% of 3858 decoy sequences, with Z-scores ranging from 1.76 to 12.23. Furthermore, a scoring matrix assembled from the residues of 40 nonhomologous all-β proteins was used to thread sequences against 12 nonhomologous all-β protein targets. In 10 of 12 cases, sequences known to adopt the native all-β structure scored in the top 5% of 3858 decoy sequences, with Z-scores ranging from 1.99 to 7.94. These results indicate that energetic information encoded by thermodynamic environments represents a fundamental property of proteins that underlies classifications based on secondary structure.

Original languageEnglish (US)
Pages (from-to)1945-1957
Number of pages13
JournalProtein Science
Volume11
Issue number8
DOIs
StatePublished - 2002

Fingerprint

Thermodynamics
Proteins
Thermodynamic stability
Dissection
Amino Acid Sequence
Databases
Amino Acids
Experiments

Keywords

  • Native state ensemble
  • Protein stability
  • Protein structure prediction
  • Residue thermodynamics
  • Secondary structure
  • Threading and fold recognition

ASJC Scopus subject areas

  • Biochemistry

Cite this

Thermodynamic environments in proteins : Fundamental determinants of fold specificity. / Wrabl, James O.; Larson, Scott A.; Hilser, Vincent J.

In: Protein Science, Vol. 11, No. 8, 2002, p. 1945-1957.

Research output: Contribution to journalArticle

Wrabl, James O. ; Larson, Scott A. ; Hilser, Vincent J. / Thermodynamic environments in proteins : Fundamental determinants of fold specificity. In: Protein Science. 2002 ; Vol. 11, No. 8. pp. 1945-1957.
@article{91a5fb15023744c981ad959f645de783,
title = "Thermodynamic environments in proteins: Fundamental determinants of fold specificity",
abstract = "To investigate the relationship between an amino acid sequence and its corresponding protein fold, a database of thermodynamic stability information was assembled as a function of residue type from 81 nonhomologous proteins. This information was obtained using the COREX algorithm, which computes an ensemble-based description of the native state of proteins. Dissection of the COREX stability constant into its fundamental energetic components resulted in 12 thermodynamic environments describing the tertiary architecture of protein folds. Because of the observation that residue types partitioned unequally between these environments, it was hypothesized that thermodynamic environments contained energetic information that connected sequence to fold. To test the significance of this hypothesis, the thermodynamic stability information was incorporated into a three-dimensional-to-one-dimensional scoring matrix, and simple fold recognition experiments were performed in a manner such that information about the fold target was never included in the scoring. For 60 out of 81 fold targets, the correct sequence for the target scored in the top 5{\%} of 3858 decoy sequences, with Z-scores ranging from 1.76 to 12.23. Furthermore, a scoring matrix assembled from the residues of 40 nonhomologous all-β proteins was used to thread sequences against 12 nonhomologous all-β protein targets. In 10 of 12 cases, sequences known to adopt the native all-β structure scored in the top 5{\%} of 3858 decoy sequences, with Z-scores ranging from 1.99 to 7.94. These results indicate that energetic information encoded by thermodynamic environments represents a fundamental property of proteins that underlies classifications based on secondary structure.",
keywords = "Native state ensemble, Protein stability, Protein structure prediction, Residue thermodynamics, Secondary structure, Threading and fold recognition",
author = "Wrabl, {James O.} and Larson, {Scott A.} and Hilser, {Vincent J.}",
year = "2002",
doi = "10.1110/ps.0203202",
language = "English (US)",
volume = "11",
pages = "1945--1957",
journal = "Protein Science",
issn = "0961-8368",
publisher = "Cold Spring Harbor Laboratory Press",
number = "8",

}

TY - JOUR

T1 - Thermodynamic environments in proteins

T2 - Fundamental determinants of fold specificity

AU - Wrabl, James O.

AU - Larson, Scott A.

AU - Hilser, Vincent J.

PY - 2002

Y1 - 2002

N2 - To investigate the relationship between an amino acid sequence and its corresponding protein fold, a database of thermodynamic stability information was assembled as a function of residue type from 81 nonhomologous proteins. This information was obtained using the COREX algorithm, which computes an ensemble-based description of the native state of proteins. Dissection of the COREX stability constant into its fundamental energetic components resulted in 12 thermodynamic environments describing the tertiary architecture of protein folds. Because of the observation that residue types partitioned unequally between these environments, it was hypothesized that thermodynamic environments contained energetic information that connected sequence to fold. To test the significance of this hypothesis, the thermodynamic stability information was incorporated into a three-dimensional-to-one-dimensional scoring matrix, and simple fold recognition experiments were performed in a manner such that information about the fold target was never included in the scoring. For 60 out of 81 fold targets, the correct sequence for the target scored in the top 5% of 3858 decoy sequences, with Z-scores ranging from 1.76 to 12.23. Furthermore, a scoring matrix assembled from the residues of 40 nonhomologous all-β proteins was used to thread sequences against 12 nonhomologous all-β protein targets. In 10 of 12 cases, sequences known to adopt the native all-β structure scored in the top 5% of 3858 decoy sequences, with Z-scores ranging from 1.99 to 7.94. These results indicate that energetic information encoded by thermodynamic environments represents a fundamental property of proteins that underlies classifications based on secondary structure.

AB - To investigate the relationship between an amino acid sequence and its corresponding protein fold, a database of thermodynamic stability information was assembled as a function of residue type from 81 nonhomologous proteins. This information was obtained using the COREX algorithm, which computes an ensemble-based description of the native state of proteins. Dissection of the COREX stability constant into its fundamental energetic components resulted in 12 thermodynamic environments describing the tertiary architecture of protein folds. Because of the observation that residue types partitioned unequally between these environments, it was hypothesized that thermodynamic environments contained energetic information that connected sequence to fold. To test the significance of this hypothesis, the thermodynamic stability information was incorporated into a three-dimensional-to-one-dimensional scoring matrix, and simple fold recognition experiments were performed in a manner such that information about the fold target was never included in the scoring. For 60 out of 81 fold targets, the correct sequence for the target scored in the top 5% of 3858 decoy sequences, with Z-scores ranging from 1.76 to 12.23. Furthermore, a scoring matrix assembled from the residues of 40 nonhomologous all-β proteins was used to thread sequences against 12 nonhomologous all-β protein targets. In 10 of 12 cases, sequences known to adopt the native all-β structure scored in the top 5% of 3858 decoy sequences, with Z-scores ranging from 1.99 to 7.94. These results indicate that energetic information encoded by thermodynamic environments represents a fundamental property of proteins that underlies classifications based on secondary structure.

KW - Native state ensemble

KW - Protein stability

KW - Protein structure prediction

KW - Residue thermodynamics

KW - Secondary structure

KW - Threading and fold recognition

UR - http://www.scopus.com/inward/record.url?scp=0036076716&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0036076716&partnerID=8YFLogxK

U2 - 10.1110/ps.0203202

DO - 10.1110/ps.0203202

M3 - Article

C2 - 12142449

AN - SCOPUS:0036076716

VL - 11

SP - 1945

EP - 1957

JO - Protein Science

JF - Protein Science

SN - 0961-8368

IS - 8

ER -