Protein half-life is an important feature of protein homeostasis (proteostasis). The increasing number of in vivo and in vitro studies using high throughput proteomics provide estimates of the protein half-lives in tissues and cells. However, protein half-lives in cells and tissues are different. Due to the resource requirements for researching tissues, more data is available from cellular studies than tissues. We have designed a multivariate linear model for predicting protein half-life in tissue from its cellular properties. Inputs to the model are cellular halflife, abundance, intrinsically disordered sequences, and transcriptional and translational rates. Before the modeling, we determined substructures in the data using the relative distance from the regression line of the protein half-lives in tissues and cells, identifying three separate clusters. The model was trained on and applied to predict protein half-lives from murine liver, brain and heart tissues. In each tissue type we observed similar prediction patterns of protein half-lives. We found that the model provides the best results when there is a strong correlation between tissue and cell culture protein half-lives. Additionally, we clustered the protein half-lives to determine variations in correlation coefficients between the protein half-lives in the tissue versus in cell culture. The clusters identify strongly and weakly correlated protein half-lives, further improves the overall prediction and identifies sub groupings which exhibit specific characteristics. The model described herein, is generalizable to other data sets and has been implemented in a freely available R code.
ASJC Scopus subject areas