Motivation: Inferring the direct relationships between biomolecules from omics datasets is essential for the understanding of biological and disease mechanisms. Gaussian Graphical Model (GGM) provides a fairly simple and accurate representation of these interactions. However, estimation of the associated interaction matrix using data is challenging due to a high number of measured molecules and a low number of samples. Results: In this article, we use the thermodynamic entropy of the non-equilibrium system of molecules and the data-driven constraints among their expressions to derive an analytic formula for the interaction matrix of Gaussian models. Through a data simulation, we show that our method returns an improved estimation of the interaction matrix. Also, using the developed method, we estimate the interaction matrix associated with plasma proteome and construct the corresponding GGM and show that known NAFLD-related proteins like ADIPOQ, APOC, APOE, DPP4, CAT, GC, HP, CETP, SERPINA1, COLA1, PIGR, IGHD, SAA1 and FCGBP are among the top 15% most interacting proteins of the dataset.
ASJC Scopus subject areas
- Statistics and Probability
- Molecular Biology
- Computer Science Applications
- Computational Theory and Mathematics
- Computational Mathematics