TY - JOUR
T1 - Assessment of health conditions from patient electronic health record portals vs self-reported questionnaires
T2 - An analysis of the INSPIRE study
AU - Khera, Rohan
AU - Sawano, Mitsuaki
AU - Warner, Frederick
AU - Coppi, Andreas
AU - Pedroso, Aline F.
AU - Spatz, Erica S.
AU - Yu, Huihui
AU - Gottlieb, Michael
AU - Saydah, Sharon
AU - Stephens, Kari A.
AU - Rising, Kristin L.
AU - Elmore, Joann G.
AU - Hill, Mandy J.
AU - Idris, Ahamed H.
AU - Montoy, Juan Carlos C.
AU - O'Laughlin, Kelli N.
AU - Weinstein, Robert A.
AU - Venkatesh, Arjun
N1 - Publisher Copyright:
© 2025 The Author(s). All rights reserved.
PY - 2025/5/1
Y1 - 2025/5/1
N2 - Objectives: Direct electronic access to multiple electronic health record (EHR) systems through patient portals offers a novel avenue for decentralized research. Given the critical value of patient characterization, we sought to compare computable evaluation of health conditions from patient-portal EHR against the traditional self-report. Materials and Methods: In the nationwide Innovative Support for Patients with SARS-CoV-2 Infections Registry (INSPIRE) study, which linked self-reported questionnaires with multiplatform patient-portal EHR data, we compared self-reported health conditions across different clinical domains against computable definitions based on diagnosis codes, medications, vital signs, and laboratory testing. We assessed their concordance using Cohen's Kappa and the prognostic significance of differentially captured features as predictors of 1-year all-cause hospitalization risk. Results: Among 1683 participants (mean age 41 ± 15 years, 67% female, 63% non-Hispanic Whites), the prevalence of conditions varied substantially between EHR and self-report (-13.2% to +11.6% across definitions). Compared with comprehensive EHR phenotypes, self-report under-captured all conditions, including hypertension (27.9% vs 16.2%), diabetes (10.1% vs 6.2%), and heart disease (8.5% vs 4.3%). However, diagnosis codes alone were insufficient. The risk for 1-year hospitalization was better defined by the same features from patient-portal EHR (area under the receiver operating curve [AUROC] 0.79) than from self-report (AUROC 0.68). Discussion: EHR-derived computable phenotypes identified a higher prevalence of comorbidities than self-report, with prognostic value of additionally identified features. However, definitions based solely on diagnosis codes often undercaptured self-reported conditions, suggesting a role of broader EHR elements. Conclusion: In this nationwide study, patient-portal-derived EHR data enabled extensive capture of patient characteristics across multiple EHR platforms, allowing better disease phenotyping compared with self-report.
AB - Objectives: Direct electronic access to multiple electronic health record (EHR) systems through patient portals offers a novel avenue for decentralized research. Given the critical value of patient characterization, we sought to compare computable evaluation of health conditions from patient-portal EHR against the traditional self-report. Materials and Methods: In the nationwide Innovative Support for Patients with SARS-CoV-2 Infections Registry (INSPIRE) study, which linked self-reported questionnaires with multiplatform patient-portal EHR data, we compared self-reported health conditions across different clinical domains against computable definitions based on diagnosis codes, medications, vital signs, and laboratory testing. We assessed their concordance using Cohen's Kappa and the prognostic significance of differentially captured features as predictors of 1-year all-cause hospitalization risk. Results: Among 1683 participants (mean age 41 ± 15 years, 67% female, 63% non-Hispanic Whites), the prevalence of conditions varied substantially between EHR and self-report (-13.2% to +11.6% across definitions). Compared with comprehensive EHR phenotypes, self-report under-captured all conditions, including hypertension (27.9% vs 16.2%), diabetes (10.1% vs 6.2%), and heart disease (8.5% vs 4.3%). However, diagnosis codes alone were insufficient. The risk for 1-year hospitalization was better defined by the same features from patient-portal EHR (area under the receiver operating curve [AUROC] 0.79) than from self-report (AUROC 0.68). Discussion: EHR-derived computable phenotypes identified a higher prevalence of comorbidities than self-report, with prognostic value of additionally identified features. However, definitions based solely on diagnosis codes often undercaptured self-reported conditions, suggesting a role of broader EHR elements. Conclusion: In this nationwide study, patient-portal-derived EHR data enabled extensive capture of patient characteristics across multiple EHR platforms, allowing better disease phenotyping compared with self-report.
KW - decentralized
KW - multicenter
KW - patient portal
KW - pragmatic studies
UR - https://www.scopus.com/pages/publications/105003761442
UR - https://www.scopus.com/inward/citedby.url?scp=105003761442&partnerID=8YFLogxK
U2 - 10.1093/jamia/ocaf027
DO - 10.1093/jamia/ocaf027
M3 - Article
C2 - 40036551
AN - SCOPUS:105003761442
SN - 1067-5027
VL - 32
SP - 784
EP - 794
JO - Journal of the American Medical Informatics Association
JF - Journal of the American Medical Informatics Association
IS - 5
ER -