Complex epilepsy phenotype extraction from narrative clinical discharge summaries

Licong Cui, Satya S. Sahoo, Samden D. Lhatoo, Gaurav Garg, Prashant Rai, Alireza Bozorgi, Guo Qiang Zhang

Research output: Contribution to journalArticle

9 Citations (Scopus)

Abstract

Epilepsy is a common serious neurological disorder with a complex set of possible phenotypes ranging from pathologic abnormalities to variations in electroencephalogram. This paper presents a system called Phenotype Exaction in Epilepsy (PEEP) for extracting complex epilepsy phenotypes and their correlated anatomical locations from clinical discharge summaries, a primary data source for this purpose. PEEP generates candidate phenotype and anatomical location pairs by embedding a named entity recognition method, based on the Epilepsy and Seizure Ontology, into the National Library of Medicine's MetaMap program. Such candidate pairs are further processed using a correlation algorithm. The derived phenotypes and correlated locations have been used for cohort identification with an integrated ontology-driven visual query interface. To evaluate the performance of PEEP, 400 de-identified discharge summaries were used for development and an additional 262 were used as test data. PEEP achieved a micro-averaged precision of 0.924, recall of 0.931, and F1-measure of 0.927 for extracting epilepsy phenotypes. The performance on the extraction of correlated phenotypes and anatomical locations shows a micro-averaged F1-measure of 0.856 (Precision: 0.852, Recall: 0.859). The evaluation demonstrates that PEEP is an effective approach to extracting complex epilepsy phenotypes for cohort identification.

Original languageEnglish (US)
Pages (from-to)272-279
Number of pages8
JournalJournal of Biomedical Informatics
Volume51
DOIs
StatePublished - Oct 1 2014
Externally publishedYes

Fingerprint

Epilepsy
Phenotype
Ontology
Electroencephalography
Interfaces (computer)
Medicine
National Library of Medicine (U.S.)
Information Storage and Retrieval
Nervous System Diseases
Seizures

Keywords

  • Cohort identification
  • Epilepsy
  • Information extraction

ASJC Scopus subject areas

  • Computer Science Applications
  • Health Informatics

Cite this

Complex epilepsy phenotype extraction from narrative clinical discharge summaries. / Cui, Licong; Sahoo, Satya S.; Lhatoo, Samden D.; Garg, Gaurav; Rai, Prashant; Bozorgi, Alireza; Zhang, Guo Qiang.

In: Journal of Biomedical Informatics, Vol. 51, 01.10.2014, p. 272-279.

Research output: Contribution to journalArticle

Cui, Licong ; Sahoo, Satya S. ; Lhatoo, Samden D. ; Garg, Gaurav ; Rai, Prashant ; Bozorgi, Alireza ; Zhang, Guo Qiang. / Complex epilepsy phenotype extraction from narrative clinical discharge summaries. In: Journal of Biomedical Informatics. 2014 ; Vol. 51. pp. 272-279.
@article{b65d7cdb18ed45f681fe5ed8905ec0b8,
title = "Complex epilepsy phenotype extraction from narrative clinical discharge summaries",
abstract = "Epilepsy is a common serious neurological disorder with a complex set of possible phenotypes ranging from pathologic abnormalities to variations in electroencephalogram. This paper presents a system called Phenotype Exaction in Epilepsy (PEEP) for extracting complex epilepsy phenotypes and their correlated anatomical locations from clinical discharge summaries, a primary data source for this purpose. PEEP generates candidate phenotype and anatomical location pairs by embedding a named entity recognition method, based on the Epilepsy and Seizure Ontology, into the National Library of Medicine's MetaMap program. Such candidate pairs are further processed using a correlation algorithm. The derived phenotypes and correlated locations have been used for cohort identification with an integrated ontology-driven visual query interface. To evaluate the performance of PEEP, 400 de-identified discharge summaries were used for development and an additional 262 were used as test data. PEEP achieved a micro-averaged precision of 0.924, recall of 0.931, and F1-measure of 0.927 for extracting epilepsy phenotypes. The performance on the extraction of correlated phenotypes and anatomical locations shows a micro-averaged F1-measure of 0.856 (Precision: 0.852, Recall: 0.859). The evaluation demonstrates that PEEP is an effective approach to extracting complex epilepsy phenotypes for cohort identification.",
keywords = "Cohort identification, Epilepsy, Information extraction",
author = "Licong Cui and Sahoo, {Satya S.} and Lhatoo, {Samden D.} and Gaurav Garg and Prashant Rai and Alireza Bozorgi and Zhang, {Guo Qiang}",
year = "2014",
month = "10",
day = "1",
doi = "10.1016/j.jbi.2014.06.006",
language = "English (US)",
volume = "51",
pages = "272--279",
journal = "Journal of Biomedical Informatics",
issn = "1532-0464",
publisher = "Academic Press Inc.",

}

TY - JOUR

T1 - Complex epilepsy phenotype extraction from narrative clinical discharge summaries

AU - Cui, Licong

AU - Sahoo, Satya S.

AU - Lhatoo, Samden D.

AU - Garg, Gaurav

AU - Rai, Prashant

AU - Bozorgi, Alireza

AU - Zhang, Guo Qiang

PY - 2014/10/1

Y1 - 2014/10/1

N2 - Epilepsy is a common serious neurological disorder with a complex set of possible phenotypes ranging from pathologic abnormalities to variations in electroencephalogram. This paper presents a system called Phenotype Exaction in Epilepsy (PEEP) for extracting complex epilepsy phenotypes and their correlated anatomical locations from clinical discharge summaries, a primary data source for this purpose. PEEP generates candidate phenotype and anatomical location pairs by embedding a named entity recognition method, based on the Epilepsy and Seizure Ontology, into the National Library of Medicine's MetaMap program. Such candidate pairs are further processed using a correlation algorithm. The derived phenotypes and correlated locations have been used for cohort identification with an integrated ontology-driven visual query interface. To evaluate the performance of PEEP, 400 de-identified discharge summaries were used for development and an additional 262 were used as test data. PEEP achieved a micro-averaged precision of 0.924, recall of 0.931, and F1-measure of 0.927 for extracting epilepsy phenotypes. The performance on the extraction of correlated phenotypes and anatomical locations shows a micro-averaged F1-measure of 0.856 (Precision: 0.852, Recall: 0.859). The evaluation demonstrates that PEEP is an effective approach to extracting complex epilepsy phenotypes for cohort identification.

AB - Epilepsy is a common serious neurological disorder with a complex set of possible phenotypes ranging from pathologic abnormalities to variations in electroencephalogram. This paper presents a system called Phenotype Exaction in Epilepsy (PEEP) for extracting complex epilepsy phenotypes and their correlated anatomical locations from clinical discharge summaries, a primary data source for this purpose. PEEP generates candidate phenotype and anatomical location pairs by embedding a named entity recognition method, based on the Epilepsy and Seizure Ontology, into the National Library of Medicine's MetaMap program. Such candidate pairs are further processed using a correlation algorithm. The derived phenotypes and correlated locations have been used for cohort identification with an integrated ontology-driven visual query interface. To evaluate the performance of PEEP, 400 de-identified discharge summaries were used for development and an additional 262 were used as test data. PEEP achieved a micro-averaged precision of 0.924, recall of 0.931, and F1-measure of 0.927 for extracting epilepsy phenotypes. The performance on the extraction of correlated phenotypes and anatomical locations shows a micro-averaged F1-measure of 0.856 (Precision: 0.852, Recall: 0.859). The evaluation demonstrates that PEEP is an effective approach to extracting complex epilepsy phenotypes for cohort identification.

KW - Cohort identification

KW - Epilepsy

KW - Information extraction

UR - http://www.scopus.com/inward/record.url?scp=84908022842&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84908022842&partnerID=8YFLogxK

U2 - 10.1016/j.jbi.2014.06.006

DO - 10.1016/j.jbi.2014.06.006

M3 - Article

VL - 51

SP - 272

EP - 279

JO - Journal of Biomedical Informatics

JF - Journal of Biomedical Informatics

SN - 1532-0464

ER -