Lightweight predicate extraction for patient-level cancer information and ontology development

Muhammad Amith, Hsing Yi Song, Yaoyun Zhang, Hua Xu, Cui Tao

Research output: Contribution to journalArticlepeer-review

7 Scopus citations


Background: Knowledge engineering for ontological knowledgebases is resource and time intensive. To alleviate these issues, especially for novices, automated tools from the natural language domain can assist in the development process of ontologies. We focus towards the development of ontologies for the public health domain and use patient-centric sources from MedlinePlus related to HPV-causing cancers. Methods: This paper demonstrates the use of a lightweight open information extraction (OIE) tool to derive accurate knowledge triples that can lead to the seeding of an ontological knowledgebase. We developed a custom application, which interfaced with an information extraction software library, to help facilitate the tasks towards producing knowledge triples from textual sources. Results: The results of our efforts generated accurate extractions ranging from 80-89% precision. These triples can later be transformed to OWL/RDF representation for our planned ontological knowledgebase. Conclusions: OIE delivers an effective and accessible method towards the development ontologies.

Original languageEnglish (US)
Article number73
JournalBMC Medical Informatics and Decision Making
StatePublished - Jul 5 2017
Externally publishedYes


  • Natural language processing
  • Ontology learning
  • Open information extraction
  • Public health
  • Semi-automated ontology development

ASJC Scopus subject areas

  • Health Policy
  • Health Informatics
  • Computer Science Applications


Dive into the research topics of 'Lightweight predicate extraction for patient-level cancer information and ontology development'. Together they form a unique fingerprint.

Cite this