Objectives: The College of American Pathologists (CAP) Category 1 quality measures, tumor stage, Gleason score, and surgical margin status, are used by physicians and cancer registrars to categorize patients into groups for clinical trials and treatment planning. This study was conducted to evaluate the effectiveness of an application designed to automatically extract these quality measures from the postoperative pathology reports of patients having undergone prostatectomies for treatment of prostate cancer. Design: An application was developed with the Clinical Outcomes Assessment Toolkit that uses an information pipeline of regular expressions and support vector machines to extract CAP Category 1 quality measures. System performance was evaluated against a gold standard of 676 pathology reports from the University of California at Los Angeles Medical Center and Brigham and Women's Hospital. To evaluate the feasibility of clinical implementation, all pathology reports were gathered using administrative codes with no manual preprocessing of the data performed. Measurements: The sensitivity, specificity, and overall accuracy of system performance were measured for all three quality measures. Performance at both hospitals was compared, and a detailed failure analysis was conducted to identify errors caused by poor data quality versus system shortcomings. Results: Accuracies for Gleason score were 99.7%, tumor stage 99.1%, and margin status 97.2%, for an overall accuracy of 98.67%. System performance on data from both hospitals was comparable. Poor clinical data quality led to a decrease in overall accuracy of only 0.3% but accounted for 25.9% of the total errors. Conclusion: Despite differences in document format and pathologists' reporting styles, strong system performance indicates the potential of using a combination of regular expressions and support vector machines to automatically extract CAP Category 1 quality measures from postoperative prostate cancer pathology reports.
|Original language||English (US)|
|Number of pages||8|
|Journal||Journal of the American Medical Informatics Association|
|State||Published - May 1 2008|
ASJC Scopus subject areas
- Health Informatics