Assessing rater performance without a 'gold standard' using consensus theory

Susan C. Weller, N. Clay Mann

Research output: Contribution to journalArticlepeer-review

98 Scopus citations

Abstract

This study illustrates the use of consensus theory to assess the diagnostic performances of raters and to estimate case diagnoses in the absence of a criterion or 'gold' standard. A description is provided of how consensus theory 'pools' information provided by raters, estimating rarer competencies and differentially weighting their responses. Although the model assumes that raters respond without bias (i.e., sensitivity = specificity), a Monte Carlo simulation with 1,200 data sets shows that model estimates appear to be robust even with bias. The model is illustrated on a set of elbow radiographs, and consensus-model estimates are compared with those obtained from follow-up data. Results indicate that with high rater competencies, the model retrieves accurate estimates of competency and case diagnoses even when raters' responses are biased.

Original languageEnglish (US)
Pages (from-to)71-79
Number of pages9
JournalMedical Decision Making
Volume17
Issue number1
DOIs
StatePublished - Jan 1997

Keywords

  • clinical competence
  • consensus theory
  • diagnostic evaluation
  • interobserver variation
  • models-mathematical

ASJC Scopus subject areas

  • Health Policy

Fingerprint

Dive into the research topics of 'Assessing rater performance without a 'gold standard' using consensus theory'. Together they form a unique fingerprint.

Cite this