Interobserver Reproducibility of the PI-RADS Version 2 Lexicon: A Multicenter Study of Six Experienced Prostate Radiologists

Radiology. 2016 Sep;280(3):793-804. doi: 10.1148/radiol.2016152542. Epub 2016 Apr 1.

Abstract

Purpose To determine the interobserver reproducibility of the Prostate Imaging Reporting and Data System (PI-RADS) version 2 lexicon. Materials and Methods This retrospective HIPAA-compliant study was institutional review board-approved. Six radiologists from six separate institutions, all experienced in prostate magnetic resonance (MR) imaging, assessed prostate MR imaging examinations performed at a single center by using the PI-RADS lexicon. Readers were provided screen captures that denoted the location of one specific lesion per case. Analysis entailed two sessions (40 and 80 examinations per session) and an intersession training period for individualized feedback and group discussion. Percent agreement (fraction of pairwise reader combinations with concordant readings) was compared between sessions. κ coefficients were computed. Results No substantial difference in interobserver agreement was observed between sessions, and the sessions were subsequently pooled. Agreement for PI-RADS score of 4 or greater was 0.593 in peripheral zone (PZ) and 0.509 in transition zone (TZ). In PZ, reproducibility was moderate to substantial for features related to diffusion-weighted imaging (κ = 0.535-0.619); fair to moderate for features related to dynamic contrast material-enhanced (DCE) imaging (κ = 0.266-0.439); and fair for definite extraprostatic extension on T2-weighted images (κ = 0.289). In TZ, reproducibility for features related to lesion texture and margins on T2-weighted images ranged from 0.136 (moderately hypointense) to 0.529 (encapsulation). Among 63 lesions that underwent targeted biopsy, classification as PI-RADS score of 4 or greater by a majority of readers yielded tumor with a Gleason score of 3+4 or greater in 45.9% (17 of 37), without missing any tumor with a Gleason score of 3+4 or greater. Conclusion Experienced radiologists achieved moderate reproducibility for PI-RADS version 2, and neither required nor benefitted from a training session. Agreement tended to be better in PZ than TZ, although was weak for DCE in PZ. The findings may help guide future PI-RADS lexicon updates. (©) RSNA, 2016 Online supplemental material is available for this article.

Publication types

  • Multicenter Study

MeSH terms

  • Adult
  • Aged
  • Biopsy
  • Contrast Media
  • Humans
  • Magnetic Resonance Imaging / methods*
  • Male
  • Middle Aged
  • Neoplasm Grading
  • Organometallic Compounds
  • Practice Patterns, Physicians' / statistics & numerical data*
  • Prostatic Diseases / diagnostic imaging*
  • Prostatic Diseases / pathology
  • Reproducibility of Results
  • Retrospective Studies

Substances

  • Contrast Media
  • Organometallic Compounds
  • gadobutrol