Interrater and intrarater agreements of magnetic resonance imaging findings in the lumbar spine: significant variability across degenerative conditions

Spine J. 2014 Oct 1;14(10):2442-8. doi: 10.1016/j.spinee.2014.03.010. Epub 2014 Mar 15.

Abstract

Background context: Magnetic resonance imaging (MRI) is frequently used in the evaluation of degenerative conditions in the lumbar spine. The relative interrater and intrarater agreements of MRI findings across different pathologic conditions are underexplored, as most studies are focused on specific findings.

Purpose: The purpose of this study was to characterize the interrater and intrarater agreements of MRI findings used to assess the degenerative lumbar spine.

Study design: A retrospective diagnostic study at a large academic medical center was undertaken with a panel of orthopedic surgeons and musculoskeletal radiologists to assess lumbar MRIs using standardized criteria.

Patient sample: Seventy-five subjects who underwent routine lumbar spine MRI at our institution were included.

Outcome measures: Each MRI study was assessed for 10 lumbar degenerative findings using standardized criteria. Lumbar vertebral levels were assessed independently, where applicable, for a total of 52 data points collected per study.

Methods: T2-weighted axial and sagittal MRI sequences were presented in random order to the four reviewers (two orthopedic spine surgeons and two musculoskeletal radiologists) independently to determine interrater agreement. The first 10 studies were reevaluated at the end to determine intrarater agreement. Images were assessed using standardized and pilot-tested criteria to assess disc degeneration, stenosis, and other degenerative changes. Interrater and intrarater absolute percent agreements were calculated. To highlight the most clinically important MRI disagreements, a modified agreement analysis was also performed (in which disagreements between the lowest two severity grades for applicable conditions were ignored). Fleiss kappa coefficients for interrater agreement were determined.

Results: The overall absolute and modified interrater agreements were 76.9% and 93.5%, respectively. The absolute and modified intrarater agreements were 81.3% and 92.7%, respectively. Average Fleiss kappa coefficient was 0.431, suggesting moderate overall agreement. However, when stratified by condition, absolute interrater agreement ranged from 65.1% to 92.0%. Disc hydration, disc space height, and bone marrow changes exhibited the lowest absolute interrater agreements. The absolute intrarater agreement had a narrower range, from 74.5% to 91.5%. Fleiss kappa coefficients ranged from fair-to-substantial agreement (0.282-0.618).

Conclusions: Even in a study using standardized evaluation criteria, there was significant variability in the interrater and intrarater agreements of MRI in assessing different degenerative conditions of the lumbar spine. Clinicians should be aware of the condition-specific diagnostic limitations of MRI interpretation.

Keywords: Interrater agreement; Intrarater agreement; Lumbar degeneration; Lumbar imaging; Lumbar spine; Magnetic resonance imaging; Spondylosis.

MeSH terms

  • Adolescent
  • Adult
  • Aged
  • Aged, 80 and over
  • Female
  • Humans
  • Intervertebral Disc Degeneration / pathology*
  • Lumbar Vertebrae / pathology*
  • Lumbosacral Region / pathology*
  • Magnetic Resonance Imaging / methods*
  • Male
  • Middle Aged
  • Observer Variation
  • Reproducibility of Results
  • Retrospective Studies
  • Spinal Stenosis / pathology*
  • Young Adult