Identifying depression with the PHQ-2: A diagnostic meta-analysis

J Affect Disord. 2016 Oct:203:382-395. doi: 10.1016/j.jad.2016.06.003. Epub 2016 Jun 6.

Abstract

Background: There is interest in the use of very brief instruments to identify depression because of the advantages they offer in busy clinical settings. The PHQ-2, consisting of two questions relating to core symptoms of depression (low mood and loss of interest or pleasure), is one such instrument.

Method: A systematic review was conducted to identify studies that had assessed the diagnostic performance of the PHQ-2 to detect major depression. Embase, MEDLINE, PsychINFO and grey literature databases were searched. Reference lists of included studies and previous relevant reviews were also examined. Studies were included that used the standard scoring system of the PHQ-2, assessed its performance against a gold-standard diagnostic interview and reported data on its performance at the recommended (≥3) or an alternative cut-off point (≥2). After assessing heterogeneity, where appropriate, data from studies were combined using bivariate diagnostic meta-analysis to derive sensitivity, specificity, likelihood ratios and diagnostic odds ratios.

Results: 21 studies met inclusion criteria totalling N=11,175 people out of which 1529 had major depressive disorder according to a gold standard. 19 of the 21 included studies reported data for a cut-off point of ≥3. Pooled sensitivity was 0.76 (95% CI =0.68-0.82), pooled specificity was 0.87 (95% CI =0.82-0.90). However there was substantial heterogeneity at this cut-off (I(2)=81.8%). 17 studies reported data on the performance of the measure at cut-off point ≥2. Heterogeneity was I(2)=43.2% pooled sensitivity at this cut-off point was 0.91 (95% CI =0.85-0.94), and pooled specificity was 0.70 (95% CI =0.64-0.76).

Conclusion: The generally lower sensitivity of the PHQ-2 at cut-off ≥3 than the original validation study (0.83) suggests that ≥2 may be preferable if clinicians want to ensure that few cases of depression are missed. However, in situations in which the prevalence of depression is low, this may result in an unacceptably high false-positive rate because of the associated modest specificity. These results, however, need to be interpreted with caution given the possibility of selectively reported cut-offs.

Keywords: Diagnostic accuracy; Diagnostic meta-analysis; Major depression; Phq-2; Screening; Ultra-brief screening instruments.

Publication types

  • Meta-Analysis
  • Systematic Review

MeSH terms

  • Depression / diagnosis*
  • Depressive Disorder, Major / diagnosis*
  • Humans
  • Psychiatric Status Rating Scales*
  • Sensitivity and Specificity