NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.
Pignone M, Gaynes BN, Rushton JL, et al. Screening for Depression [Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); 2002 Apr. (Systematic Evidence Reviews, No. 6.)
This publication is provided for historical reference only and the information may be out of date.
We included detailed information, including demographic characteristics of the study population, descriptors of study design and setting, diagnoses and conditions of interest, criterion standard used for measurement (for screening topics), numerous outcome measures, and indicators of quality in the Evidence Tables in Appendix D. The tables cover, respectively, screening accuracy (41 entries in Evidence Table 1);17,34-73 pharmacologic treatment (7 entries covering 9 publications in Evidence Table 2);74-82 psychotherapeutic treatment (13 entries covering 15 publications in Evidence Table 3);74,77-90 screening outcomes (13 entries in Evidence Table 4).91-103 Some articles appear in more than one Evidence Table. (See the main glossary in Appendix B and the specialized glossary in Appendix D for abbreviations.)
Key Question 1: Accuracy of Screening Tests for Depression
For screening to be effective, then reliable, accurate, feasible, and acceptable screening methods must be available. On the advice of the USPSTF liaisons, we focused the review on diagnostic accuracy and the ability of the instruments to classify patients correctly as depressed or well. We comment briefly on feasibility and acceptability but have not systematically reviewed the literature in those areas.
Screening Accuracy in Adults
Multiple reliable depression screening instruments are available for adults.3,104 Numerous studies have examined the diagnostic accuracy of screening tests for depression in adults. We identified 33 articles that had been published from January 1994 to August 1999 and 8 older articles published from 1966 to December 1993 that examined the sensitivity and specificity of 13 different screening instruments against a criterion standard for the diagnosis of depression. The following sections examine several aspects of the diagnostic performance of the screening tests in different populations, including community, general practice, or primary care patients, the elderly, children and adolescents, and special populations. This information is then used to estimate the diagnostic consequences of screening for depression in these different populations.
As with all screening procedures for diagnostic tests, a positive screen for depression does not make a diagnosis of a depressive illness. Unlike many other disorders, depression has no universally accepted criterion standard. Several diagnostic instruments have been used to define the presence or absence of depression (Table 6). The most feasible standard in primary care is most likely a comparison of the patient's symptoms with criteria listed in the Diagnostic and Statistical Manual (DSM) particularly DSM-IV for depressive illnesses. 13 A specific DSM-IV Primary Care Version has been tailored to be a useful aid in diagnosing mental disorders in primary care. 105
After confirming that a patient who screens positive meets the diagnostic criterion for a specific depressive illness, the clinician must consider other potential causes of depression (such as hypothyroidism, depression due to medication or substance use, vitamin deficiencies, or electrolyte imbalances). Additionally, the clinician must take into account other psychiatric illnesses that can present with depressive symptoms (Table 7). Such considerations would require additional history collection and possibly laboratory tests. Should 1 of the additional causes of depressive illness be identified, first steps at treatment may be directed at this underlying etiology. Otherwise, treatment for the depressive illness (whether in the primary care setting or by referral to a mental health professional) can be initiated.
The 41 studies in Evidence Table 1 (listed in alphabetical order by author) (Appendix D) include 24 studies of adults in community or primary care settings, 12 articles that address screening in older adults, and 5 studies performed in special populations. The primary screening instruments used in these studies are the Center for Epidemiologic Study Depression scale (CES-D), used as the main instrument in 13 studies; the Geriatric Depression Scale (GDS), used in 6 studies; the General Health Questionnaire (GHQ), used in 4 studies; the Beck Depression Inventory (BDI) used in 3 studies; the Zung self-depression screener (SDS) used in 3 studies; the Symptom Driven Diagnostic System - Primary Care (SDDS-PC) used in 2 studies; the Self Care-D, used in 2 studies; the depression screening module of the Medical Outcomes Study (MOS) used in 2 studies; and 6 instruments that were used in 1 study each. Table 8 describes the basic characteristics of these instruments.
The majority of the identified studies (23/34) examined sensitivity and specificity for major depressive disorder, defined by a variety of criterion standards, many of which are based on DSM-III or DSM-IIIR criteria. Eight studies examined screening accuracy for depression without specifying a specific disorder. One study each specifically examined screening accuracy for minor depression, subsyndromal depressive disorders, "depression NOS," or a "significantly depressed state." Three studies could not be characterized. Some studies used more than 1 disease definition.
Older Studies
Mulrow et al 104 systematically reviewed the performance of screening tests for depression conducted between 1966 and February 1994. They identified 15 published and 4 unpublished articles that met their inclusion criteria, which required that the outcome status of at least 50% of the subjects be verified by an acceptable criterion standard examination. Eleven of these articles met our inclusion criteria as well and appear in Evidence Table 1.39,40,42,43,47,60,63,69,73,97,102
To summarize performance, Mulrow et al 104 calculated the average sensitivity and specificity for the included articles (based on the usual cut-points for each instrument) and constructed a summary receiver operating characteristics (ROC) curve. The overall sensitivity was 84% (95% confidence interval [CI], 79% to 89%), and overall specificity was 72% (95% CI, 67% to 77%). These values translate to a positive likelihood ratio (LR) of about 3 and a negative LR of 0.2. Results did not differ substantially based on the degree of verification bias. The included instruments were easy to administer and complete, and they had been written at either easy (third to fifth grade) or average (sixth to ninth grade) literacy levels.
General Primary Care Populations
We identified 23 newer articles that Mulrow et al had not included. Six of the 23 newer studies were conducted in primary care settings in nonelderly or mixed populations.36,51,61,62,65,71 Klinkman et al 51 found that the CES-D had a sensitivity of 81% and specificity of 72% for scores above 15, compared with a gold standard diagnosis based on a Structured Clinical (Diagnostic) Interview (SCID) for DSM-IIIR or -IV. Parkerson and Broadhead 61 found a similar level of performance for the Duke AD screener: 81% sensitivity and 64% specificity for scores greater than 30. Salokangas et al 62 found that The Depression Scale (DEPS) performed reasonably well (sensitivity, 74%; specificity, 85% for scores greater than 8). Bashir et al 36 tested the GHQ in a random sample of British general practice attenders and found a sensitivity of 76% and a specificity of 74%. Steer et al 65 reported that the BDI performed extremely well (sensitivity, 97%; specificity, 99%) against a less rigorous criterion standard, the mood module of the Primary Care Evaluation of Mental Disorders (PRIME-MD).
The study by Whooley et al 71 deserves special comment. They examined the performance of multiple screening tests, including the CES-D, BDI, and MOS, as well as a new two-item screener that included only questions about depressed mood and anhedonia, in a population of veterans (97% men) from an urgent care setting. The two-item screener (sensitivity, 96%; specificity, 57%; area under the ROC curve, 0.82) performed nearly as well as the CES-D and MOS (area under the ROC curve, 0.89 for each). Shorter versions of the CES-D and BDI also performed well.
Overall, these newer studies had sensitivity and specificity results similar to those found by Mulrow et al. 104 Sensitivity with some of the newer short screeners was slightly improved, with specificity similar to that of older instruments.
Elderly Populations
Twelve newer studies (Evidence Table 1) specifically examined the performance of depression screening instruments in older adults, including 6 using the GDS, 3 using the Self Care-D, and 3 using the CES-D (Table 9). The age limits used to define "elderly" varied; 1 study included adults older than 50 years of age, another enrolled only those older than 75 years, and others fell in between. The settings included community-based recruitment, primary care clinics, geriatric assessment clinics, patients' homes, and a nursing home.
Each of these screening instruments demonstrated relatively good test performance characteristics (Table 9), with sensitivities generally 80% to 95% and specificities of 70% to 85%. Each instrument showed modest variation between studies. In general, confidence intervals were not calculated for the sensitivity and specificity estimates, and few studies calculated area under the ROC curves. Two studies, Gerety et al 45 and Lyness et al, 56 compared the GDS and CES-D instruments; both found that the GDS performed better. In Gerety et al, 45 the area under the ROC curve was 0.91 for the GDS and 0.85 for the CES-D. According to Lyness et al, 56 each instrument had similar performance for major depression, but the GDS performed better for "minor" depression. None of the studies compared the Self Care-D with either the CES-D or the GDS.
Special Populations
We identified 5 studies of depression screening in special populations that met our inclusion criteria (Evidence Table 1). Geisser et al 44 tested the CES-D in a pain clinic. The criterion standard was a clinical interview with a psychologist using DSM-IV criteria. They found a 33% prevalence, a sensitivity of 82%, and a specificity of 73% using a score of 27 or greater to define a positive screen.
Holcomb et al 48 examined the performance of the BDI in an obstetrics and gynecology setting. They used the Diagnostic Interview Schedule (DIS) as a gold standard and found an 11% prevalence of current depression. A BDI score of 16 or greater had 83% sensitivity and 89% specificity for depression.
Irwin et al 50 used the CES-D in a community-based sample of adults with known physical illness. They compared their screening results against the SCID as a criterion standard. Scores of greater than or equal to 4 had 99% sensitivity and 84% specificity for depression.
Leung et al (1998) studied the performance of the Zung SDS in Chinese family practice patients in Taiwan. 53 This team reported that SDS scores of greater than or equal to 55 had 67% sensitivity and 90% specificity for depression when compared against a diagnosis by a physician using DSM-IV criteria.
Lustman et al (1997) examined the BDI in patients with diabetes, using the DIS as a criterion standard. 55 The prevalence for major depression was 37%, and a BDI score greater than or equal to 13 had 85% sensitivity and 88% specificity.
Summary of Screening Accuracy in Adults
Several depression screening instruments appear to detect depression effectively. Recent research has shown that shorter screening tests, including simply asking 2 questions about depressed mood and anhedonia, appear to detect a large majority of depressed patients; in some cases, they perform better than the original instruments from which they had been derived.
In general, sensitivity results were good to excellent and specificity results were moderate to good; with commonly used cut-points, typical values were 80% to 90% for sensitivity and 70% to 85% for specificity. If the prevalence of major depression is estimated to be between 5% and 15% in primary care settings, the positive predictive value (probability of depression after a positive test) would be 25% to 50% (Table 10). Thus, more than half of patients who screen positive will be false positives for major depression. Some of these "false positives" may be patients with minor depression or dysthymia. People with positives screens require further diagnostic questioning before clinicians apply a diagnostic label and suggest a treatment plan.
One problem with depression screening instruments is that continuous data (ie, scores on the instruments) are dichotomized into positive and negative results at an arbitrary cut-off value and then used to calculate sensitivity and specificity (as well as positive and negative likelihood ratios) for that cut-off. With this approach, valuable information is lost because all scores above the threshold are counted equally (similarly, all below the threshold are also treated the same).
Some studies in this report partially overcome this problem by providing information on area under the ROC curve, which quantitates overall performance by producing a score between 0.50 (no information) and 1.0 (perfect information). An even more useful technique is to calculate stratum-specific likelihood ratios (SSLRs) for ranges of scores on an instrument. The SSLR for the result of the screen is multiplied by the pre-test odds to give the post-test odds. Furukawa et al 106 calculated SSLRs for the CES-D using data from Japanese psychiatric hospitals and clinics. Scores of 0 to 29 were associated with an SSLR of 0.35; scores of 30 to 49 were associated with an SSLR of 2.3; and scores over 50 were associated with an SSLR of 11.7 (Table 11).
Another difficulty in measuring the accuracy of screening instruments comes when trying to interpret specificity. Instruments used in some studies to detect major depression may count subjects with subsyndromal depressive illnesses as false positives. A true measure of specificity would count as false positives only those patients who are free from any significant depressive illness but who screened positive, because patients with subsyndromal illnesses may also benefit from treatment or more careful observation. Patients with other important and treatable disorders such as substance abuse, anxiety disorders, complicated grief reactions, or bipolar disorders may also be counted as false positives in some studies, but they might well be identified by the more careful and in-depth assessment that would presumably follow a positive screen. If, however, treatment for depression is initiated on only the basis of screening positive, then patients with other related illnesses may receive suboptimal care.
Using Risk Factors to Identify Patients with Depression
Because the prevalence of depression is only 5% to 10% in primary care settings, some experts have suggested that the presence of known risk factors for depression be used to determine who should or should not be screened -- a strategy of selective screening. Although, intuitively appealing, most common risk factors for depression perform relatively poorly in discriminating patients who are depressed from those who are not depressed. Conde et al 107 demonstrated that most common risk factors have positive likelihood ratios (LR) between 1 and 2 and negative likelihood ratios between 0.5 and 1, suggesting low predictive ability (Table 12).
Other factors, such as a previous history of depression or concurrent diagnosis of panic disorder or generalized anxiety disorder, have positive LRs greater than 10; their presence warrants further investigation for depression, perhaps including a diagnostic interview. Their absence, however, does not significantly change the likelihood of depression.
Depression screening tools have a positive LR of approximately 3 and a negative LR of 0.2, demonstrating that they perform better than most of the common demographic risk factors. Based on these data, a strategy of selective screening does not appear to be superior to simply performing (or asking the patient to perform) one of the brief screening tools. In patients with previous depression or a current anxiety or panic disorder, directly proceeding to a full diagnostic interview may be warranted instead of initial screening.
Screening Accuracy in Children and Adolescents
The identification of depression in children and adolescents has not been as well studied as in adults. Increasing recognition of the important burden of depressive illness and its sequelae in children and adolescents has led to greater attention to means to identify, prevent, and treat mood disorders in this vulnerable population.
Depressive illnesses may have different clinical characteristics and presentations in children and adolescents than in adults. Child and adolescent psychiatrists have developed several structured diagnostic interviews that have been used to characterize and diagnose depression in youth, but they are too long and complex for routine use by primary care providers. Apart from the DSM, these include versions of the Child Assessment Schedule (CAS), Diagnostic Interview for Children and Adolescents (DICA), Diagnostic Interview Schedule for Children (DISC), and Schedule for Affective Disorders and Schizophrenia for School-age Children (K-SADS). 108 These instruments are often used as criterion to make the diagnosis of depression.
The use of different criterion standards is critical to the appraisal of screening test performance as these standards have their own limitations with regard to sensitivity and specificity that affect the evaluation of screening tools.
Only a small number of studies have addressed screening test performance in ambulatory, nonpsychiatric pediatric populations that are generalizable to primary care. The screening tools that have been evaluated most commonly are reviewed below and summarized in Table 13.
Beck Depression Inventory (BDI)
Two studies looked at performance of the BDI in outpatient samples referred for psychiatric care;109,110 most subjects were adolescents. Sensitivity was 48%, 86%, and 89% with corresponding specificities of 87%, 82%, and 88%. Positive predictive values were high (63%, 83%, and 93%) because of the high prevalence of depression in these referred patients.
Three studies used the BDI in general school samples of adolescents. The largest study included 1,704 Oregon high school students and used a BDI of ?11 for females and ?15 for males to assign a diagnosis of current depression (according to DSM-III criteria). 111 Sensitivity was 84%; specificity, 81%. Positive predictive value was 10% and negative predictive value 99.5%. A small sample of 49 adolescents from a school population was a part of a study using the BDI to identify DSM-III major depression. 112 Using a cut-off of 16, the investigators reported 100% sensitivity and 93% specificity for the BDI. Prevalence of depression was 10% (5/49 adolescents); positive predictive value was 61%. The third study of adolescent students used a BDI of ?16 to assess lifetime history of DSM-III major depression and dysthymia. 113 For depression, sensitivity and specificity were 77% and 65%, respectively. Prevalence of depression was 4%; positive predictive value was 8%. For dysthymia, sensitivity and specificity were 71% and 64%, respectively. Prevalence of depression was 5%; positive predictive value was 10%.
Finally, the only study conducted in a general primary care setting used a version of the BDI, the Beck Depression Inventory for Primary Care (BDI-PC) to assess major depression during 100 adolescent health maintenance examinations. 114 A BDI-PC cut-off of 4 yielded a sensitivity of 91%, a specificity of 91%, and a positive predictive value of 56% for the population with a high prevalence of 11%.
Center for Epidemiological Studies - Depression Scale (CES-D)
The CES-D is a 20-item scale developed for adults. The CES-D in children did not correlate well with the Children's Depression Inventory (CDI) and did not discriminate depressed and nondepressed patients adequately for use in children. 115
Two studies have described CES-D screening accuracy for depression in large school-based samples of adolescents. Roberts et al 111 looked at CES-D scores in the Oregon sample that also used the BDI. Investigators applied a cut-off of 22 for males and 24 for females to identify current depression (DSM-III criteria) in 1,704 adolescents. Sensitivity was 84%; specificity was 75%; and positive predictive value was 8%.
Garrison et al 116 used a subsample of 332 students identified in a larger survey of adolescents in the Southeastern United States. Using various cut-off points, the researchers found that optimal screening characteristics for depression occurred at a cut-off point of 12 for males and 22 for females. For males, sensitivity was 85%, specificity was 49%, and positive predictive value was 13%. For females, sensitivity was 83%, specificity was 77%, and positive predictive value was 25%. Screening performance of the CES-D was also assessed for dysthymia using a cut-off of 16 for males and 20 for females. For males, sensitivity was 75%, specificity was 67%, and positive predictive value was 14%. For females, sensitivity was 100%, specificity was 67%, and positive predictive value was 8%.
The CES-D also has a version for children, the CES-DC. In 1 study of the CES-DC using a cut-off of 15, Fendrich et al 117 found the CES-DC to have a sensitivity of 71% and a specificity of 57%.
Other Screening Instruments
In a population of adolescents referred for psychiatric care, Angold et al 118 tested the Short Mood and Feeling Questionnaire in a mixed sample of 173 children and adolescents. They used the DISC as a criterion standard and found sensitivity of 70% and specificity of 85%.
Several other screening instruments have been used in children and adolescents, but most have not been used to screen a primary care sample of pediatric patients. These other tests include the Children's Depression Inventory (CDI), Child Depression Scale (CDS), Children's Self-report Rating Scale (CSRS), Depression Self-Rating Scale (DSRS), and Reynolds Adolescent Depression Scale (RADS). Studies of these scales have reported validation in psychiatric inpatient and referred samples, and so these instruments may be useful in some settings. However, the studies either do not report data in primary care populations or do not describe test performance results to address use as general screening tools. 119
The Pediatric Symptom Checklist (PSC) and Child Behavior Checklist (CBCL) have been shown to be feasible to implement in primary care practice and have relatively good sensitivity and specificity as a general screen of mental health needs. These tests may increase awareness of unrecognized psychosocial problems; however, they do not appear to perform well in identifying specific individual diagnosis such as depression.120,121
Special Populations
Children with comorbid psychopathology or chronic medical illness and other pediatric subpopulations have been reported to have a higher prevalence of depressive disorders than the general population. Special populations may be candidates for targeted screening, but few studies report screening accuracy results. Sensitivity and specificity in psychiatric inpatient or outpatient groups are generally similar to the results presented above, although predictive values will be higher because of the higher prevalence of depression. 122
One study in a population of chronically ill pediatric patients (an important subset of pediatric patients with higher prevalence of depressive disorders) evaluated test performance of the CDI, the PSC, and the CBCL. 123 The authors found a high prevalence of depression and mental disorders and relatively good specificity of the measures at detecting depression, anxiety, or both (78% to 96%). They concluded, however, that low sensitivity of the tests (26% to 55%) limited their clinical usefulness for this patient population. Other, better performing depression scales have not been tested in children with chronic illnesses.
Summary of Screening Accuracy in Pediatric Populations
The existing literature suggests that screening instruments for depression in adolescents that have been tested in community or primary care settings perform reasonably well. They produce sensitivity values ranging from 75% to 100% and specificity values from 70% to 90%, values similar to those found in adults, although there are fewer studies and fewer total subjects. Fewer data are available for children. The prevalence of disease and the positive predictive value in children are quite low, but the values rise in adolescents. Like adults, those who screen positive should undergo a more rigorous diagnostic interview before being labeled as depressed.
Key Question 2: Outcomes of Treatment for Depression in Primary Care Settings
Treatment of depression in primary care patients can involve antidepressant medication, psychotherapy, or a combination of the 2. Additionally, educational and quality improvement interventions directed at the patient, clinician, or care system have been applied to improve the effectiveness of treatment for depression. As part of examining whether screening for depression is beneficial in the primary care setting, we sought to determine whether treatment for depressive disorders in primary care patients can improve outcomes, including depressive severity, functional status, and health care utilization. We first address the evidence for treatment of adults, including the elderly and special populations, and then examine the evidence for children and adolescents.
Treatment of Depression in Adults
The Depression Guideline Panel of the Agency for Health Care Policy and Research (AHCPR) systematically reviewed literature published through December 1990 and performed a meta-analysis on 7 of the 24 extant randomized controlled trials (RCTs) conducted with primary care patients, all of which were pharmacologic interventions. 3 Only 1 of the 7 studies involved a selective serotonin reuptake inhibitor (SSRI); the preponderance of medication trials involved tricyclic antidepressants (TCAs) (4 studies) or heterocyclic agents (5 studies). The overall drug efficacy was 57.8%; the placebo response rate (included in 3 of the studies) was 35.6%.
We updated the AHCPR review using 3 more recent systematic reviews and a search of articles published from 1966 through December 1999. We included articles that provided clinical outcome measures and had been performed in a primary care setting. The systematic reviews included a review of treatment in primary care, which examined 28 articles; 124 a review of the treatment of dysthymia with 15 articles; 125 and a review of treating depression in patients with physical illness that identified 18 articles. 126 One study had been included in a review by Mulrow et al 124 and in a separate Cochrane review by Lima and Moncrieff; 125 another study 127 was included in both of the reviews by Mulrow and her colleagues124,128 and the Gill and Hatcher 126 Cochrane review. These first 2 reviews involved antidepressant trials and did not include studies of psychotherapy. Gill and Hatcher assessed trials involving antidepressant drugs, 3 of which had had a concomitant psychotherapy. As their analysis was limited to the effects of antidepressants, we will report its results only in regard to antidepressant outcomes. In addition to the articles from these reviews, our literature searches identified 19 other trials of treatment for depression. Data from these trials are included in the Evidence Tables in Appendix D.
Across all these sources, we identified a total of 78 studies for review in this SER (the 59 articles from the 3 previous systematic reviews, plus the 19 newly identified articles). Of these 78 studies, 73 directly tested an antidepressant or psychotherapeutic treatment (or both): 60 tested an antidepressant alone, 5e involved both an antidepressant and psychotherapeutic intervention (3 of which looked at the effects of a combined intervention), and 8 tested a psychotherapy intervention alone.
The remaining 5 studies involved educational or quality improvement interventions. Four involved multidisciplinary collaboration and education directed at the patient, clinician, and system of care.129-132 One assessed the effect of drug counseling and information leaflets for patients on medication adherence and depressive severity. 133
In the following sections, we examine the outcome of various forms of interventions for depression, including antidepressant medications, psychotherapy, and educational or quality improvement interventions.
Pharmacologic Interventions
Details about pharmacological treatment studies that met our inclusion criteria can be found in Evidence Table 2 (Appendix D). The discussion below is presented first for large-scale reviews (for which data have not been provided in Evidence Tables) and then for other studies; for the latter, the 6 entries in the Evidence Table (which cover 8 publications) are presented in reverse chronological order.
Results from Large-Scale Reviews
Mulrow and colleagues 124 completed a systematic review from 1980 through January 1998 that evaluated RCTs involving depressed primary care patients that compared the efficacy of "newer" antidepressants to that of other pharmacologic or psychosocial interventions or to placebo. They identified 28 trials involving 5,940 primary care patients; these covered major depression (14 studies), dysthymia (2 studies), or another form of depressive illness ("depression requiring treatment," "depressive illness," "endogenous depression," or mixed anxiety-depression, 12 studies). Average response rates were 63% for newer agents and 35% for placebo (rate ratio, 1.6; 95% CI, 1.2 to 2.1). This magnitude is equivalent to that noted in the AHCPR Depression Guideline Panel review. The response rate was similar for newer and older agents (rate ratio, 1.0; 95% CI, 0.9 to 1.1). The drop-out rate because of adverse effects was significantly lower for newer agents than for the TCAs (8% vs 13%; absolute risk reduction [ARR], 4%; 95% CI, 0% to 7%) although the overall drop-out rate did not differ.
Although response rates appeared similar across different depressive disorders in the Mulrow et al review, there were too few studies in each group to exclude a modest difference. The most frequent diagnosis of interest, as noted above, was major depression; the remaining other forms of depressive illness may include dysthymia, minor depression, or some additional subthreshold depressive illness. Only 2 trials clearly addressed dysthymia, making conclusions about its pharmacologic treatment in primary care settings less clear.
A recent systematic review by Lima and Moncrieff 125 of all RCTs comparing drugs and placebo for dysthymia from 1966 through January 1997 may provide additional important information for that condition. The review identified 15 studies involving 1,964 patients, with trial duration ranging from 4 to 12 weeks. One study had been conducted in a general practice; 134 the remainder had been performed in a mixture of community, inpatient, and outpatient mental health care settings. The analysis made no distinction among the different settings. Antidepressants were 56% more likely to reduce dysthymic symptoms than placebo (risk ratio [RR], 1.56; 95% CI, 1.43 to 1.67). Treatment response did not differ by class of antidepressant. Patients treated with TCAs were more likely to report adverse events than those on placebo, but they were not significantly more likely to drop out.
Gill and Hatcher 126 recently reviewed all RCTs published through June 1998 that had examined antidepressant interventions in depressed patients who also had a physical illness. Settings were not limited to primary care. The 18 studies in this review involved a total of 838 patients. Study subjects had a wide range of medical illnesses (5 studies examined patients with human immunodeficiency virus [HIV] infection; 3 with stroke, 2 with cancer, 2 with mixed medical diagnoses; and 1 each with diabetes, head injury, heart disease, lung disease, multiple sclerosis, and renal disease). Patients could be diagnosed as depressed by any criterion. Those treated with antidepressants were significantly more likely to improve (52%) than those given placebo (30%) (odds ratio [OR], 0.37; 95% CI, 0.27 to 0.51). Six of the 18 trials involved a diagnosis of major depression by structured clinical interview; for this subgroup, the effect was similar (OR, 0.35; 95% CI, 0.22 to 0.55).
Results from Additional Trials
We identified 6 additional RCTs for depressive illness in primary care involving the use of antidepressants (Evidence Table 2). Five of these studies reported benefit for antidepressant intervention compared to either placebo74,76,77 or usual care;78,81 1 study compared a cognitive-behavioral therapy (CBT) with antidepressant treatment and with a combination of the 2 interventions. 75 Five studies involved patients with diagnoses of major depression.74,75,77,78,81,74,78,135 Strict intention-to-treat analyses were conducted in 4 of the trials;74,76,78,75 Mynors-Wallis et al 77 and Scott and Freeman 81 analyzed only those subjects who received at least some treatment.
Appleby and colleagues 74 compared fluoxetine (an SSRI) to placebo (both with either 1 or 6 CBT sessions) for women screened originally on obstetrics wards who had postpartum major or minor depression 6 to 8 weeks after delivery. Patients with major depression were in the majority in each group (60.5% for fluoxetine and 56.8% for placebo). No distinction was made in the analysis between those with major and minor depression. Of note is that a substantial proportion of women who fulfilled study criteria did not enter the trials; of 188 with confirmed diagnoses of depression, only 87 agreed to enter the trial. The fluoxetine group averaged a 66.9% decrease in Hamilton Depression Rating Scale (HAM-D) scores at 12 weeks compared to a 54.0% decrease for the placebo group. The statistical significance of this difference was not reported. Among subjects completing treatment (70% of the randomized sample), treatment appeared to lead to significant improvement, with the fluoxetine group having a 78% decrease in HAM-D scores compared to a 61% decrease in the placebo group (P = "significant"). The fluoxetine and CBT treatments did not appear to interact significantly, and no advantage was found for those receiving both interventions.
Schulberg et al 78 compared primary care patients receiving the TCA nortriptyline alone to those receiving only interpersonal psychotherapy (IPT) and to those receiving usual care. All subjects had a rigorously diagnosed major depressive disorder that used a three-stage assessment. Of 7,652 waiting-room patients completing the CES-D screen, 1,492 scored above a cut-off of 22 and were not currently being treated for a mood disorder. These patients were eligible for the next phase, consisting of diagnostic confirmation using the DIS Depression section; 136 of the 1,059 patients completing this section, 678 (64%) met the criterion for a major depression. Of these 678 patients, 403 (59%) completed the third stage, in which a consultation-liaison psychiatrist confirmed the depression of major depression and confirmed protocol eligibility. Psychiatrists judged 283 (70%) of those they evaluated as protocol eligible; 276 of these agreed to a randomized treatment assignment.
Patients in the nortriptyline group had weekly or biweekly visits until the acute phase of treatment had ended and monthly visits thereafter. Of those treated with nortriptyline, 48% had recovered at 8 months, as had 18% of those treated with usual care. There was no significant difference in outcome between the medication and the psychotherapy intervention (48% with nortriptyline, 46% with IPT).
Mynors-Wallis et al 77 compared amitriptyline (also a TCA) or psychotherapy (problem-solving therapy) to placebo in patients with major depression. As with the other treatment arms, the amitriptyline group was offered 6 treatment sessions over 3 months, and treatments were usually given at the patient's home or local health center. All 3 groups had 3.5 hours of contact time (about 35 minutes per session). An intention-to-treat analysis was not performed, as outcomes were measured only for those attending 4 or more sessions. Recovery by 12 weeks was seen for 52% of the patients receiving amitriptyline, 60% of those receiving problem-solving therapy, and 27% of those receiving placebo.
Scott and Freeman 81 compared amitriptyline prescribed by a psychiatrist, cognitive therapy provided by a psychologist, or counseling given by a social worker to usual care for patients with major depression. The amitriptyline group averaged approximately 240 minutes (4 hours) of total contact time over the 16-week course of treatment; the usual care group (treated by general practitioners) averaged 50 minutes. An intention-to-treat analysis was not performed; of those randomized to antidepressant treatment, 5 (16%) never began the intervention and were not included in the results. Each of the 4 groups had marked improvement of their symptoms over the four-month study period: 58% of the amitriptyline group had recovered at 16 weeks, compared to 48% of the usual care group.
Malt et al 76 compared sertraline (an SSRI) and mianserin (a newer heterocyclic agent) to placebo for patients with 2 weeks of depressive symptoms that were judged to be "severe enough to require treatment." Patients were seen weekly for the first month and then with a gradually lengthening follow-up interval for a total of 10 visits over a 24-week period. This study employed an effectiveness design that attempted to reproduce more accurately the clinical situation in primary care by not excluding patients with concomitant medical illness and not excluding those experiencing a placebo response. Clinically significant responses occurred in 61% of those receiving sertraline, 54% of the mianserin group, and 47% of the placebo group. The number needed to treat (NNT) for sertraline was 7. Of note, 86% to 89% of all subjects met criteria for a major depressive episode, although only 18% of all subjects were considered profoundly depressed on the Clinical Global Impression (CGI) scale.
Mynors-Wallis and colleagues compared 6 sessions of medication-only alone treatment (provided by research general practitioners [GP], 137 not the patients' usual GP) to 6 sessions of problem-solving (PS) psychotherapy (by a trained research GP or a trained research nurse) and to a combination of medication and psychotherapy treatment 75 . A usual care or placebo group was not included. GPs referred subjects with a depressive illness requiring treatment; those included had had at least 4 weeks of probable or definite major depression as confirmed by Research Diagnostic Criteria. 138 The number of actual contact hours for the medication-only group was not given. Each of the 4 groups showed substantial improvement. In an intention-to-treat analysis, 67% of the medication-only group had recovered (HAM-D <7) at the end of the 12-week treatment course; 56% remained recovered at the 1-year mark. The medication group did not differ significantly from either the problem-solving groups or the combination treatment group. Although not statistically significant, the medication-only and combination treatment groups lost 17% of their patients to follow-up, compared to 36% of the PS-GP group and 22% of the PS-nurse group.
Of note, all but 1 of the trials included in this review were efficacy trials, conducted under ideal conditions with much closer and more frequent follow-up than is routine in primary care. Such results may not generalize to normal primary care practice. Simon et al 139 initially randomized patients to SSRI (fluoxetine) or tricyclic (desipramine or imipramine) antidepressant treatment and then allowed subsequent antidepressant management to be undertaken by the primary care physician. In this effectiveness trial, the proportion of patients continuing the original medication was significantly higher for the fluoxetine group (80% over the 6-month period) than for either the desipramine group (52% overall) or the imipramine group (57% overall) (P< 0.001 for each comparison at one-month, three-month, and six-month follow-up), although the proportion in each group continuing any antidepressant was similar at each assessment. These findings suggest that patients are more likely to switch treatment from tricyclic agents than from SSRIs.
Psychotherapy Interventions
Evidence Table 3 (Appendix D) presents information on 13 studies of psychotherapy (covering 15 publications); the entries appear in alphabetical order. We present the discussion below in terms of studies on major depression, minor depression, dysthymia, and/or other depressive conditions.
Major Depression
Eleven of the 13 studies of psychotherapy involved patients with major depressive disorders (Evidence Table 3). The five studies that also included medication trials are described with respect to the medication efficacy in the previous section; the outcomes of psychotherapy are described below. As shown in Table 14, the more effective interventions tended to have a more highly structured intervention than is typically the case; that is, the more effective approaches were well formulated, limited in time, and standardized in application, and they tended to have clearly defined goals and stages. Only 6 studies used intention-to-treat approaches.74,75,78,85,88,89 The studies are reviewed below in the order of decreasing magnitude of effect and decreasing stringency of outcome measures (eg, recovery is more stringent than reduction in depressive severity).
Mynors-Wallis and colleagues 77 compared 6 treatment sessions of well-structured PS therapy, guided by a treatment manual and provided by either an experienced psychiatrist or trained GPs, against usual care. As noted earlier, all groups (including the pharmacologic arm) had 3.5 hours of contact time. No intention-to-treat analysis was done. At 12 weeks, 60% of those in the PS group had recovered compared to 27% of those in the usual care arm.
Holden et al 84 compared counseling by health visitors to usual care in a trial involving women with postpartum major or minor depression. The health visitors had limited training and provided 8 weekly sessions of an unstructured, supportive intervention of at least 30 minutes duration. The therapy was not administered according to any standardized manual or approach. Approximately two-thirds of each group had patients with major depression at the start of the trial. No intention-to-treat analysis was performed: 55 women were randomized; of these, 50 completed the trial and were included in the results. At 13 weeks, 69% of the health visitor group and 38% of the usual care group had neither major or minor depression as assessed by Research Diagnostic Criteria. The results did not distinguish between major and minor depression.
Scott et al 89 compared six 30-minute cognitive therapy sessions to usual care. No manual was used, but the treatment was relatively well structured and a random sample of psychotherapy tapes were reviewed to ensure quality. In an intention-to-treat analysis, at 7 weeks 62% of the group randomized to cognitive therapy had recovered, as had 33.3% of those with usual care. Follow-up was also assessed at 58 weeks in a treatment-completer analysis, and the psychotherapy arm had significantly lower depressive severity (HAM-D=6.1) than the usual care arm (HAM-D=10.7). Of note was the large attrition rate at 1-year follow up (16/28 in cognitive therapy group, 8/24 in usual care group).
Katon et al 85 evaluated a brief CBT intervention as part of a multi-faceted primary care intervention for major or minor depression that included on-site education and consultation for physicians about antidepressant and behavioral treatment of depression. Analysis was intention-to-treat. The psychotherapy intervention was geared toward improving medication adherence and consisted of 4 to 6 meetings with a psychologist for a total of 2.5 to 3.5 hours plus 4 telephone contacts. Outcomes for patients involved in this program were compared to outcomes for patients receiving usual care for the same conditions. For major depression, 70.4% of those receiving the multi-faceted intervention involving CBT had a greater than 50% decrease in depressive severity at 4 months compared to 42.3% of those in the usual care group. The effect size was smaller for minor depression (66.7% improved with therapy, 52.8% with usual care) and did not reach statistical significance.
Mynors-Wallis and colleagues 75 compared 6 sessions of well-structured PS therapy by a trained GP to 6 sessions by a trained nurse, to antidepressant medication alone, and to a combination of the medication and PS therapy. Therapy was provided in either the patient's home or the local health center. The first PS sessions lasted 1 hour; subsequent sessions lasted 30 minutes. Patients receiving PS therapy alone had a mean number of 4.6 treatment sessions (2.8 hours total contact time); those receiving combination treatment had a mean number of 5.2 PS treatment sessions (3.1 hours contact in addition to medication management time).
After 3 months of treatment, 51% of the PS-GP group and 54% of the PS-nurse group had recovered (HAM-D <7), compared to 67% of the medication alone group and 60% of the combination group. At 1-year follow-up, 62% of the PS-GP group had recovered, as had 56% of the PS-nurse group, 56% of the medication alone group, and 66% of the combination group. As described before, the 4 groups did not differ significantly in terms of rate of recovery, suggesting that combination treatment for routine depressive illness in primary care is no more effective than a single intervention, and that outcomes will not differ between PS therapy delivered by a trained GP and that delivered by a trained nurse. These findings are in contrast to recent research in specialty settings suggesting benefit for combination in certain situations, such as preventing recurrence of depression in a geriatric psychiatry setting 140 and in treating chronic depression in an outpatient psychiatry setting. 141
Schulberg et al 78 compared 16 weeks of IPT delivered by doctoral-level, experienced therapists using a well-structured, standardized protocol to usual care. In an intention-to-treat analysis, 46% of those randomized to the IPT group recovered as did 18% of the usual care group.
Ross and Scott 88 tested individual cognitive therapy (consisting of 12 sessions lasting 45 minutes over 3 months) or group cognitive therapy (12 sessions lasting 90 minutes over 3 months) to usual care. All treatment was delivered by the same experienced social worker; it is unclear if the treatment was structured. All groups appeared to improve. Following the 3-month intervention period, those receiving cognitive therapy appeared to have significantly greater reductions in depressive severity than usual care (32% reduction on HAM-D vs 17%, P <0.01; intention-to-treat analysis). The individual and group forms of treatment did not differ significantly. For the subset of patients who had been assessed 12 months after completing treatment, benefits appeared to be maintained, although no usual care group was available for comparison.
The Appleby et al 74 study did not distinguish between patients who developed major or minor depression postpartum. Those receiving 6 sessions of minimally structured CBT totaling 3.5 hours by a nonspecialist with minimal training experienced a 64% decrease in the HAM-D score at 12 weeks compared to a 57.7% decrease for those receiving a single, 1-hour CBT session from a nonspecialist. These results were slightly less robust than the pharmacologic intervention. Again, significance was not reported for the intention-to-treat analysis. For the patients completing treatment (30% attrition), HAM-D scores decreased by 76% for those with 6 sessions, a significantly greater decrease than the 66% drop for the 1-session group.
Scott and Freeman 81 compared cognitive therapy delivered by a psychologist or supportive counseling delivered by a social worker to usual care. Neither the cognitive treatment nor the counseling was provided according to a formal manual or otherwise clearly structured. Analysis was not intention to treat. Over the 16-week course, the cognitive intervention averaged nearly 7.75 hours and the social work counseling more than 12 hours, compared to less than 1 hour by the general practitioners. At 16 weeks there was no difference in percentage recovered between the cognitive therapy group and usual care (41% vs 48%), but the social work group (72%) produced substantially higher rates of recovery.
Teasdale et al 90 compared up to 20 one-hour sessions of cognitive therapy (mean 15.2 hours) delivered by doctoral-trained, experienced psychologists to usual care for patients with major depression in a primary care setting. The investigators ensured adequacy of treatment by tape review and did not employ a structured manual. Analysis was not done on the basis of intention to treat. Immediately post-treatment, patients in the therapy group averaged a greater change in depressive severity on the BDI than did the usual care group (22 point decrease vs 11.5 point decrease, P <0.01). This benefit was not apparent at follow up three months after completing treatment. Of note, contact time for the therapy group was substantially greater than for usual care.
Blackburn and colleagues 83 compared the outcomes for patients receiving either a pharmacologic intervention (the TCA amitriptyline) or only cognitive psychotherapy to outcomes for patients receiving combined cognitive psychotherapy and amitriptyline; all patients had a diagnosis of major depression. Psychologists performed 12 to 20 sessions of therapy; no manual was used and the degree to which the treatment was structured is unclear. An intention-to-treat analysis was not done, and allocation to therapists was not randomized. Subjects in all 3 groups showed benefit, but patients in the cognitive therapy group and the combined treatment group tended to have a greater decrease in depressive severity than the amitriptyline-only group. Specifically, using a more than 50% decrease in depressive severity immediately post-treatment as the outcome of interest, 81.8% of the combined group, 72.7% of the cognitive therapy group, and 55% of the medication group achieved that outcome (overall chi-square test was not significant). The cognitive therapy and combined groups appeared to have substantially more visits than the medication-only group. Attrition during the trial was 27%.
Minor Depression
Two studies assessed the benefits of counseling for patients with minor depression. Miranda and Munoz 87 compared a CBT approach consisting of 8 weekly 2-hour sessions by doctoral-level psychologists following a specific protocol (according to a formal manual) to usual care in primary care medical patients. Over the subsequent year, the cognitive therapy group had a greater reduction in depressive severity and missed fewer medical appointments. The sample (n=150) consisted of patients with minor depression (33%, n=49) and patients with other subthreshold depressive symptoms. The attrition rate for the full sample was large; 20% of those randomized attended none of the 8 sessions, and 37% of the sample attended fewer than half of the sessions.
Lynch et al 86 compared telephone counseling (consisting of 6 weekly 20-minute phone sessions of PS therapy conducted by student therapists with minimal experience) to usual care for the treatment of minor depression. The therapy was relatively structured and was based on an existing PS therapy model. The sample size was small (n=29). The telephone counseling group had more drop-outs than usual care (4/15 vs 1/14) and an intention-to-treat analysis was not done. The counseling group had a significant 4.7-point drop (from a baseline of 15.6) in its HAM-D score immediately following the intervention, whereas the usual care group had no significant change in depressive severity (from 12.4 to 13.3).
We found no RCTs of psychotherapy for dysthymia in either primary care or psychiatric settings.
Educational and Quality Improvement Interventions
Five studies examined health care delivery strategies that did not directly involve traditional medication or psychotherapeutic interventions.129,132,133 Katon and colleagues 129 tested a "Collaborative Care" model that included patient education, on-site consultation for patients, active collaboration with primary care physicians, and increased frequency and intensity of primary care visits. At 4 months, significantly more patients with major depression who received care through the collaborative care model had a greater than 50% decrease in depressive severity than did patients on usual care (74.4% vs 43.8%). The authors reported no significant difference for patients with minor depression (60% vs 67.9%).
Llewellyn-Jones and colleagues 132 tested a "Shared Care" model for "depressed" patients involving caregiver education, health education and promotion for patients, and improved communication between general practitioners and staff at a single elderly residential care facility. Their design examined control and intervention groups in a serial fashion. The intervention was "population-based" in that it was targeted to the entire living facility. Participation was variable: only 62% of either study group had general practitioners who attended the provider education program. The intervention itself was relatively inexpensive. Compared with patients in the control group, patients in the multi-faceted intervention group had a significantly greater reduction in their GDS scores (by 1.87 points) and were more likely to move to a "less depressed" state (45% vs 31%).
Peveler et al 133 tested the benefits of 2 sessions of counseling about antidepressant medication adherence, or the provision of an information leaflet about adherence, versus usual care in a population with "depressive illness." No difference in depressive symptoms as measured by the Hospital Anxiety and Depression Scale 142 was found between treatment groups overall. However, among patients with major depression who received higher doses of medication, those in the counseled group had significantly lower final depression scores than those with usual care (4.0 vs 5.9).
Katzelnick et al 131 compared the benefits of a systematic, primary care-based depression treatment program for depressed "high utilizers" not in active treatment. This depression management program (DMP) consisted of patient education materials, physician education programs, telephone-based treatment coordination, and antidepressant medication treatment initiated and managed by the patients' primary care physician. Those receiving the DMP were compared to a usual care arm in an intention-to-treat analysis. The DMP group was significantly more likely to fill 3 or more antidepressant prescriptions in the first 6 months (69.3% vs 18.5%, vs, P< 0.001) and had significantly greater improvement in HAM-D depressive severity scores at 1 year (−9.2 vs −5.6, P< 0.001), with this benefit beginning by 6 weeks into the study. Additionally, at 1 year, intervention patients were more improved on mental health, social functioning, and general health self-report measures (P< 0.05 for each domain). Of note, mean visits counts in the DMP increased by 1.6 visits, whereas mean visits counts decreased in the usual care group by 2.0 visits (P=0.02).
Simon et al 130 compared a program of feedback only and 1 of feedback plus care management to usual care in primary care patients with recently diagnosed depressive illness. The feedback-only intervention consisted of feedback and algorithmic recommendations to doctors at 8 and 16 weeks based on data from computerized records of pharmacy and visits. The feedback plus care management group additionally provided to patients 2 later telephone monitoring contacts (at 8 and 16 weeks), which were followed by more sophisticated feedback to the doctor based on information received during the phone call.
In an intention-to-treat analysis compared to usual care, the care management group had a higher probability of receiving at least moderate doses of antidepressants (OR, 1.00; 95% CI, 1.23 to 3.22). The care management group also had a significantly higher probability of showing a 50% decrease in depression severity (OR, 2.22; 95% CI, 1.31 to 3.75) and a significantly lower probability of persistent major depression (OR, 0.45; 95% CI, 0.24 to 0.86) at 6 months. Meanwhile, relative to usual care, the feedback-only group showed no difference on receiving at least moderate doses of antidepressants (data not provided), the probability of a 50% decrease in severity (OR, 1.12; 95% CI, 0.73 to 1.73), or the probability of major depression at follow-up (OR, 0.89; 95% CI, 0.55 to 1.46).
Conclusions about Therapies for Adults
Effective treatments for depressive illness in primary care are available. Antidepressant medications for major depression are clearly effective compared with placebo. Most of these results have come from structured efficacy trials with selected populations, although more recent studies using usual-care comparison groups and real-world settings have produced similar effects.76-78
Antidepressant interventions for dysthymia are probably effective in primary care patients; although only 2 studies have been performed in primary care settings, evidence from multiple sites (inpatient psychiatric hospitals, outpatient psychiatric clinics, primary care practices, and the community) show a similar magnitude of effect. The evidence regarding the benefit of antidepressant medication for minor depression is limited. The 1 trial addressing this question (the Collaborative Care model, 129 in which improved medication prescription and adherence was part of the intervention) did not find a statistically significant benefit with antidepressants, but it may have been underpowered to detect a modest but clinically important effect (10% to 15%).
Tricyclic agents and newer agents (including SSRIs) have similar efficacy. The newer agents, however, have fewer side effects and are less likely to have side effects that lead to drop-out. Total drop-out rates, however, did not differ. Of note, the 1 effectiveness study (which most closely represented actual practice in primary care by allowing naturalistic follow-up and management by primary care physicians) 139 found that the drop-out rates for the tricyclic-treated patient were much higher than those for the SSRI-treated patients. For patients with major depression, greater side effects lead to significantly higher drop-out rates from treatment, although similar drop-out rates were not noted for patients with dysthymia.
Psychotherapeutic interventions appear as effective as antidepressant interventions for major depression, with a similar magnitude of effect. In general, the more effective psychotherapeutic interventions had greater structure to their treatments. Relative to pharmacologic interventions, psychotherapeutic interventions were clearly more time intensive. Four studies used between 4 and 6 sessions totaling 3 to 4 hours for their interventions.74,85,89,135
Evidence on the effectiveness of psychotherapeutic intervention for patients with minor depression is limited, although the results of 2 studies using well-structured interventions suggest potential benefit.86,87 No evidence exists concerning the use of psychotherapy alone for dysthymia.
Few studies have examined the effect of combining medications and psychotherapy. Two studies involving combined treatments did not find a significant incremental benefit when compared to a single active intervention.74,83 However, 2 recent trials in psychiatry clinic settings suggest that combination therapy may improve long-term outcomes.140,141
Treatment of Depression in Children and Adolescents
Treatment of depression has been less studied in adolescents and children than in adults. Nevertheless, recent trials and systematic reviews have increased the knowledge of the efficacy of different forms of treatment for depression. In this section we review the evidence for treatment of depression in adolescents and children with psychotherapy and pharmacotherapy. We reverse the order of discussion (relative to that for adults) because psychotherapy has been a comparatively more important intervention for children in the past. Before considering treatment options, we discuss options for preventing depression in this age group.
Preventing Depression
One method of reducing the impact of depression is to treat risk factors and symptoms before they lead to a full episode of major depression. Some studies, described below, provide limited evidence on this approach for children and adolescents (eg, intervening with children with subclinical depression or providing assistance with coping skills for children at risk of depression).
Jaycox et al 143 reported reduction in depressive symptoms in the Penn Prevention Program, a prospective cohort study of 142 children ages 10 to 13 years. They used CBT to teach coping strategies to 69 "at-risk" children in a treatment group. At-risk children were selected based on depressive symptoms and reports of parental conflicts. The treatment group was compared to 73 control children who did not receive any intervention. Children were not randomized to intervention. Outcomes were assessed after 6 months using the Children's Depression Inventory (CDI). In the treatment group, the percentage of children who were moderately depressed (CDI >15) decreased significantly from 24% to 15% (P<0.05); in the control group the change in percentage of depressed subjects was not significant (24% to 23%, P=0.36). Based on self-reported depressive symptoms in the 6 months following the intervention, 23% of the children in the treatment group and 44% of the control group reported moderate depressive symptoms (P<0.05).
Clarke et al 144 were able to demonstrate positive results in adolescents with depressive symptoms at risk for developing a DSM-IIIR-defined episode of depression. The intervention consisted of assessment of symptomatology by the CES-D with subsequent K-SADS diagnosis of depression or dysthymia. The investigators randomly assigned 172 adolescents with subclinical depression to a usual-care control group or an after-school cognitive psychotherapy group. Total incidence of major depression or dysthymia during follow up was 18 of 70 children (25.7%) in the control group and 8 of 55 (14.5%) for the intervention group.
Lamb et al 145 conducted a school-based program designed to promote coping among rural adolescents with depressive symptoms. The study surveyed 222 students ages 14 to 19 years and identified a subgroup of subjects with moderate to high Reynolds Adolescent Depression Scale (RADS) scores who could be randomly assigned to treatment or control groups. The treatment consisted of 8 weeks of group sessions using coping techniques and role-playing tasks. Four students dropped out of the treatment group; 1 left the control group. The investigators found that 87% of the intervention group and 61% of the control group improved on RADS scores. These results were significant for females (P=0.032) but not for males.
Psychotherapy
Psychotherapy has been the mainstay of treatment for children diagnosed with depression. Various forms of psychotherapy and counseling have been used. CBT is the method that has been studied most rigorously and been shown to be effective.
Reinecke et al 146 recently reviewed evidence on CBT in a systematic review and meta-analysis. The authors identified 6 controlled clinical trials with 14 post-treatment control comparisons and 10 follow-up control comparisons covering 217 subjects.147-151 All studies were conducted in adolescents ages 10 to 19 years. All but 1 study recruited subjects in schools and used group therapy sessions. The interventions lasted 5 to 8 weeks and included 6 to 14 sessions with follow-up periods of 1 to 3 months. Outcomes were based on different depression scales.
The overall pooled effect size (a measure of change in standard deviations) at post-treatment was -1.02 (95% CI, −1.23 to −0.81) and for follow-up data −0.61 (95% CI, −0.88 to −0.35). Negative effect size scores indicated a decrease in combined depression measures and improvement of symptoms in terms of standard deviations. Thus, CBT appears to be effective in reducing depressive symptoms among adolescents. Treatment gains seem to be maintained after completion of therapy. The results of this meta-analysis were consistent with other meta-analyses of psychotherapy for depression in children and adolescents.152,153
Pharmacotherapy
Tricyclic Antidepressants
Two recent systematic reviews have examined the use of TCAs in children and adolescents. Hazell et al 166 published a meta-analysis of 12 RCTs comparing the efficacy of TCAs with placebo in depressed children ages 6 to 18 years.154-166 All studies but 1 suggested greater improvement in the TCA group than in the placebo group, but the difference was statistically significant in only 1 study. Six studies presented results as a change in scales of depressive symptoms using the CDI, Children's Depression Rating Schedule-Revised (CDRS-R), K-SADS, or Depressive Adjective Checklist (DACL). Effect size in the 6 studies ranged from −0.29 to 1.57 with a pooled effect size of 0.35 standard deviations (95% CI, −0.16 to 0.86). The authors concluded that the trend toward improvement in depression on TCAs versus placebos was not statistically significant and likely not clinically significant. They did note the important placebo effect (in some trials more than 50% of subjects improved).
Geller et al 172 conducted a systematic review of TCA use in children and adolescents for various indications including depression.160,162,165,167-172 They reviewed double-blind, placebo-controlled trials and reported no significant improvement with treatment of depression using TCAs compared with placebos in 6 studies. One of the studies in the review produced mixed results based on different outcome rating scales. 168 Another study demonstrated improved outcomes on intravenous clomipramine versus placebo; 169 however, a study focusing on intravenous medication is not applicable to ambulatory care treatment. We found no additional RCTs using TCAs for treatment of depression in children or adolescents in our updated literature search. Thus, it appears that TCAs are ineffective for treating depression in children and adolescents.
In addition to considering efficacy, the important side effects of TCAs, including sudden death and fatal overdose potential, must be considered in any discussion of management of patients in this age group.
Selective Serotonin Reuptake Inhibitors
SSRIs are a relatively new therapy for the treatment of children and adolescents with depression. Favorable anecdotal clinical experiences and open trials have reported improvement in depression for pediatric patients on fluoxetine,173-177 sertraline,178-180 and paroxetine.181-183 Recent clinical trials (discussed below) have added to the evidence.184,185 To date, however, no studies in children or adolescents have been conducted in the primary care setting. Efficacy studies, clinical experience, and case reports suggest that overdose potential and side effects are lower in pediatric subjects than in adults; however, more subtle effects on neurobiology and behavior are unknown at this time.
Simeon et al 184 published a placebo-controlled, double-blind study of fluoxetine. The study included 40 inpatients and outpatients ages 13 to 18 years with unipolar depression defined by HAM-D scores of >20. The intervention consisted of a 1-week placebo period for all subjects followed by 8 weeks of either fluoxetine titrated to 20-60 mg daily dose or placebo. Thirty-two patients were followed for a mean of 24 months with the HAM-D and other behavioral symptom scales and clinical measures. It is not clear if the 8 drop-outs were included in the final results. Results were not reported in sufficient detail to calculate effect size. In general, most adolescents on fluoxetine or on placebo improved. Fluoxetine treatment was superior to placebo in many clinical measures, but the differences were not statistically significant.
Emslie et al 185 conducted the first double-blind, randomized, placebo-controlled clinical trial of fluoxetine in children and adolescents. The study included 96 children ages 7 to 17 years with nonpsychotic major depression diagnosed by DSM-IIIR criteria from a structured clinical interview, depression scales, and consensus team diagnosis. All subjects participated in a 1-week placebo run-in period. Patients were randomized to placebo or 20 mg of fluoxetine every morning for 8 weeks. Thirty-six patients did not complete the full 8-week trial following randomization: 5 because of side effects (4 in the treatment group, 1 in the placebo group); 5 because of protocol violation (3 in the treatment group, 2 in the placebo group), and 26 because of a lack of efficacy (7 in the treatment group, 19 in the placebo group). Of the 60 patients who completed the 8-week trial, 25 of 34 (74%) responded to treatment and 15 of 26 (58%) responded to placebo. Differences in raw scores of the Clinical Global Impressions (CGI) and the CDRS-R were also significant among patients who completed 5 or more weeks of the trial. Although many of the subjects improved, only 31% of the original 48 treatment patients and 23% of the 48 placebo patients had a remission of depression to minimal symptomatology (CDRS-R <28). The NNT based on this result is 13 depressed children treated with fluoxetine to achieve clinical remission in 1 patient. 185
Several studies that are under way or planned to evaluate SSRIs in depressed children and adolescents should add to the growing body of evidence on treatment. In addition, the Texas Medication Algorithm Project (TMAP)186,187 and other groups such as the American Academy of Child and Adolescent Psychiatrists (AACAP) 188 have proposed treatment guidelines that feature SSRIs as first-line therapy for pediatric patients with depression. At present, most of the recommendations have focused on psychiatric care and do not describe the role of primary care providers in pharmacotherapy.
Combination Therapy
Clinical experience and expert opinion suggest that combination therapy may improve long-term outcomes especially for complex patients with comorbid disorders. No randomized trials in children or adolescents are available to describe the efficacy of combination therapy with multiple medications or pharmacotherapy and psychotherapy versus monotherapy with medication or counseling alone.
Additional Considerations
This review has attempted to describe generally the identification and management of children who present in primary care, but special patient populations should be considered. Gender, age, and ethnicity are important variables in existing studies that may limit generalizability of results. Many of the above studies did not have large numbers of minorities or patients of lower socioeconomic status. Most of the positive studies and results are based on adolescents and older children. Aside from case reports and series, very few data are available about interventions in young children. Individual characteristics should be considered before the results on any larger population are generalized or applied to a specific patient.
Finally, children with poor health and chronic illnesses have been reported to have higher rates of depression and mental health problems. It is very important to consider depression and comorbid effects on chronic medical conditions in terms of adherence to medical treatment, functionality, and outcomes. However, in pediatric patients with chronic illness, screening tools for depression appear to lack sensitivity and predictive value and thus cannot be recommended for routine use. 123 In addition, studies are not yet sufficient to document treatment effectiveness in these patients.
Conclusions for Children and Adolescents
Data on prevention of depressive disorders in school and community settings provide support for intervention on selected youths with depressive symptoms, although no studies have described this type of intervention in primary care settings. The approach most relevant to primary care involves early recognition of depressed patients, proper identification and diagnosis, and facilitation of effective treatment.
Treatment of depression in adolescents with CBT or SSRIs appears to be effective. Whether these results can be generalized to primary care settings or to children is unclear. TCAs are not effective for treatment of depression in children and adolescents. The comparative efficacy of psychotherapy alone, medications alone, or combined treatments in children or adolescents is unknown.
Key Question 3: Screening Outcomes
The effect of giving health care providers the results of a screening test for depression has been compared with usual care in 14 randomized or quasi-experimental trials in primary care settings. Detailed study characteristics and results for these 14 trials can be found in >Evidence Table 4 in Appendix D. In this section of the SER, we describe and compare the main findings from these trials and attempt to understand the effect of screening (compared with usual care) on the diagnosis, treatment, and outcomes of depression in primary care settings.
Overview of Screening Outcome Studies
Several different screening instruments have been tested as a means of providing feedback to providers. Four studies used the Zung SDS;95,97,98,103 3 papers from 2 studies used the CES-D;91,92,102 3 studies used the GHQ (which contains items about depression as well as other psychiatric conditions);94,96,99 1 study each used the BDI, 93 the SDDS-PC, 100 the GDS 82 and a 2 item screener. 101 The results of these studies are summarized in Table 15a-15d.
Eight papers from 7 studies82,91,92,93,95,97,98,102 examined the effect of feedback of screening results on the rate of diagnosis and recognition of depressive disorders; another group of 8 studies82,91,92,93,95,97,101,102 examined the effect on prescription of treatment for depressed mood. A different set of 9 trials directly examined the effect of screening on patient health outcomes, including changes in depression severity, duration, number of depressive symptoms, or health care utilization.91,92,94,96,100-103 In the next sections, we examine in depth the effect of screening feedback on diagnosis, treatment, and outcomes of depression.
Results of Screening Outcomes Studies
Effect of Screening on Recognition and Diagnosis of Depression
Seven studies examined the effect of screening, compared to usual care, for the diagnosis of depression (Table 15a and 15b). Moore et al, 98 Linn and Yager, 95 and Magruder-Habib et al 97 all used the Zung SDS screener. Callahan et al91,92 and Williams et al 102 used the CES-D. Dowrick used the BDI. 93 Whooley et al 82 used the GDS. In each study, the detection of depression was assessed by chart audit.
Moore et al 98 screened consecutive patients, 20 to 60 years of age, at a university-based family medicine residency program. All patients were asked to self-administer the Zung SDS. The intervention patients' providers received feedback about SDS results greater than 50; providers of patients who scored below 50 and of all control patients simply received notice that their patients had been screened. No attempt was made to confirm the diagnosis using a criterion standard. Of 212 subjects in the trial, 96 scored above 50 (45%). Recognition of depression, defined by any notation in the chart, was 56% for cases in the intervention group (28/50) and 22% in the control group (10/46). The difference between intervention and control groups was similar for "severe depression," but rates of detection in both groups were higher (73% vs 37%). Effect on treatment rate and outcomes was not described.
Linn and Yager, 95 in testing the self-administered Zung SDS, randomized 150 consecutive new patients from a primary care clinic to feedback or no feedback. They found that patients assigned to the feedback group were more likely to have depression diagnosed (29% vs 8%) than the no-feedback patients, but they did not employ any criterion standard.
Magruder-Habib et al 97 screened 800 Veterans Administration primary care clinic patients for depression. Research assistants administered the Zung SDS and used the DIS to confirm diagnosis with DSM-III criteria. Patients with SDS scores greater than 75 were excluded from randomization. The 100 patients who screened positive and met DSM-III criteria for major depression were then randomized to feedback or usual care. Those patients whose physicians received feedback were 3 times as likely to be accurately identified as depressed at the outset than were those whose clinicians had not received such feedback (25% vs 8%). At 1-year follow-up, 42% of the intervention patients, but only 21% of the controls, had been recognized as depressed.
Callahan et al91,92 conducted an RCT of feedback from screening plus targeted educational information and treatment recommendations for patients over age 60 years in an academic primary care setting that served a low-income population. Potential subjects were screened by research assistants using the CES-D and HAM-D depression scales. Those patients scoring above the threshold for diagnosis were eligible to be randomized. Randomization was by physician, with certain clinic sessions randomly assigned to the intervention and others to control. All physicians received an educational talk at baseline.
Two articles appear to report results from this study. The first article, based on a 175-patient sample, found that patients in the intervention group were more likely than the control group to have a new notation of depression in their charts (32% vs 12%). 91 In the second paper, additional analyses on a larger sample size (n=222) found higher rates for documentation of depression (87% vs 40%). 92
Williams et al 102 used the CES-D or a single question about depressed mood to examine the effect of feedback to providers for adult primary care patients. Most patients were able to complete the single question (90%) and the CES-D (54%) without assistance. The presence or absence of depression was later confirmed using the DIS and DSM-IIIR criteria. 102 Current depression was defined as either meeting the DSM-IIIR criteria for major depression or dysthymia or having minor depression (depressed mood or anhedonia plus 1 to 3 additional DSM-IIIR symptoms). Based on chart reviews, current depression was recognized in 39% of patients whose providers received feedback from screening and in 29% of controls. This difference of 10% in the rate of recognition did not reach statistical significance.
Dowrick 93 randomized 116 patients who were initially rated "not depressed" by their usual general practitioners but had BDI scores greater than 14. Feedback was provided 1 week after the visit in which screening took place and was noted in the chart for subsequent visits. The study was powered to detect a 30% difference in the level of diagnosis after feedback. There was a higher level of depression diagnosis at 1 year in the feedback group (35% vs 21%; OR for detection, 2.10; 95% CI, 0.84 to 5.28), but the difference did not reach statistical significance.
Whooley et al 82 randomized primary care clinics to screening with physician feedback versus no screening or feedback for patients over age 65 years screened with the GDS. No criterion standard was applied. They found no difference in the rate of diagnosis of depression at 2 years.
In conclusion, feedback of screening results to providers increases the recognition of depression, especially major depression, by a factor of 2 to 3 in all cases except for the trial by Whooley et al. 82 The absolute increases in the diagnosis of depression range from 10% to 47%, with larger differences for major depression. Recognition and diagnosis of minor depression, when assessed, were generally low in both intervention and control groups.
Effect of Screening on Treatment of Depressed Patients
The effect of feedback of screening results on the proportion of depressed patients who receive treatment was examined in 7 studies (Tables 15a and 15c). Treatment generally included prescription of pharmacologic antidepressant therapy or referral to mental health services. Most studies evaluated treatment by chart audit; some used pharmacy databases. Actual patient adherence was not directly measured.
In contrast to recognition and diagnosis, the effect on rates of treatment was mixed. In 3 studies (Linn and Yager; 93 Dowrick; 95 Williams et al 102 ), the documented rates of treatment were nearly equal in the intervention and control groups (Table 15c). Other studies, however, found improvements in the rate of treatment, with increases in the prescription of antidepressant medication more common than changes in mental health referrals. Callahan et al,91,92 using a stepped program of treatment recommendations in addition to the feedback, found a difference of 17% to 18% in the initiation of a treatment plan and an increase in 12% for the rate of antidepressant prescription (P=0.01). Magruder-Habib et al 97 found an initial difference of 24% in the rate of treatment, although at 1 year it declined to a difference of 14% (56% vs 42%). The Williams et al 102 study also did not find an overall difference in treatment.
Wells et al 101 studied the effect of combining screening and a quality improvement program for depression treatment in 46 primary care clinics and measured its impact on treatment and outcomes of depression. Patients were enrolled if they screened positive on a 2-question screener. Patients received the Composite International Diagnostic Interview (CIDI) criterion standard examination, but participation was not based on its results. Randomization was at the level of the practice, and the intervention included feedback on the results of the 2-item screener. Intervention practices also received educational materials and assistance with quality improvement in treatment initiation and maintenance plus access to nurse-led medication follow-up or to cognitive-behavioral therapy. The investigators screened 27,000 patients, identified 3,918 as potentially eligible, and randomized 1,356 patients. Subjects were followed for 12 months. The proportion of patients receiving appropriate treatment was increased in the intervention group at 6 months (50.9% vs 39.7%) and at 12 months (59% vs 50%, P=0.006).
Effect of Screening on Depression Outcomes
The effect of screening and feedback on depression outcomes was measured in 8 studies (Tables 15a and 15d).
Johnstone and Goldberg 94 applied the GHQ to 1,093 primary care patients and identified 119 cases of depression. These 119 subjects were randomly assigned to feedback of the results to the physician or to usual care. The investigators found no difference in mean GHQ scores at 12-month follow-up, but they did see a larger improvement with feedback among the subset of subjects with severe depression. For all patients, the mean duration of the first episode of depression and the total amount of time depressed were decreased by approximately 2 months (P<0.01).
Zung and King 103 screened 499 patients at a single private physician's practice. Of the 60 who screened positive, 49 were confirmed to have major depression using DSM-III criteria and were randomized to feedback and treatment with the benzodiazepine alprazolam (n=23) or to usual care (n=26). Four weeks later, outcome data were available for 20 patients in each group. The feedback and treatment group was more likely than controls to improve by at least 12 points on retesting with the Zung scale (66% vs 35%, P <0.05).
In Callahan et al91,92 no improvements in HAM-D score emerged among those who received feedback of screening results.
Reifler et al 100 used the SDDS-PC, followed by a depression-specific diagnostic module, in 358 primary care patients. The 186 intervention patients had a lower mean number of visits than the 172 controls (3.7 vs 5.3, P=0.06), but other outcomes including SF-36 or SDS scores did not differ.
Lewis et al 96 used the GHQ and a computer-based diagnostic tool (PROQSY) to examine the effect of feedback of positive scores on outcomes in low-income primary care patients in London. Compared with GHQ scores for controls at 6 weeks, GHQ scores were lower for patients whose providers received feedback on the PROQSY results but not for those who received only GHQ results. The differences were attenuated and nonsignificant at 6-month follow-up.
Williams et al 102 found a statistically nonsignificant difference of 9% in the proportion of subjects still depressed at 3 months. The rate of recovery (patients with 1 or no DSM-IIIR criteria), however, was higher in the intervention than control groups (48% and 27%, respectively; P <0.05).
Whooley et al 82 found little difference in the proportion of patients depressed on the GDS after 24 months of follow-up: 42% for intervention patients and 50% for controls (P=0.3).
Wells et al 101 found statistically significant increases in the proportion of intervention patients (intervention practices received feedback of screening results and a quality improvement intervention) who were not depressed at 6 and 12 months and in the rate of job retention. Based on CES-D scores, intervention subjects were less likely to be depressed at 6 months than controls (55% vs 64%, P=0.001) and at 1 year (55% vs 61%, P=0.04). Among patients initially employed, 90% were still working, as compared with 85% of controls. 101
Based on the results of Wells et al 101 , approximately 10 to 12 patients identified as being depressed by screening would need to be treated to produce 1 additional remission. Twenty patients would need to be treated to preserve 1 patient's job. If depression is present in 5% to 10% of primary care patients, 100 to 200 patients would need to be screened to produce 1 additional remission at 6 months.
Conclusions about Screening and Feedback
In summary, multiple studies have examined the effect of providing feedback of depression screening results to providers in primary care. The rate of detection and diagnosis of depression, based mainly on chart reviews or the completion of a study-specific form, increased by 10% to 47% in the 6 studies reporting this outcome. The effect on treatment was more variable. Four of the 8 studies reporting this outcome found small, nonsignificant increases in the proportion of patients treated for depression.93,95,102 Magruder-Habib et al 97 found a much larger increase (24%), and Callahan et al 91 noted increases in antidepressant prescribing but not referral for counseling or psychiatric care. Wells et al also noted a 10% increase in appropriate treatment, which was statistically significant.
The effects of depression screening on clinical outcome of depression were also mixed. Two small, older trials found large improvements in major depression.94,103 Two larger, well-designed trials found moderate improvements (9%) in remission from depression in a population with a mixed set of diagnoses.101,102 Four other studies found small or no improvements in outcomes.82,91,92,100
Thus, although the effect of screening on diagnosis appears robust, improvements in more distal variables such as treatment and outcomes are not as consistent or as large. Translating the increased rates of detection with screening into improved outcomes may require that particular attention be paid to initiation and maintenance of effective therapy, perhaps in the form of a quality improvement effort or other programs systematically designed to provide appropriate care.
Demonstrating improvements in clinical outcomes (as measured by the proportion still depressed, for example) requires large samples. Studies with smaller sample sizes may be unable to demonstrate statistically significant results despite finding clinically significant differences in recovery.
Major depression appears more responsive to intervention with screening and feedback than minor depression, although the Wells et al 101 study suggests that outcomes can be improved for all subjects with sufficient attention to treatment. The appropriate outcome measure for minor depression differs from major depression, so failure to demonstrate changes in the proportion of patients depressed may not be a fair test for patients with subsyndromal illnesses.
Screening Outcomes for Children and Adolescents
No studies have examined the overarching question of treatment outcomes for children or adolescents identified by primary care providers using targeted screening or clinical suspicion. A large part of the literature focuses on development of screening measures and reliability testing; it does not provide information to assess screening accuracy or sample a general ambulatory population that generalizes to primary care settings. No randomized trials in children or adolescents evaluate the effects of screening for depression on outcomes of recognition, diagnosis, or treatment. No studies in pediatric patients have linked an initial screening assessment for depression with subsequent treatment and demonstrated improved patient outcomes as a direct result of screening. Some studies have shown that screening instruments, especially the relatively brief general measures such as the CBCL and PSC, may increase recognition of mental disorders and referrals; however, there is no evidence that these general screens of psychopathology can improve outcome of depressed children or adolescents.
Brief screens for depression, such as versions of the BDI and CES-D, have been used in children and adolescents. However, their predictive value in general populations with relatively low prevalence of depression may limit their effectiveness and usefulness as a screen for all pediatric primary care patients. Targeted screening or use of measurement instruments on patients with suspected psychiatric disorder can improve diagnostic accuracy, but whether selective screening produces improved outcomes compared to usual care remains untested.
In addition to specific measures of depression, 2 general instruments that seek to identify psychosocial issues have been extensively researched and implemented in primary care.
- Results - Screening for DepressionResults - Screening for Depression
- Appendix A Acknowledgments - Screening for DepressionAppendix A Acknowledgments - Screening for Depression
Your browsing activity is empty.
Activity recording is turned off.
See more...