U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Ervin AM, Boland MV, Myrowitz EH, et al. Screening for Glaucoma: Comparative Effectiveness [Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); 2012 Apr. (Comparative Effectiveness Reviews, No. 59.)

  • This publication is provided for historical reference only and the information may be out of date.

This publication is provided for historical reference only and the information may be out of date.

Cover of Screening for Glaucoma: Comparative Effectiveness

Screening for Glaucoma: Comparative Effectiveness [Internet].

Show details

Discussion

The purpose of this Comparative Effectiveness Review was to summarize the evidence linking screening for glaucoma to intermediate and functional health outcomes of treatment. We did not identify evidence to address five of the six key questions of interest as there were no population-based studies that screened and followed treated or untreated asymptomatic persons with disease that also included a suitable comparison group of early glaucoma patients identified via case finding, referral or a different screening-based program (Figure 1).

The investigators of the evidence report Primary Care Screening for Ocular Hypertension and Primary Open-Angle Glaucoma: Evidence Synthesis,100 commissioned by the Agency for Healthcare Research and Quality in 2005, found no evidence assessing screening and subsequent treatment of glaucoma in a population setting and concluded that while there was good evidence to suggest that treating early primary open angle glaucoma is beneficial, based on the lack of evidence regarding screening, more research is needed to address whether screening is “effective in improving vision-specific functional outcomes and health-related quality of life.”6 As our updated search of the literature was unable to identify any evidence linking screening to the pre-specified intermediate and functional outcomes, we also conclude that more research is needed to address this question. A randomized controlled trial of glaucoma screening would be the optimal study design as a randomized controlled trial design would allow investigators to enroll participants with similar risk profiles and minimize the risk of lead time bias. The feasibility of a randomized controlled trial would be contingent, however, on both the identification of sufficiently sensitive and specific tests for screening and diagnosing persons with glaucoma and the establishment of a standard definition for open-angle glaucoma.

A sixth Key Question (KQ3) addressed the accuracy of candidate screening/diagnostic tests for glaucoma. In 2005, the investigators of the Primary Care Screening for Ocular Hypertension and Primary Open-Angle Glaucoma: Evidence Synthesis100 reported the sensitivity and specificity of direct ophthalmoscopy, tonometry, the Henson visual field analyzer, and frequency doubling technology, but concluded that there were no appropriate tests that would support population -based screening to identify asymptomatic persons with early disease.

After completing a systematic review of 40 included studies and 48,000 participants, Burr (2007) concluded that optic disc photography, HRT II, FDT, SAP and Goldmann applanation tonometry were potential candidates for a screening-based program, but acknowledged that given the “imprecision in estimates from the pooled meta-analysis models for the diagnostic performance of each test it was not possible to identify a single test (or even a group of tests) as the most accurate.”7 With respect to the limitations of Burr 2007, the authors note that only a small number of studies were identified for the candidate tests included in the review, thus limiting the ability to conduct sensitivity analyses to determine the effect of pooling estimates from population-based studies and those including persons suspected of having glaucoma at the time of screening. The lack of an agreed upon reference standard for the diagnosis of glaucoma and a limited number of studies that address test performance among those at high risk for glaucoma, were additional limitations of this review.7

Building on the comprehensive evaluation by Burr (2007),7 we identified 83 additional studies evaluating the diagnostic accuracy of candidate tests published as of 6 October 2011. While there is now more evidence regarding Optical Coherence Tomography (OCT), the Heidelberg retina tomograph III (HRT III), and the GDx scanning laser polarimeters, the ability of these devices to identify glaucoma in a screening setting is not well understood for the same reasons as noted by Burr 2007: the lack of a single diagnostic standard for glaucoma and the high degree of variability in the design and conduct of largely cross-sectional studies of diagnostic accuracy.

The lack of diagnostic standards continues to complicate all studies of glaucoma, including those of screening. The lack of standard definitions results in studies that attempt to address the same questions using different definitions, and results in varied estimates of test accuracy that cannot be appropriately compared across studies. The authors of Burr 2007 also noted this as a limitation of their work, and further note that the optimal reference standard, confirmation of glaucoma at a followup visit, was used by only seven of the 40 included studies. A second reference standard of diagnosis by an ophthalmologist at the time of screening was used more frequently. We adapted the reference standards of Burr 2007 and as well identified significant variability in the reference standard with some investigators relying on clinical examination or disc photographs, or optic nerve assessments only while other investigators defined standards that incorporated clinical examination with both structural and functional measurements. The use of a standard, such as that proposed by Foster (2002)3 in studies of glaucoma screening or of devices potentially used in screening, would help overcome this problem. Foster proposed that glaucoma should be classified by three levels of evidence. A Category 1 diagnosis, which is considered the highest level of evidence, includes both optic disc and visual field defects consistent with glaucoma. A Category 2 diagnosis of optic nerve defects only (defined as a vertical cup to disc ratio above the 99.5th percentile of the healthy population) would be considered when the assessment of the visual field was not possible or not performed satisfactorily. Finally, a Category 3 diagnosis of glaucoma would be defined as an intraocular pressure above the 99.5th percentile of the healthy population with visual acuity less than 20/400 or evidence of prior glaucoma filtering surgery and visual acuity less than 20/400. A Category 3 diagnosis would be deemed sufficient if the optic disc was not visible and thus no visual field assessment was possible.

More uniform reporting of participant characteristics would also enhance diagnostic studies. Since inclusion criteria are highly variable and the important characteristics of the resulting populations are not uniformly described, synthesis across studies is difficult. Better characterization of participants would also help address the question of whom to screen for glaucoma. It is clear that discriminating healthy participants from those with early glaucoma is more difficult than discriminating healthy participants from those with moderate or advanced glaucoma. If participants were described in enough detail to distinguish those with mild, moderate, or severe disease, it would facilitate secondary questions regarding which groups should undergo screening and which stages of disease should be of primary interest. It may be the case, for instance, that identifying people with severe disease is a reasonable goal of a screening program since such people are likely at the highest risk of visual impairment.

The risk of bias of diagnostic study designs is an additional concern. Many of the glaucoma diagnostic studies included in this review are at high risk of spectrum bias because the investigators compared healthy volunteers with people with known glaucoma at the time of screening. Spectrum bias was a concern in 68 percent of the primary studies included in this review. Enrolling participants who are not representative of those one reasonably expects to encounter in a screening setting results in biased and inflated estimates of diagnostic performance and limits the generalizability of findings. Incorporation bias is another concern as the reference standard should not include one or more tests that comprise the candidate tests under investigation. Incorporation bias was encountered in only 2 percent of the primary studies included in this review. But as noted in Burr (2007), incorporation bias is a very complex issue when considering the diagnosis of glaucoma. The tests used to diagnose glaucoma are categorized broadly into tests of optic nerve structure or function, so to lessen the risk of incorporation bias, one would have to employ, for example, a test of structure as the reference standard if the candidate test was one of function, but that assumes that “structural (e.g. optic disc) and functional (e.g. visual field) damage occur simultaneously in glaucoma pathogenesis, whereas there is evidence that disc damage precedes manifest visual field loss.”7 Under these circumstances, avoiding use of the same test in the reference standard would be the best alternative to reduce the risk of incorporation bias.

Masking of investigators from the results of the reference standard when interpreting the candidate test results and masking investigators from the results of the candidate tests when interpreting the reference standard should be incorporated into the design of diagnostic studies and reported consistently. The candidate test(s) was/were interpreted without knowledge of the results of the reference test in 28 percent of the included studies but there was insufficient information to make a judgment for 60 percent of the studies.

The World Glaucoma Association’s (WGA) 2008 consensus statement is consistent with the conclusions of Burr 2007 as well as our review of the literature.101 The panel noted that there was no best single or group of tests that may be used for glaucoma screening. The WGA also noted that “optimal screening test criteria are not yet known” as there is a lack of population-based diagnostic studies.

The American Academy of Ophthalmology’s (AAO) Preferred Practice Pattern (PPP) for primary open-angle glaucoma (October 2008) includes discussion of population screening for glaucoma.102 The PPP states that screening may be valuable for high-risk populations and expanded to the larger population once sufficient tests are identified. The panel further noted that intraocular pressure measurements are not effective for screening, that structural assessments of the optic nerve and retinal nerve fiber layer are not appropriate for screening as they require expertise and have been noted as having low reliability, and that the diagnostic accuracy of visual field assessments, which have been used in population screenings, is largely unknown. Our review of the current evidence base further highlights the significant barriers that remain in identifying and characterizing potential glaucoma screening tests.

The AAO PPP panel highlighted FDT as a potential tool for the identification of moderate glaucomatous defects. We found that a large percentage of these studies were at high risk of spectrum bias and thus may present biased estimates of accuracy. There was appreciable heterogeneity in sensitivity estimates as there were varied patient populations and criterion used for the definition of glaucoma. Investigators of the FDT C-20 concluded that FDT may not be ideal for identifying patients with early disease,18 while investigators of the FDT N-30 concluded that FDT may perform well for identifying early functional defects in at-risk eyes without structural changes.20 When compared with noncontact tonometry and a questionnaire, FDT was determined to be the best among the candidate single and combination tests in the study, despite fair sensitivity for detecting OAG.91 The LALES study investigators compared FDT C-20, Humphrey Visual Field testing, Goldmann applanation tonometry, central corneal thickness and cup to disc ratio measurements.81 The results of the analyses for overall and high risk subgroups were similar and thus the investigators concluded that high-risk group screening, using LALES criteria, may not improve the estimates of test accuracy over population screening of those older than 40 years of age.

The results of this review should be interpreted in light of potential limitations. We did not include studies that were published in languages other than English, as we were unable to identify appropriate translation services for all non-English abstracts and/or the full text of potentially eligible articles prior to the start of full text screening. This represents a limited number of citations (129 of 3,877; 3%) that were retrieved by the electronic searches at the title and abstract stage. Given that the same indexing criteria were applied to all studies identified by the electronic searches and given that approximately 87 percent of the citations were excluded at the title and abstract stage and 87 percent were excluded at the full text stage, applying these same rates of exclusion, we may have missed a maximum of two potentially eligible foreign language studies. Additionally, as the majority of the studies included participants with known or suspected disease, the evidence is not applicable to routine screening and primary care settings and the estimates of sensitivity and specificity may be overestimates of the true effect.

Screening for glaucoma is a difficult problem due to the fact that it is asymptomatic, has low prevalence, is typically only slowly progressive, and has no agreed upon standard for diagnosis. These issues, while challenging, might be overcome by with a combination of creative thinking with regard to populations amenable to screening and hard work on the necessary studies and diagnostic standards.

Image methodsf1

Views

  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (8.6M)

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...