U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Chao YS, Clark M, Carson E, et al. HPV Testing for Primary Cervical Cancer Screening: A Health Technology Assessment [Internet]. Ottawa (ON): Canadian Agency for Drugs and Technologies in Health; 2019 Mar. (CADTH Optimal Use Report, No. 7.1b.)

Cover of HPV Testing for Primary Cervical Cancer Screening: A Health Technology Assessment

HPV Testing for Primary Cervical Cancer Screening: A Health Technology Assessment [Internet].

Show details

Clinical Review

Methods

Study Design

To address the clinical research questions, existing relevant and high-quality systematic reviews were integrated into an overarching review, and supplemented with subsequently published primary studies. Based on published guidance by the US Agency for Healthcare Research and Quality (AHRQ),2830 and as documented in an a priori protocol and protocol amendment31 (PROSPERO number CRD42017058463), eligible systematic reviews were identified and integrated through a five-stage process that included locating existing systematic reviews, assessing the relevance of existing systematic reviews, assessing the quality of existing systematic reviews, determining the appropriate use and methods to incorporate existing systematic reviews, and reporting methods and results from existing systematic reviews. Using this approach, the following two clinical research questions were addressed:

  1. What is the diagnostic efficacy of primary high-risk HPV testing, with or without cytology triage, compared with primary cytology-based testing for asymptomatic cervical cancer screening?
  2. What are the diagnostic efficacies of primary high-risk HPV testing strategies compared with each other for asymptomatic cervical cancer screening?

Literature Search Strategy

The literature search was performed by an information specialist using a search strategy peer-reviewed according to the PRESS (Peer Review of Electronic Search Strategies) checklist.32 The complete search strategy is presented in Appendix 1.

For the clinical search, published literature was identified by searching the following databases: MEDLINE (1946–) with in-process records and daily updates via Ovid; Embase (1974–) via Ovid; the Cochrane Database of Systematic Reviews via Ovid; the Cochrane Central Register of Controlled Trials via Ovid; the Database of Abstracts of Reviews of Effects (DARE) via Ovid; and PubMed. The search strategy comprised both controlled vocabulary, such as the National Library of Medicine’s MeSH (Medical Subject Headings) and keywords. The main search concepts were HPV testing, cervical cancer, DTA, and screening.

No filters were applied to limit retrieval by study type. This search updates a previous literature search initially conducted in 2002 for a CADTH Technology Report entitled Liquid-Based Cytology and Human Papillomavirus Testing in Cervical Cancer Screening.33 Retrieval for the current search was limited to documents published since January 1, 2002, for systematic reviews. For supplemental primary studies, retrieval was limited to the earliest literature search cut-off date for each outcome assessed within the relevant systematic review. The search was also limited to English-language and French-language publications. Conference abstracts were excluded from the search results.

The initial searches were completed by March 2017. Regular alerts were established to update the searches until the final report was published. Regular search updates were performed on databases that do not provide alert services. Studies identified in the alerts and meeting the selection criteria of the review were incorporated into the analysis if identified prior to the completion of the stakeholder feedback period of the final report. Any studies that were identified after the stakeholder feedback period are described in the discussion, with a focus on comparing the results of these new studies with the results of the analysis conducted for this report.

Grey literature (literature that is not commercially published) was identified by searching the CADTH Grey Matters checklist (https://www.cadth.ca/resources/finding-evidence/grey-matters), which includes the websites of HTA agencies, clinical trial registries, clinical guideline repositories, systematic review repositories, patient-related groups, and professional associations. Google and other Internet search engines were used to search for additional Web-based materials, and a Google alert was created for the topic of HPV screening.

Selection Criteria

The selection criteria for clinical research questions 1 and 2 can be found in Table 1.

Table 1. Selection Criteria for Research Questions 1 and 2 — Clinical Review.

Table 1

Selection Criteria for Research Questions 1 and 2 — Clinical Review.

Exclusion Criteria

Studies were excluded if they did not meet the selection criteria outlined in Table 1, if they were duplicate publications, or were primary studies published before the literature search cut-off date of a related included systematic review for a particular outcome. Further, if eligible primary studies were identified but had already been included within an included systematic review, those primary studies were considered redundant and examined as a part of the included systematic review.

Studies that selected patient samples for inclusion on the basis of cervical cytology results (e.g., known ASCUS, known low-grade squamous intraepithelial lesion [LSIL] cytology results) were excluded. Studies were also excluded if they focused exclusively on HPV types not listed in Table 1 or exclusively evaluated screening interventions with a focus on in situ hybridization, p16 immunostaining, or HPV viral load. Evaluations of earlier versions of commercial tests that have been replaced (e.g., Hybrid Capture [HC] 1) were also excluded. Finally, studies comparing high-risk HPV testing with visual inspection with acetic acid or visual inspection with Lugol’s iodine were excluded, as these screening methods are more common in low-resource settings and are not representative of current cervical cancer screening practices in Canada. Studies that examined HPV testing as part of co-testing with cytology as a primary screening strategy were excluded for analysis of clinical utility outcomes as primary co-testing is not expected to be a cost-effective strategy in Canada. For analysis of DTA outcomes, however, primary co-testing studies were included if results were reported as if HPV testing was performed alone as a primary test. For research question 2, that compared triage strategies, if co-testing was used as a comparator triage strategy the study was eligible for inclusion because this has potential to be relevant in a Canadian setting.

Screening and Selecting Studies for Inclusion

Systematic Reviews

Given the existence of several published and related systematic reviews, in an effort to integrate and build on those reviews, systematic reviews were the preferred study design for inclusion. Therefore, to begin, search results were first screened to identify potentially relevant systematic reviews for inclusion. Selection criteria are outlined in Table 1.

In order to identify potentially eligible SRs from the broad literature search results, a string of keywords, including (“systematic review” OR “systematic reviews” OR “meta-analysis” OR “meta-analyses” OR “meta analysis” OR “meta analyses” OR “metaanalysis” OR “metaanalyses”) was created and applied in DistillerSR35 (Evidence Partners, Ottawa, Canada) to all citations retrieved through electronic database searches, including monthly literature search update alerts. Two reviewers independently screened the titles and abstracts of the resulting citations in duplicate. The full text of potentially eligible citations was retrieved and then screened in duplicate in accordance to the eligibility criteria in Table 1. Discrepancies between reviewers were resolved through discussion.

To inform the inclusion decisions, important systematic review (SR) characteristics (e.g., objectives, PICO criteria, and study design elements [types of studies included, literature search time frames, and quality appraisal tools used]) were extracted from the full text of the publications into standardized tables by one reviewer. A second reviewer verified the extractions. SRs were considered for inclusion if they had inclusion criteria that exactly matched, were broader than, or were included by the PICO criteria summarized in Table 1. SRs that had a different population, intervention, comparators, outcomes, or country settings were excluded.

Primary Studies

As relevant SRs were identified as eligible for inclusion for various outcomes, per the AHRQ guidance,30 where the search was last updated more than one year ago, citations arising through the full CADTH literature search were screened in order to identify primary studies that had been published since the earliest literature search cut-off date for each outcome. Two reviewers independently screened the titles and abstracts for primary studies in duplicate. The full text of potentially eligible citations was retrieved and then screened in duplicate in accordance to the eligibility criteria in Table 1. Discrepancies between reviewers were resolved through discussion. As with SRs, DistillerSR35 (Evidence Partners, Ottawa, Canada) was used to manage the screening process and to facilitate screening and selection of primary studies.

Methodological Quality Assessments

Systematic Reviews

A review of the methodologic quality of each potentially eligible SR identified through the screening and selection process was conducted independently by two reviewers using the AMSTAR (A MeaSurement Tool to Assess Systematic Reviews) 2 checklist as a guide.36 AMSTAR 2 is a broad critical appraisal instrument designed primarily to guide appraisals of SRs of studies of health care interventions.36 It is not intended for the assessment of SRs of DTA studies; however, in the absence of a validated appraisal tool for SRs of DTA studies, the criteria in the AMSTAR 2 checklist were used as a guide. While the authors of the AMSTAR 2 checklist have defined seven critical and nine non-critical domains, these classifications were not applied in this review given different implications for SRs of DTA studies.36 Appraisals were conducted independently and in duplicate, and discrepancies were resolved through discussion. Quality scores and overall confidence ratings were not derived. Instead, a summary table outlining the quality assessment of the included SRs was prepared and used to guide decisions about the appropriate use and methods to incorporate existing SRs into this overarching review. In addition, the appraisal results were used to inform subsequent discussion on the possible sources of heterogeneity in SRs.

Primary Studies

Primary studies that investigated the DTA of HPV tests or testing strategies were evaluated using the QUADAS-2 instrument.37 For the other outcomes of interest, including test acceptance and clinical utility, the quality of randomized controlled trials (RCTs) was assessed using the Cochrane risk of bias tool,38 and the quality of observational studies, including cohort and cross-sectional studies, was assessed using the Newcastle-Ottawa Scale.39 All quality appraisals were conducted independently and in duplicate by two reviewers. Disagreements were resolved through discussion. Results of the quality appraisal process were used to inform comparisons between the results of primary studies (i.e., explore any potential discordance in results) and their related SR, in addition to informing interpretation of overall results.

Data Extraction

Relevant data included both descriptive data and results reported in all included studies. Separate standardized forms were used to extract relevant information from both SRs and primary studies. From SRs, descriptive data included information about included primary studies, search strategies, participants, interventions, comparators, and outcomes measures used. In addition, information about the conduct and results of risk of bias assessments of the primary studies were extracted. For primary studies, data were extracted on study characteristics, study design, population characteristics, intervention, comparators, outcomes, and conclusions. Two reviewers piloted the extraction forms in duplicate among a number of individual included primary studies and SRs. When complete, the reviewers compared the results and repeated the process until the authors’ extraction results were consistent with each other. The forms were updated during the pilot phase to reflect additional details reported by the included studies that were relevant to the outcomes of interest. Once consistency was reached, data from each included study was then extracted by one reviewer and checked for accuracy by a second reviewer. Disagreements were resolved through discussion until consensus was reached.

Data Analysis Methods

All outcome data from both SRs and primary studies were tabulated, summarized narratively, and presented, by outcome, as they relate to each research question. Results from SRs are presented first followed by the results for primary studies. For each outcome a table was prepared to report results, and is accompanied by a narrative summary that describes results within and across studies. Within the summary, attention was paid to describing the direction and size of observed effects and consistency in effects across studies. When differences were observed, an attempt was made to explain those differences by study and patient characteristics. For each outcome of interest, narrative synthesis was conducted for the overall study population and for the subgroups of interest, where possible.

Systematic Reviews

For outcomes where meta-analysis results were available, the range of individual study estimates, pooled estimates, and confidence intervals (CIs) were reported. For SR results where meta-analysis was not possible, the range of individual study estimates was reported, if provided.

When more than one SR addressed an outcome of interest; a matrix of primary studies included across multiple SRs was constructed to illustrate any overlap between SRs, both generally and by outcome.

Heterogeneity was explored within and between SRs. Within each SR, where possible, the research team reported and discussed any issues of heterogeneity in the primary studies as reported by the SR authors. Between SRs, if more than one SR was identified for an outcome, the concordance or discordance of SR results likewise would have been examined. If results had been found to be discordant, SR characteristics, for example, eligibility criteria or SR quality, would have been explored in an attempt to explain the discordance.

Primary Studies

For any primary studies included after the search date of any included SR, the individual estimates for each outcome were reported alongside SR results with CIs, where available. All results were summarized narratively, with no attempt to quantitatively synthesize results from included SRs and primary studies, as the goal of including primary studies was to assess concordance or discordance with the results of the SRs.

Once all outcome data were extracted and reported, for each outcome, the results of included primary studies were compared with those of the SRs. Reasons for concordance or discordance of the results between the SRs and the primary studies were assessed based on the clinical and methodological characteristics of the studies, for example, HPV test or testing strategy used, participant characteristics, and study quality.

Results

Quantity of Research Available

Systematic Reviews

A flow diagram illustrating the literature selection process for SRs is provided in Appendix 2.

Of the 7,128 citations identified through the full literature search strategy, 170 were identified as potentially eligible SRs and these citations were combined with three relevant SRs identified from media screening. Altogether, these 173 titles and abstracts were further assessed for relevance to this review. After title and abstract screening, 19 SR publications were ordered for full-text review and subsequently assessed against the PICO criteria.

Fifteen SRs were excluded for various reasons described in Table 40, while four SRs5,20,40,41 were determined to be relevant to the inclusion criteria and were included. The relevant SRs were produced by Melnikow et al. for the AHRQ;41 the Health Information and Quality Authority (HIQA);5 Koliopoulos et al. for the Cochrane Gynaecological, Neuro-oncology and Orphan Cancer Group;40 and Verdoodt et al.20 A list of excluded studies, with reasons for exclusion after full-text review, is provided in Table 40.

Final inclusion decisions at the SR level were made by individual outcome as no single existing SR was able to address all of the outcomes relevant to the research questions. The four included SRs reported outcomes relevant to the diagnostic efficacy of primary HPV testing, with or without cytology triage, compared with primary cytology-based testing for cervical cancer screening of asymptomatic women. The SR produced by HIQA5 reported outcomes relevant to research question two, which addresses the diagnostic efficacies of primary high-risk HPV testing strategies compared with each other for asymptomatic cervical cancer screening. All were included in this review because collectively they assess different aspects of diagnostic efficacy, in line with the outcomes of interest listed in Table 1.

The comparison of characteristics of the relevant SRs is presented in Table 36 and Table 39.

Primary Studies

The authors screened 2,723 citations identified through the literature search for eligible primary studies published after the literature search cut-off date for each included SRs (i.e., a different literature search cut-off date was applied for each SR addressing different outcomes of interest). There were 2,655 citations excluded and 68 articles were ordered for full-text review. Forty-eight articles were excluded and 20 primary studies were included. All included primary studies were used to address outcomes of relevance to research question one. No relevant primary study was identified to address outcomes of relevance to research question two. No studies that met the inclusion criteria were identified after the stakeholder feedback process. The flow diagram is provided in Appendix 3. A list of excluded studies is available in Table 35.

Summary

Four SRs,5,20,40,41 nine RCTs,4249,50 10 prospective cohort studies,5160 and one retrospective cohort study19 were included in this review. Twenty-four publications (four SRs, nine RCTs, 10 prospective cohort studies, and one retrospective cohort study) were used to address research question one.5,19,20,4058,59,60 One SR5 was used to address research question two. No primary studies were eligible to address research question two.

Study Characteristics

General Information About Included Systematic Reviews

The characteristics of the four included SRs are summarized in Table 36. Generally, the aim of the SRs was to assess the use of high-risk HPV testing as part of cervical cancer screening strategies. Each SR approached the topic slightly differently. The Cochrane SR40 and the HIQA SR5 assessed the DTA of HPV tests when used for cervical cancer screening. Melnikow et al. reviewed the benefits and harms of using HPV testing for cervical cancer screening.41 Verdoodt et al.20 aimed to evaluate the impact of different recruitment strategies on adherence to screening.

The authors of three of the included SRs searched for and included RCTs.5,20,41 Given the focus on DTA outcomes, Koliopoulos et al., in their Cochrane SR,40 limited their literature search to cross-sectional and cohort studies and did not include RCTs in their analyses.

Melnikow et al. searched for articles published between 2011 and 2018 in six electronic databases.41 There were eight RCTs, five cohort studies, and one individual-patient-data meta-analysis included in Melnikow et al.41 The HIQA SR included a search of MEDLINE and Embase for articles published between 2015 and April 2016 to supplement their previously published SR.5 In the Cochrane SR40 and Verdoodt et al.,20 two and three databases were searched respectively with a cut-off date of 2015.20,40 There were 23, 40, and 16 primary studies included respectively by the authors of the HIQA SR, the Cochrane SR, and Verdoodt et al.5,20,40 Meta-analysis was done in these three SRs;5,20,40 however, the authors of the HIQA SR5 chose not to meta-analyze results for their research question comparing various screening strategies with each other and, instead, narratively summarized the results of these studies. Similarly, due to the author’s concern regarding heterogeneity between the included studies, Melnikow et al. conducted qualitative synthesis only.41 Further, studies that were determined to be of low quality were excluded from analysis.41

There was overlap in the primary studies included in the meta-analyses of Cochrane40 and HIQA.5 Due to the limited number of studies identified in both SRs that compared other HPV tests with cytology, the DTA meta-analyses of HPV tests were limited to the comparison of the HC2 test versus cytology. The two SRs included a combined total of 36 primary studies for this comparison.5,40 The overlap of the primary studies is illustrated in Table 37 and Table 38. Eleven primary studies were included in the meta-analyses of both SRs.5,40 There were 25 studies included in the HIQA analysis5 that were not included in the SR by Cochrane40 and 12 studies included by the Cochrane SR40 that were not included in the HIQA SR.5 The analysis of screening strategies compared with each other was only done in the HIQA SR;5 however, three of the studies included in the Cochrane SR40 were also used in this analysis by HIQA.

General Information About Included Primary Studies

The study characteristics of the included primary studies are summarized in Table 42.

Twenty primary studies were identified for inclusion in this review. Nine RCTs,4249,50 10 prospective cohort studies,5160 and one retrospective cohort study19 were identified.

All 20 studies were used to address research question one. No primary studies were identified to address research question two.

Country of Conduct

Systematic Reviews

Based on the location of the corresponding authors, Melnikow et al. were based in the US,41 the HIQA SR was conducted in Ireland,5 the Cochrane SR was done in Greece,61 and the SR by Verdoodt et al. was conducted in Belgium.20

As outlined in Table 1, the inclusion of publications in this review was limited to those that most closely align with the Canadian health care context. These criteria were applied to the primary studies included in the SRs that are included in this review. Melnikow et al. included studies that were published only in countries rated “very high” on the 2014 Human Development Index, as defined by the United Nations Development Program.41 A specific list of those countries was not provided in the publication. The authors of the HIQA SR5 limited inclusion of primary studies to those conducted in industrialized countries including: Canada, the US, the UK, Germany, France, Western and Eastern Europe, Italy, Norway, Switzerland, Taiwan, Chile, Japan, and Russia. Verdoodt et al. did not limit the countries that were considered for their SR; however, their analysis included primary studies conducted in the Netherlands, Sweden, France, Sweden, the UK, Italy, Argentina, Mexico, and Finland.20 While some of the countries represented in the primary studies included in these SRs do not meet the selection criteria outlined in Table 1, the SRs remained eligible as a minimum of 80% of the participants were from countries that met the pre-specified criteria, in accordance with our a priori defined as a decision rule.

For DTA outcomes specifically, the authors of the Cochrane SR40 did not place any geographical restriction on the studies included. Twenty-one of the 40 included studies were conducted in countries that did not meet the CADTH inclusion criteria: China (7 studies), India (3), Mexico (2), Congo (2), Chile (1), former Soviet Union (1), Latin America (1), Russia (1), Switzerland (1), Vanuatu (1), and Zimbabwe (1).40 However, the authors of the SR conducted a sensitivity analysis that indicated the observed DTA of HPV tests was similar between high-income and middle- and low-income countries.40 These analyses were therefore included in this review.

Primary Studies

The nine included RCTs were conducted in Canada,43,45,48 Australia,46 Italy,50 Norway,44 Sweden,42 the UK,49 and the US.47 Eight of the nine included cohort studies were conducted in Germany,58 Greece,56 Hungary,53 Italy,54,57,59 Spain,52 and the US,55 while one prospective cohort study was conducted in both Germany and Greece.51 Two co-testing studies were conducted in the US.19,62

Patient Population

Systematic Reviews

The CADTH inclusion criteria are outlined in Table 1. For the comparison of primary screening with HC2 versus cytology testing, the authors of the HIQA SR5 aimed to identify studies examining people aged 18 to 70 years of age participating in a cervical cancer screening program who were not being followed for previous cervical abnormalities. Twenty-one studies included routine screening populations and two studies included populations of potentially higher risk of cervical cancer (those who had a previous abnormal cytology result and those presenting to routine gynecological clinics).5 Sample size ranged from 231 to 25,577. The age of screening participants in the individual studies was not reported.

For the comparison of HPV-based triage strategies versus each other, the authors of the HIQA SR5 aimed to identify studies examining participants of a cervical screening program who had a positive primary HPV screening test result and were going to undergo triage testing. Fifteen primary studies were included. Sample sizes ranged from 364 to 40,901.5 All of the included studies recruited individuals attending routine cervical cancer screening. The median age of patients of one study (Verhoef et al.) was 42 years and was higher than the other included studies. Participants recruited in one study (Wright et al.) were younger than those in the other studies, with a quarter of participants ranging in age from 25 to 29 years.5

The SR by Melnikow et al. included studies involving participants aged 21 years or older who were using HPV testing for cervical cancer screening, with or without cytology triage.41 Where possible, the authors grouped the results into two age groups: younger than 35 years of age and older than 35 years of age.41 This grouping reflects the approved age ranges for HPV testing in the US.41

The Cochrane SR40 included studies where all participants were presenting for routine cervical cancer screening and had received both HPV testing and cervical cytology followed by verification of the disease status with colposcopy.40 Forty primary studies including more than 140,000 participants aged 20 to 70 years were included in the SR.

The SR by Verdoodt et al.20 included irregularly or never-screened participants, or those who did not respond to one or more invitations for conventional screening for cervical cancer. Inclusion in the SR was not limited by age; however, the participants in the included studies ranged in age from 25 to 29 years. The number of participants in the self-sampling arms ranged from 800 to 26,886.20

Primary Studies

All of the included primary studies recruited persons eligible for routine screening programs.19,4251,52,5360 The sample sizes ranged from 12047 to 16,32046 in the included RCTs and from 18055 to 99,54921 in the included non-randomized studies. In the included RCTs, participants’ age ranged from a minimum of 21 years47 to 56 years42 to a maximum age ranging from 60 years42 to 70 years. 45 In the included non-randomized studies, participants’ age ranged from a minimum of 18 years 53 to 30 years51,55,63 to a maximum age ranging from 55 years56 to 65 years.52,53,55 Population characteristics are summarized in Table 2.

Table 2. Sample Sizes and Ages of Participants in Primary Studies.

Table 2

Sample Sizes and Ages of Participants in Primary Studies.

Interventions and Comparators

Systematic Reviews

The index and comparator tests of the included SRs are summarized in Table 36.

The Cochrane SR40 and one question of the HIQA SR5 originally aimed to include studies assessing any type of HPV test compared with cytology (LBC or conventional). The HIQA SR5 eventually limited their analysis to include only the HC2 test after the authors discovered an insufficient number of studies evaluating the other types of HPV tests. Another question of the HIQA SR included a primary HPV test (HC2, Amplicor, Linear Array, Cobas, or general primer [GP]5+/6+ polymerase chain reaction [PCR]) combined with a reflex test that could be cytology, another HPV test, HPV genotyping, or infection marker testing. The triage strategies were compared with each other. The results for question two were not meta-analyzed.5 The Cochrane SR40 included a variety of HPV tests in their report (HC2, Aptima, Care HPV test, and nucleic acid sequence-based amplification); however, most of their analyses focused on the comparison of HC2 with cytology (LBC or conventional). Melnikow et al. included HPV tests that detect high-risk strains of HPV (HC2 and PCR/GP5+/6+).41 These were compared with cytology.41

The intervention and comparator used in Verdoodt et al.20 were self-collected HPV sampling versus clinician-collected HPV sampling. The aim of the SR was to determine if there was an increase in screening adherence associated with different methods of screening recruitment and self-sampling. There were three self-sampling scenarios identified: mail-to-all, opt-in, and door-to-door.20 If mailed to all, self-samplers were mailed to the participants’ homes.20 The opt-in option waited for participants to request self-samplers after an invitation was sent to their homes.20 The door-to-door approach involved study staff visiting participants at their home addresses.20

The types of HPV strains that could be detected by the HPV tests are compared in Table 3 Table 3. The HPV types were grouped according to the International Agency for Research on Cancer classification.

Table 3. HPV Types Detected by HPV Tests.

Table 3

HPV Types Detected by HPV Tests.

Primary Studies

The index and comparator tests used in the included primary studies are summarized in Table 4. All included primary studies compared one type of self- or clinician-sampled HPV test with clinician-sampled tests (that could be cytology or another HPV test).4259 Among the nine RCTs, two compared clinician-sample HPV tests with cytology,42,43 and seven compared self-sampled HPV tests with cytology.4450 Among the 11 non-randomized studies, six compared clinician-sample HPV tests with cytology,51,53,54,5658 one compared self-sampled HPV tests with cytology,55 and two compared HPV and cytology co-testing with cytology.52,59 Kocsis et al. also compared two HPV tests, Confidence versus Cobas.53 Cook et al. reported the predictive values of Aptima and HC2 HPV tests based on a subset of the intervention group in the HPV FOCAL trial.43

Table 4. Index and Comparator Tests in Primary Studies.

Table 4

Index and Comparator Tests in Primary Studies.

Outcomes

For research question one, comparing HPV testing with cytology, there were four main groups of outcomes of interest that could be addressed using the results of the included SRs and primary studies: DTA, referral to colposcopy, acceptance of screening, and clinical utilities and harms. For question two, comparing HPV testing strategies with each other, there were three main outcomes: baseline DTA, longitudinal DTA, and referral to colposcopy. Baseline DTA was the accuracy to detect CIN2 or more advanced pathological findings (CIN2+) or CIN3 or more advanced pathological findings (CIN3+) at the time of examination. Longitudinal DTA aimed to predict CIN2+ or CIN3+ in the long run. The coverage of these outcomes is summarized in Table 5 and the overlap of studies included in the SRs pertaining to DTA is presented in Table 37 and Table 38.

Table 5. Outcomes Reported by Research Question.

Table 5

Outcomes Reported by Research Question.

Critical Appraisal

Systematic Reviews

Quality of Systematic Reviews

The four included SRs were of high relevance and were deemed to be of sufficient quality to include in this overarching review. Each included SR transparently provided a rationale for inclusion of specific study designs, conducted comprehensive literature searches, extracted data from the primary studies in duplicate, described the included studies with sufficient detail, appropriately assessed the risk of bias in the primary studies, and reported any potential conflicts of interest.5,20,41,40 For those reviews that included a meta-analysis, appropriate statistical methods were also used. A few limitations, however, were noted within the included reviews.6,22,40 The primary limitation identified in Melnikow et al. was that the authors did not report sources of funding in primary studies.41 This same limitation was also identified in each of the other included SRs. The HIQA SR authors additionally did not report the publication of an a priori protocol, investigate heterogeneity based on the risk of bias in primary studies, or investigate publication bias.5 The authors of the Cochrane SR did not comprehensively account for risk of bias in primary studies when discussing results, nor did they investigate publication bias.40 The Verdoodt et al. study was found to have four primary limitations. The authors did not report publication of an a priori protocol, provide a list of excluded studies with rationale, comprehensively account for risk of bias while discussing the results, nor did they investigate publication bias.20

Quality of Primary Studies Included in Systematic Reviews

The authors of the SRs used a variety of tools to critically appraise the included primary studies. In the HIQA SR,5 the 15 primary studies were appraised with the Quality Assessment of Diagnostic Accuracy Studies (QUADAS)-2 checklist. Three were rated at low risk of bias in four domains.5 The overall quality of the studies were rated fair to good.5 The Cochrane SR40 used QUADAS to assess the risk of bias of the included studies. They found that, overall, the quality of the evidence for the sensitivity of the tests was moderate and the quality of the tests for specificity was high.40

All primary studies in Melnikow et al. were assessed with the U.S. Preventive Services Task Force criteria, while observational studies were also appraised with the Newcastle-Ottawa Scale.41 Two Italian and one Dutch trial, New Technologies for Cervical Cancer Screening (NTCC) phase I and phase II, and POBASCAM, were rated as good quality.41 The Canadian FOCAL trial was also rated as good quality.41 The other trials included in the SR were considered to be fair quality.41 In the SR by Verdoodt et al.,20 the methodological quality of the included studies was assessed as moderate to high according to the criteria in the Cochrane risk of bias tool.20

Primary Studies

Diagnostic Test Accuracy Studies

Seven primary studies19,43,51,53,56,58,60 were assessed using the QUADAS-2 checklist for DTA outcomes, as presented in Table 43.37 The results of quality assessment based on the checklist are provided in Table 43. The first item of the QUADAS-2 checklist explores whether the selection of patients could introduce bias into the study.37 Five studies adequately described how patients were selected and were determined to be at low risk of bias.43,51,53,56,58 The risk of selection bias was determined to be unclear for Wright et al.60 and Jin et al.19 because the publications did not provide enough information on patient selection to adequately assess how it might lead to bias.19,60 The second item was whether the conduct or interpretation of the index test introduced bias.37 The results of the index tests and reference standards were available at the same time for Jin et al. and it was unclear whether the reference standards were known to the authors.19 This study was considered at unclear risk of selecting patients based on the outcome.19 The third item was whether the conduct or interpretation of the reference standard introduced bias.37 Cook et al. blinded the results of index test (Aptima HPV tests) and was considered at low risk.43 The other six studies were at high risk for the lack of blinding.19,51,53,56,58,60 The fourth item was whether the patient flow introduced bias.37 Two studies were considered at high risk for the lack of adjustment for verification bias.19,43 Cook et al. did not investigate the disease status of test-negative patients,43 while the other study did investigate test-negative patients.

Jin et al. did not describe the conduct or the interpretation of the supplemental tests (cytology in this case) and the risk of bias was unclear.19 Other studies described the diagnostic thresholds and were considered at low risk.43,51,53,56,58,60 Wright et al. and Cook et al. had additional index tests, hybrid HPV tests and HC2, respectively, and described the diagnostic thresholds and the methods to determine the DTA.43,60 They were considered at low risk of introducing bias due to the conduct or interpretation of them.43,60

Non-Randomized Studies

Six prospective cohort studies51,52,54,55,57,59 used to address research question one were critically appraised with the Newcastle-Ottawa Scale with results presented in Table 44. All of them had somewhat or truly representative samples, non-exposed cohorts drawn from similar communities, exposure ascertained with secure records, comparable cohorts according to the study design, outcome assessment with record linkage, and adequate cohort follow-up for selected outcomes.51,52,54,55,57,59 Ilangovan et al.55 and Chatzistamatiou et al.51 were rated as high quality with no limitations identified in eight criteria.55 Four studies were also rated as high quality based on one limitation of not demonstrating that outcome of interest was absent at the beginning of study.52,54,57,59

Randomized Controlled Trails

Nine RCTs were assessed with the Cochrane risk of bias tool and are presented in Table 45.42,4350 Risk of bias in sequence generation was unclear in three studies.42,43,47 Risk of bias in allocation concealment or selection bias was high in six studies4244,46,47,49,50 and low in three.45,48,49 Risk of bias in blinding of participants and personnel or performance bias was high in seven studies,42,44,4650 low in one study,45 and unclear in the other study.43 Risk of bias in blinding of outcome assessors or detection bias was high in seven studies42,44,4650 and unclear in two studies.43,45 Risk of bias from missing outcome data or attrition bias was low for all studies.42,4350 Risk of bias from selective outcome reporting or reporting bias was low for all studies.42,4350 Risk of bias from other biases was low in eight studies.42,4350 The risk was unclear in Cook et al. due to insufficient information on the adjustment for verification bias.43

Summary of Results

Research Question 1

What is the diagnostic efficacy of primary high-risk HPV testing, with or without cytology triage, compared with primary cytology-based testing for cervical cancer screening of asymptomatic women?

The outcomes and the relevant SRs are listed in Table 46 to Table 55.

Diagnostic Test Accuracy

Systematic Reviews

The Cochrane SR directly compared three types of HPV tests (HC2, PCR [13 or more virus strains], and Aptima) to cytology (LBC or conventional).40 HC2 was the only HPV test that applied more than one diagnostic threshold. The authors adopted 1 pg/mL and 2 pg/mL or relative light units (RLU) as the thresholds of HPV positivity for HC2 in their direct comparisons. Meta-analysis was done for the 1 pg/mL cut-off value only, as there were not sufficient primary studies to undertake a meta-analysis of HC2 at the threshold of 2 pg/mL or RLU.40 The Cochrane SR distinguished between the two types of cytology tests, conventional and LBC, and two cytology thresholds, ASCUS and LSIL.

The HIQA SR compared HC2 with cytology (LBC or conventional).5 The authors considered 1 pg/mL or RLU as the positivity threshold for HC2.5 The authors analyzed conventional cytology and LBC at the threshold of ASCUS.5

Overall, both SRs found that HC2 at the threshold of 1pg/mL or 1 RLU was more sensitive and less specific than liquid-based or conventional cytology at the threshold of ASCUS for the detection of CIN2+ or CIN3+.5,40 The overall trends in the results are summarized in Table 6 and Table 7.

Table 6. Results of the Diagnostic Test Accuracy Comparison Between HPV Tests and Cytology for the Detection of Cervical Intraepithelial Neoplasia 2+.

Table 6

Results of the Diagnostic Test Accuracy Comparison Between HPV Tests and Cytology for the Detection of Cervical Intraepithelial Neoplasia 2+.

Table 7. Results of the Diagnostic Test Accuracy Comparison Between HPV Tests and Cytology for the Detection of Cervical Intraepithelial Neoplasia 3+.

Table 7

Results of the Diagnostic Test Accuracy Comparison Between HPV Tests and Cytology for the Detection of Cervical Intraepithelial Neoplasia 3+.

Both SRs found that the pooled values for HC2, at the threshold of 1pg/mL or 1 RLU, were significantly more sensitive and less specific than liquid-based or conventional cytology at the threshold of ASCUS for the detection of CIN2+ or CIN3+ (Table 8 and Table 9).5,40

Table 8. Comparative Sensitivity and Specificity — HPV Tests Versus Cytology for the Detection of Cervical Intraepithelial Neoplasia 2+.

Table 8

Comparative Sensitivity and Specificity — HPV Tests Versus Cytology for the Detection of Cervical Intraepithelial Neoplasia 2+.

Table 9. Comparative Sensitivity and Specificity — HPV Tests Versus Cytology for the Detection of Cervical Intraepithelial Neoplasia 3+.

Table 9

Comparative Sensitivity and Specificity — HPV Tests Versus Cytology for the Detection of Cervical Intraepithelial Neoplasia 3+.

Other HPV tests or HC2 thresholds (i.e., 2 pg/mL) were not included in the meta-analysis in the HIQA SR in Table 8 and Table 9.5 There were not sufficient numbers of primary studies in the Cochrane SR for the comparisons between HC2 at the threshold of 2 pg/mL or 2 RLU and cytology for the detection of CIN2+ or CIN3+.40 In the Cochrane review, for HC2 at the threshold of 1 pg/mL or 1 RLU, the sensitivity was significantly higher and the specificity was significantly lower than LBC or conventional cytology at the threshold of LSIL for the detection of CIN2+.40 However, there were no significant differences found for the DTA between HC2 (1 pg/mL or 1 RLU) and LBC (LSIL+) for the detection of CIN3+.40 There was not sufficient data for a meta-analysis of the comparison between HC2 (1pg/mL or 1 RLU) and conventional cytology (LSIL+) for the detection of CIN3+.40

As described in Table 6 and Table 7, PCR-based HPV tests that could detect more than 12 high-risk HPV strains were significantly less specific than LBC at the threshold of ASCUS for the detection of CIN2+ in the Cochrane SR.40 There were no significant differences in the sensitivities between these two types of tests.40 The other comparisons between PCR-based HPV tests and LBC or conventional cytology at the threshold of ASCUS did not indicate significant differences in DTA.40

The Aptima HPV test was also compared with LBC at the threshold of ASCUS for the detection of CIN3+ and there were no significant differences in DTA identified in the Cochrane SR (Table 9).40

The ranges and the pooled estimates of the DTA reported in the HIQA SR and the Cochrane SR are provided in Table 46 to Table 51. Corrected estimates were also provided to account for a data extraction error in the Cochrane SR.

In the HIQA SR, the positive and negative predictive values for the prediction of CIN2+ and CIN3+ were compared (Table 52).5 The pooled values for the negative predictive values of both HC2 and cytology were greater than 99% (99.91% and 99.57%, respectively), while the positive predictive values were below 20% (11.8% and 19.9%). These values were calculated assuming a prevalence of 1.6% for CIN2+ and 1.0% for CIN3+ for Irish women aged 25 to 60 years.5

The authors of the Cochrane SR40 considered several factors important to the observed variations in DTA across trials in their SR. These factors included the difference in sensitivity and specificity of tests in those aged 30 years and over, verification bias, variation in prevalence in different geographic areas (high- versus low-income countries), and the numbers of high-risk HPV types detected.40

An analysis was conducted that examined the difference in sensitivity and specificity of HC2 (1pg/mL) at the threshold of CIN2+ when used only for those participants older than 30 years of age. The pooled sensitivity (93.9% [95% CI, 89.3 to 96.6]) and specificity (91.3% [95% CI, 88.9 to 93.2]) among these participants were higher than those observed when analyses included participants of all ages.40 These results were expected as the specificity of HPV tests are expected to increase in older participants being screened as the prevalence of high-grade lesions is higher in the older age group. No data were available for the CIN3+ threshold for those older than 30 years of age.

The DTA values that were adjusted for verification bias (i.e., part or all of the test-negative patients underwent colposcopy to verify outcome status) are presented in Table 49. The studies that did not verify outcome status among individuals that obtained negative results only in multiple primary screening tests were not included in this analysis. The sensitivity for the detection of CIN3+ was higher in the studies at high risk of verification bias than those at low risk,40 indicating that sensitivity estimates as reported may be overestimated.

The authors of the Cochrane SR compared the accuracy estimates of the tests based on the geographical region where the primary studies were conducted.40 Countries were classified as high-income or middle- or low-income. Though the results and methods of this analysis were not presented in the publication, the authors indicated that they did not identify any significant effects on accuracy measures based on geography.40

There were not sufficient numbers of primary studies to investigate the variation in DTA due to the types of high-risk HPV detected by the tests.40 Further, there was no meta-analysis conducted to investigate the DTA of self-sampling HPV tests compared with cytology.40 In the four primary studies included in the Cochrane SR, the sensitivity of self-collected HPV testing ranged from 41% to 97% and the specificity ranged from 77% to 98%.40 Three of the four studies used HC2, two used Care HPV, and one used both.40 The impact of self-sampling on the DTA of HC2 and Care HPV tests remained to be investigated.

Positive and negative predictive values (positive predictive value [PPVs] and negative predictive values, respectively) were determined by the disease prevalence and DTA.5 The PPVs for the detection of CIN3+ were lower than those for the detection of CIN2+ for HC2 and cytology in Table 52.5 The PPVs of cytology for the detection of CIN2+ or CIN3+ were higher than those of HC2. However, there was no statistical test to determine the significance of the differences. The negative predictive values of cytology and HC2 remained above 99% for the detection of CIN2+ or CIN3+.

Primary Studies

There were seven primary studies that evaluated the sensitivity and specificity of primary HPV tests included in this review (one RCT,43 two co-testing studies,19,60 and four prospective cohort studies51,53,56,58). Six of the primary studies43,51,53,56,58,60 supported the conclusion that HPV tests had higher sensitivity and lower specificity than cytology. The detailed results of these studies are presented in Table 46 to Table 50. The HPV tests evaluated in these studies included:

The study by Jin et al.19 observed a sensitivity of 94.1% (95% CI, 90.3 to 96.5) and a specificity of 98.1% (95% CI, 98.1 to 98.2) for HC2 at a threshold of CIN3+ when used in a co-testing scenario for participants 30 years of age or older. In this study, the authors found that HC2 as the primary HPV test was more sensitive to CIN3+ cases than primary cytology (90.7% [95% CI, 86.4 to 93.8]) and slightly more specific than cytology (97.6 [95% CI, 97.5 to 97.7]).19 The authors acknowledged that the results of their study did not align with other similar studies. They proposed that differences in rates of abnormal cytology could lead to differences in sensitivity and specificity. Sample collection or pathological interpretation may have been factors contributing to these differences but it was not possible to definitively determine this to be the cause. The population included in this study also had lower incidences of LSIL and HSIL as compared with the US national averages. The reason for the superior specificity of HC2 compared with cytology in this study was not clear.19

Screening Participation

Systematic Reviews

Verdoodt et al.20 included 16 studies and evaluated screening participation among those who were considered underscreened — those not participating in regular cervical cancer screening programs — following an invitation for self-collected HPV testing compared with an invitation for clinician-collected HPV or cytology testing for cervical cancer screening. These results are summarized in Table 10 and full detail is available in Table 53.20 Control groups in 14 of the studies involved clinician-collected cytology testing; however, two studies used clinician-collected high-risk HPV testing as the control group. Participation in each study group varied significantly between studies so the authors grouped the analysis according to the invitation scenario.20 There were three self-sampling strategies identified: mail-to-all, opt-in, and door-to-door.20 The mail-to-all approach was to directly send the self-sampling devices to the eligible participants.20 The opt-in approach was to invite the participants and wait for them to opt in to self-sampled tests.20 The door-to-door method was to have staff workers visit eligible participants and deliver self-sampling devices.20 The participation rates were significantly different when comparing mail-to-all self-collected HPV tests and control. Both the per-protocol and intent-to-treat analyses showed that the mail-to-all option was more acceptable and achieved higher participation rates than the control, according to the pooled estimates.20 In both analyses, the acceptance of the opt-in option was not significantly different from that of the control group.20 The door-to-door option was not associated with significantly different participation rates compared with clinician-collected cytology, according to both analyses.20

Table 10. Participation Rates Reported in Verdoodt Et Al.

Table 10

Participation Rates Reported in Verdoodt Et Al.

Primary Studies

Self-Sampling HPV Tests Versus Cytology: Six RCTs44,4650 compared the absolute participation in populations that were considered underscreened when offered either self-collected HPV testing or cytology for cervical cancer screening. Two studies, one RCT45 and one observational study,55 listed the participation rates in different groups and did not test the statistical significance of the differences. These results are summarized in Table 11. With the exception of Zehbe et al., which studied the participation rates in First Nations communities in Ontario,48 the other seven primary studies recruited or invited those who did not attend regular screening programs for at least one year.4447,49,50,55

Table 11. Absolute Participation Rates in Self-Sampling Versus Cytology as Reported in Primary Studies.

Table 11

Absolute Participation Rates in Self-Sampling Versus Cytology as Reported in Primary Studies.

Among the six studies that tested the statistical significance in the difference between groups,44,4650 five reported higher participation rates in the self-sampling group.44,46,47,49,50 Zehbe et al. studied the participation rates in First Nations communities and did not find differences.48 Rossi et al. compared four strategies: a self-sampler delivered to home for self-testing, a self-sample kit obtained in a pharmacy, cytology at a clinic, and HPV tests at a clinic. Higher participation rates were reported for self-sampling at home compared with the testing at a clinic.50 They did not find the rates significantly different between those taking self-samplers at a pharmacy and those undergoing the test at a clinic.50

Physician-Collected HPV Tests Versus Cytology: One RCT42 and two observational studies54,59 compared the absolute participation when participants were offered clinician-collected HPV testing or cytology for routine cervical cancer screening.42,54,59 These results are summarized in Table 12 and details are described in Table 53. In two studies,42,54 the absolute participation rates were similar between the groups that were offered clinician-collected HPV testing versus those who were offered routine cytology testing, although statistical significance was not tested. Pasquale et al. found that the relative frequencies of participation rates of physician-collected HPV tests were higher than cytology.59

Table 12. Absolute Participation Rates in Physician-Collected Testing Versus Cytology as Reported in Primary Studies.

Table 12

Absolute Participation Rates in Physician-Collected Testing Versus Cytology as Reported in Primary Studies.

Referral to Colposcopy

Systematic Reviews

In Melnikow et al., four primary RCTs (NTCC phase II, HPV FOCAL, Compass, and FINNISH) and one cohort study (Zorzi et al.) examining the differences in colposcopy referral rates between primary HPV testing and cytology were narratively summarized.41 The referral rate was presented as a percentage of the total number of participants who were triaged to colposcopy after their initial screening tests. One round of results were reported for the RCTs.41 The complete results are presented in Table 54.

When comparing all participants included in the RCTs, colposcopy referral was highest for high-risk HPV testing alone (7.9%). This was followed by:

  • high-risk HPV testing with LBC triage (3.8% and 5.7%, Compass and HPV FOCAL trials, respectively)
  • LBC alone (2.7% and 3.1%, Compass and HPV FOCAL trials, respectively)
  • LBC with high-risk HPV testing triage (3.1%, HPV FOCAL trial)
  • conventional cytology alone (1.1% and 2.8%, FINNISH and NTCC phase II trials, respectively)
  • high-risk HPV testing with conventional cytology triage (1.2%, FINNISH).41

In addition, for the second round of screening (occurring approximately four years after the first round) for those who tested negative in the first round of screening, the referral rates were reported in Ogilvie et al. and were reviewed in the SR by Melnikow et al.:

  • HPV test (Aptima or HC2) with LBC triage (4.9%)41
  • LBC at a threshold of ASCUS+ (7.0%).41

When the results were subdivided by the age of the participants, the referral rates of high-risk HPV testing with LBC triage were higher. For participants aged 35 years and older, the results were generally the same, with high-risk HPV testing alone having the highest referral rate (5.8%) and high-risk HPV testing with conventional cytology triage having the lowest referral rate (0.9%).41,66 For participants younger than 35 years of age, high-risk HPV testing with LBC triage had referral rates of 19.9% (25 to 29 years of age, HPV FOCAL trial) and 10.8% (30 to 34 years of age, HPV FOCAL trial). The lowest referral rates for this age group were for high-risk HPV testing with conventional cytology triage (2.3%) and conventional cytology alone (1.9% and 3.6%, FINNISH and NTCC phase II trials, respectively).41

The one-arm observational study by Zorzi et al. examined the effectiveness of only primary HPV tests and reported results from two rounds of screening between 2007 and 2009.41 The colposcopy referral rates were higher at the first round (4.4%) as compared with the second round (2.2%), and the overall combined referral rate was 5.4%.41

The screening intervals of the primary studies included in Melnikow et al. ranged from three to five years.41 The authors indicated that none of these studies were designed or powered to test for differences in colposcopy rates or false-negatives with shorter and longer intervals within a trial.41

Primary Studies

The colposcopy referral rates were examined in two RCTs42,43 and in three prospective cohort studies.51,52,57 Colposcopy referral was reported in two different ways: relative to total participants screened or relative to the number of participants who were triaged or randomized. A full summary of results is presented in Table 54.

The colposcopy referral rates reported as a percentage of the total number of participants triaged were available in one RCT.42 For the Cobas HPV test, the referral rate was 0.3%.42 Screening with LBC at a threshold of ASCUS+ resulted in a referral rate of 0.2%.42

The colposcopy referral rates reported as a percentage of the total number of participants screened were available in two RCTs42,43 and three prospective cohort studies.51,52,57 The referral rates for the RCTs at round one were:

  • HC2 (3.1%)43
  • Cobas HPV test (0.8%)42
  • LBC at a threshold of ASCUS+ (0.7%).42

Although the RCTs recruited individuals eligible for routine screening programs, the populations in the two RCTs were different. The HPV FOCAL trial was conducted in British Columbia, Canada.43 In Lamin et al., Swedish participants aged 56 to 60 years received Cobas HPV tests for cervical cancer screening.42 We suspect the differences in population characteristics might contribute to some of the variations in referral rates.

Referral rates reported in the prospective cohort studies were:

  • HC2 (1.1%)57
  • Aptima (3.5%)52
  • Multiplex Genotyping HPV test (16.3%)51

LBC at a threshold of ASCUS+ (2.7% to 6.4%). 51,52,57

Because colposcopy referral rates were not usually the primary outcome of interest, the differences between different groups were not tested for statistical significance.

Harms and Clinical Utility Outcomes Other Than Colposcopy

Systematic Reviews

Melnikow et al. addressed harms and clinical utility and qualitatively summarized data.41,65 The clinical utility findings of Melnikow et al. comparing HPV, HPV with conventional cytology triage, or HPV with LBC triage against conventional cytology in first-round screening are summarized in Table 13. There were results available from second-round screening. However, second-round screening involved comparing conventional cytology with conventional cytology or switched the testing method to co-testing rather than primary HPV testing (FOCAL trial) after first-found screening, which was out of scope for the CADTH review. There were also results from co-testing studies and those using primary HPV tests in Melnikow et al.41,65 The co-testing studies were not eligible for the inclusion criteria of this review and are not described below.

Table 13. Clinical Utility.

Table 13

Clinical Utility.

Across the four trials with variable protocols and high-risk HPV test types included by Melnikow et al. the evidence was consistent in demonstrating that primary high-risk HPV screening led to a statistically significantly increased detection of CIN 3+ in the initial round of screening.41 The relative risk for CIN 3+ detection between screening groups was similar to the overall findings in both the younger (younger than 35 years) and older (35 years and older) age groups.41

The trials that reported the outcome showed low rates of invasive cervical cancer.65 Overall, primary high-risk HPV testing was associated with higher colposcopy rates.41 In all trials, where the results reported were statistically significant, participants younger than 35 years who were screened using primary high-risk HPV testing had higher referral rates for colposcopy than participants who were screened with cytology.41

The results of the Canadian FOCAL trial were reported as a part of the SR by Melnikow et al.41,65 The results of this study appeared to be comparable with the results of the primary studies that were conducted in other countries.

Though Melnikow et al. aimed to assess the harms and adverse events (AEs) associated with cervical cancer screening, no results were identified among the included primary studies with regard to cervical cancer mortality, rates of cervical cancer treatment, or harms occurring from the screening test, diagnostic testing, or treatments.41,65 No data were provided regarding the impact of HPV testing on the detection of adenocarcinoma. The authors commented that the studies included in their SR were not adequately powered to detect the relatively uncommon AEs that can occur following the biopsy or treatment of cervical lesions.41,65 The authors also attempted to address the differences in adverse effects based on different screening intervals. None of the studies identified were designed to specifically compare these outcomes between screening intervals and, due to heterogeneity between the studies, the authors were not able to determine how the screening interval or screening strategies might have related to the potential harms of overdiagnosis and detection or missed cervical cancers.41,65 No studies reported on the psychological effects of primary HPV testing or addressed quality of life.41,65

Primary Studies

There were no primary studies identified addressing harms or clinical utility outcomes other than referral to colposcopy.

Research Question 2

What are the diagnostic efficacies of primary high-risk HPV testing strategies compared with each other for asymptomatic cervical cancer screening?

Evidence was identified related to DTA and colposcopy referral rates, but not for acceptance of screening, harms, or clinical utility.

Diagnostic Test Accuracy — First Round of Screening

Systematic Reviews

The authors of the HIQA SR5 aimed to compare the DTA of different HPV testing and triage strategies. They included 15 studies of participants in cervical cancer screening programs who had a positive result on their preliminary HPV test and then underwent some form of triage testing before proceeding to sample confirmation with colposcopy.5 Four of the five triage strategies they identified were of relevance to this review. The baseline DTA of the four triage strategies identified in the HIQA SR are listed in Table 14. These strategies included:

  1. primary HPV testing with cytology triage
  2. primary HPV testing followed by triage with partial genotyping for HPV 16/18
  3. primary HPV testing followed by triage with sequential partial genotyping for HPV 16/18 followed by cytology to further triage those positive for HPV 16/18
  4. primary HPV testing followed by co-testing triage (partial genotyping for HPV 16/18 and cytology triage).

Table 14. Baseline Sensitivity and Specificity of Triage Strategies.

Table 14

Baseline Sensitivity and Specificity of Triage Strategies.

These strategies follow the pathway outlined in Figure 6. No one study compared all of the triage strategies with each other. The baseline DTA results of the four triage strategies were discussed separately in the HIQA SR.5

For the first triage strategy, primary HPV testing with cytology triage, the authors of the HIQA SR included six RCTs. Two of the RCTs used colposcopy confirmation only for HPV-positive results. Four of the RCTs used colposcopy confirmation for all participants who had the primary HPV and triage test, regardless of the outcome.5 The results were not pooled. Two of the four RCTs, Castle et al. (2011) and Wright et al. (2016) were publications from the US-based ATHENA trial and reported sensitivities and specificities for both CIN2+ (sensitivity = 52.6% [95% CI, 47.6% to 57.6%) and 46.5% [95% CI, 41.7% to 51.3%]) (specificity = 90.1% [95% CI, 89.4% to 90.7%] and 89.9% [95% CI, 89.1% to 90.6%]) and CIN3+ (sensitivity = 89.9% [95% CI, 89.1% to 90.6%] and 48.3% [42.3% to 54.3%]) (specificity = 89.3% [95% CI, 88.6% to 90.0%] and 89.2% [88.5% to 89.9%]) that were much lower than those reported in the other included studies.

For the second triage strategy, primary HPV testing followed by genotyping for HPV 16 and 18, two of the studies reported DTA values for the entire screening strategy (HPV test and triage test) while one study reported conditional outcomes that represent the outcomes for the triage test for the population who were screened positive on the primary HPV screening test. Two of the three included studies provided DTA estimates and the results suggested that this strategy was less sensitive but more specific than primary HPV testing followed by cytology triage.5

For the third triage strategy, primary HPV testing followed by sequential HPV genotyping for HPV 16 and 18 and cytology, three studies were included. Two of the studies reported DTA values for the entire screening strategy (HPV test and triage test) while one study reported conditional outcomes that represent the outcomes for the triage test for the population who were screened positive on the primary HPV screening test. The authors of the SR concluded that the results of the two studies examining the strategy as a whole suggest that this strategy was less sensitive, but more specific, than primary HPV testing followed by cytology.

For the fourth strategy, primary HPV testing followed by co-testing with genotyping for HPV 16 and 18 and cytology, three studies were identified. Two of the studies reported DTA values for the entire screening strategy (HPV test and triage test) while one study reported conditional outcomes that represent the outcomes for the triage test for the population who were screened positive on the primary HPV screening test. The two included studies examining the strategy as a whole reported DTA estimates that were suggestive that this strategy was similarly sensitive but less specific than primary HPV testing followed by cytology.5

Two studies also compared strategies two, three, and four and found the highest sensitivity was reported with primary HPV testing with co-testing triage.5 The highest specificity was reported for primary HPV testing followed by sequential genotyping for HPV 16 and 18 and cytology.5

There appeared to be a trade-off between sensitivities and specificities. The studies reporting higher sensitivities in each triage strategy tended to report lower specificities and vice versa. Due to study heterogeneity and insufficient numbers of primary studies in the triage strategies, there were no meta-analyses conducted for the triage strategies.5

Diagnostic Test Accuracy — Subsequent Rounds of Screening

Systematic Reviews

There were five primary studies included in the HIQA SR for longitudinal DTA.5 The results are presented in Table 15. Longitudinal DTA was discussed based on the four triage strategies identified previously.5 There was no meta-analysis conducted for any of these four strategies.5 Not all included studies directly compared cross-sectional and longitudinal DTA.5 The findings in the HIQA SR were summarized as follows:

  • For the strategy with primary HPV testing followed by cytology, five of the six included primary studies reported longitudinal DTA after following the participants for one to four years.5 High longitudinal sensitivities and specificities were maintained for the detection of CIN2+ and CIN3+.5
  • One included study, authored by the VUSA-screen researchers, reported longitudinal DTA for primary HPV testing followed by genotyping for HPV 16 and 18.5 Compared with the baseline DTA reported by the NTCC trial, the longitudinal sensitivity was significantly lower and the specificity was significantly higher.5
  • For primary HPV testing followed by sequential genotyping for HPV 16 and 18 and cytology, the three-year sensitivities reported in the ATHENA trial were significantly higher than those reported at baseline.5 In contrast, the three-year sensitivities and specificities were slightly lower in the VUSA-screen trial, as compared with the baseline DTA reported in the Public Health Trial Finland (referred to as FINNISH in the AHRQ SR).5
  • For primary HPV testing followed by co-testing genotyping for HPV 16 and 18 and cytology, the longitudinal sensitivity was lower and specificity was higher, as reported in the POBASCAM trial, as compared with the Public Health Trial Finland (or FINNISH in the AHRQ review).

Table 15. Longitudinal Sensitivity and Specificity of Triage Strategies.

Table 15

Longitudinal Sensitivity and Specificity of Triage Strategies.

Primary Studies

There were no primary studies identified for the comparison between these four HPV triage strategies.

Referral to Colposcopy of Triage Strategies

Systematic Reviews

The colposcopy referral rates of the previously mentioned triage strategies (Figure 6) were summarized by the authors of the HIQA SR (n = 6).5 The results are presented in Table 16. The colposcopy referral rates were not compared with each other or meta-analyzed.5 Among the four screening strategies that are relevant to the Canadian setting, primary HPV testing followed by co-testing (genotyping for HPV 16 and 18 and cytology) seemed to have higher referral rates than the other three strategies.5 The differences in the referral rates of total screened between the other three strategies were not clear.5

Table 16. Colposcopy Referral Rates of Triage Strategies.

Table 16

Colposcopy Referral Rates of Triage Strategies.

Primary Studies

There were no primary studies identified that reported colposcopy referral rates based on these four triage strategies.

Harms and Clinical Utility Outcomes Other Than Referral to Colposcopy

Systematic Reviews

There were no SRs identified that reported harms or clinical utility outcomes other than colposcopy referral rates based on these four triage strategies.

Primary Studies

There were no primary studies identified that reported harms or clinical utility outcomes other than colposcopy referral rates based on these four triage strategies.

Summary of Results

Summary of Results for Question 1

Four SRs were included for the comparison between HPV tests and cytology.5,20,40 Twenty-two relevant publications of 21 primary studies were identified that were published after the literature search cut-offs of the included SRs.19,4260

DTA outcomes were addressed using the results of the Cochrane review40 and the review by the HIQA.5 Authors of the Cochrane review40 directly compared three types of HPV tests (HC2, Aptima, and PCR [13 or more virus strains]) with cytology. The HIQA review compared HC2 with cytology.5 Both reviews5,40 concluded that, at the HPV threshold of 1pg/mL or 1 RLU:

  • HC2 was more sensitive than cytology at the threshold of ASCUS for the detection of CIN2+ and CIN3+ (Table 47 [HC2] and Table 46 [cytology])
  • HC2 was less specific than cytology at the threshold of ASCUS for the detection of CIN2+ and CIN3+ (Table 47 [HC2] and Table 46 [cytology]).

The Cochrane review40 reported that the sensitivity of HPV testing was higher in the studies at high risk of verification bias and in those at low risk regarding the prediction of CIN3+ in Table 49,40 suggesting that sensitivity is overestimated. For CIN3+, the sensitivities of HPV testing reported in the studies that recruited participants older than 30 years of age were higher than the sensitivities reported in studies where all eligible screening ages were included (93.9% [95% CI, 89.3% to 96.6%] versus 92.6% [95% CI, 89.6% to 95.3%]),40 which is as expected due to a higher prevalence of high-grade lesions in this group of participants older than 30 years of age.

The results of seven of the eight primary studies identified since the publication of the SRs supported the conclusion that HPV tests, including HC2, Multiplex Genotyping, Aptima, Cobas, and Confidence, demonstrate higher sensitivity and lower specificity than either LBC or conventional cytology.43,51,53,56,58,60 One retrospective study by Jin et al.19 found that HC2 was both more sensitive and more specific than cytology. There was no definitive explanation as to why the results of this study were discordant; however, the authors did not specify the diagnostic threshold used for HC2 testing nor did they adjust the results for verification bias.

Acceptance of screening invitations was evaluated in one SR. Based on the summary results of the review by Verdoodt et al.,20 the pooled estimates in both the per-protocol and intent-to-treat analyses showed that the option of mailing a self-collected HPV test to all eligible participants who were overdue for screening was more accepted than undergoing standard cervical cancer screening. In both analyses, the acceptance of the opt-in self-sampled HPV testing option was not significantly different from that of standard cervical cancer screening.20 The option of going door-to-door and offering self-collected HPV testing kits to participants overdue for screening was not associated with significantly different acceptance rates when compared with conventional screening.20

Based on results of the five primary studies published after Verdoot et al., there was evidence to show higher participation rates for self-collected HPV testing than for conventional cytology testing among women who were considered as non-attenders for cervical cancer screening.44,46,47,49,50 Among the five studies that tested the statistical significance in the difference between groups,4650 three reported higher participation rates in the self-sampling group.7,46,49 However, Zehbe et al.48 conducted a cluster randomized study of First Nations communities in Ontario and found similar participation rates between self-sampling and control groups.48 Rossi et al. compared four strategies, self-sampler delivered to home for self-testing, obtained in pharmacy, cytology at a clinic and HPV tests at a clinic, and did not report significant differences in participation rates.50 The relative frequencies of participating in cervical cancer screening via clinician-sampled HPV tests was higher than those for cytology in the study by Pasquale et al.59

Colposcopy referral rates and the detection of CIN3+ were reported in the SR by Melnikow et al.41 Among participants who were triaged, higher colposcopy referral rates and detection of CIN3+ were reported in the primary HPV testing groups compared with cytology in round 1.41 Higher rates of colposcopy referral were observed among participants younger than 35 years versus those aged 35 years and older.41 There was heterogeneity in screening strategies, settings, and populations observed in the studies included in the SR by Melnikow et al.41 The one-arm cohort study (Zorzi et al.) reported higher colposcopy referral rates at the first round of screening (4.4%) as compared with the second round (2.2%) two years later.41

Four primary studies published after the draft SR by Melnikow et al.41 evaluated colposcopy referral in two different ways: relative to total participants screened or relative to the number of participants who were triaged or randomized.42,43,51,52,57 Referral rates relative to the total number of participants screened ranged from 0% for LBC to 16.3% for Multiplex Genotyping.42,43,51,52,57 In the studies looking at referral relative to the number of participants who were triaged or randomized, the referral rates for Cobas, HC2 or Cobas with LBC triage, and LBC were 0.4% or less.42

There was limited evidence available to address harms and clinical utility. Overall, the evidence was consistent in demonstrating that primary high-risk HPV screening led to a statistically significantly increased detection of CIN3+ in the initial round of screening versus cytology and that the relative risk for CIN3+ detection between screening groups was similar to the overall findings in both the younger (younger than 35 years) and older (older than 35 years) age groups.41 The results of the Canadian FOCAL trial were reported as a part of the SR by Melnikow et al.41,65 The results of this study appeared to be comparable with the results of the primary studies that were conducted in other countries.

Though Melnikow et al. aimed to assess the harms and AEs associated with cervical cancer screening, no results were identified among the included primary studies with regard to cervical cancer mortality, rates of cervical cancer treatment, or harms occurring from the screening test, diagnostic testing, or treatments.41,65 No data were provided regarding the impact of HPV testing on the detection of adenocarcinoma. The authors commented that the studies included in their SR were not adequately powered to detect the relatively uncommon AEs that can occur following the biopsy or treatment of cervical lesions.41,65 Melnikow et al. were able to comment on the incidence of invasive cervical cancer detected in participants with negative screening tests in three studies evaluating screening using primary high-risk HPV testing compared with cytology based on the meta-analysis by Ronco et al. (Table 55).41,65 The pooled incidence rates were 0.05 and 0.08 in the HPV testing and cytology groups, respectively.41,65 In the NTCC phase II study, no cases of invasive cervical cancer or CIN3+ were identified among those who were screen negative and were followed up to three and a half years after one round of screening in both the control and intervention groups.41,65 After one round of screening and five years of follow-up, the FINNISH trial reported invasive cervical cancer in 0.01% (5 of 57,135) of participants with an initial negative screening result in the high-risk HPV testing group and in 0.005% (3 of 61,241) of participants in the cytology group.41,65 The data on invasive cervical cancer was not reported in the HPV FOCAL trial, as it used rates of CIN2+ and CIN3+.41 65

Summary of Results for Question 2

Baseline and longitudinal DTA of four different HPV testing and triage strategies were compared in the HIQA SR.5 There were four primary studies included and, due to heterogeneity, there was no meta-analysis conducted.5 Based on the results presented in the HIQA SR, there seemed to be a trade-off between the sensitivities and specificities of the four strategies.5 Primary HPV testing, followed by triage with sequential genotyping and cytology, was less sensitive and more specific than primary HPV testing followed by cytology triage in three included primary studies.5 Primary HPV testing follow by co-testing with genotyping and cytology was similarly sensitive but less specific than primary HPV testing followed by cytology in two primary studies.5 Among the four HPV triage strategies, primary HPV testing with HPV test and cytology co-testing seemed to have the highest sensitivity.5 Primary HPV testing followed by sequential genotyping and cytology seemed to have the highest specificity. There were no additional primary studies identified for these outcomes in the CADTH search.

Longitudinal DTA was summarized based on the same triage strategies.5 There was no meta-analysis conducted for the three primary studies.5 The sensitivity and specificity of the primary HPV testing followed by cytology remained high after one to four years of follow-up.5 The longitudinal DTAs of the other three triage strategies of interest were compared with baseline DTA.5 Longitudinal sensitivities were lower than baseline for primary HPV testing followed by either cytology alone, sequential genotyping and cytology, or co-testing (with HPV genotying and cytology).5 The longitudinal specificities were higher for primary HPV testing followed by cytology alone, and co-testing (with HPV genotying and cytology), while they were lower for primary HPV testing followed by sequential genotyping and cytology than baseline.5 There were no additional primary studies identified for these outcomes in the CADTH search.

The colposcopy referral rates based on the four triage strategies were reported in the same four studies that reported DTA outcomes.5 The results were not meta-analyzed. Primary HPV testing followed by co-testing with genotyping and cytology seemed to have higher referral rates of total screened compared with primary HPV testing followed by either cytology alone, genotyping alone, or sequential genotyping and cytology.5

Companion Reports

In order to identify additional information regarding the comparability and agreement of DTA between self- and clinician-sampled HPV tests and between self- and clinician-sampled HPV tests or cytology, we undertook a rapid review of the literature, which has been published separately.67 The review aimed to address the following questions:

  • What is the diagnostic test accuracy of self-sampled HPV tests compared with clinician-sampled HPV tests or cytology for asymptomatic cervical cancer screening?
  • What is the clinical evidence regarding the agreement or concordance of self-sampled HPV tests and clinician-sampled HPV tests or cytology for asymptomatic cervical cancer screening?

Based on a review and critical appraisal of one SR, four RCTs, six prospective cohort studies, and two cross-sectional studies, it was found that there is evidence to show that self-sampled HPV tests can achieve similar DTA as clinician-sampled HPV tests with certain combinations of HPV tests and sampling devices for the detection of CIN2 or severe diagnosis. For example, GP5+/6+ PCR HPV tests based on cervix specimens sampled with brushes or lavage have similar sensitivities and specificities as clinician-sampled HPV tests. Signal-based HPV tests, including HC2, one of the most widely tested HPV tests, are less sensitive and less specific with self-sampled specimens. There are individual studies showing high concordance or fair to high agreement between self- and clinician-sampled HPV tests. However, self-sampled HPV tests are less sensitive and specific than cytology at the threshold of ASCUS or more severe dysplasia.

The advantages of self-sampled HPV tests included better acceptance by those eligible for routine screening programs. Self-sampled HPV tests detected more cases with findings of CIN2+ than cytology or co-testing with clinician-sampled HPV tests and cytology.

The limitations of this review include considerable heterogeneity between studies, relatively few studies on the agreement between self- and clinician-sampled HPV tests, and the applicability of the existing evidence to vaccinated populations.

Further detail regarding the methods and results of the rapid review are available on the CADTH website.67

Image ch3f6
Copyright © 2019 Canadian Agency for Drugs and Technologies in Health.

The copyright and other intellectual property rights in this document are owned by CADTH and its licensors. These rights are protected by the Canadian Copyright Act and other national and international laws and agreements. Users are permitted to make copies of this document for non-commercial purposes only, provided it is not modified when reproduced and appropriate credit is given to CADTH and its licensors.

Except where otherwise noted, this work is distributed under the terms of a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International licence (CC BY-NC-ND), a copy of which is available at http://creativecommons.org/licenses/by-nc-nd/4.0/

Bookshelf ID: NBK543096

Views

  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (8.2M)

Other titles in this collection

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...