U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

WHO consolidated guidelines on tuberculosis: Module 3: Diagnosis – Rapid diagnostics for tuberculosis detection [Internet]. Geneva: World Health Organization; 2024.

Cover of WHO consolidated guidelines on tuberculosis

WHO consolidated guidelines on tuberculosis: Module 3: Diagnosis – Rapid diagnostics for tuberculosis detection [Internet].

Show details

2Recommendations

2.1. Initial diagnostic tests for diagnosis of TB with drug-resistance detection

Xpert MTB/RIF and Xpert MTB/RIF Ultra assays

The development of the Xpert MTB/RIF assay (Cepheid, Sunnyvale, United States of America [USA]) was a significant step forward for improving the diagnosis of TB and the detection of rifampicin resistance globally. However, Xpert MTB/RIF sensitivity is suboptimal, particularly in smear-negative and HIV-associated TB patients. The Xpert MTB/RIF Ultra (Cepheid, Sunnyvale, USA), hereafter referred to as Xpert Ultra, was developed by Cepheid as the next-generation assay to overcome these limitations. It uses the same GeneXpert® platform as the Xpert MTB/RIF.

Recommendations

This section contains five sets of recommendations, with each set being specific for a particular type of testing (initial or repeated) and type of TB (pulmonary or extrapulmonary).

Recommendations on Xpert MTB/RIF and Xpert Ultra as initial tests in adults and children with signs and symptoms of pulmonary TB

  1. In adults with signs and symptoms of pulmonary TB, Xpert MTB/RIF should be used as an initial diagnostic test for TB and rifampicin-resistance detection in sputum rather than smear microscopy/culture and phenotypic DST.
    (Strong recommendation, high certainty of evidence for test accuracy; moderate certainty of evidence for patient-important outcomes6)
  2. In children with signs and symptoms of pulmonary TB, Xpert MTB/RIF should be used as an initial diagnostic test for TB and rifampicin-resistance detection in sputum, gastric aspirate, nasopharyngeal aspirate and stool rather than smear microscopy/culture and phenotypic DST.
    (Strong recommendation, moderate certainty for accuracy in sputum; low certainty of evidence for test accuracy in gastric aspirate, nasopharyngeal aspirate and stool)
  3. In adults with signs and symptoms of pulmonary TB and without a prior history of TB (≤5 years) or with a remote history of TB treatment (>5 years since end of treatment), Xpert Ultra should be used as an initial diagnostic test for TB and for rifampicin-resistance detection in sputum, rather than smear microscopy/culture and phenotypic DST.
    (Strong recommendation, high certainty of evidence for test accuracy)
  4. In adults with signs and symptoms of pulmonary TB and with a prior history of TB and an end of treatment within the last 5 years, Xpert Ultra may be used as an initial diagnostic test for TB and for rifampicin-resistance detection in sputum, rather than smear microscopy/culture and phenotypic DST.
    (Conditional recommendation, low certainty of evidence for test accuracy)
  5. In children with signs and symptoms of pulmonary TB, Xpert Ultra should be used as the initial diagnostic test for TB and detection of rifampicin resistance in sputum or nasopharyngeal aspirate, rather than smear microscopy/culture and phenotypic DST.
    (Strong recommendation, low certainty of evidence for test accuracy in sputum; very low certainty of evidence for test accuracy in nasopharyngeal aspirate)
6

Mortality, cure, pretreatment loss to follow-up, time to diagnosis, treatment and mortality in PLHIV.

Remarks

For recommendation 2: Sputum includes expectorated and induced sputum. Studies assessing the impact of Xpert MTB/RIF on outcomes in children are lacking. The choice of the specimen will depend on the acceptability (for children, parents, health care workers and other stakeholders) and the feasibility of collecting and preparing specimens in the local context. Regarding Xpert MTB/RIF, the certainty of evidence is higher for sputum and nasopharyngeal aspirates than for other specimen types. The recommendation can be extrapolated for children living with HIV. The direct benefit from testing for rifampicin resistance in sputum (very low certainty of evidence for test accuracy) can be extrapolated to other specimens.

For recommendation 4: The justification for a conditional recommendation is based on:

  • low certainty of evidence for test accuracy;
  • uncertainty about the interpretation of Xpert Ultra trace results in patients with a prior history of disease and the associated high false-positivity rate; and
  • uncertainty about the required resources.

For patients with Xpert Ultra trace results, decisions regarding treatment initiation should include considerations of the clinical presentation and the patient context (including prior treatment history, probability of relapse and other test results).

Recommendations on Xpert MTB/RIF and Xpert Ultra as initial tests in adults and children with signs and symptoms of extrapulmonary TB

6.

In adults and children with signs and symptoms of TB meningitis, Xpert MTB/RIF or Xpert Ultra should be used in cerebrospinal fluid (CSF) as an initial diagnostic test for TB meningitis rather than smear microscopy/culture.

(Strong recommendation, moderate certainty of evidence for test accuracy for Xpert MTB/RIF; low certainty of evidence for test accuracy for Xpert Ultra)

7.

In adults and children with signs and symptoms of extrapulmonary TB, Xpert MTB/RIF may be used in lymph node aspirate, lymph node biopsy, pleural fluid, peritoneal fluid, pericardial fluid, synovial fluid or urine specimens as the initial diagnostic test rather than smear microscopy/culture.

(Conditional recommendation, moderate certainty of evidence for test accuracy for pleural fluid; low certainty for lymph node aspirate, peritoneal fluid, synovial fluid, urine; very low certainty for pericardial fluid, lymph nodes biopsy)

8.

In adults and children with signs and symptoms of extrapulmonary TB, Xpert Ultra may be used in lymph node aspirate and lymph node biopsy as the initial diagnostic test rather than smear microscopy/culture.

(Conditional recommendation, low certainty of evidence)

9.

In adults and children with signs and symptoms of extrapulmonary TB, Xpert MTB/RIF or Xpert Ultra should be used for rifampicin-resistance detection rather than culture and phenotypic DST.

(Strong recommendation, high certainty of evidence for test accuracy for Xpert MTB/RIF; low certainty of evidence for Xpert Ultra)

10.

In HIV-positive adults and children with signs and symptoms of disseminated TB, Xpert MTB/RIF may be used in blood, as an initial diagnostic test for disseminated TB.

(Conditional recommendation, very low certainty of evidence for test accuracy)

Remarks

For recommendation 6: This recommendation applies to all patients with signs and symptoms of TB meningitis. The recommendation in children with signs and symptoms of TB meningitis is based on very low certainty of evidence for test accuracy for Xpert MTB/RIF. No data were available on the accuracy of Xpert Ultra for TB meningitis in children.

For recommendation 7: Clinical judgement and pretest probability should guide treatment. In a high pretest probability setting (>5%), a negative test result will not rule out the condition. Available data on Xpert MTB/RIF for children have included lymph node aspirate and lymph node biopsy specimens; given the similarity of the effects, the recommendation for adults is extrapolated for children.

For recommendation 8: The composite reference standard for Xpert Ultra gave similar results when lymph nodes aspirate was compared to lymph nodes biopsy.

For recommendation 9: Clinical judgement and pretest probability should guide treatment. In a high pretest probability setting, a negative test result will not rule out the condition.

For recommendation 10: Blood was only evaluated in people living with HIV (PLHIV) and under particular processing specifications (6), using third-generation Xpert MTB/RIF cartridges, based on one study with a small number of participants. The recommendation applies only to a particular population (HIV-positive adults with signs and symptoms of disseminated TB). The GDG did not feel comfortable extrapolating this recommendation to other patient populations.

Recommendations on Xpert MTB/RIF and Xpert Ultra repeated testing in adults and children with signs and symptoms of pulmonary TB 1

11.

In adults with signs and symptoms of pulmonary TB who have an Xpert Ultra trace positive result on the initial test, repeated testing with Xpert Ultra may not be used.

(Conditional recommendation, very low certainty of evidence for test accuracy)

12.

In children with signs and symptoms of pulmonary TB in settings with pretest probability below 5% and an Xpert MTB/RIF negative result on the initial test, repeated testing with Xpert MTB/RIF in sputum, gastric fluid, nasopharyngeal aspirate or stool specimens may not be used.8

(Conditit onal recommendation, low certainty of evidence for test accuracy for sputum and very low for other specimen types)

13.

In children with signs and symptoms of pulmonary TB in settings with pretest probability 5% or more and an Xpert MTB/RIF negative result on the initial test, repeated testing with Xpert MTB/RIF (for total of two tests) in sputum, gastric fluid, nasopharyngeal aspirate and stool specimens may be used.

(Conditional recommendation, low certainty of evidence for test accuracy for sputum and very low for other specimen types)

14.

In children with signs and symptoms of pulmonary TB in settings with pretest probability below 5% and an Xpert Ultra negative result on the initial test, repeated testing with Xpert Ultra in sputum or nasopharyngeal aspirate specimens may not be used.

(Conditional recommendation, very low certainty of evidence for test accuracy)

15.

In children with signs and symptoms of pulmonary TB in settings with pretest probability 5% or more and an Xpert Ultra negative result on the first initial test, repeated one Xpert Ultra test (for a total of two tests) in sputum and nasopharyngeal aspirate specimens may be used.

(Conditional recommendation, very low certainty of evidence for test accuracy)

8

In low prevalence settings the effect of the second test was less pronounced.

Remarks

For recommendation 11: Xpert Ultra trace results will require follow-up, including reassessing clinical symptoms and information on prior history of TB. In the case of suspected rifampicin resistance, repeated testing may provide additional benefit for detection as well as an initial attempt to assess rifampicin resistance.

For recommendation 13: The GDG felt that the implementation of the recommendation depends on the acceptability (for children, parents or caregivers, health care workers and other stakeholders) and the feasibility of conducting repeated testing in the local context. The evidence reviewed evaluated repeating the same test on the same type of specimen. However, from the data reviewed on comparing single tests on different specimen types, there appears to be no difference, regardless of which second specimen is obtained. The recommendation can be extrapolated for children living with HIV (for Xpert MTB/RIF). This includes consideration of the direct benefit from detecting rifampicin resistance in sputum samples (very low certainty of evidence for test accuracy), which the GDG felt can be extrapolated to other samples. The recommendation applies to a moderate or high pretest setting (>5%). If the first test result is positive, the test should not be repeated. In settings with moderate to high pretest probability, the incremental yield of more than two tests is unknown.

For recommendation 15: Desirable and undesirable effects were judged to be moderate, but the GDG felt that testing twice in the moderate and high pretest probability (>5%) settings on balance may provide more benefits than harms. The recommendation is applicable for sputum and nasopharyngeal aspirates. No evidence was identified for stool and gastric aspirates.

Recommendations on Xpert MTB/RIF and Xpert Ultra as initial tests for pulmonary TB in adults in the general population either with signs and symptoms of TB or chest radiograph with lung abnormalities or both 2

16.

In adults in the general population who had either signs or symptoms of TB or chest radiograph with lung abnormalities or both, the Xpert MTB/RIF or Xpert Ultra may replace culture as the initial test for pulmonary TB.

(Conditional recommendation, low certainty of the evidence in test accuracy for Xpert MTB/RIF and moderate certainty for Xpert Ultra)

17.

In adults in the general population who had either a positive TB symptom screen or chest radiograph with lung abnormalities or both, one Xpert Ultra test may be used rather than two Xpert Ultra tests as the initial test for pulmonary TB.

(Conditional recommendation, very low certainty of evidence for test accuracy)

Remarks

For recommendation 16: This recommendation was informed by evidence from recent national surveys of TB disease prevalence in four high TB burden countries. Indirectness of the evidence was classified as serious, given that the methods applied in TB prevalence surveys differ from usual programmatic conditions (e.g. symptom screen limited to cough for 14 days or more, and a requirement in surveys to have the results of both symptom screen and chest radiography available). In addition, inconsistency of the evidence was also classified as serious, owing to variability of the data from different countries. As a result, certainty in the estimates of effect was downgraded to low for sensitivity and moderate for specificity. The recommendation applies only to the use of Xpert MTB/RIF or Xpert Ultra for clinical case management in situations where an immediate decision on patient treatment needs to be made and recourse to supplementary tests is not available or would incur delays. It does not apply to scientific studies with other objectives, such as the reliable estimation of the prevalence of TB disease in the community, for which alternative testing algorithms are required (in particular, to address the issue of false positive results, as illustrated in Table 1.17). Recommendations about the screening and diagnostic algorithms to be used in such studies are beyond the scope of this GDG. Recommendations for the diagnostic algorithm(s) to recommend in national TB prevalence surveys specifically are being developed by WHO and are scheduled for release in 2020.

For recommendation 17: There are concerns about losing global and national capacity for culture testing – the current reference standard for identifying active TB disease. An Xpert Ultra trace result was considered as negative in these studies. More false positive results are expected for Xpert Ultra for pulmonary TB. The recommendation applies only to the use of Xpert Ultra for clinical case management. When Xpert Ultra gives a positive result, clinical management should be followed according to national guidelines. When Xpert Ultra gives a negative result, the patient should be re-evaluated clinically. In the case of a culture-positive result, clinical management should be followed according to national guidelines. In the case of a culture-negative result, the patient should be re-evaluated clinically. The recommendation does not apply to scientific studies with other objectives, such as the reliable estimation of the prevalence of TB disease in the community, in which alternative testing algorithms (e.g. using more than one test) may be required. Recommendations for the diagnostic algorithm(s) to be used in such studies are beyond the scope of this GDG. Recommendations for the diagnostic algorithm(s) to recommend in national TB prevalence surveys specifically are being developed by WHO and are scheduled for release in 2020.

Test descriptions

Xpert MTB/RIF is an automated PCR test (molecular test) using the GeneXpert platform (Fig. 2.1.1). Xpert MTB/RIF is a single test that can detect both MTBC bacteria and rifampicin resistance within 2 hours of starting the test, with minimal hands-on technical time (7).

Fig. 2.1.1. The GeneXpert four-module instrument and the Xpert MTB/RIF test cartridge.

Fig. 2.1.1

The GeneXpert four-module instrument and the Xpert MTB/RIF test cartridge.

In Xpert MTB/RIF sample processing – in contrast to conventional nucleic acid amplification tests (NAATs) – PCR amplification and detection are integrated into a single self-enclosed test unit; that is, the Xpert MTB/RIF cartridge. Following sample loading, all steps in the assay are automated and contained within the cartridge. In addition, the assay’s sample reagent, used to liquefy sputum, is tuberculocidal (i.e. it has the ability to kill TB bacteria), which largely eliminates concerns about biosafety during the test procedure. These features allow the technology to be taken out of a central laboratory or reference laboratory, and be used nearer to patients. However, Xpert MTB/RIF requires an uninterrupted and stable electrical power supply, temperature control and yearly calibration of the instrument’s modules (8).

Xpert Ultra uses the same GeneXpert platform as Xpert MTB/RIF; Cepheid developed it as the next-generation assay to overcome limitations in sensitivity for TB diagnosis. To improve assay sensitivity for the detection of MTBC, the Xpert Ultra assay incorporates two different multicopy amplification targets (IS6110 and IS1081) and has a larger DNA reaction chamber than Xpert MTB/RIF (50 μL PCR in Xpert Ultra versus 25 μL in Xpert MTB/RIF, Fig. 2.1.2). Xpert Ultra also incorporates fully nested nucleic acid amplification, more rapid thermal cycling, and improved fluidics and enzymes. This has resulted in Xpert Ultra having a limit of detection of 16 bacterial colony forming units (cfu) per millilitre (compared with 114 cfu/mL for Xpert MTB/RIF). To improve the accuracy of rifampicin-resistance detection, the Xpert Ultra test incorporates melting-temperature-based analysis. Specifically, four probes identify rifampicin-resistance mutations in the rifampicin-resistance determining region of the rpoB gene by detecting shifts in the melting temperature away from the wild-type reference range (9).

Fig. 2.1.2. (a) The Xpert MTB/RIF Ultra cartridge with its 50 μL reaction tube (green) and (b) the Xpert MTB/RIF cartridge with its 25 μL reaction tube (green).

Fig. 2.1.2

(a) The Xpert MTB/RIF Ultra cartridge with its 50 μL reaction tube (green) and (b) the Xpert MTB/RIF cartridge with its 25 μL reaction tube (green).

Justification and evidence

The WHO Global TB Programme has initiated an update of the current guidelines and commissioned a systematic review on the use of Xpert MTB/RIF and Xpert Ultra for the diagnosis of TB in people with signs and symptoms of TB.

The population, intervention, comparator and outcome (PICO) questions were designed to form the basis for the evidence search, retrieval and analysis.

Box 2.1.1PICO questions and subquestions

  • PICO 1: Among adults with signs and symptoms of pulmonary TB, seeking care at health care facilities, should Xpert MTB/RIF / Xpert Ultra be used as an initial test for diagnosis of pulmonary and rifampicin resistance?
    1.1.

    What is the impact of Xpert MTB/RIF on patient-important outcomes (cure, mortality, time to diagnosis and time to start treatment)?

    1.2.

    What is the diagnostic accuracy of Xpert MTB/RIF for pulmonary TB and rifampicin resistance, as compared with microbiological reference standard (MRS)?10

    1.3.

    What is the diagnostic accuracy of Xpert Ultra for pulmonary TB and rifampicin resistance, as compared with MRS?

  • PICO 2: Among children with signs and symptoms of pulmonary TB, seeking care at health care facilities, should Xpert MTB/RIF / Xpert Ultra be used as an initial test for diagnosis of pulmonary TB and rifampicin resistance?
    2.1.

    What is the diagnostic accuracy of Xpert MTB/RIF for pulmonary TB and rifampicin resistance in children, as compared with MRS and composite reference standard (CRS)?11

    2.2.

    What is the diagnostic accuracy of Xpert Ultra for pulmonary TB and rifampicin resistance in children, as compared with MRS and CRS?

  • PICO 3: Among adults with signs and symptoms of extrapulmonary TB, seeking care at health care facilities, should Xpert MTB/RIF / Xpert Ultra be used as an initial test for diagnosis of extrapulmonary TB and rifampicin resistance?
    3.1.

    What is the diagnostic accuracy of Xpert MTB/RIF for extrapulmonary TB and rifampicin resistance in adults, as compared with MRS and CRS?

    3.2.

    What is the diagnostic accuracy of Xpert Ultra for extrapulmonary TB and rifampicin resistance in adults, as compared with MRS and CRS?

  • PICO 4: Among children with signs and symptoms of extrapulmonary TB and rifampicin resistance, seeking care at health care facilities, should Xpert MTB/RIF / Xpert Ultra be used as an initial test for diagnosis of extrapulmonary TB and rifampicin resistance?
    4.1.

    What is the diagnostic accuracy of Xpert MTB/RIF for extrapulmonary TB and rifampicin resistance in children, as compared with MRS and CRS?

    4.2.

    What is the diagnostic accuracy of Xpert Ultra for extrapulmonary TB and rifampicin resistance in children, as compared with MRS and CRS?

  • PICO 5: Among people with signs and symptoms of pulmonary TB, seeking care at health care facilities, do repeated Xpert (Ultra) tests on subsequent samples as an initial test for diagnosis of pulmonary TB and rifampicin resistance increase sensitivity/specificity compared with a single initial test?
    5.1.

    Xpert Ultra repeated test for the diagnosis of pulmonary TB in adults with signs and symptoms of pulmonary TB who have an initial Xpert Ultra trace result, as compared with MRS?

    5.2.

    More than one Xpert MTB/RIF versus one Xpert MTB/RIF to diagnose pulmonary TB in children with signs and symptoms of pulmonary TB, as compared with MRS?

    5.3.

    More than one Xpert Ultra versus one Xpert Ultra to diagnose pulmonary TB in children with signs and symptoms of pulmonary TB, as compared with MRS?

  • PICO 6: Among adults either with signs and symptoms of TB or chest radiograph with lung abnormalities suggestive of pulmonary TB or both, should Xpert MTB/RIF or Xpert Ultra alone be used to define a case of active TB disease (10)?
    6.1.

    Xpert MTB/RIF to diagnose pulmonary TB in adults in the general population with signs and symptoms of pulmonary TB or chest radiograph with lung abnormalities or both, as compared with MRS.

    6.2.

    Xpert Ultra to diagnose pulmonary TB in adults in the general population with signs and symptoms of pulmonary TB or chest radiograph with lung abnormalities or both, as compared with MRS.

    6.3.

    Two Xpert Ultra versus one Xpert Ultra to diagnose pulmonary TB in adults in the general population with signs and symptoms of TB or chest radiograph with lung abnormalities or both, as compared with MRS.

10

Culture.

11

Positive culture or a clinical decision to initiate treatment for TB.

The systematic reviews were conducted to summarize the current literature on the diagnostic accuracy of Xpert MTB/RIF and Xpert Ultra for the diagnosis of TB and rifampicin resistance. This was done as part of the WHO process to develop updated guidelines for the use of molecular assays intended as initial tests for the diagnosis of pulmonary and extrapulmonary TB in adults and children. The data on children, where possible, were reported separately from adults.

The certainty of the evidence was assessed consistently through PICO questions, using the Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach, which produces an overall quality assessment (or certainty) of evidence and a framework for translating evidence into recommendations. The certainty of the evidence is rated as high, moderate, low or very low. These four categories “imply a gradient of confidence in the estimates” (10). In the GRADE approach, even if diagnostic accuracy studies are of observational design, they start as high-quality evidence.

At least two review authors independently completed the quality assessment of diagnostic accuracy studies (QUADAS)-2 assessments. Any disagreements were resolved through discussion or consultation with a third review author.

Finally, where applicable, meta-analyses were performed to estimate pooled sensitivity and specificity separately for Xpert MTB/RIF and Xpert Ultra, and separately for TB (either pulmonary or extrapulmonary) and rifampicin resistance.

Data synthesis was structured around the preset PICO questions list below. Details of studies included in the current analysis are given in Web Annex 1.1: Xpert MTB/RIF and Xpert Ultra. Summary of the results and details of the evidence quality assessment are available in Web Annex 2.1: Xpert MTB/RIF and Xpert Ultra.

PICO 1: Among adults with signs and symptoms of pulmonary TB, seeking care at health care facilities, should Xpert MTB/RIF / Xpert Ultra be used as an initial test for diagnosis of pulmonary TB and rifampicin resistance?

1.1 What is the impact of Xpert MTB/RIF on patient-important outcomes (cure, mortality, time to diagnosis and time to start treatment)?

The aim of the review was to assess the impact on patient-important outcomes of diagnostic strategies using Xpert MTB/RIF compared with strategies using smear microscopy. The following outcomes were considered: all-cause mortality, pretreatment loss to follow-up, cure, time to diagnosis and time to treatment initiation.

For the impact of Xpert MTB/RIF on patient-important outcomes for TB, seven studies were included (16 421 participants): two individually randomized trials (Mupfumi 2014; Theron 2014), four cluster randomized trials (Churchyard 2015; Cox 2014; Ngwira LG 2017; Durovni 2014), and one individual patient data (IPD) meta-analysis (Di Tanna 2019) (see Web Annex 1.1: Xpert MTB/RIF and Xpert Ultra for details of these and other studies). All studies were conducted in high TB burden and high TB/HIV burden countries. There were two trials in South Africa (Churchyard 2015; Cox 2014), one in Zimbabwe (Mupfumi 2014), one in Malawi (Ngwira LG 2017), one in Brazil (Durovni 2014) and two multicountry studies with sites in South Africa, United Republic of Tanzania, Zambia and Zimbabwe (Theron 2014, Di Tanna 2019). All studies were conducted in outpatient settings and enrolled participants aged 18 years or older.

Web Annex 4.1: Impact of diagnostic test Xpert MTB/RIF on patient-important outcomes for tuberculosis: a systematic review.

1.2 What is the diagnostic accuracy of Xpert MTB/RIF for pulmonary TB and rifampicin resistance, as compared with MRS?

The aim of the review was to assess the diagnostic accuracy of Xpert MTB/RIF for pulmonary TB and rifampicin resistance in adults. Randomized trials, cross-sectional studies and cohort studies were included, using respiratory specimens that evaluated Xpert MTB/RIF alone or together with Xpert Ultra against the reference standards of culture for TB detection and culture-based DST or MTBDRplus for rifampicin resistance. Only studies that enrolled adults (aged >15 years) were eligible. For the evaluation of TB detection, studies were included that evaluated the index tests in people with signs and symptoms of pulmonary TB, except for studies in PLHIV, where studies were eligible for inclusion irrespective of signs and symptoms of pulmonary TB (e.g. studies that performed TB screening in PLHIV as part of intensified case finding or before TB preventive therapy).

For detection of pulmonary TB, a total of 94 studies were identified. Of these, 85 studies (40 652 participants) evaluated Xpert MTB/RIF and nine studies (3881 participants) evaluated both Xpert Ultra and Xpert MTB/RIF. Of the 94 studies, 50 (53%) took place in high TB burden and 54 (57%) in high TB/HIV burden countries. Most studies had low risk of bias. Also, most studies had low concern about applicability because participants in these studies were evaluated in primary care facilities, local hospitals or both settings.

For detection of rifampicin resistance, 57 studies (8287 participants) evaluated Xpert MTB/RIF. Of the 57 studies, 27 took place in high MDR-TB burden countries. Most studies were judged as having low risk of bias.

Web Annex 4.2: Xpert MTB/RIF and Xpert Ultra for detecting active tuberculosis in adults with signs and symptoms of pulmonary TB: an updated systematic review.

1.3 What is the diagnostic accuracy of Xpert Ultra for pulmonary TB and rifampicin resistance, as compared with MRS?

For detection of pulmonary TB, a total of nine studies (3881 participants) evaluated both Xpert Ultra and Xpert MTB/RIF. For Xpert Ultra, a composite reference standard was also used that included clinical components as defined by the primary study authors. For detection of rifampicin resistance, eight studies (1039 participants) evaluated Xpert Ultra. The total number of Xpert Ultra studies includes one study that provided data for two cohorts; therefore, we classified these as two distinct studies, Mishra 2019a and Mishra 2019b. Most studies were judged as having high certainty of evidence.

Web Annex 4.2: Xpert MTB/RIF and Xpert Ultra for detecting active tuberculosis in adults with signs and symptoms of pulmonary TB: an updated systematic review.

PICO 2: Among children with signs and symptoms of pulmonary TB, seeking care at health care facilities, should Xpert MTB/RIF / Xpert Ultra be used as an initial test for diagnosis of pulmonary TB and rifampicin resistance?

2.1 What is the diagnostic accuracy of Xpert MTB/RIF for pulmonary TB and rifampicin resistance in children, as compared with MRS and CRS?

The initial search resulted in 835 individual records, with one additional reference identified through other sources, giving a total of 836 records, from which 707 were excluded. Initially, the remaining 129 articles were retrieved. After full-text review, 50 studies were included in the quantitative meta-analysis; of these, 40 (80%) took place in high TB burden countries and 10 in high TB/HIV burden countries. For pulmonary TB detection, 43 studies were included that evaluated the diagnostic accuracy of Xpert MTB/RIF in children, and three that evaluated both Xpert Ultra and Xpert MTB/RIF. Forty-two studies evaluated pulmonary TB using a reference standard of culture, and one study evaluated pulmonary TB using smear microscopy only.

In terms of methodological quality, in the patient selection domain, most studies (83%) evaluating pulmonary TB were judged to have low risk of bias. In the index test domain, all studies were judged to have low risk of bias. In the flow and timing domain, most studies (88%) were judged to have low risk of bias. In the reference standard domain, with respect to the MRS, 47% of studies were judged to have unclear risk of bias because only one culture was used to exclude TB. With respect to the composite reference standard, all studies were judged to have unclear risk of bias because of imperfect accuracy of the composite reference standard and differing definitions of this standard used by the primary study authors. Regarding applicability, in the patient selection domain, 50% of studies were judged as having high or unclear risk of bias, because participants were evaluated exclusively as inpatients at tertiary care centres, or the clinical setting was unclear. With respect to applicability of the index test, most studies (72%) were judged as having low concern owing to standardized application of the index tests. Eleven studies evaluating stool as a specimen for Xpert MTB/RIF or Xpert Ultra were judged to have unclear risk of bias because of the absence of a standardized protocol for stool preparation. Applicability of the reference standard was considered as a low concern for most studies (93%).

To generate evidence about the detection of rifampicin resistance, six studies were included. All of the six studies (223 participants) evaluated only Xpert MTB/RIF and were conducted in high TB burden countries and in high MDR-TB burden countries. Among the studies, 50% had a low risk of bias with respect to patient selection, while all studies had a low risk of bias with respect to the reference standard. Risk of bias was considered low for the reference standard if an automated process was used or it was clear that the reference standard results were interpreted without knowledge of the index tests. For all six studies, there were applicability concerns regarding patient selection because of enrolment exclusively from inpatient or tertiary centres.

For the meta-analysis, a total of 23 studies (6612 participants) evaluated sputum specimens; 14 studies (3468 participants) evaluated gastric specimens; four studies (1125 participants) evaluated nasopharyngeal specimens; and 11 studies (1592 participants) evaluated stool specimens – all of these studies evaluated Xpert MTB/RIF alone. Three studies (753 participants) evaluated both Xpert MTB/RIF and Xpert Ultra on frozen sputum specimens. One study (195 participants) evaluated both Xpert MTB/RIF and Xpert Ultra on nasopharyngeal specimens.

2.2 What is the diagnostic accuracy of Xpert Ultra for pulmonary TB and rifampicin resistance in children, as compared with MRS and CRS?

No studies evaluated Xpert Ultra alone. Three studies (753 participants) evaluated both Xpert MTB/RIF and Xpert Ultra on frozen sputum specimens. One study (195 participants) evaluated both Xpert MTB/RIF and Xpert Ultra on nasopharyngeal specimens.

Web Annex 4.4: Xpert MTB/RIF and Xpert Ultra for detecting active tuberculosis in children: an updated systematic review.

PICO 3: Among adults with signs and symptoms of extrapulmonary TB, seeking care at health care facilities, should Xpert MTB/RIF / Xpert Ultra be used as an initial test for diagnosis of extrapulmonary TB and rifampicin resistance?

3.1 What is the diagnostic accuracy of Xpert MTB/RIF for extrapulmonary TB and rifampicin resistance in adults, as compared with MRS and CRS?

There are difficulties in obtaining extrapulmonary specimens both from children and adults, and technical limitations of conventional bacteriological methods to aid diagnosis. Thus, various non-pulmonary specimens and composite reference standards are often used in evaluating the performance of new diagnostic technologies in extrapulmonary TB.

For detection of extrapulmonary TB, 65 studies were included. A total of 63 studies (13 144 participants) evaluated Xpert MTB/RIF, including five that evaluated both Xpert MTB/RIF and Xpert Ultra. The included studies evaluated Xpert MTB/RIF in cerebrospinal fluid (CSF) specimens comprising lymph node aspirate, lymph node biopsy, pleural fluid, urine, synovial fluid, peritoneal fluid, pericardial fluid and blood.

Of the total of 65 studies, 39 (60%) took place in high TB burden and 41 (63%) in high TB/HIV burden countries. Risk of bias was judged to be low in the domains of patient selection, index test, and flow and timing; and high or unclear in the reference standard domain because many studies decontaminated sterile specimens before culture inoculation. Regarding applicability, in the patient selection domain, high or unclear concern was expressed for most studies because either the participants were evaluated exclusively as inpatients at tertiary care centres, or the clinical settings were unclear.

Annex 4.3: Xpert MTB/RIF and Xpert Ultra for detecting active tuberculosis in adults with signs and symptoms of extrapulmonary TB: an updated systematic review.

3.2 What is the diagnostic accuracy of Xpert Ultra for extrapulmonary TB and rifampicin resistance in adults, as compared with MRS?

Six studies (507 participants) evaluated Xpert Ultra for the detection of extrapulmonary TB. The included studies evaluated the test in CSF specimens comprising lymph node biopsy, pleural fluid, urine and synovial fluid. Serious concerns were expressed regarding the indirectness of the evidence; these concerns related to applicability (i.e. evidence was generated in tertiary referral medical centres), and imprecision of the evidence, related mostly to low numbers of participants included in studies. Certainty of evidence was generally judged as being between low and very low.

Web Annex 4.3: Xpert MTB/RIF and Xpert Ultra for detecting active tuberculosis in adults with signs and symptoms of extrapulmonary TB: an updated systematic review.

PICO 4: Among children with signs and symptoms of extrapulmonary TB and rifampicin resistance, seeking care at health care facilities, should Xpert MTB/RIF / Xpert Ultra be used as an initial test for diagnosis of extrapulmonary TB and rifampicin resistance?

4.1 What is the diagnostic accuracy of Xpert MTB/RIF for extrapulmonary TB and rifampicin resistance in children, as compared with MRS and CRS?

4.2 What is the diagnostic accuracy of Xpert Ultra for extrapulmonary TB and rifampicin resistance in children, as compared with MRS and CRS?

To evaluate detection of extrapulmonary TB, studies that evaluated the diagnostic accuracy of Xpert MTB/RIF in children with signs or symptoms of lymph node TB or TB meningitis were included.

For diagnosis of lymph node TB, six studies (210 participants) evaluated Xpert MTB/RIF against an MRS of smear or culture on lymph node specimens. Two studies (105 participants) evaluated Xpert MTB/RIF against a composite reference standard for lymph node TB. For TB meningitis, six studies (241 participants) evaluated Xpert MTB/RIF against culture on CSF. In addition, two studies (155 participants) assessed Xpert MTB/RIF against a composite reference standard that included a clinical diagnosis of TB meningitis. The certainty of evidence was judged to be very low for sensitivity, and low for specificity of detection of both TB meningitis and lymph node TB.

No studies evaluating the accuracy of Xpert Ultra for detecting lymph node TB or TB meningitis were identified.

Web Annex 4.4: Xpert MTB/RIF and Xpert Ultra for detecting active tuberculosis in children: an updated systematic review.

PICO 5: Among people with signs and symptoms of pulmonary TB, seeking care at health care facilities, do repeated Xpert (Ultra) tests on subsequent samples as an initial test for diagnosis of pulmonary TB and rifampicin resistance increase sensitivity/specificity compared with a single initial test?

5.1 Xpert Ultra repeated test for the diagnosis of pulmonary TB in adults with signs and symptoms of pulmonary TB who have an initial Xpert Ultra trace result, as compared with MRS?

For adults, with initial Xpert Ultra trace results, three studies were identified: Mishra 2019a (4 participants), Piersimoni 2019 (4 participants), and Dorman 2018 (42 participants) (see Web Annex 1.1: Xpert MTB/RIF and Xpert Ultra for details of included studies). Piersimoni 2019 retested the same initial sample, whereas Dorman 2018 retested a separately collected sputum sample. Mishra 2019a retested only those participants with discrepant results (i.e. Ultra trace positive/culture negative), and retested new specimens obtained a median of 444 days (range 245–526 days) after initial testing. Owing to limited data, a meta-analysis was not performed. The evidence was downgraded one level for inconsistency and two levels for imprecision. Serious concerns were expressed for inconsistency, and very serious concerns for imprecision. Certainty of evidence was judged to be very low for both sensitivity and specificity.

Web Annex 4.2: Xpert MTB/RIF and Xpert Ultra for detecting active tuberculosis in adults with signs and symptoms of pulmonary TB: an updated systematic review.

5.2 More than one Xpert MTB/RIF versus one Xpert MTB/RIF to diagnose pulmonary TB in children with signs and symptoms of pulmonary TB, as compared with MRS?

For children, five studies (2119 participants) were included that have evaluated the diagnostic accuracy of multiple Xpert MTB/RIF tests compared with a single test. Serious concerns were expressed for indirectness, because patients were enrolled from inpatient tertiary care settings, which could lead to the enrolment of children with more advanced disease. Also, serious concerns were expressed for imprecision, related to the low number of children with pulmonary TB contributing to this analysis for the observed sensitivity. Overall, the certainty of evidence was judged to be very low for sensitivity and moderate for specificity.

Web Annex 4.4: Xpert MTB/RIF and Xpert Ultra for detecting active tuberculosis in children: an updated systematic review.

5.3 More than one Xpert Ultra versus one Xpert Ultra to diagnose pulmonary TB in children with signs and symptoms of pulmonary TB, as compared with MRS?

For children, one study (163 participants) was included that evaluated the diagnostic accuracy of multiple Xpert Ultra tests in sputum compared with a single test. The certainty of evidence was judged to be very low for sensitivity and low for specificity owing to serious concerns for indirectness and imprecision. In addition, one study (130 participants) was included that evaluated the diagnostic accuracy of multiple Xpert Ultra tests in nasopharyngeal aspirates compared with a single test. Overall, the certainty of evidence was judged to be very low both for sensitivity and specificity, owing to very serious concerns for indirectness and imprecision.

Web Annex 4.4: Xpert MTB/RIF and Xpert Ultra for detecting active tuberculosis in children: an updated systematic review.

PICO 6: Among adults either with signs and symptoms of TB or chest radiograph with lung abnormalities suggestive of pulmonary TB or both, should Xpert MTB/RIF or Xpert Ultra alone be used to define a case of active TB disease (10)?

The aim of the review was to assess the diagnostic accuracy of Xpert MTB/RIF and Xpert Ultra for pulmonary TB in adults (aged ≥15 years) among the general population. Data from four nationally representative and two subnational prevalence surveys for active TB disease, cross-sectional in design, were included. These surveys used sputum samples that evaluated Xpert MTB/RIF or Xpert Ultra against the reference standard of culture for TB. For the evaluation of TB detection, the surveys evaluated the index tests in adults (aged ≥15 years) with chest X-ray abnormalities or symptoms suggestive of pulmonary TB (or both). For detection of pulmonary TB, a total of six surveys were identified.

6.1 Xpert MTB/RIF to diagnose pulmonary TB in adults in the general population with signs and symptoms of pulmonary TB or chest radiograph with lung abnormalities or both, as compared with MRS?

The analysis reported on the results of four surveys, including 49 556 participants. Assessment of the quality of the evidence revealed serious deficiencies in the evidence quality.

Indirectness: the populations in these prevalence surveys differed from the general population with respect to prior testing (e.g. symptom screen was limited to cough for 14 days or more) and the availability of results of both symptom screen and chest radiography in most participants included in the studies. The evidence was downgraded one level for indirectness.

Inconsistency: the sensitivity estimate for Bangladesh was 84%, which was higher than the sensitivity estimates for the other three countries (range, 68–69%). Lower HIV prevalence in Bangladesh could only partly explain the inconsistency. The evidence was downgraded one level for inconsistency. Overall, the certainty of evidence was judged to be low for sensitivity and moderate for specificity.

6.2 Xpert Ultra to diagnose pulmonary TB in adults in the general population with signs and symptoms of pulmonary TB or chest radiograph with lung abnormalities or both, as compared with MRS.

The analysis reported on the results of four surveys, including 11 488 participants. The included countries were Myanmar, South Africa (TREATS project) and Zambia (TREATS project). The average prevalence of TB in these countries was 2.8% (range 1.6–6.7%).

Indirectness: the populations in these prevalence surveys differed from the general population with respect to prior testing (e.g. symptom screen was limited to cough for 14 days or more) and the availability of results of both symptom screen and chest radiography in most participants included in the studies. The evidence was downgraded one level for indirectness.

Imprecision: there were relatively few participants contributing to this analysis, and a wide 95% confidence interval (CI). The 95% CI around true positives and false negatives may lead to different decisions, depending on which limits are assumed. The evidence was downgraded one level for imprecision. Overall, the certainty of evidence was judged to be low for sensitivity and moderate for specificity.

6.3 Two Xpert Ultra versus one Xpert Ultra to diagnose pulmonary TB in adults in the general population with signs and symptoms of TB or chest radiograph with lung abnormalities or both, as compared with MRS.

The analysis reported on the results of three surveys, including 5080 participants. Serious concerns were expressed about the indirectness of the available evidence. This was because most of the data were from Myanmar, and the results may not be applicable to other settings. In addition, very serious concerns were expressed about imprecision because the analysis was based on data for only a small number of individuals. The 95% CIs for two Xpert Ultra assays and one Xpert Ultra assay were wide. Overall, the certainty of evidence was judged to be very low for sensitivity and moderate for specificity.

Performance of the molecular assays

Table 2.1.1. PICO 1.1: What is the impact of Xpert MTB/RIF on patient-important outcomes (e.g. cure, mortality, time to diagnosis and time to start treatment)?

Table 2.1.1

PICO 1.1: What is the impact of Xpert MTB/RIF on patient-important outcomes (e.g. cure, mortality, time to diagnosis and time to start treatment)?

Table 2.1.2. PICO 1.2: What is the diagnostic accuracy of Xpert MTB/RIF for pulmonary TB in adults, as compared with MRS?

Table 2.1.2

PICO 1.2: What is the diagnostic accuracy of Xpert MTB/RIF for pulmonary TB in adults, as compared with MRS?

Table 2.1.3. PICO 1.2: What is the diagnostic accuracy of Xpert MTB/RIF for rifampicin resistance in adults with pulmonary TB, as compared with MRS?

Table 2.1.3

PICO 1.2: What is the diagnostic accuracy of Xpert MTB/RIF for rifampicin resistance in adults with pulmonary TB, as compared with MRS?

Table 2.1.4. PICO 1.3: What is the diagnostic accuracy of Xpert Ultra for pulmonary TB, as compared with MRS?

Table 2.1.4

PICO 1.3: What is the diagnostic accuracy of Xpert Ultra for pulmonary TB, as compared with MRS?

Table 2.1.5. PICO 1.3: What is the diagnostic accuracy of Xpert Ultra for rifampicin resistance in adults with pulmonary TB, as compared with MRS?

Table 2.1.5

PICO 1.3: What is the diagnostic accuracy of Xpert Ultra for rifampicin resistance in adults with pulmonary TB, as compared with MRS?

Table 2.1.6. PICO 2.1: What is the diagnostic accuracy of Xpert MTB/RIF for pulmonary TB in children, as compared with MRS and CRS?

Table 2.1.6

PICO 2.1: What is the diagnostic accuracy of Xpert MTB/RIF for pulmonary TB in children, as compared with MRS and CRS?

Table 2.1.7. PICO 2.1: What is the diagnostic accuracy of Xpert MTB/RIF for rifampicin resistance in children, as compared with MRS?

Table 2.1.7

PICO 2.1: What is the diagnostic accuracy of Xpert MTB/RIF for rifampicin resistance in children, as compared with MRS?

Table 2.1.8. PICO 2.2: What is the diagnostic accuracy of Xpert Ultra for pulmonary TB in children, as compared with MRS and CRS?

Table 2.1.8

PICO 2.2: What is the diagnostic accuracy of Xpert Ultra for pulmonary TB in children, as compared with MRS and CRS?

Table 2.1.9. PICO 3.1: What is the diagnostic accuracy of Xpert MTB/RIF for extrapulmonary TB in adults, as compared with MRS and CRS?

Table 2.1.9

PICO 3.1: What is the diagnostic accuracy of Xpert MTB/RIF for extrapulmonary TB in adults, as compared with MRS and CRS?

Table 2.1.10. PICO 3.1: What is the diagnostic accuracy of Xpert MTB/RIF for rifampicin resistance in adults with extrapulmonary TB, as compared with MRS?

Table 2.1.10

PICO 3.1: What is the diagnostic accuracy of Xpert MTB/RIF for rifampicin resistance in adults with extrapulmonary TB, as compared with MRS?

Table 2.1.11. PICO 3.2: What is the diagnostic accuracy of Xpert Ultra for extrapulmonary TB in adults, as compared with MRS and CRS?

Table 2.1.11

PICO 3.2: What is the diagnostic accuracy of Xpert Ultra for extrapulmonary TB in adults, as compared with MRS and CRS?

Table 2.1.12. PICO 3.2: What is the diagnostic accuracy of Xpert Ultra for rifampicin resistance in adults with extrapulmonary TB, as compared with MRS and CRS?

Table 2.1.12

PICO 3.2: What is the diagnostic accuracy of Xpert Ultra for rifampicin resistance in adults with extrapulmonary TB, as compared with MRS and CRS?

Table 2.1.13. PICO 4.1: What is the diagnostic accuracy of Xpert MTB/RIF for extrapulmonary TB in children, as compared with MRS?

Table 2.1.13

PICO 4.1: What is the diagnostic accuracy of Xpert MTB/RIF for extrapulmonary TB in children, as compared with MRS?

Table 2.1.14. PICO 5.1: Xpert Ultra repeated test for the diagnosis of pulmonary TB in adults with signs and symptoms of pulmonary TB who have an initial Ultra trace result, as compared with MRS?

Table 2.1.14

PICO 5.1: Xpert Ultra repeated test for the diagnosis of pulmonary TB in adults with signs and symptoms of pulmonary TB who have an initial Ultra trace result, as compared with MRS?

Table 2.1.15. PICO 5.2: More than one Xpert MTB/RIF versus one Xpert MTB/RIF to diagnose pulmonary TB in children with signs and symptoms of pulmonary TB, as compared with MRS?

Table 2.1.15

PICO 5.2: More than one Xpert MTB/RIF versus one Xpert MTB/RIF to diagnose pulmonary TB in children with signs and symptoms of pulmonary TB, as compared with MRS?

Table 2.1.16. PICO 5.3: More than one Xpert Ultra versus one Xpert Ultra to diagnose pulmonary TB in children with signs and symptoms of pulmonary TB, as compared with MRS?

Table 2.1.16

PICO 5.3: More than one Xpert Ultra versus one Xpert Ultra to diagnose pulmonary TB in children with signs and symptoms of pulmonary TB, as compared with MRS?

Table 2.1.17. PICO 6.1–6.2: Among adults in the general population with signs and symptoms of pulmonary TB or chest radiograph with lung abnormalities or both, should Xpert MTB/RIF or Xpert Ultra alone be used to define a case of active TB disease, as compared with MRS?

Table 2.1.17

PICO 6.1–6.2: Among adults in the general population with signs and symptoms of pulmonary TB or chest radiograph with lung abnormalities or both, should Xpert MTB/RIF or Xpert Ultra alone be used to define a case of active TB disease, as compared (more...)

Table 2.1.18. PICO 6.3: Two Xpert Ultra versus one Xpert Ultra to diagnose pulmonary TB in adults in the general population with signs and symptoms of TB or chest radiograph with lung abnormalities or both, as compared with MRS.

Table 2.1.18

PICO 6.3: Two Xpert Ultra versus one Xpert Ultra to diagnose pulmonary TB in adults in the general population with signs and symptoms of TB or chest radiograph with lung abnormalities or both, as compared with MRS.

Cost–effectiveness analysis

This section deals with the following additional question:

What are the comparative cost, affordability and cost–effectiveness of implementation of Xpert MTB/RIF, Xpert Ultra?

A systematic review was carried out, focusing on economic evaluations of molecular-based tests for the diagnosis of active TB. The tests included GeneXpert MTB/RIF (referred to as Xpert MTB/RIF) and the novel Xpert Ultra. The objective of the review was to summarize current economic evidence and further understand the costs, cost–effectiveness and affordability of these molecular tests for TB diagnosis. Twenty-eight studies were identified that met the inclusion criteria and addressed one of the PICO questions of interest. No studies assessing the cost–effectiveness of Xpert Ultra. Most of the studies assessed Xpert MTB/RIF in outpatient settings in countries in Africa; however, also included were studies among outpatients and hospitalized patients in other countries, such as Brazil, China, Germany, Hong Kong Special Administrative Region (SAR), India, South Africa and the USA.

Studies employed a variety of different modelling approaches, populations and settings. The included studies varied in their costing, effectiveness and epidemiological parameters, making direct comparisons across studies challenging. Furthermore, variations in what costing elements, implementation costs and downstream costs were included in the different studies.

Although many studies demonstrated that Xpert MTB/RIF may be cost effective in diagnosing pulmonary TB, key implementation conditions and settings had a strong effect on cost–effectiveness and must be considered when implementing this test. The cost–effectiveness of Xpert MTB/RIF was shown to be improved among certain populations: those with higher TB prevalence, in PLHIV and those where rates of empirical treatment were low. Cost–effectiveness of Xpert MTB/RIF is strongly affected by factors such as the location of GeneXpert machines (i.e. centralized versus decentralized facilities), test volume, underlying TB prevalence, level of empirical treatment and pretreatment loss to follow-up.

Caution should be used when generalizing cost–effectiveness and economic evaluations across settings. Local implementation conditions and settings should be taken into account, and local implementation studies may be helpful to assess the likely impact on case finding, long-term outcomes and cost–effectiveness.

There is substantial economic evidence around the implementation and scale-up of Xpert MTB/RIF in different settings, most notably among outpatients presenting with signs and symptoms of TB. Most of these studies found that Xpert MTB/RIF would probably be cost effective. Still, there were some exceptions, and it was clear that differences in implementation approaches and settings could have an important impact on cost–effectiveness. Studies employed a wide variety of modelling and analysis approaches, assumptions, diagnostic algorithms, and comparators. They also assessed different study settings, making comparisons across studies and generalizations to other challenging settings.

Studies highlighted that implementation factors and settings need to be considered when generalizing cost–effectiveness results to different settings. Important factors in determining whether Xpert MTB/RIF may be cost effective in any given setting include current standard of care, level of empirical treatment, existing testing facilities, location of Xpert MTB/RIF (centralized or decentralized facilities), TB prevalence, patient volume, pretreatment loss to follow-up and existing linkage to care. Other important cost components include whether implementation costs associated with Xpert MTB/RIF scale-up are considered and whether downstream costs (e.g. for TB and MDR-TB treatment, and antiretroviral therapy and HIV care) were included.

Web Annex 4.5: Systematic literature review of economic evidence for molecular assays intended as initial tests for the diagnosis of pulmonary and extrapulmonary TB in adults and children.

User perspective

This section deals with the following question:

Are there implications for feasibility, accessibility, patient equity and human rights from the implementation of Xpert MTB/RIF, Xpert Ultra?

The results of the qualitative research show that participants place great value on the ability of Xpert3 to improve the diagnosis of DR-TB; they also show the impact on patients if they cannot access testing for drug resistance through this technology. The impact on case notification and the value of Xpert for finding more TB cases was less clear, owing to widespread clinical treatment, the prolonged turnaround time for results, and the challenges with feasibility and use of Xpert.

Although access has improved, not everybody who needs it can access Xpert testing. Simple laboratory procedures do not automatically translate into feasibility to implement. Instead, the feasibility of Xpert testing depends on government commitment to ensure functioning infrastructure and stable power, supply of cartridges and functioning laboratory services, investment in expertise for handling (discordant) results, effective repair services, staff with monitoring capacities, functioning sample transport, sustainable funding models and transparent donor agreements, and simple diagnostic algorithms.

Concerning acceptability, although Xpert has eased laboratory work through convenience and automation, the preference for Xpert in the laboratory can have undesired consequences for treatment monitoring with microscopy, and for reverting to microscopy if GeneXpert instruments become non-functional. Clinicians’ confidence in Xpert results is relatively high, but the challenges with feasibility and use mean that clinicians are at times deterred from ordering Xpert tests.

Summary of the results
  1. Xpert is unable to bridge disconnects or lack of capacity in general laboratory services. Participants valued the option to use a specimen other than sputum, but having GeneXpert machines available in the public sector does not necessarily mean that facilities and capacities are available to extract and make use of those specimens. For example, services for histopathology and bacteriology in one country may be disconnected, and sending a specimen to histopathology in the private sector, for instance, may mean that the sample will not return to a public sector GeneXpert machine.
  2. Xpert Ultra trace results complicate decision-making. Laboratory and clinical management of trace results was rarely straightforward. Study participants reported challenges with obtaining a second fresh sample when patients had left the facility or had been put on treatment and could not easily produce sputum. If repeat tests are conducted after trace, they cause confusion if the second test has a different result (e.g. is negative). Some laboratory managers are unsure which result to report, and clinicians need expertise and experience to conduct more extensive evaluation for trace patients. This presents challenges in peripheral settings and where turnaround times of confirmatory tests (e.g. phenotypic DST and LPA) slow down clinical decision-making.
  3. Discordant results of repeat tests and confirmatory tests can cause confusion around what should be considered the gold standard. This is particularly the case when specimen quality might be poor. Understanding and contextualizing discordant results requires continuous training, experience and expertise.
  4. Establishing a thorough TB history of patients is uncommon, and “previously treated” is defined differently. This has implications for potential false positive results through Xpert testing. Clear guidance is needed on how to define previously treated patients, how to handle their Xpert results, and how to capture outcomes in national databases accurately.
  5. The lack of trained counsellors and of information provided to patients on diagnostics have negative implications. Patients may be unwilling to accept a diagnosis and invest time and money in clinic visits, follow-up tests and treatment. Patients need better quality counselling by health workers to continue with diagnostic journeys and treatment; such counselling should include information about diagnostic technology and considerations for follow-up testing.
  6. Persistent underuse of GeneXpert machines is compounded by the challenges of delays due to sample transport, module breakdown, stock-out of cartridges or complicated diagnostic algorithms. The presence of local Cepheid agents is key for repair. However, high workload and staff turnover, combined with infrastructure and environmental conditions, still cause frequent module breakdown, and repair work can be slow or services deemed insufficient. The challenges of cartridge stock-out lead to important delays and disruption of workflows, leading to underuse.
  7. Diagnostic algorithms that are simple to follow in a specific facility (e.g. test all those with presumptive TB) are more feasible and enhance use, but this simplicity depends on cost and supplies. Cartridge stock-outs or prohibitive costs can complicate diagnostic algorithms, making them less feasible to follow and further compounding underuse. In Uganda, Xpert testing eligibility criteria had to be temporarily restricted to particular patient groups because of cartridge shortages that complicated the algorithm.
  8. Current donor agreements with governments regarding introduction of new diagnostic technologies are not transparent enough for civil society to be able to hold accountable and follow-up. Involving civil society in negotiating agreements and social contracts at the national level and local facility levels can enhance accountability and the responsiveness of governments, leading to improved implementation processes and access to diagnostics.

Web Annex 4.6: Report on user perspectives on Xpert testing: results from qualitative research.

Research priorities

  • Evaluation of the impact of Xpert Ultra testing on patient-important outcomes (cure, mortality, time to diagnosis and time to start treatment).
  • Evaluation of the diagnostic accuracy of Xpert Ultra in gastric or stool specimens for pulmonary TB and extrapulmonary TB in children.
  • Evaluation of the combinatorial benefit of multiple specimen types. Limited data were suggesting that the combination of non-invasive specimens performs comparably with traditional gastric specimens or induced sputum specimens.
  • Additional operational and qualitative research to determine the best approach to less-invasive specimen collection.
  • Implementation studies on a method of suction for nasopharyngeal aspiration that is appropriate for low-skill or low-resource environments.
  • Extensive operational research into the use of stool as a diagnostic specimen in terms of integration into normal diagnostic clinical pathways, definition of laboratory protocols that successfully balance ease of implementation and diagnostic performance, and the impact of stool testing on patient-important outcomes. A dearth of qualitative research identifies child and family preferences for and acceptability of comparative diagnostic approaches.
  • Identification of an improved reference standard that accurately defines TB disease in children and paucibacillary specimens because the sensitivity of all available diagnostics is suboptimal.
  • Development of new tools that correctly diagnose a higher proportion of child TB cases. Ideally, the new tools will be rapid, affordable, feasible, and acceptable to children and their parents.
  • Comparison of different tests, including Xpert MTB/RIF and Xpert Ultra, to determine which tests (or strategies) yield superior diagnostic accuracy. The preferred study design is when all participants receive all available diagnostic tests or are randomly assigned to receive a particular test. Studies should include children and HIV-positive people. Future research should acknowledge the concern associated with culture as a reference standard and consider ways to address this limitation.
  • Development of rapid point-of-care diagnostic tests for extrapulmonary TB. Research groups should focus on developing diagnostic tests and strategies that use readily available clinical specimens such as urine rather than specimens that require invasive procedures for collection.
  • Operational research to ensure that tests are used optimally in settings of intended use.

Summary of changes between the 2013 guidance and the 2020 update

Xpert MTB/RIF assay for the diagnosis of pulmonary and extrapulmonary TB in adults and children. Policy update (2013) (11)Molecular assays intended as initial tests for the diagnosis of pulmonary and extrapulmonary TB and rifampicin resistance in adults and children: rapid communication. Policy update (2020) (12)Changes

Using Xpert MTB/RIF to diagnose extrapulmonary TB and rifampicin resistance in adults and children

  1. Xpert MTB/RIF may be used rather than conventional microscopy and culture as the initial diagnostic test in all adults suspected of having TB (conditional recommendation acknowledging resource implications, high-quality evidence).
  2. Xpert MTB/RIF may be used rather than conventional microscopy and culture as the initial diagnostic test in all children suspected of having TB (conditional recommendation acknowledging resource implications, very low quality evidence).
  3. Xpert MTB/RIF may be used as a follow-on test to microscopy in adults suspected of having TB but not at risk of MDR-TB or HIV-associated TB, especially when further testing of smear-negative specimens is necessary (conditional recommendation acknowledging resource implications, high-quality evidence).
  4. Xpert MTB/RIF should be used in preference to conventional microscopy and culture as the initial diagnostic test for CSF specimens from patients suspected of having TB meningitis (strong recommendation given the urgency for rapid diagnosis, very low quality evidence).
  5. Xpert MTB/RIF may be used as a replacement test for usual practice (including conventional microscopy, culture or histopathology) for testing specific non-respiratory specimens (lymph nodes and other tissues) from patients suspected of having extrapulmonary TB (conditional recommendation, very low quality evidence).

Xpert MTB/RIF and Xpert Ultra as initial tests in adults and children with signs and symptoms of pulmonary TB

  1. In adults with signs and symptoms of pulmonary TB, Xpert MTB/RIF should be used as an initial diagnostic test for TB and for rifampicin-resistance detection rather than smear microscopy/culture and DST (strong recommendation, high certainty of evidence for test accuracy and moderate certainty of evidence for patient-important outcomes).
  2. In adults with signs and symptoms of pulmonary TB without a prior history of TB (<5 years since end of treatment) or with a remote history of TB treatment (>5 years since end of treatment), Xpert Ultra should be used as the initial diagnostic test for TB and for rifampicin-resistance detection rather than smear microscopy/ culture (strong recommendation, high certainty of evidence for test accuracy).
  3. In adults with signs and symptoms of pulmonary TB and a prior history of TB with an end of treatment within the past 5 years, Xpert Ultra may be used as the initial diagnostic test for TB and for rifampicin-resistance detection rather than smear microscopy/ culture (conditional recommendation, low certainty of evidence for test accuracy).
  4. In children with signs and symptoms of pulmonary TB, Xpert MTB/RIF should be used as the initial diagnostic test for TB rather than smear microscopy/culture in sputum (moderate certainty of evidence in test accuracy), gastric aspirate (low certainty of evidence for test accuracy), nasopharyngeal aspirate (moderate certainty of evidence for test accuracy), or stool (low certainty of evidence for test accuracy) specimens (strong recommendation).
  5. In children with signs and symptoms of pulmonary TB, Xpert Ultra should be used as the initial diagnostic test for TB rather than smear microscopy/culture in sputum (low certainty of evidence in test accuracy) and nasopharyngeal aspirate (very low certainty of evidence for test accuracy) specimens (strong recommendation).
  1. Strong recommendation for use of Xpert MTB/RIF as an initial test for TB and rifampicin resistance in all adults and children with signs and symptoms of pulmonary TB.
  2. Xpert Ultra is now recommended as an initial test for TB and rifampicin resistance in all adults and children with signs and symptoms of pulmonary TB.
  3. In children, recommended use of Xpert MTB/RIF is expanded to gastric aspirate, nasopharyngeal aspirate, nasopharyngeal aspirate and stool. Use of Xpert Ultra is expanded to nasopharyngeal aspirate.

Using Xpert MTB/RIF to diagnose pulmonary TB and rifampicin resistance in adults and children

  1. Xpert MTB/RIF should be used rather than conventional microscopy, culture and DST as the initial diagnostic test in adults suspected of having MDR-TB or HIV-associated TB (strong recommendation, high-quality evidence).
  2. Xpert MTB/RIF should be used rather than conventional microscopy, culture and DST as the initial diagnostic test in children suspected of having MDR-TB or HIV-associated TB (strong recommendation, very low quality evidence).

Xpert MTB/RIF and Xpert Ultra as initial tests in adults and children with signs and symptoms of extrapulmonary TB

  1. In adults and children with signs and symptoms of TB meningitis, Xpert MTB/RIF or Xpert Ultra should be used in CSF as an initial diagnostic test for TB meningitis (strong recommendation, moderate certainty of evidence for test accuracy for Xpert MTB/RIF, low certainty of evidence for Xpert Ultra).
  2. In adults and children with signs and symptoms of extrapulmonary TB, Xpert MTB/RIF may be used in lymph node aspirate, lymph node biopsy, pleural fluid, peritoneal fluid, pericardial fluid, synovial fluid or urine specimens as the initial diagnostic test for the corresponding form of extrapulmonary TB (conditional recommendation, moderate certainty of evidence for test accuracy for pleural fluid; low for lymph node aspirate, peritoneal fluid, synovial fluid, urine; very low for pericardial fluid, lymph nodes biopsy).
  3. In adults and children with signs and symptoms of extrapulmonary TB an Xpert Ultra may be used in lymph node aspirate and lymph node biopsy as the initial diagnostic test (conditional recommendation, low certainty of evidence).
  4. In adults and children with signs and symptoms of extrapulmonary TB, Xpert MTB/RIF or Xpert Ultra should be used for rifampicin-resistance detection rather than culture and DST (strong recommendation, high certainty of evidence for test accuracy for Xpert MTB/RIF; low certainty of evidence for Xpert Ultra).
  5. In HIV-positive adults and children with signs and symptoms of disseminated TB, Xpert MTB/RIF may be used in blood, as a diagnostic test for disseminated TB (conditional recommendation, very low certainty of evidence for test accuracy).
  1. Improved certainty of evidence for test accuracy for Xpert MTB/RIF when used in CSF as an initial diagnostic test for TB meningitis.
  2. High certainty of evidence for Xpert Ultra when used in CSF as an initial diagnostic test for TB meningitis.
  3. Use of Xpert MTB/RIF in lymph node aspirate, lymph node biopsy, pleural fluid, peritoneal fluid, pericardial fluid, synovial fluid or urine specimens as the initial diagnostic test for the corresponding form of extrapulmonary TB.
  4. Use of Xpert Ultra in lymph node aspirate, lymph node biopsy specimens as the initial diagnostic test for the corresponding form of extrapulmonary TB.
  5. Use of Xpert Ultra for rifampicin-resistance detection in adults and children with signs and symptoms of extrapulmonary TB.
  6. Use of Xpert MTB/RIF in blood for diagnosis of disseminated TB.

Xpert MTB/RIF and Xpert Ultra repeated testing in adults and children with signs and symptoms of pulmonary TB

  1. In adults with signs and symptoms of pulmonary TB who have an Xpert Ultra trace positive result on the initial test, repeated testing with Ultra may not be used (conditional recommendation, very low certainty of evidence for test accuracy).
  2. In children with signs and symptoms of pulmonary TB in settings with pretest probability below 5% and an Xpert MTB/RIF negative result on the initial test, repeated testing with Xpert MTB/RIF in sputum, gastric fluid, nasopharyngeal aspirate or stool specimens may not be used (conditional recommendation, low certainty of evidence for test accuracy for sputum and very low for other specimen types).
  3. In children with signs and symptoms of pulmonary TB in settings with pretest probability 5% or more and an Xpert MTB/RIF negative result on the initial test, repeated testing with Xpert MTB/RIF (for a total of two tests) in sputum, gastric fluid, nasopharyngeal aspirate and stool specimens may be used (conditional recommendation, low certainty of evidence for test accuracy for sputum and very low for other specimen types).
  4. In children with signs and symptoms of pulmonary TB in settings with pretest probability below 5% and an Xpert Ultra negative result on the initial test, repeated testing with Xpert Ultra in sputum or nasopharyngeal aspirate specimens may not be used (conditional recommendation, very low certainty of evidence for test accuracy).
  5. In children with signs and symptoms of pulmonary TB in settings with pretest probability 5% or more and an Xpert Ultra negative result on the first initial test, repeated one Xpert Ultra test (for a total of two tests) in sputum and nasopharyngeal aspirate specimens may be used (conditional recommendation, very low certainty of evidence for test accuracy).
  1. Not recommended repeated Xpert Ultra in adults who have an Xpert Ultra trace positive result on the initial test.
  2. Not recommended repeated Xpert MTB/RIF in children in low prevalence settings.
  3. Recommended repeated Xpert MTB/RIF in children in high prevalence settings in sputum, gastric fluid, nasopharyngeal aspirate and stool specimens.
  4. Recommended repeated Xpert Ultra in children in both low and high prevalence settings in sputum and nasopharyngeal specimens.

Xpert MTB/RIF and Xpert Ultra as initial tests for pulmonary TB in adults in the general population either with signs and symptoms of TB or chest radiograph with lung abnormalities or both

  1. In adults in the general population who had either signs or symptoms of TB or chest radiograph with lung abnormalities or both, the Xpert MTB/RIF or Xpert Ultra may replace culture as the initial test for pulmonary TB (conditional recommendation, low certainty of the evidence in test accuracy for Xpert MTB/RIF and moderate certainty for Xpert Ultra).
  2. In adults in the general population who had either a positive TB symptom screen or chest radiograph with lung abnormalities or both, one Xpert Ultra test may be used rather than two Xpert Ultra tests as the initial test for pulmonary TB (conditional recommendation, very low certainty of evidence for test accuracy).
Conditional recommendation on use of Xpert MTB/RIF or Xpert Ultra for individual case management in individuals with radiographic abnormalities (but not in surveys estimating burden of disease).

CSF: cerebrospinal fluid; DST: drug susceptibility testing; HIV: human immunodeficiency virus; MDR-TB: multidrug-resistant tuberculosis; TB: tuberculosis.

Truenat MTB, MTB Plus and MTB-RIF Dx assays

New molecular assays – the Truenat MTB, MTB Plus and MTB-RIF Dx assays (Molbio Diagnostics, Goa, India), hereafter referred to as Truenat – were developed in India, and may be used at the same health system level as Xpert MTB/RIF. Of the above-mentioned assays, MTB and MTB Plus are used as initial diagnostic tests for TB, whereas MTB-RIF Dx is used as a reflex test to detect rifampicin resistance for those with positive results on the initial Truenat tests. Multisite international evaluations in settings of intended use are being implemented by the Foundation for Innovative New Diagnostics (FIND), a WHO collaborating centre for the evaluation of new diagnostic technologies. Given the similarity of the operational characteristics for Xpert MTB/RIF and Truenat, the results of the latter study were reviewed within the same GDG meeting.

Recommendations

Recommendations on Truenat MTB, MTB Plus and Truenat MTB-RIF Dx in adults and children with signs and symptoms of pulmonary TB
  1. In adults and children with signs and symptoms of pulmonary TB, the Truenat MTB or MTB Plus may be used as an initial diagnostic test for TB rather than smear microscopy/culture.
    (Conditional recommendation, moderate certainty of evidence for test accuracy)
  2. In adults and children with signs and symptoms of pulmonary TB and a Truenat MTB or MTB Plus positive result, Truenat MTB-RIF Dx may be used as an initial test for rifampicin resistance rather than culture and phenotypic DST.
    (Conditional recommendation, very low certainty of evidence for test accuracy)
Remarks

For recommendation 1: The recommendation includes patients who are smear negative. There is uncertainty about the use of these assays in PLHIV. In smear-negative patients, the sensitivity is lower than in all patients. The indirect data on test accuracy in smear-negative patients (given that there are no data on PLHIV for this version of Truenat) made it possible to extrapolate this recommendation to PLHIV. However, the certainty of evidence for test accuracy would need to be lowered to account for additional indirectness. In the case of children, there were no data available to assess the accuracy of the test in different specimens, and not enough indirect evidence to extrapolate for specimens other than sputum. This recommendation is extrapolated to children for sputum, although the tests are expected to be less sensitive in children.

For recommendation 2: The Truenat aplies a reflex (two-step) test for rifampicin resistance. Hence, the recommendation for Truenat MTB-RIF Dx is only applicable for those patients with positive Truenat MTB or MTB Plus results.

Test descriptions

The new molecular assays – the Truenat MTB, MTB Plus and MTB-RIF Dx assays – developed in India, may be used at the same health system level as Xpert MTB/RIF. This policy focuses on the following Molbio devices and diagnostic tests (13):

  • Trueprep Auto DNA extraction system;
  • Truelab DuoDx and Truelab QuattroDx micro-PCR machines;
  • Truelab MTB chip;
  • Truelab MTB Plus chip; and
  • Truelab MTB-RIF Dx chip.

The Truenat MTB and MTB Plus assays and the rifampicin-resistance detection reflex assay (Truenat MTB-RIF Dx) (Molbio Diagnostics, India) use real-time micro-PCR for detection of M. tuberculosis and rifampicin resistance in DNA extracted from a patient’s sputum specimen (Fig. 2.1.3). The assays use automated, battery-operated devices to extract, amplify and confirm the presence of specific genomic DNA loci, allowing for the rapid diagnosis of TB infections with minimal user input. These products are intended to be operated in peripheral laboratories with minimal infrastructure, and technicians with only minimal training can easily perform these tests routinely in their facilities and report results in under 1 hour. Moreover, with these devices, PCR testing can also be initiated at the field level, on-site.

If the Truenat MTB assay result is positive, the user may then take another aliquot of extracted DNA and run the MTB-RIF Dx assay, to detect the presence of selected rifampicin-resistance-associated mutations. The diagnostic performance of these assays has been previously evaluated in microscopy centres in India, (13) but a larger assessment of the operational characteristics and acceptability of the technology is needed in intended settings of use to confirm assay performance.

Fig. 2.1.3. Molbio equipment to run the Truenat MTB, MTB Plus and MTB-RIF Dx assays: (a) Trueprep instrument for sample preparation, (b) Truelab Uno Dx real-time PCR instrument for running the tests, and (c) chip for real-time PCR.

Fig. 2.1.3

Molbio equipment to run the Truenat MTB, MTB Plus and MTB-RIF Dx assays: (a) Trueprep instrument for sample preparation, (b) Truelab Uno Dx real-time PCR instrument for running the tests, and (c) chip for real-time PCR. PCR: polymerase chain reaction. (more...)

Justification and evidence

The evidence on the use of the Truenat MTB, MTB Plus and MTB-RIF Dx system was generated by multisite international evaluations in settings of intended use, implemented by FIND.

The PICO questions were designed to form the basis for the evidence search, retrieval and analysis.

Box 2.1.2PICO questions and subquestions

PICO 1: Among people with signs and symptoms of pulmonary TB, seeking care at health care facilities, should Molbio Truenat MTB, MTB Plus and MTB-RIF Dx be used as an initial test for diagnosis of pulmonary TB and rifampicin resistance?

1.1.

What is the diagnostic accuracy of Truenat MTB to diagnose pulmonary TB in adults with signs and symptoms of pulmonary TB, as compared with MRS?

1.2.

What is the diagnostic accuracy of Truenat MTB Plus to diagnose pulmonary TB in adults with signs and symptoms of pulmonary TB, as compared with MRS?

1.3.

What is the diagnostic accuracy of Truenat MTB-RIF Dx to diagnose rifampicin resistance in adults with signs and symptoms of pulmonary TB, as compared with MRS?

Additional questions

  1. What are the comparative cost, affordability and cost–effectiveness of implementation of Truenat MTB, MTB Plus and MTB-RIF Dx systems?
  2. Are there implications for feasibility, accessibility, patient equity and human rights from the implementation of Truenat MTB, MTB Plus and MTB-RIF Dx systems?

The evaluation study of Truenat was carried out in 19 clinical sites (each with a microscopy centre attached) and seven reference laboratories in four countries. The diagnostic accuracy of the assays was evaluated when performed in the intended settings of use (i.e. microscopy centres), against microbiological confirmation (culture) as the reference standard. As part of this assessment, the performance of the Truenat assays was also compared to Xpert MTB/RIF or Xpert Ultra, on the same specimens, in reference laboratories.

The certainty of the evidence was assessed consistently through PICO questions, using the Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach, which produces an overall quality assessment (or certainty) of evidence and a framework for translating evidence into recommendations. The certainty of the evidence is rated as high, moderate, low or very low. These four categories “imply a gradient of confidence in the estimates” (10). In the GRADE approach, even if diagnostic accuracy studies are of observational design, they start as high-quality evidence.

At least two review authors independently completed the quality assessment of diagnostic accuracy studies (QUADAS)-2 assessments. Any disagreements were resolved through discussion or consultation with a third review author.

Data synthesis was structured around the preset PICO questions list below. Details of study included in the current analysis are given in Web Annex 1.2: Truenat MTB, MTB Plus and MTB-RIF Dx. Summary of the results and details of the evidence quality assessment are available in Web Annex 2.2: Truenat MTB, MTB Plus and MTB-RIF Dx.

PICO 7: Among people with signs and symptoms of pulmonary TB, seeking care at health care facilities, should Molbio Truenat MTB, MTB Plus and MTB-RIF Dx be used as an initial test for diagnosis of pulmonary TB and rifampicin resistance?

6.4 What is the diagnostic accuracy of Truenat MTB to diagnose pulmonary TB in adults with signs and symptoms of pulmonary TB, as compared with MRS?

Evidence for the use of Truenat MTB, MTB Plus and MTB-RIF Dx assays to diagnose pulmonary TB and rifampicin resistance in adults was generated through a multicentre prospective clinical evaluation study implemented by FIND. The study was conducted at 19 clinical sites (each with a microscopy centre attached) and seven reference laboratories in four countries. The aim was to determine the diagnostic accuracy of the Truenat assays when performed in the intended settings of use (i.e. microscopy centres), relative to microbiological confirmation (culture) as the reference standard. The performance of the Truenat assays was also compared head-to-head (on the same specimens) to Xpert or Ultra in reference laboratories, as part of this assessment. All sites performed Xpert except for sites in Peru, which performed Ultra. The analysis for Truenat MTB reported on the results for 1336 participants. Serious concerns were expressed for imprecision and inconsistency of evidence related to sensitivity. Overall, the certainty of evidence was judged to be low for sensitivity but high for specificity.

6.5 What is the diagnostic accuracy of Truenat MTB Plus to diagnose pulmonary TB in adults with signs and symptoms of pulmonary TB, as compared with MRS?

The analysis for Truenat MTB Plus reported on the results for 1336 participants. Serious concerns were expressed for imprecision for sensitivity, related to the few participants contributing to the analysis. Overall, the certainty of evidence was judged to be low for sensitivity and high for specificity.

6.6 What is the diagnostic accuracy of Truenat MTB-RIF Dx to diagnose rifampicin resistance in adults with signs and symptoms of pulmonary TB, as compared with MRS?

The analysis for Truenat MTB-RIF Dx reported on the results for 186 participants. For sensitivity, there were serious concerns about indirectness (India and Peru contributed most of the data to the determination of rifampicin resistance) and inconsistency (variable sensitivity estimates: 100% for Peru, based on seven rifampicin-resistant specimens; 100% for Ethiopia, based on one rifampicin-resistant specimen; 100% for Papua New Guinea, based on one rifampicin-resistant specimen; and 81% for India, based on 42 rifampicin-resistant specimens). These results may not be applicable to other settings. In addition, very serious concerns were expressed for imprecision, owing to the small number of participants contributing to this analysis. Overall, the certainty of evidence was judged to be very low for sensitivity. Serious concerns were expressed for indirectness for specificity, related to the low numbers of rifampicin-resistant cases and the fact that most of them were from India and Peru.

Web Annex 4.7: Report on the diagnostic accuracy of the Molbio Truenat tuberculosis and rifampicin-resistance assays in the intended setting of use.

Performance of the molecular assays

Table 2.1.19. PICO 1: What is the diagnostic accuracy of Molbio Truenat MTB for pulmonary TB in adults, as compared with MRS?

Table 2.1.19

PICO 1: What is the diagnostic accuracy of Molbio Truenat MTB for pulmonary TB in adults, as compared with MRS?

Table 2.1.20. PICO 2: What is the diagnostic accuracy of Molbio Truenat MTB Plus for pulmonary TB in adults, as compared with MRS?

Table 2.1.20

PICO 2: What is the diagnostic accuracy of Molbio Truenat MTB Plus for pulmonary TB in adults, as compared with MRS?

Table 2.1.21. PICO 3: What is the diagnostic accuracy of Molbio Truenat MTB-RIF Dx for rifampicin resistance in adults, as compared with MRS?

Table 2.1.21

PICO 3: What is the diagnostic accuracy of Molbio Truenat MTB-RIF Dx for rifampicin resistance in adults, as compared with MRS?

Cost–effectiveness analysis

This section deals with the following additional question:

What are the comparative cost, affordability and cost–effectiveness of implementation of Truenat MTB, MTB Plus and MTB-RIF Dx systems?

A systematic review was carried out, focusing on economic evaluations of molecular-based tests for the diagnosis of active TB including the novel Molbio Truenat MTB test. The objective of the review was to summarize current economic evidence and further understand the costs, cost–effectiveness and affordability of these molecular tests for TB diagnosis.

Only one study assessing the cost–effectiveness of Molbio’s Truenat MTB was identified. This study suggests that Truenat MTB is likely to be cost effective if implemented at the point of care in India. However, the study relies on several important modelling assumptions, including improved linkage to care and increased treatment initiation; these assumptions should be evaluated in pragmatic trials (as has been done for Xpert MTB/RIF implementation in South Africa).

Caution should be used when generalizing cost–effectiveness and economic evaluations across settings. Local implementation conditions and settings should be taken into account, and local implementation studies may be helpful to assess the likely impact on case finding, long-term outcomes and cost–effectiveness.

More details on economic evaluation of Truenat MTB, MTB Plus and MTB-RIF Dx systems are available in Web Annex 4.5: Systematic literature review of economic evidence for molecular assays intended as initial tests for the diagnosis of pulmonary and extrapulmonary TB in adults and children.

User perspective

This section deals with the following question:

Are there implications for patient values, feasibility, accessibility and equity from the implementation of Truenat MTB, MTB Plus and MTB-RIF Dx systems?

The available results of the qualitative research were based on Xpert MTB/RIF and Xpert MTB/RIF Ultra mainly for the patient and policymaking perspective (see User perspective for Xpert MTB/RIF and Xpert MTB/RIF Ultra, p. 49 above). Whereas largely qualitative evidence from Xpert MTB/RIF and Xpert MTB/RIF Ultra were judged as applicable to the Truenat tests, caution should be used when generalizing the conclusions, as specific characteristics of the technology, i.e. diagnostic accuracy and use in particular patient populations may be different. Additionally, particularities of supply chain and maintenance relevant for programme staff/managers may also differ. In general, caution should be applied when generalizing findings across settings. The Truenat implementation trial provided information from laboratory technicians perspective on use of the test. Results of the trial showed that test was generally considered as acceptable and feasible by laboratory staff, yet some noted that test is new and more complex to perform compared with Xpert MTB/RIF.

More details on qualitative evaluation of Truenat MTB, MTB Plus and MTB-RIF Dx systems are available Web Annex 4.7: Report on the diagnostic accuracy of the Molbio Truenat tuberculosis and rifampicin-resistance assays in the intended setting of use.

Research priorities

  • Operational research to ensure that tests are used optimally in settings of intended use.
  • Evaluation of the diagnostic accuracy of Truenat (MTB, MTB Plus and MTB-RIF) in specific patient populations such as PLHIV, former TB patients for pulmonary TB and extrapulmonary TB in adults and children.

Moderate complexity automated NAATs for detection of TB and resistance to rifampicin and isoniazid

Rapid detection of TB and rifampicin resistance is increasingly available as new technologies are developed and adopted by countries. However, what has also emerged is the relatively high burden of isoniazid-resistant, rifampicin-susceptible TB that is often undiagnosed. Globally, isoniazid-resistant, rifampicin-susceptible TB is estimated to occur in 13.1% (95% CI: 9.9–16.9%) of new cases and 17.4% (95% CI: 0.5–54.0%) of previously treated cases (14).

A new class of technologies has come to market with the potential to address this gap. Several manufacturers have developed moderate complexity automated NAATs for detection of TB and resistance to rifampicin and isoniazid on high throughput platforms for use in laboratories. The tests belonging to this class are faster and less complex to perform than phenotypic culture-based drug susceptibility testing (DST) and line probe assays (LPA). They have the advantage of being largely automated following the sample preparation step. Moderate complexity automated NAATs may be used as an initial test for detection of TB and resistance to both first-line TB drugs simultaneously (rifampicin and isoniazid). They offer the potential for the rapid provision of accurate results (important to patients) and for testing efficiency where high volumes of tests are required daily (important to programmes). Hence, these technologies are suited to areas with a high population density and rapid sample referral systems.

Recommendation

In people with signs and symptoms of pulmonary TB, moderate complexity automated NAATs may be used on respiratory samples for the detection of pulmonary TB, and of rifampicin and isoniazid resistance, rather than culture and phenotypic DST.

Conditional recommendation, moderate certainty of evidence for diagnostic accuracy

There are several subgroups to be considered for this recommendation:

  • The recommendation is based on evidence of diagnostic accuracy in respiratory samples of adults with signs and symptoms of pulmonary TB.
  • The recommendation applies to people living with HIV (studies included a varying proportion of such individuals); performance on smear-negative samples was reviewed but was only available for TB detection, not for rifampicin and isoniazid resistance, and data stratified by HIV status were not available.
  • The recommendation applies to adolescents and children based on the generalization of data from adults; an increased rate of indeterminate results may be found with paucibacillary TB disease in children.
  • The review did not consider extrapolation of the finding for use in people with extrapulmonary TB and testing on non-sputum samples because data on diagnostic accuracy of technologies in the class for non-sputum samples were limited.

Test descriptions

Abbott Molecular has two NAATs for TB, one for detection of Mycobacterium tuberculosis (Mtb) (RealTime MTB test), and one for detection of both rifampicin and isoniazid resistance (RealTime MTB RIF/INH). TB detection targets both the IS6110 genetic element and the pab gene. The rifampicin and isoniazid resistance test uses eight dye-labelled probes to detect variants in the rifampicin resistance determining region (RRDR) of the rpoB gene and four probes to detect isoniazid resistance, with two probes each for the katG and inhA genes. The company reports a limit of detection (LoD) of 17 cfu/mL for the RealTime MTB assay and of 60 colony forming units (cfu)/mL for the RealTime RIF/INH assay (1416). The test is performed on the m2000 platform, m2000sp for automated DNA extraction and m2000rt for the real-time PCR.

Fig. 2.1.4. m2000sp RealTime system (a) and RealTime MTB Amplification Reagent Kit (b).

Fig. 2.1.4

m2000sp RealTime system (a) and RealTime MTB Amplification Reagent Kit (b).

Becton Dickinson (BD) has a multiplexed real-time PCR (BD MAX™ MDR-TB) NAAT for the detection of Mtb and resistance to both rifampicin and isoniazid. The test is performed on a platform that uses five-colour detection (14). For Mtb detection, this test targets the multicopy genomic elements IS6110 and IS1081, as well as a single-copy genomic target. To detect resistance to rifampicin, the test targets the RRDR codons 507–533 Escherichia coli nomenclature (426–452 Mtb nomenclature) of the rpoB gene; for detection of resistance to isoniazid, the test targets both the inhA promoter region and the 315 codon of the katG gene. The LoD reported by the company is 0.5 cfu/mL for Mtb detection and 6 cfu/mL for resistance detection. The test is performed on the BD MAX platform, with the DNA automatically extracted and real-time PCR performed.

Fig. 2.1.5. BD MAX™ System (a) and BD MAX PCR Cartridges (b).

Fig. 2.1.5

BD MAX™ System (a) and BD MAX PCR Cartridges (b).

Bruker-Hain Diagnostics has two real-time NAATs: the FluoroType® MTB, which detects Mtb, and the FluoroType MTBDR, which detects Mtb and rifampicin and isoniazid resistance. These platforms are completely independent of the GenoType MTBDR platforms. The FluoroType MTBDR test uses asymmetric excess PCR and light on/off probes. The target genes are rpoB for detection of TB and rifampicin resistance and the inhA promoter and katG gene to detect isoniazid resistance. The LoD reported by the company is 15 cfu/mL for the FluoroType MTB test and 20 cfu/mL for the FluoroType MTBDR assay (14, 17, 18). For DNA extraction, manual (FluoroLyse) and automated (GenoXtract) options are available. The platforms used for amplification and detection are FluoroCycler® for the MTB assay and FluoroCycler XT for the MTBDR assay.

Fig 2.1.6. the FluoroType® MTB (a) and FluoroType MTBDR® (b) test principles.

Fig 2.1.6

the FluoroType® MTB (a) and FluoroType MTBDR® (b) test principles.

Roche Diagnostics (Roche) has two NAATs: cobas® MTB assay to detect Mtb, and cobas® MTB-RIF/INHassay to detect drug resistance (rifampicin and isoniazid) (14). The cobas MTB assay detects both 16S ribosomal RNA (rRNA) and esx genes as target genes for Mtb detection. The LoD reported by the company for this test was 7.6–8.8 cfu/mL. Rifampicin resistance is detected using RRDR and isoniazid resistance using the inhA promoter region and the katG gene. The tests are run on the cobas 6800/8800 systems, with the DNA automatically extracted and realtime PCR performed.

Figure 2.1.7. cobas® 6800 or 8800 system (a) and cobas® MTB Positive Control Kit (b).

Figure 2.1.7

cobas® 6800 or 8800 system (a) and cobas® MTB Positive Control Kit (b).

Table 2.1.22. Mycobacterium genomic regions targeted by the different assays for TB detection included in the evaluation.

Table 2.1.22

Mycobacterium genomic regions targeted by the different assays for TB detection included in the evaluation.

In the moderate complexity class, an automated test is one that has (a) automated DNA extraction, (b) automated PCR preparation and (c) automated result interpretation, with either no pipetting steps or only one pipetting step between (a) and (c). These automated tests may require an initial manual specimen treatment step before the test material is transferred into the sample processing tube. Tests in the moderate complexity category require medical laboratories with biosafety measures in place and test-specific equipment; they also need well-trained, skilled and qualified laboratory staff to set up the tests and carry out the necessary equipment maintenance.

Justification and evidence

The WHO Global TB Programme initiated an update of the current guidelines and commissioned a systematic review on the use of moderate complexity automated NAATs for detection of TB and resistance to rifampicin and isoniazid in people with signs and symptoms of TB.

Three PICO questions were designed to form the basis for the evidence search, retrieval and analysis:

  1. Should moderate complexity automated NAATs be used on respiratory samples in people with signs and symptoms of pulmonary TB for detection of pulmonary TB, as compared with culture?
  2. Should moderate complexity automated NAATs be used on respiratory samples in people with signs and symptoms of pulmonary TB for detection of resistance to rifampicin, as compared with culture-based phenotypic DST?
  3. Should moderate complexity automated NAATs be used on respiratory samples in people with signs and symptoms of pulmonary TB for detection of resistance to isoniazid, as compared with culture-based phenotypic DST?

A comprehensive search of the following databases (PubMed, Embase, BIOSIS, Web of Science, LILACS and Cochrane) for relevant citations was performed. The search was restricted to the period January 2009 to July 2020. Reference lists from included studies were also searched. No language restriction was applied. Because there were few studies for the selected index tests, the diagnostic companies were contacted for reports of their internal validation data. Studies were also included from the WHO public call for submission of data. Mycobacterial culture was used as the reference standard for evaluation of Mtb detection. Resistance detection was compared with a phenotypic DST reference standard and a composite reference standard (that combines phenotypic and genotypic DST results) in studies where both had been performed.

Bivariate random-effects meta-analyses were performed using Stata software, to obtain pooled sensitivity and specificity estimates with 95% CIs for rifampicin resistance, isoniazid resistance and Mtb detection. Where only a limited number of studies were available, descriptive analyses were conducted.

For meta-analysis, studies were first meta-analysed separately for each test. Studies from all the tests were then used to obtain a pooled estimate for all technologies.

To decide whether pooling of all the tests would give meaningful estimates, various criteria for pooling were developed and agreed upon by the GDG panel before they were applied. Data were also evaluated and visualized using head-to-head comparisons of the tests with Xpert® MTB/RIF or any other WHO-recommended test.

Data for all the index platforms were only pooled to answer PICO questions if they met the preconditions given in Table 2.1.23 and fulfilled either Condition 1 or Condition 2.

Table 2.1.23. Criteria for pooling studies on moderate complexity automated NAATs.

Table 2.1.23

Criteria for pooling studies on moderate complexity automated NAATs.

The certainty of the evidence of the pooled studies was assessed systematically through PICO questions, using the Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach (19, 20). The GRADE approach produces an overall quality assessment (or certainty) of evidence and has a framework for translating evidence into recommendations; also, under this approach, even if diagnostic accuracy studies are of observational design, they start as high-quality evidence.

GRADEpro Guideline Development Tool software (19) was used to generate summary of findings tables. The quality of evidence was rated as high (not downgraded), moderate (downgraded one level), low (downgraded two levels) or very low (downgraded more than two levels), based on five factors: risk of bias, indirectness, inconsistency, imprecision and other considerations. The quality (certainty) of evidence was downgraded by one level when a serious issue was identified and by two levels when a very serious issue was identified in any of the factors used to judge the quality of evidence.

Data synthesis was structured around the three preset PICO questions, as outlined below. Three web annexes4 give additional information, as follows:

  • details of studies included in the current analysis (Web Annex 1.3: Moderate complexity automated NAATs;
  • a summary of the results and details of the evidence quality assessment (Web Annex 2.3: Moderate complexity automated NAATs); and
  • a summary of the GDG panel judgements (Web Annex 3.3: Moderate complexity automated NAATs).

PICO 1: Should moderate complexity automated NAATs be used on respiratory samples in people with signs and symptoms of pulmonary TB for detection of pulmonary TB, as compared with culture?

A total of 29 studies with 13 852 specimens provided data for evaluating TB detection from the five index tests (Fig. 2.1.4). Of these 29 studies, 12 were conducted on the Abbott RealTime MTB test, six on FluoroType MTB, four on FluoroType MTBDR, five on BD MAX and two on the cobas MTB test. The reference standard for each of these studies for TB detection was mycobacterial culture.

Of the 29 studies, 16 (55%) had high or unclear risk of bias because they tested specimens before inclusion in the study, used convenience sampling or did not report the method of participant selection. Thus, the evidence was downgraded one level for risk of bias. Overall, the certainty of the evidence was moderate for sensitivity and high for specificity.

Fig. 2.1.8. Forest plot of included studies for TB detection with culture as the reference standard.

Fig. 2.1.8

Forest plot of included studies for TB detection with culture as the reference standard. CI: confidence interval; FN: false negative; FP: false positive; TB: tuberculosis; TN: true negative; TP: true positive.

The overall sensitivity in these 29 studies ranged from 79% to 100%, and the specificity from 60% to 100%. The pooled sensitivity was 93.0% (95% CI: 90.9–94.7%) and the pooled specificity was 97.7% (95% CI: 95.6–98.8%).

PICO 2: Should moderate complexity automated NAATs be used on respiratory samples in people with signs and symptoms of pulmonary TB for detection of resistance to rifampicin, as compared with culture-based phenotypic DST?

A total of 18 studies with 2874 specimens provided data for resistance testing of rifampicin using moderate complexity automated NAATs (Fig. 2.1.5). Of these 18 studies, nine were conducted on the Abbott RealTime RIF/INH test, three on FluoroType MTBDR, four on BD MAX and two on the cobas RIF/INH test. The reference standard for each of these studies for resistance detection was phenotypic DST, using a composite reference standard with both phenotypic DST and sequencing results.

Eight (44%) of the 18 studies had high or unclear risk of bias because they did not report participant selection or tested specimens before inclusion in the study.

Fig. 2.1.9. Forest plot of included studies for rifampicin resistance detection with phenotypic DST as the reference standard.

Fig. 2.1.9

Forest plot of included studies for rifampicin resistance detection with phenotypic DST as the reference standard. CI: confidence interval; DST: drug susceptibility testing; FN: false negative; FP: false positive; TB: tuberculosis; TN: true negative; (more...)

The overall sensitivity for rifampicin resistance in these 18 studies ranged from 88% to 100% and the specificity from 98% to 100%. The pooled sensitivity was 96.7% (95% CI: 93.1–98.4%) and the pooled specificity was 98.9% (95% CI: 97.5–99.5%).

In determining rifampicin resistance, the results from genetic sequencing (genotypic DST) were obtained where possible, and a composite reference standard was developed that combined the results from phenotypic and genotypic DST. For rifampicin resistance detection, the diagnostic test accuracy of moderate complexity automated NAATs was similar for phenotypic DST and the composite reference standard.

PICO 3: Should moderate complexity automated NAATs be used on respiratory samples in people with signs and symptoms of pulmonary TB for detection of resistance to isoniazid, as compared with culture-based phenotypic DST?

A total of 18 studies with 1758 specimens provided data for resistance testing of isoniazid using moderate complexity automated NAATs (Fig. 2.1.6). Of these 18 studies, nine were conducted on the Abbott RealTime RIF/INH test, three on FluoroType MTBDR, four on BD MAX and two on the cobas MTB-RIF/INH test. The reference standard for each of these studies for resistance detection was phenotypic DST, and a composite reference standard with both phenotypic DST and sequencing results.

Eight (44%) of the 18 studies had high or unclear risk of bias, because participant selection was not reported or prior testing was done on the included specimens.

Fig. 2.1.10. Forest plot of included studies for isoniazid resistance detection with phenotypic DST as the reference standard.

Fig. 2.1.10

Forest plot of included studies for isoniazid resistance detection with phenotypic DST as the reference standard. CI: confidence interval; DST: drug susceptibility testing; FN: false negative; FP: false positive; RIF: rifampicin; TB: tuberculosis; TN: (more...)

The overall sensitivity for isoniazid resistance in these 18 studies ranged from 58% to 100% and the specificity from 94% to 100%. The pooled sensitivity was 86.4% (95% CI: 82.1–89.8%) and the pooled specificity was 99.8% (95% CI: 98.3–99.8%).

In determining isoniazid resistance, the results from genetic sequencing (genotypic DST) were obtained where possible, and a composite reference standard was developed that combined the results from phenotypic and genotypic DST. For detecting isoniazid resistance, the diagnostic test accuracy of phenotypic DST was similar to that of the composite reference standard.

Cost–effectiveness analysis

This section answers the following additional question:

What is the comparative cost, affordability and cost–effectiveness of implementation of moderate complexity automated NAATs?

A systematic review was conducted, focusing on economic evaluations of moderate complexity automated NAATs. Four online databases (Embase, Medline, Web of Science and Scopus) were searched for new studies published from 1 January 2010 through 17 September 2020. The citations of all eligible articles, guidelines and reviews were reviewed for additional studies. Experts and test manufacturers were also contacted to identify any additional unpublished studies.

The objective of the review was to summarize current economic evidence and further understand the costs, cost–effectiveness and affordability of moderate complexity automated NAATs.

Several commercially available tests were included as eligible tests in the moderate complexity automated NAATs category; however, no published studies were identified assessing the costs or cost–effectiveness of any of those tests. One unpublished study comparing available data on two technologies from moderate complexity automated NAATs class was identified, and the data from that study are described below.

Unpublished data from FIND was provided through direct communication. This costing-only study used time and motion studies combined with a bottom-up, ingredients-based approach to estimate the unit test cost for the two selected technologies.5 Time and motion studies were conducted at a reference-level laboratory in South Africa. Several important simplifying assumptions were made that may limit the generalizability of the results; for example, 50% of laboratory operations dedicated to TB, a minimum daily throughput of 24 samples or the equivalent of one BD MAX run (24 tests/run), equipment costs fixed at US$ 100 000 for both platforms, a 5% annual maintenance cost, and the standard 3% discount rate and 10 years expected useful life years.

Additional literature searches conducted to look for economic data using similar platforms from non-TB disease areas identified three additional studies from HIV and hepatitis C virus (HCV) with limited cost data: one (20) using Abbott RealTime HIV and two on HCV (21, 22). Data were limited to cost per unit test kit and are not transferrable to test kit costs for the tests being considered in this review.

How large are the resource requirements (costs)?

Available unit test costs for two moderate complexity automated NAATs ranged from US$ 18.52 (US$ 13.79–40.70) and US$ 15.37 (US$ 9.61–37.40), with one study reporting cheaper per-test kit costs and higher operational costs associated with laboratory processing time. Equipment costs were strong drivers of cost variation and will vary across laboratory networks and operations. If equipment can be optimally placed or multiplexed to ensure high testing volume, the per-test cost can be minimized.

In one-way sensitivity analyses, annual testing volumes varied from fewer than 5000 tests/year to more than 25 000 tests/year. Per-test cost was highly sensitive to testing volume when fewer than 5000 tests were conducted per year; however, unit test costs begin to stabilize between 5000 and 10 000 tests/year, and above 10 000 tests/year, unit cost estimate was robust. When equipment can be multiplexed and used at capacity, per-test cost can be minimized.

What is the certainty of the evidence of resource requirements (costs)?

Available per-test cost data were unpublished but did include overheads, equipment, building, staff and consumable costs; however, complete quality assessment of the study was not possible. Test cost will vary according to testing volume and laboratory operations. There is limited evidence to assess the important variability across sites, countries and implementation approaches.

Does the cost–effectiveness of the intervention favour the intervention or the comparison?

No studies were identified that assessed cost–effectiveness for any of the moderate complexity automated NAATs, and extrapolation was not appropriate given differences in standard of care, care cascades and associated costs, operational conditions, testing volume and diagnostic accuracy. Implementation considerations (e.g. test placement, laboratory network and ability of the programme to initiate treatment quickly) are all likely to affect unit test cost and cost–effectiveness. Economic modelling is needed across various settings to understand the range of cost–effectiveness profiles of moderate complexity automated NAATs, and how they are likely to vary under different operational criteria.

Additional details on economic evidence synthesis and analysis are provided in Web Annex 4.9: Systematic literature review of economic evidence for NAATs to detect TB and DR-TB in adults and children.

User perspective

This section answers the following questions about key informants’ views and perspectives on the use of moderate complexity automated NAATs:

  • Is there important uncertainty about or variability in how much end-users value the main outcomes?
  • What would be the impact on health equity?
  • Is the intervention acceptable to key stakeholders?
  • Is the intervention feasible to implement?

User perspectives on the value, feasibility, usability and acceptability of diagnostic technologies are important in the implementation of such technologies. If the perspectives of laboratory personnel, clinicians, patients and TB programme personnel are not considered, the technologies risk being inaccessible to and underused by those for whom they are intended.

To address questions related to user perspective, two activities were undertaken:

  • A systematic review of evidence on user perspectives and experiences with NAATs for detection of TB and TB drug resistance (moderate and low complexity automated assays, and high complexity hybridization-based assays) was undertaken from July to November 2020.
  • A total of 14 semi-structured interviews with clinicians, programme officers, laboratory staff and patient advocates were conducted in India, Moldova and South Africa from October to November 2020.

The findings from these activities are discussed below.

Systematic review

A total of 27 studies were identified that met inclusion criteria, of which 21 were sampled for inclusion in the analysis. All of the sampled studies were published between 2012 and 2020. Of the 21 included studies, 18 were located in high TB burden countries: six in India, four in South Africa, two each in Kenya and Uganda, and one each in Brazil, Cambodia, Myanmar and Viet Nam. One study covered projects in nine countries (Bangladesh, Cambodia, Democratic Republic of the Congo, Kenya, Malawi, Moldova, Mozambique, Nepal and Pakistan). In addition, there was one study located in Eswatini, one in Mongolia and one in Nepal. All studies focused on Xpert MTB/RIF, except for one that focused on Xpert MTB/RIF Ultra (Xpert Ultra).

A summary of the core characteristics of studies included in this review is presented in a study characteristics table in Web Annex 4.10: User perspectives on NAATs to detect TB and DR-TB: results from qualitative evidence synthesis: systematic review.

Interviews

The aim of the interviews was to understand participants’ experiences of using the various technologies (i.e. NAATs for detection of TB and TB drug resistance) and their general TB diagnostic experiences. The three countries – India, Moldova and South Africa – were selected based on them being on WHO’s list of 30 high MDR-TB burden countries (2) and that index tests have been used to some extent in research contexts within these countries. Due to the short time frame, participants were purposively sampled and approached based on convenience through personal contacts and colleagues.

An overview of the participants is given in Table 2.1.24. To mask the identity of study participants they were coded by their country (Moldova [M], India [I] or South Africa [S]), their profession (clinician or medical doctor [M], patient advocate/representative [R], laboratory personnel [L] or programme officers [P]) and a number.

Table 2.1.24. Overview of participants for the end-users’ interviews.

Table 2.1.24

Overview of participants for the end-users’ interviews.

Interviews were conducted using Zoom, Skype or phone. Topics discussed included:

  • current approach to diagnosing TB, MDR-TB and extensively drug-resistant TB (XDR-TB), including specific challenges;
  • experiences with using molecular TB diagnostics and the index tests specifically, including details on steps taken in the diagnostic process;
  • experiences with determining eligibility and treatment initiation, and challenges and benefits of using the index tests;
  • overall usefulness of the index tests;
  • the feasibility of implementing the index tests;
  • the potential impact of the index tests on health equity; and
  • how the potential impact of the index tests relates to current policy context.

Several important limitations of this approach were noted. Only a few participants were interviewed per country. Owing to the use of Zoom, Skype or phone for interviews, it was not possible to triangulate interview data with other evidence commonly collected through ethnographic approaches (e.g. multiple interviews and informal conversations at the same facility, observations or site visits). In addition, only some of the participants had personal experience with one or all of the index tests, and those participants who did have experience with the tests had used them in research settings rather than for routine practice.

More details on these interviews are given in Web Annex 4.11: User perspectives on nucleic acid amplification tests for tuberculosis and tuberculosis drug resistance: Interviews study.

Findings of the review and interviews

The main findings of the systematic review and interviews are given below. Where information is from the review, a level of confidence in the quality evidence synthesis (QES) is given; where it is from interviews, this is indicated with ‘Interviews’.

Is there important uncertainty about or variability in how much end-users value the main outcomes?

  • Patients in high burden TB settings value:

    getting an accurate diagnosis and reaching diagnostic closure (finally knowing “what is wrong with me”);

    avoiding diagnostic delays because they exacerbate existing financial hardships and emotional and physical suffering, and make patients feel guilty for infecting others (especially children);

    having accessible facilities; and

    reducing diagnosis-associated costs (e.g. travel, missing work) as important outcomes of the diagnostic.

    QES: moderate confidence

  • Moderate complexity automated NAATs meet several preferences and values of clinicians and laboratory staff, in that they:

    are faster than culture-based phenotypic DST (similar to LPA or cartridge-based tests);

    have the advantage of being automated (unlike LPA);

    provide additional clinically relevant drug-resistance information such as high versus low resistance (unlike the current Xpert MTB/RIF cartridge).

    Interviews

What would be the impact on health equity?

  • Various factors – for example, lengthy diagnostic delays, underuse of diagnostics, lack of TB diagnostic facilities at lower levels and too many eligibility restrictions – hamper access to prompt and accurate testing and treatment, particularly for vulnerable groups.
    QES: high confidence
  • Staff and managers voiced concerns about:

    sustainability of funding and maintenance;

    complex conflicts of interest between donors and implementers; and

    the strategic and equitable use of resources, which negatively affects creating equitable access to cartridge-based diagnostics.

    QES: high confidence

  • Access to clear and comprehensible information for TB patients on what TB diagnostics are available to them and how to interpret results is a vital component of equity, and lack of such access represents an important barrier for patients.
    Interviews
  • New treatment options need to be matched with new diagnostics. It is important to improve access to treatment based on new diagnostics and to improve access to diagnostics for new treatment options.
    Interviews
  • The speed at which WHO guidelines are changing does not match the speed at which many country programmes are able to implement the guidelines. This translates into differential access to new TB diagnostics and treatment:

    between countries (i.e. between those that can and cannot quickly keep up with the rapidly changing TB diagnostic environment); and

    within countries (i.e. between patients who can and cannot afford the private health system that is better equipped to quickly adopt new diagnostics and policies).

    Interviews

  • The identified challenges with the use of NAATs for detection of TB and DR-TB, and accumulated delays, risk compromising the added value as identified by the users, ultimately leading to underuse. The challenges also hamper access to prompt and accurate testing and treatment, particularly for vulnerable groups.
    QES: high confidence

Is the intervention acceptable to key stakeholders?

  • Patients can be reluctant to test for TB or MDR-TB because of:

    stigma related to MDR-TB or having interrupted treatment in the past;

    fears of side-effects;

    failure to recognize symptoms;

    inability to produce sputum; and

    cost, distance and travel concerns related to (repeat) clinic visits.

    QES: high confidence

  • Health workers can be reluctant to test for TB or MDR-TB because of:

    TB-associated stigma and consequences for their patients;

    fear of acquiring TB;

    fear from supervisors when reclassifying patients already on TB treatment who turn out to be misclassified;

    fear of side-effects of drugs in children; and

    community awareness of disease manifestations in children.

    QES: high confidence

  • In relation to the acceptability of moderate complexity automated NAATs:

    the automation of this class of technologies, which recognizes the high workload of laboratory staff, improves their acceptability;

    in terms of the physical size of the platform and how it fits into the laboratory space and workflow, a smaller footprint may be more acceptable; and

    the number of samples run on the system is acceptable provided that the platform is placed within a laboratory that receives a sufficient sample load to run the system.

    Interviews

Is the intervention feasible to implement?

  • The feasibility of all diagnostic technologies is challenged if there is an accumulation of diagnostic delays or underuse (or both) at every step in the process, mainly because of health system factors such as:

    non-adherence to testing algorithms, testing for TB or MDR-TB late in the process, empirical treatment, false negatives due to technology failure, large sample volumes and staff shortages, poor or delayed sample transport and sample quality, poor or delayed communication of results, delays in scheduling follow-up visits and recalling patients, and inconsistent recording of results;

    lack of sufficient resources and maintenance (i.e. stock-outs; unreliable logistics; lack of funding, electricity, space, air conditioners and sputum containers; dusty environment; and delayed or absent local repair option);

    inefficient or unclear workflows and patient flows (e.g. inefficient organizational processes, poor links between providers, and unclear follow-up mechanisms or information on where patients need to go); and

    lack of data-driven and inclusive national implementation processes.

    QES: high confidence

  • The feasibility of moderate complexity automated NAATs is also challenged by:

    how or whether the platform fits into the physical space of the laboratory (considering bench size and weight of the platform) and sample workflow;

    a poorly functioning sample transport system that affects the quality of samples; and

    the need to ensure that clinicians and laboratory staffhave time to communicate effectively regarding diagnostic results if the platform is centralized, while also ensuring that the laboratory location is central enough to receive adequate numbers of samples to make the machine worth running.

    Interviews

  • Implementation of new diagnostics must be accompanied by training for clinicians to help them interpret results from new molecular tests and understand how this information is translated into prompt and proper patient management. In the past, with the introduction of Xpert MTB/RIF, this has been a challenge.
    QES: high confidence and interviews
  • Introduction of new diagnostics must be accompanied by guidelines and algorithms that support clinicians and laboratories in communicating with each other, such that they can discuss discordant results and interpret laboratory results in the context of drug availability, patient history and patient progress on a current drug regimen.
    Interviews
Implementation considerations

Factors to consider when implementing moderate complexity automated NAATs for detection of TB and resistance to rifampicin and isoniazid are as follows:

  • local epidemiological data on resistance prevalence should guide local testing algorithms, whereas pretest probability is important for the clinical interpretation of test results;
  • the cost of a test varies depending on parameters such as the number of samples in a batch and the staff time required; therefore, a local costing exercise should be performed;
  • low, moderate and high complexity tests have successive increase in technical competency needs (qualifications and skills) and staff time, which affects planning and budgeting;
  • availability and timeliness of local support services and maintenance should be considered when selecting a provider;
  • laboratory accreditation and compliance with a robust quality management system (including appropriate quality control) are essential for sustained service excellence and trust;
  • training of both laboratory and clinical staff is needed to ensure effective delivery of services and clinical impact;
  • use of connectivity solutions for communication of results is encouraged, to improve efficiency of service delivery and reduce time to treatment initiation;
  • moderate complexity automated NAATs may already be used programmatically for other diseases – for example, severe acute respiratory syndrome coronavirus 2 (SARS-CoV2), HIV and antimicrobial resistance (AMR) – which could potentially facilitate implementation of TB testing on shared platforms;
  • implementation of moderate complexity automated NAATs requires laboratories with the required infrastructure, space and efficient sample referral systems;
  • although these are automated tests, well-trained skilled staff are needed to set up assays and complete maintenance requirements; and
  • implementation of these tests should be context specific; thus, it should take into account access issues, especially in remote areas, where less centralized WHO-recommended technologies may be more appropriate.

Research priorities

Research priorities for moderate complexity automated NAATs for detection of TB and resistance to rifampicin and isoniazid are as follows:

  • diagnostic accuracy in specific patient populations (e.g. children, people living with HIV, and patients with signs and symptoms of extrapulmonary TB) and in non-sputum samples;
  • impact of diagnostic technologies on clinical decision-making and outcomes that are important to patients (e.g. cure, mortality, time to diagnosis and time to start treatment) in all patient populations;
  • impact of specific mutations on treatment outcomes among people with DR-TB;
  • use, integration and optimization of diagnostic technologies in the overall landscape of testing and care, as well as diagnostic pathways and algorithms;
  • economic studies evaluating the costs, cost–effectiveness and cost–benefit of different diagnostic technologies;
  • qualitative studies evaluating equity, acceptability, feasibility and end-user values of different diagnostic technologies;
  • effect of non-actionable results (indeterminate, non-determinate or invalid) on diagnostic accuracy and outcomes that are important to patients;
  • operational research on the advantages and disadvantages of individual technologies within the class of moderate complexity automated NAATs;
  • effect of moderate complexity automated NAATs in fostering collaboration and integration between disease programmes; and
  • the potential utility of detecting katG resistance to identify MDR-TB clones that may be missed because they do not have an RRDR mutation (e.g. the Eswatini MDR-TB clone, which has both the katG S315T and the non-RRDR rpoB I491F mutation).

2.2. Initial diagnostic tests for diagnosis of TB without drug-resistance detection

Loop-mediated isothermal amplification

A commercial molecular assay, the Loopamp Mycobacterium tuberculosis complex (MTBC) detection kit (Eiken Chemical Company, Tokyo, Japan), is based on loop-mediated isothermal amplification (LAMP) reaction. Referred to as TB-LAMP, this is a manual assay that requires less than 1 hour to perform and can be read with the naked eye under UV light. Because it requires little infrastructure and is relatively easy to use, TB-LAMP is being explored for use as a rapid diagnostic test that would be an alternative to smear microscopy in resource-limited settings. LAMP methods have been used to detect malaria and several neglected tropical diseases.

In 2012, WHO convened a GDG on TB-LAMP recognizing that it is a manual molecular test to detect TB and could feasibly be implemented in peripheral-level microscopy laboratories with adequately trained laboratory technicians. The advantages of TB-LAMP are that it has a relatively high throughput, does not require sophisticated instruments, and has biosafety requirements similar to those of sputum-smear microscopy. Since 2012, some 20 additional studies in 17 countries have been conducted. WHO convened a GDG meeting in January 2016 to review evidence from a systematic review and meta-analysis of data from individual participants in these studies.

Recommendations

  1. TB-LAMP may be used as a replacement test for sputum-smear microscopy for diagnosing pulmonary TB in adults with signs and symptoms consistent with TB.
    (Conditional recommendation, very low quality evidence)
  2. TB-LAMP may be used as a follow-on test to smear microscopy in adults with signs and symptoms consistent with pulmonary TB, especially when further testing of sputum smear-negative specimens is necessary.
    (Conditional recommendation, very low quality evidence)
Remarks
  1. These recommendations apply to settings where conventional sputum-smear microscopy can be performed.
  2. TB-LAMP should not replace the use of rapid molecular tests that detect TB and resistance to rifampicin, especially among populations at risk of MDR-TB.
  3. The test has limited additional diagnostic value over sputum-smear microscopy for testing PLHIV who have signs and symptoms consistent with TB.
  4. These recommendations apply only to the use of TB-LAMP in testing sputum specimens from patients with signs and symptoms consistent with pulmonary TB.
  5. These recommendations are extrapolated to using TB-LAMP in children, based on the generalization of data from adults, while acknowledging the difficulties of collecting sputum specimens from children.

Test description

The amplification reaction requires four types of primers, which are complementary to six regions of the target gene. At about 65 °C, double-stranded DNA is in a condition of dynamic equilibrium. One of the LAMP primers can anneal to the complementary sequence of double-stranded target DNA, initiating DNA synthesis with the polymerase; strand displacement activity then displaces and releases a single-stranded DNA. Owing to the complementarity of the 5′-end of the forward inner primer (known as FIP) and the backward inner primer (BIP) in nearby regions of the target amplicon, loop structures are formed. This allows variously sized structures, consisting of alternately inverted repeats of the target sequence on the same strand, to be formed in rapid succession.

The addition of loop primers, which contain sequences complementary to the single-stranded loop region on the 5′-end of the hairpin structure, speeds the reaction by providing a greater number of starting points for DNA synthesis. Using loop primers, amplification by 109–1010 times can be achieved within 15–30 minutes. The version of TB-LAMP that was evaluated includes loop primers for a total of six primers binding to eight locations. This requirement for homogeneous sequences at multiple binding sites preserves the specificity of the assay, even in the absence of a probe.

The LAMP method is relatively insensitive to the accumulation of DNA and DNA byproducts (pyrophosphate salts), so the reaction proceeds until large amounts of amplicon are generated. This feature makes it possible to visually detect successful amplification using double-stranded DNA-binding dyes, such as SYBR green, by detecting the turbidity caused by precipitating magnesium pyrophosphate or by using a non-inhibitory fluorescing reagent that is quenched in the presence of divalent cations. Fig. 2.2.1 shows calcein, unquenched by pyrophosphate consumption of divalent cations, fluorescing under UV light. The turbid, fluorescent product is easily seen with the naked eye.

Fig. 2.2.1. Visual display of TB-LAMP results under UV light.

Fig. 2.2.1

Visual display of TB-LAMP results under UV light. LAMP: loop-mediated isothermal amplification; TB: tuberculosis; UV: ultraviolet.

The test procedure has three main steps (Fig. 2.2.2):

  1. Sample preparation – bacteria are heat treated for inactivation and lysis. This step also includes the extraction of DNA.
  2. Amplification – the sample is placed in a heating block at 67 °C. At this temperature, the polymerase enzyme amplifies the target DNA.
  3. Visualization – the test-tube contains a double-stranded DNA-binding molecule that will fluoresce under UV light, meaning that detection can easily be performed with the naked eye.

Fig. 2.2.2. Description of the workflow for TB-LAMP.

Fig. 2.2.2

Description of the workflow for TB-LAMP. LAMP: loop-mediated isothermal amplification; TB: tuberculosis.

Justification and evidence

The evidence reviewed, and this policy guidance apply only to the use of the commercial TB-LAMP manual assay. In accordance with WHO’s standards for assessing evidence when formulating policy recommendations, the GRADE approach was used. GRADE provides a structured framework to determine the quality of the evidence and to provide information on the strength of the recommendations, using PICO questions agreed by the GDG. PICO refers to the following four elements that should be included in questions that govern a systematic search of the evidence: the population targeted by the action or intervention (in the case of systematic reviews of the accuracy of diagnostic tests, P is the population of interest), the intervention (I is the index test), the comparator (C is the comparator test or tests) and the outcomes (O is usually sensitivity and specificity). The PICO questions for the review are given in Box 2.2.1.

Box 2.2.1PICO questions addressed by the GDG

  1. What is the diagnostic accuracy of TB-LAMP for detecting pulmonary TB in adults when TB-LAMP is used as a replacement test for sputum-smear microscopy compared with culture as a reference standard? (Results were stratified by HIV status.)
  2. What is the diagnostic accuracy of TB-LAMP for detecting pulmonary TB in adults when TB-LAMP is used as an add-on test following negative sputum-smear microscopy compared with culture as a reference standard?
  3. What is the difference in diagnostic accuracy between TB-LAMP and the Xpert MTB/RIF assay (Cepheid, Sunnyvale, USA) for detecting pulmonary TB in reference to mycobacterial culture among all adults?
  4. What is the proportion of indeterminate or invalid results when TB-LAMP is used to detect pulmonary TB among all adults and among HIV-positive adults?

The review included all prospective studies that evaluated the use of TB-LAMP on sputum samples from adults with signs and symptoms consistent with pulmonary TB that were conducted in settings with an intermediate or high burden of TB. Twenty studies were identified, including all studies that were directly conducted by FIND or funded through FIND following a request for applications. Study participants who could not be classified as TB-positive or TB-negative based on the reference standard definitions described below were excluded.

The mycobacterial culture reference standards listed below were used to classify TB status. Eligible studies performed one or more sputum cultures on solid media (Löwenstein–Jensen) or on liquid media using the BACTEC™ mycobacterial growth indicator tube (MGIT; Becton Dickinson, Franklin Lakes, USA), or on both liquid and solid media. To account for the different number of cultures performed by studies and the different number of culture results available for participants, three hierarchical culture-based reference standards were used to assess diagnostic accuracy.

Standard 1 comprised:

  • TB: at least one positive culture-confirmed to be MTBC by speciation testing.
  • Not TB: no positive and at least two negative cultures performed on two different sputum samples.

Standard 2 comprised:

  • TB: at least one positive culture-confirmed to be MTBC by speciation testing.
  • Not TB: No positive and at least two negative cultures performed on at least one sputum sample.

Standard 3 comprised:

  • TB: at least one positive culture-confirmed to be MTBC by speciation testing.
  • Not TB: No positive and at least one negative culture.

Across the three standards, there is an expected trade-off between the yield of a confirmed TB diagnosis (highest with Standard 1 and lowest with Standard 3) and the number of studies or participants included in the analysis (lowest with Standard 1 and highest with Standard 3). Thus, using Standard 1, the potential for false negative index test results is highest and for false positive index test results is lowest. Also, using Standard 1, the number of studies and study participants included is expected to be lowest because it excludes studies that performed only one culture, and study participants for whom only one negative culture result was available due to culture contamination; in contrast, using Standard 3, the number of studies and study participants is highest.

Of the 4760 adults eligible for inclusion in the analysis, 1810 participants (38%) across seven studies qualified for Standard 1 status, 3110 participants (65%) across 10 studies qualified for Standard 2 and 4596 participants (97%) across 13 qualified for Standard 3 (Table 2.2.1).

The performance of the test was calculated using the three different reference standards for the following scenarios:

  1. TB-LAMP as a replacement for sputum-smear microscopy;
  2. TB-LAMP as a replacement for sputum-smear microscopy among PLHIV;
  3. TB-LAMP as an add-on test for sputum-smear microscopy negative individuals; and
  4. TB-LAMP in head-to-head comparison with Xpert MTB/RIF.

Table 2.2.1. TB-LAMP as a replacement test for smear microscopy: estimates of pooled sensitivity and specificity.

Table 2.2.1

TB-LAMP as a replacement test for smear microscopy: estimates of pooled sensitivity and specificity.

Details of studies included in the current analysis are given in Web Annex 1.4: TB-LAMP. Summary of the results and details of the evidence quality assessment are available in Web Annex 2.4: TB-LAMP.

Cost–effectiveness analysis

For the cost analysis, a bottom-up micro-costing analysis was conducted – the aim being to identify, measure and value all resources relevant to providing TB-LAMP and the Xpert MTB/RIF assay as routine diagnostic tests in peripheral laboratories in Malawi and Viet Nam. The two TB-LAMP strategies (used as a replacement test for sputum-smear microscopy and as an add-on test to sputum-smear microscopy for further testing in smear-negative patients) were compared with the base case algorithm, with sputum-smear microscopy followed by clinical diagnosis in those patients with a negative microscopy result.

The weighted average per-test cost of TB-LAMP was US$ 13.78–16.22, and for the Xpert MTB/RIF assay it was US$ 19.17–28.34 when these tests were used as routine diagnostic tests at all peripheral-level laboratories in both countries. The first-year expenditure required for implementation at peripheral laboratories with a medium workload (10–15 sputum-smear microscopy tests per day) in Viet Nam was US$ 26 917 for TB-LAMP and US$ 43 325 for the Xpert MTB/RIF assay. These costs were about US$ 3000 lower in Malawi, because of lower operating and staff costs. Likewise, TB-LAMP was a considerably cheaper test to implement, accounting for 9.33% of the reported TB control budget for 2014 in Malawi and 17.2% in Viet Nam; in comparison, implementing the Xpert MTB/RIF assay accounted for 18% of the reported TB control budget in Malawi and 37% in Viet Nam. In the cost–effectiveness analyses, both of the TB-LAMP scenarios improved case-detection rates, and both strategies were cost effective when compared with WHO’s willingness-to-pay threshold levels.

The cost–effectiveness analysis findings demonstrate that TB-LAMP is potentially a cost-effective alternative to the base case of sputum-smear microscopy plus clinical diagnosis in settings where the Xpert MTB/RIF assay cannot be implemented because of the infrastructure requirements, including a continuous power supply. However, given the inability of TB-LAMP to detect RR-TB, and its suboptimal sensitivity for detecting TB among PLHIV, national policymakers must cautiously evaluate the operational feasibility and cost considerations before introducing this technology.

Implementation considerations

The systematic review supports the use of TB-LAMP as a replacement test for smear microscopy, for diagnosing pulmonary TB in countries with an intermediate or high burden of TB. However, the Xpert MTB/RIF assay should remain the preferred diagnostic test for anyone suspected of having TB, provided that there are sufficient resources and infrastructure to support its use, given the evidence, its ability to simultaneously identify rifampicin resistance and the fact that it is automated.

  • Several operational issues accompany the implementation of TB-LAMP; for example, the need for electricity, adequate storage and waste disposal, stock monitoring and temperature control in storage settings where temperatures exceed the manufacturer’s recommendation (currently 30 °C for TB-LAMP).
  • TB-LAMP is designed and has been evaluated to detect M. tuberculosis in sputum specimens. Its use with other samples (e.g. urine, serum, plasma, CSF or other body fluids) has not been adequately evaluated.
  • Adoption of TB-LAMP does not eliminate the need for smear microscopy, which should be used for monitoring the treatment of patients with drug-susceptible TB. However, the demand for conventional sputum microscopy may decrease in settings where TB-LAMP fully or partially replaces conventional sputum microscopy.
  • TB-LAMP should not replace the Xpert MTB/RIF assay because the latter simultaneously detects M. tuberculosis and rifampicin resistance, is automated and is relatively simple to perform.
  • In settings where the Xpert MTB/RIF assay cannot be implemented (e.g. because of an inadequate electric supply, or excessive temperatures, humidity or dust), TB-LAMP may be a plausible alternative.

Research priorities

  • Evaluation of diagnostic algorithms in different epidemiological and geographical settings and patient populations.
  • Conducting of more rigorous studies with higher quality reference standards (including multiple specimen types and extrapulmonary specimens) to improve confidence in specificity estimates.
  • Determination of training needs, and assessments of competency and quality.
  • Gathering of more evidence on the impact on TB treatment initiation, morbidity and mortality.
  • Performance of country-specific cost–effectiveness and cost–benefit analyses of targeted TB-LAMP use in different programmatic settings.
  • Meeting the Standards for Reporting Diagnostic Accuracy Studies (STARD) for future studies (23).

Lateral flow urine lipoarabinomannan assay

Tests based on the detection of the lipoarabinomannan (LAM) antigen in urine have emerged as potential point-of-care tests for TB. The currently available urinary LAM assays have suboptimal sensitivity, and are therefore not suitable as general diagnostic tests for TB. However, unlike traditional diagnostic methods, they demonstrate improved sensitivity for the diagnosis of TB among individuals coinfected with HIV. The estimated sensitivity is even greater in patients with low CD4 cell counts. The lateral flow urine LAM assay (LF-LAM) strip-test – the Alere Determine TB LAM Ag (USA), hereafter referred to as AlereLAM – is currently the only commercially available urinary LAM test that potentially could be used as a rule-in test for TB in patients with advanced HIV-induced immunosuppression, and facilitate the early initiation of anti-TB treatment.

Recommendations

In inpatient settings

94.2 WHO strongly recommends using LF-LAM to assist in the diagnosis of active TB in HIV-positive adults, adolescents and children:

  1. with signs and symptoms of TB (pulmonary and/or extrapulmonary)
    (strong recommendation, moderate certainty in the evidence about the intervention effects); or
  2. with advanced HIV disease15 or who are seriously ill16
    (strong recommendation, moderate certainty in the evidence about the intervention effects); or
  3. irrespective of signs and symptoms of TB and with a CD4 cell count of less than 200 cells/mm3
    (strong recommendation, moderate certainty in the evidence about the intervention effects).

In outpatient settings

94.2 WHO suggests using LF-LAM to assist in the diagnosis of active TB in HIV-positive adults, adolescents and children:

  1. with signs and symptoms of TB (pulmonary and/or extrapulmonary) or seriously ill
    (conditional recommendation, low certainty in the evidence about test accuracy); and
  2. irrespective of signs and symptoms of TB and with a CD4 cell count of less than 100 cells/mm3
    (conditional recommendation, very low certainty in the evidence about test accuracy).

In outpatient settings

94.2 WHO recommends against using LF-LAM to assist in the diagnosis of active TB in HIV-positive adults, adolescents and children:

  1. without assessing TB symptoms
    (strong recommendation, very low certainty in the evidence about test accuracy);
  2. without TB symptoms and unknown CD4 cell count or without TB symptoms and CD4 cell count greater than or equal to 200 cells/mm3
    (strong recommendation, very low certainty in the evidence about test accuracy); and
  3. without TB symptoms and with a CD4 cell count of 100–200 cells/mm3
    (conditional recommendation, very low certainty in the evidence about test accuracy).

15

For adults, adolescents, and children aged 5 years or more, “advanced HIV disease” is defined as a CD4 cell count of less than 200 cells/mm3 or a WHO clinical stage 3 or 4 event at presentation for care. All children with HIV aged under 5 years should be considered as having advanced disease at presentation.

16

"Seriously ill" is defined based on four danger signs: respiratory rate of more than 30/minute, temperature of more than 39 °C, heart rate of more than 120/minute and unable to walk unaided.

Remarks
  1. The reviewed evidence and recommendations apply to the use of AlereLAM only, because other in-house LAM-based assays have not been adequately validated or used outside limited research settings. Any new or generic LAM-based assay should be subject to adequate validation in the settings of intended use.
  2. All patients with signs and symptoms of pulmonary TB who are capable of producing sputum should submit at least one sputum specimen for Xpert MTB/RIF (Ultra) assay, as their initial diagnostic test. This also includes children and adolescents living with HIV who are able to provide a sputum sample.
  3. These recommendations also apply to adolescents and children living with HIV, based on generalization of data from adults, while acknowledging that there are very limited data for these population groups.
  4. LF-LAM should be used as an add-on to clinical judgement in combination with other tests; it should not be used as a replacement or triage test.

Test description

The urine-based LF-LAM AlereLAM is a commercially available point-of-care test for active TB (24). AlereLAM is an immunocapture assay that detects LAM antigen in urine, LAM being a lipopolysaccharide present in mycobacterial cell walls released from metabolically active or degenerating bacterial cells during TB disease (24, 25).

AlereLAM is performed manually by applying 60 μL of urine to the test strip (the white pad marked by the arrow symbols in Fig. 2.2.3 A) and incubating at room temperature for 25 minutes. The strip is then inspected by eye for visible bands. The intensity of any visible band on the test strip is graded by comparing it with the intensities of the bands on a manufacturer-supplied reference scale card (as shown in the example in Fig. 2.2.3 B).

Fig. 2.2.3. Alere Determine TB LAM Ag tests (AlereLAM): (a) individual test strip, and (b) reference card accompanying test strips to "grade" the test result and determine positivity.

Fig. 2.2.3

Alere Determine TB LAM Ag tests (AlereLAM): (a) individual test strip, and (b) reference card accompanying test strips to "grade" the test result and determine positivity.

AlereLAM is being considered as a diagnostic test that may be used in combination with existing tests for the diagnosis of HIV-associated TB.

Justification and evidence

WHO commissioned a systematic review to summarize the current scientific literature on the accuracy of AlereLAM for the diagnosis of TB in PLHIV as part of a WHO process to develop updated guidelines for the use of the AlereLAM assay.

The PICO questions shown in Box 2.2.2 were designed to form the basis forthe evidence search, retrieval and analysis.

Box 2.2.2PICO questions

  1. What is the diagnostic accuracy of LF-LAM for the diagnosis of TB in all HIV-positive adults and children with signs and symptoms of TB?
    • in inpatient settings (adults, adolescents and older children)
    • in outpatient settings (adults, adolescents and older children)
    • in all settings (adults, adolescents and older children)
    • in inpatient settings (children aged ≤5 years)
    • in outpatient settings (children aged ≤5 years)
    • in all settings (children aged ≤5 years)
  2. What is the diagnostic accuracy of LF-LAM for the diagnosis of TB in all HIV-positive adults and children irrespective of signs and symptoms of TB?
    • in inpatient settings (adults, adolescents and older children)
    • in outpatient settings (adults, adolescents and older children)
    • in all settings (adults, adolescents and older children)
    • in inpatient settings (children aged ≤5 years)
    • in outpatient settings (children aged ≤5 years)
    • in all settings (children aged ≤5 years)
  3. What is the diagnostic accuracy of LF-LAM for the diagnosis of TB in adults with advanced HIV disease irrespective of signs and symptoms of TB?
    • in inpatient settings, CD4 cell count ≤200
    • in outpatient settings, CD4 cell count ≤200
    • in all settings, CD4 cell count ≤200
    • in inpatient settings, CD4 cell count ≤100
    • in outpatient settings, CD4 cell count ≤100
    • in all settings, CD4 cell count ≤100
  4. Can the use of LF-LAM in HIV-positive adults reduce mortality associated with advanced HIV disease?
    • in all settings
    • in inpatient settings
    • in outpatient settings
    • in individuals with CD4 cell count ≤200
    • in inpatient settings, CD4 cell count ≤200
    • in outpatient settings, CD4 cell count ≤200
    • in individuals with CD4 cell count ≤100
    • in inpatient settings, CD4 cell count ≤100
    • in outpatient settings, CD4 cell count ≤100
  5. Additional questions:
    • What are the comparative cost, affordability and cost–effectiveness of implementation of LF-LAM (AlereLAM versus FujiLAM) – based on review of the published literature and estimations?
    • Are there possible implications for patient equity from the implementation of LF-LAM (AlereLAM versus FujiLAM) – based on review of the published literature and estimations?
    • What are the human rights implications from the implementation of LF-LAM – based on review of the published literature and comparative analysis of the two available LF-LAM (AlereLAM versus FujiLAM)?

The review identified 15 unique published studies that assessed the accuracy of AlereLAM in adults, and integrated nine new studies identified since the original WHO and Cochrane reviews in 2015 and 2016, respectively (26, 27). All studies included in the systematic review were performed in high TB/HIV burden countries. The positive AlereLAM results were reported in accordance with the manufacturer’s updated recommendations for test interpretation (graded on a scale of 1 to 4, based on band intensity). All analyses were performed with respect to an MRS.

The 15 included studies involved 6814 participants, of whom 1761 (26%) had TB. Eight of the studies evaluated the accuracy of AlereLAM for TB diagnosis in participants with signs and symptoms suggestive of TB; these studies involved 3449 participants, of whom 1277 (37%) had TB. Seven studies evaluated the accuracy of AlereLAM for diagnosis of unselected participants who may or may not have had TB signs and symptoms at enrolment; these studies involved 3365 participants, of whom 439 (13%) had TB.

All studies were performed in high TB/HIV burden countries that were classified as low-income or middle-income countries. The studies had substantial differences in the following characteristics: study population (“studies with symptomatic participants” and “studies with unselected participants”), setting (inpatients versus outpatients), median CD4 cell count, TB prevalence, inclusion and exclusion of participants based on whether or not they could produce sputum, and whether patients were evaluated for pulmonary TB or extrapulmonary TB, or both.

Most studies reported that a valid AlereLAM result was obtained on the first attempt for all tests. Uninterpretable test results (<1%) were reported in only three studies (2830).

Summary of the results

For TB diagnosis in HIV-positive adults presenting with signs and symptoms of TB, the diagnostic accuracy of AlereLAM is as follows:

  • in inpatient settings, sensitivity 52% (40–64%)6 and specificity 87% (78–93%);
  • in outpatient settings, sensitivity 29% (17–47%) and specificity 96% (91–99%); and
  • in all settings, sensitivity 42% (31–55%) and specificity 91% (85–95%).

For TB diagnosis in HIV-positive adults, irrespective of signs and symptoms of TB, the diagnostic accuracy of AlereLAM is as follows:

  • in inpatient settings, sensitivity 62% (41–83%) and specificity 84% (48–96%);
  • in outpatient settings, sensitivity 31% (18–47%) and specificity 95% (87–99%); and
  • in all settings, sensitivity 35% (22–50%) and specificity 95% (89–98%).

For diagnosis of TB in adults with advanced HIV disease, irrespective of signs and symptoms of TB, the diagnostic accuracy of AlereLAM (limited data available) is as follows:

  • in inpatient settings, CD4 cell count ≤200, sensitivity 64% (35–87%) and specificity 82% (67–93%) (one study);
  • in outpatient settings, CD4 cell count ≤200, sensitivity 21% (8–48%) and specificity 96% (89–99%);
  • in all settings, CD4 cell count ≤200, sensitivity 26% (9–56%) and specificity 96% (87–98%);
  • in inpatient settings, CD4 cell count ≤100, sensitivity 57% (33–79%) and specificity 90% (69–97%);
  • in outpatient settings, CD4 cell count ≤100, sensitivity 40% (20–64%) and specificity 87% (68–94%); and
  • in all settings, CD4 cell count ≤100, sensitivity 47% (30–64%) and specificity 90% (77–96%).

For diagnosis of TB in HIV-positive children, the diagnostic accuracy of AlereLAM (limited data available) is as follows:

  • in all settings, including all children, for individual studies, sensitivity and specificity were:

    42% (15–72%) and 94% (73–100%) (one study conducted in an outpatient setting);

    56% (21–86%) and 95% (90–98%) (one study conducted in an inpatient setting); and

    43% (23–66%) and 80% (69–88%) (one study conducted in both inpatient and outpatient settings).

For use of AlereLAM to reduce mortality associated with advanced HIV disease (two randomized trials):

  • the pooled risk ratio for mortality was 0.85 (0.76–0.94); and
  • the absolute effect was 35 fewer deaths per 1000 (from 14 fewer to 55 fewer) (PICO 4).

Table 2.2.2 presents pooled sensitivity and specificity results for AlereLAM against an MRS grouped by the study population, TB diagnosis among “symptomatic participants” and TB diagnosis among “unselected participants”.

Table 2.2.2. AlereLAM pooled sensitivity and specificity for TB diagnosis, by study population.

Table 2.2.2

AlereLAM pooled sensitivity and specificity for TB diagnosis, by study population.

Cost–effectiveness analysis

Economic evidence for the implementation and scale-up of LF-LAM is limited. The studies that have been done show a consistent trend, suggesting that LF-LAM could be cost effective in a population of African adults living with HIV (particularly among hospitalized patients).

More details are given in Web Annex 4.13: Economic evaluations of LF-LAM for the diagnosis of active tuberculosis in HIV-positive individuals: an updated systematic review.

User perspective

For a qualitative study on user perspectives, 15 semi-structured interviews were conducted during February and March 2019 with clinicians, nurses, programme officers, laboratory staff and patient advocates in Kenya, South Africa and Uganda. The results showed that LF-LAM clearly addresses a need and makes an important difference in a population in which TB is hard to diagnose. In line with the global discourse on LF-LAM, the participants in this study generally saw LF-LAM as an easy-to-use, rapid test that requires little maintenance and equipment, and crucially does not rely on sputum but on urine, a specimen that is safer to work with and easier to obtain. However, the perceived benefits of the specimen, turnaround time, user-friendliness, cost and maintenance requirements can also pose a challenge, depending on the particular situation and the capacities in which the test is used. Similarly, the infrastructure requirements are minimal but there can still be challenges with stock-outs, lack of urine containers and shelf life. Finally, even though the turnaround time is in theory only 25 minutes, in many settings, treatment is not initiated until the next day.

Overall, the results from the qualitative study suggest that the benefits outweigh the challenges, especially given the absence of viable diagnostic alternatives for this particular patient group. These results also show that it is essential to pay attention to how diagnostics are operationalized. Just because a technology is quicker, easier to conduct and cheaper than existing diagnostics, this does not mean it is necessarily more successful in being implemented.

More details are given in Web Annex 4.14: User perspectives on TB-LAM for the diagnosis of active tuberculosis: results from qualitative research.

Summary of changes between the 2015 guidance and the 2019 update

The use of lateral flow urine lipoarabinomannan assay (LF-LAM) for the diagnosis and screening of active tuberculosis in people living with HIV. Policy guidance (2015) (27)Lateral flow urine lipoarabinomannan assay (LF-LAM) for the diagnosis of active tuberculosis in people living with HIV. Policy update (2019) (31)Changes
LF-LAM may be used to assist in the diagnosis of TB in HIV-positive adults in patients with signs and symptoms of TB (pulmonary and/or extrapulmonary) who have a CD4 cell count ≤100 cells/ μL, or HIV-positive patients who are seriously illa regardless of CD4 cell count or with unknown CD4 cell count (conditional recommendation, low quality of evidence).

In inpatient settings, WHO strongly recommends using LF-LAM to assist in the diagnosis of active TB in HIV-positive adults, adolescents and children:

  • with signs and symptoms of TB (pulmonary and/or extrapulmonary) (strong recommendation, moderate certainty in the evidence about the intervention effects); or
  • with advanced HIV disease;b or
  • who are seriously ill (strong recommendation, moderate certainty in the evidence about the intervention effects); or
  • irrespective of signs and symptoms of TB and with a CD4 cell count <200 (strong recommendation, moderate certainty in the evidence about the intervention effects).

Increased strength of the recommendation.

Improved quality of evidence.

Increased scope of the recommendation:

- all symptomatic or seriously ill inpatients, irrespective of CD4 cell count;

- all inpatients with advanced HIV disease; and

- inpatients with or without signs and symptoms of TB who have a CD4 cell count <200.

This recommendation also applies to HIV-positive adult outpatients with signs and symptoms of TB (pulmonary and/ or extrapulmonary) who have a CD4 cell count ≤100 cells/μL, or HIV-positive patients who are seriously ill regardless of CD4 cell count or with unknown CD4 cell count, based on the generalization of data from inpatients.

In outpatient settings, WHO suggests using LF-LAM to assist in the diagnosis of active TB in HIV-positive adults, adolescents and children:

  • with signs and symptoms of TB (pulmonary and/ or extrapulmonary) or seriously ill (conditional
  • recommendation, low certainty in the evidence about test accuracy); and
  • irrespective of signs and symptoms of TB and with a CD4 cell count <100 (conditional recommendation, very low certainty in the evidence about test accuracy).

Increased scope of the recommendation:

- all outpatients with signs and symptoms of TB or seriously ill; and

- outpatients with a CD4 cell count <100, irrespective of signs and symptoms of TB.

Except as specifically described below for persons with HIV infection with low CD4 cell counts or who are seriously ill, LF-LAM should not be used for the diagnosis of TB (strong recommendation, low quality of evidence).

In outpatient settings, WHO recommends against using LF-LAM to assist in the diagnosis of active TB in HIV-positive adults, adolescents and children:

  • without assessing TB symptoms (strong recommendation, very low certainty in the evidence about test accuracy);
  • without TB symptoms and unknown CD4 cell count, or without TB symptoms and CD4 cell count ≥200 (strong recommendation, very low certainty in the evidence about test accuracy); or
  • without TB symptoms and with a CD4 cell count of 100– 200 (conditional recommendation, very low certainty in the evidence about test accuracy).
Better definition of patient populations for negative recommendation against use of LF-LAM.
LF-LAM should not be used as a screening test for TB (strong recommendation, low quality of evidence).

See inpatient and outpatient recommendations above for situations in which LF-LAM is suggested for use among individuals, irrespective of signs and symptoms of TB.

See outpatient recommendations above for situations in which WHO recommends against LF-LAM use.

Clarification of recommendation for usage among individuals with and without TB signs and symptoms (i.e. irrespective of signs and symptoms):

- LF-LAM is strongly recommended for inpatients with advanced HIV disease, and individuals with a CD4 cell count <200, irrespective of symptoms; and

- LF-LAM is suggested for outpatients with a CD4 cell count <100, irrespective of symptoms.

See above for patient populations with a recommendation against usage.

This recommendation also applies to HIV-positive children with signs and symptoms of TB (pulmonary and/ or extrapulmonary) based on the generalization of data from adults while acknowledging very limited data and concern regarding low specificity of the LF-LAM assay in children.These recommendations also apply to adolescents and children living with HIV, based on generalization of data from adults, while acknowledging that data for these population groups are limited.

HIV: human immunodeficiency virus; LF-LAM: lateral flow urine lipoarabinomannan assay; TB: tuberculosis; WHO: World Health Organization.

a

“Seriously ill” is defined based on four danger signs: respiratory rate of more than 30/minute, temperature of more than 39 °C, heart rate of more than 120/minute and unable to walk unaided.

b

For adults, adolescents, and children aged 5 years or more, “advanced HIV disease” is defined as a CD4 cell count of less than 200 cells/mm3 or a WHO clinical stage 3 or 4 event at presentation for care. All children with HIV who are aged under 5 years should be considered as having advanced disease at presentation.

Research priorities

  • Development of simple, more accurate tests based on LAM detection, with the potential to be used for HIV-negative populations.
  • Evaluation of the use of LF-LAM in PLHIV without signs and symptoms of TB.
  • Evaluation of the use of LF-LAM in children and adolescents with HIV.
  • Evaluation of the combination of parallel use of LF-LAM and rapid qualitative CD4 cell count systems.
  • Undertaking of implementation research into the acceptance, scale-up and impact of LF-LAM in routine clinical settings.
  • Undertaking of qualitative research on user perspectives of LF-LAM for feasibility, accessibility and equity issues.
  • Undertaking of implementation research on LF-LAM integrated into HIV care packages.
  • Evaluation of the performance of LF-LAM as the HIV epidemic evolves and more people on treatment with viral load suppression are hospitalized.
  • Evaluation of the cost–effectiveness of LF-LAM.
  • Evaluation of other rapid LAM-based tests such as FujiLAM.

2.3. Follow-on diagnostic tests for detection of additional drug-resistance after TB confirmation

Low complexity automated NAATs for detection of resistance to isoniazid and second-line anti-TB agents

Among 105 countries possessing representative data on resistance to fluoroquinolones from the past 15 years, the proportion of MDR/RR-TB cases with resistance to any fluoroquinolone for which testing was done was 20.1% (95% CI: 15.5–25.0%). Thus, rapid and early testing for the detection of fluoroquinolone resistance is essential for determining eligibility for treatment with the all-oral 9–12 month standardized shorter regimen for MDR/RR-TB. However, the current limitation with testing for fluoroquinolone resistance is the limited accessibility of current technologies (which are often only available at higher tiers of the health system) and poor yield in paucibacillary specimens.

Low complexity automated NAATs are a new class of diagnostics intended for use as a reflex test in specimens determined to be Mtb complex (MTBC)-positive; they offer rapid DST in intermediate and peripheral laboratories. The first product in this class simultaneously detects resistance to isoniazid, fluoroquinolones, ethionamide and amikacin. Results are available in under 90 minutes, leading to faster time to results than the current standard of care, which includes LPAs and culture-based phenotypic DST.

An additional value of the tests is the accurate and rapid detection of isoniazid resistance, which is relevant for both RR-TB and rifampicin-susceptible TB; the latter is often undiagnosed and contributes to a large burden of disease. Globally, rifampicin-susceptible TB is estimated to occur in 13.1% (95% CI: 9.9–16.9%) of new cases and 17.4% (95% CI: 0.5–54.0%) of previously treated cases. Thus, this test could also be used as a reflex test to complement existing technologies that only test for rifampicin, allowing the rapid and accurate detection of isoniazid-resistant, rifampicin-susceptible TB.

Although these new technologies are excellent at detecting resistance to selected drugs, conventional culture-based phenotypic DST remains important to determine resistance to other anti-TB agents, particularly the new and repurposed medicines such as bedaquiline and linezolid.

Recommendations

  1. In people with bacteriologically confirmed pulmonary TB, low complexity automated NAATs may be used on sputum for the initial detection of resistance to isoniazid and fluoroquinolones, rather than culture-based phenotypic DST.
    Conditional recommendation, moderate certainty of evidence for diagnostic accuracy
  2. In people with bacteriologically confirmed pulmonary TB and resistance to rifampicin, low complexity automated NAATs may be used on sputum for the initial detection of resistance to ethionamide, rather than DNA sequencing of the inhA promoter.
    Conditional recommendation, very low certainty of evidence for diagnostic accuracy
  3. In people with bacteriologically confirmed pulmonary TB and resistance to rifampicin, low complexity automated NAATs may be used on sputum for the initial detection of resistance to amikacin, rather than culture-based phenotypic DST.
    Conditional recommendation, low certainty of evidence for diagnostic accuracy

There are several subgroups to be considered for these recommendations:

  • The recommendations are based on the evidence of diagnostic accuracy in sputum of adults with bacteriologically confirmed pulmonary TB, with or without rifampicin resistance.
  • The recommendations are extrapolated to adolescents and children, based on the generalization of data from adults.
  • The recommendations apply to people living with HIV (studies included a varying proportion of such individuals); data stratified by HIV status were not available.
  • The recommendations are extrapolated to people with extrapulmonary TB, and testing of non-sputum samples was considered appropriate, which affects the certainty. The panel did not evaluate test accuracy in non-sputum samples directly, including in children; however, extrapolation was considered appropriate given that WHO has recommendations for similar technologies for use on non-sputum samples (e.g. Xpert MTB/RIF and Xpert Ultra).
  • Recommendations for detection of resistance to amikacin and ethionamide are only relevant for people who have bacteriologically confirmed pulmonary TB and resistance to rifampicin.

Test description

The index tests are rapid, low complexity automated NAATs for detection of resistance to isoniazid and second-line anti-TB drugs.

“Automated test” in the low complexity category is defined as a test where most reagents are enclosed in a disposable sealed container to which a clinical specimen is added and almost all processes (e.g. DNA extraction or PCR procedures) are performed within the container linked to the diagnostic platform. Such automated tests may require an initial manual specimen treatment step before transfer of the material requiring testing into the cartridge.

“Low complexity” refers to a situation where no specialized biosafety infrastructure is required; only basic laboratory skills to perform the test and equipment to perform the test are required.

Xpert MTB/XDR assay (Xpert MTB/XDR, Cepheid, Sunnyvale, USA) is the only index test in this review. Evidence on MeltPro® XDR-TB (MeltPro, Xiamen Zeesan Biotech Co Ltd, China) provided by the manufacturer was not sufficient for this assay to be included in this review, and no independent evaluations of MeltPro were identified.

Xpert MTB/XDR detects MTBC DNA and genomic mutations associated with resistance to isoniazid, fluoroquinolones, ethionamide and second-line injectable drugs (amikacin, kanamycin and capreomycin) in a single cartridge (see Table 2.3.1). This review does not include molecular DST for kanamycin and capreomycin because WHO does not currently recommend these second-line injectable agents for use in RR-TB or MDR-TB treatment regimens (32).7

Xpert MTB/XDR employs Cepheid’s GeneXpert platform, similar to that used by Xpert MTB/RIF and Xpert MTB/RIF Ultra. However, in the case of Xpert MTB/XDR, the platform supports multiplexing via 10-colour technology, which is different from the six-colour technology employed by Xpert MTB/RIF and Xpert MTB/RIF Ultra.

The package insert from the manufacturer explains that Xpert MTB/XDR is intended for use as a reflex test in specimens (unprocessed sputum or concentrated sputum sediments) that have been found to be MTBC-positive. The LoD for Mtb by Xpert MTB/XDR (136 cfu/mL in unprocessed sputum) is similar to that of Xpert MTB/RIF (112.6 cfu/mL), but higher than that of Xpert Ultra (15.6 cfu/mL) (33). The manufacturer states in the package insert: “Specimens with ‘MTB trace detected’ results when tested with the Xpert MTB/RIF Ultra assay are expected to be below the limit of detection of the MTB/XDR assay and are not recommended for testing with the Xpert MTB/XDR assay”. As with Xpert MTB/RIF and Xpert Ultra, Xpert MTB/XDR detects both live and dead bacteria.

Xpert MTB/XDR can report results as “Mtb not detected” or “Mtb detected”. If results are reported as “Mtb detected”, each drug is reported as resistance “detected” or “not detected”. If results are reported as “Mtb not detected”, “invalid”, “error” or “no result”, then no DST results are reported.

Table 2.3.1. Drug-related gene targets, codon regions and nucleotide sequences that determine presence of variants associated with drug resistance in the Xpert MTB/XDR assay (34).

Table 2.3.1

Drug-related gene targets, codon regions and nucleotide sequences that determine presence of variants associated with drug resistance in the Xpert MTB/XDR assay (34).

Justification and evidence

The WHO Global TB Programme initiated an update of the current guidelines and commissioned a systematic review on the use of low complexity automated NAATs for the detection of resistance to isoniazid and second-line TB drugs in people with signs and symptoms of TB.

The PICO questions were designed to form the basis for the evidence search, retrieval and analysis:

  1. Should low complexity automated NAATs be used on sputum in people with signs and symptoms of pulmonary TB, irrespective of resistance to rifampicin, for detection of resistance to isoniazid, as compared with culture-based phenotypic DST?
  2. Should low complexity automated NAATs be used on sputum in people with signs and symptoms of pulmonary TB, irrespective of resistance to rifampicin, for detection of resistance to fluoroquinolones, as compared with culture-based phenotypic DST?
  3. Should low complexity automated NAATs be used on culture isolates in people with signs and symptoms of pulmonary TB, and detected resistance to rifampicin, for detection of resistance to ethionamide, as compared with genotypic sequencing of the inhA promoter?
  4. Should low complexity automated NAATs be used on sputum in people with signs and symptoms of pulmonary TB, and detected resistance to rifampicin, for detection of resistance to amikacin, as compared with culture-based phenotypic DST?

The databases Ovid Medline (Ovid, 1946 to present) and Embase (Ovid, 1947 to present) were searched for studies evaluating cartridge-based tests using the following search terms: tuberculosis, pulmonary AND Xpert, GeneXpert, Truenat, cartridge, point-of-care systems, drug susceptibility test, isoniazid resistance, fluoroquinolone resistance and second-line injectable drug resistance. Clinicaltrials.gov and the WHO International Clinical Trials Registry Platform were also searched for trials in progress. Searches were run up to 6 September 2020 without language restriction. On 4 November 2020, an additional search was run using the search terms Zeesan and MeltPro.

Researchers at FIND, the WHO Global TB Programme, the manufacturer and other experts in the field of TB diagnostics were contacted for information about ongoing and unpublished studies. Data submitted in response to the WHO public call were reviewed.

Drug resistance was compared against a phenotypic reference standard (or a genotypic reference standard for ethionamide resistance), as well as a composite reference standard that was constructed by combining the results of phenotypic and genotypic DST results in studies where both had been performed.

The certainty of the evidence was assessed consistently through PICO questions, using the GRADE approach (36, 37), which produces an overall quality assessment (or certainty) of evidence and a framework for translating evidence into recommendations. In the GRADE approach, even if diagnostic accuracy studies are of observational design, they start as high-quality evidence.

GRADEpro Guideline Development Tool software (20) was used to generate summary of findings tables. The quality (certainty) of evidence was rated as high (not downgraded), moderate (downgraded one level), low (downgraded two levels) or very low (downgraded more than two levels), based on five factors: risk of bias, indirectness, inconsistency, imprecision and other considerations. The quality (certainty) of evidence was downgraded one level when a serious issue was identified and by two levels when a very serious issue was identified in any of the factors used to judge the quality of evidence.

Data synthesis was structured around the four preset PICO questions, as outlined below. Three web annexes give additional information, as follows:

  • details of studies included in the current analysis (Web Annex 1.6: Low complexity automated NAATs);
  • a summary of the results and details of the evidence quality assessment (Web Annex 2.6: Low complexity automated NAATs); and
  • a summary of the GDG panel judgements (Web Annex 3.6: Low complexity automated NAATs).

PICO 1: Should low complexity automated NAATs be used on sputum in people with signs and symptoms of pulmonary TB, irrespective of resistance to rifampicin, for detection of resistance to isoniazid, as compared with culture-based phenotypic DST?

Three multinational studies with 1605 participants provided data for evaluating isoniazid resistance detection. The reference standard for each of these studies was culture-based phenotypic DST. Each study centre in the multinational studies was analysed as a separate study (Fig. 2.3.1).

Several concerns were expressed about indirectness in the study populations. First, the median prevalence of isoniazid resistance in the included studies was 67.2% (range, 26.8% [Diagnostics for Multidrug Resistant Tuberculosis in Africa – DIAMA, Benin] to 93.9% [FIND, Moldova]), which is higher than the global estimates for isoniazid resistance. Hence, applicability to settings with a lower prevalence of isoniazid resistance comes with some uncertainty. Second, there are potential differences in the mutations present in isoniazid monoresistant strains and MDR strains; that is, some studies suggest that the mutations found in monoresistant strains are more diverse than the mutations found in MDR strains. Third, although the population for this PICO question is “irrespective of rifampicin resistance”, enrolment criteria in the studies meant that most participants within the included studies had RR-TB. As a result of these concerns, certainty of evidence was downgraded one level for indirectness both for sensitivity and specificity, and the quality (certainty) of evidence was rated moderate both for sensitivity and specificity.

Fig. 2.3.1. Forest plot of included studies for isoniazid resistance detection, irrespective of rifampicin resistance with culture-based phenotypic DST as the reference standard.

Fig. 2.3.1

Forest plot of included studies for isoniazid resistance detection, irrespective of rifampicin resistance with culture-based phenotypic DST as the reference standard. CI: confidence interval; DIAMA: Diagnostics for Multidrug Resistant Tuberculosis in (more...)

The sensitivity in these three studies ranged from 81% to 100% and the specificity from 87% to 100%. The pooled sensitivity was 94.2% (95% CI: 89.3–97.0%) and the pooled specificity was 98.0% (95% CI: 95.2–99.2%).

PICO 2: Should low complexity automated NAATs be used on sputum in people with signs and symptoms of pulmonary TB, irrespective of resistance to rifampicin, for detection of resistance to fluoroquinolones, as compared with culture-based phenotypic DST?

Three multinational studies with 1337 participants provided data for evaluation of detection of fluoroquinolone resistance. The reference standard for each of these studies was culture-based phenotypic DST. Each study centre in the multinational studies was analysed as a separate study (Fig. 2.3.3).

Specificity estimates were inconsistent, at 84% (FIND, Mumbai), 91% (FIND, New Delhi) and more than 96% for other studies. The heterogeneity in specificity estimates could not be explained. Consequently, the certainty of the evidence was downgraded one level for inconsistency; the quality (certainty) of the evidence was rated high for sensitivity and moderate for specificity.

Fig. 2.3.2. Forest plot of included studies for fluoroquinolone resistance detection, irrespective of rifampicin resistance with culture-based phenotypic DST as the reference standard.

Fig. 2.3.2

Forest plot of included studies for fluoroquinolone resistance detection, irrespective of rifampicin resistance with culture-based phenotypic DST as the reference standard. CI: confidence interval; DIAMA: Diagnostics for Multidrug Resistant Tuberculosis (more...)

The sensitivity for fluoroquinolone resistance in these three studies ranged from 83% to 100% and the specificity from 84% to 100%. The pooled sensitivity was 93.1% (95% CI: 88.0–96.1%) and the pooled specificity was 98.3% (95% CI: 94.5–99.5%).

PICO 3: Should low complexity automated NAATs be used on culture isolates in people with signs and symptoms of pulmonary TB, and detected resistance to rifampicin, for detection of resistance to ethionamide, as compared with genotypic sequencing of the inhA promoter?

One multinational study with 434 participants provided data for evaluating resistance to ethionamide. The reference standard for this study was DNA sequencing of the inhA promoter. Each study centre in the multinational study was analysed as a separate study (Fig. 2.3.2).

The study was judged to be at very serious risk of bias in the reference standard domain because it did not include all loci (i.e. ethA, ethR and inhA promoter) required for the reference standard to classify the target condition correctly. Against a reference standard of phenotypic DST, the pooled sensitivity was considerably lower, at 51.7% (95% CI: 33.1–69.8%). Consequently, certainty of evidence was downgraded two levels for risk of bias for both sensitivity and specificity. In addition, the 95% CIs were wide for both sensitivity and specificity, which could lead to different decisions, depending on which confidence limits are assumed. Consequently, the certainty of the evidence was downgraded one level for imprecision for both sensitivity and specificity; the quality (certainty) of evidence was rated very low for both sensitivity and specificity.

Fig. 2.3.3. Forest plot of included studies for ethionamide resistance detection with genotypic DST as the reference standard.

Fig. 2.3.3

Forest plot of included studies for ethionamide resistance detection with genotypic DST as the reference standard. CI: confidence interval; DST: drug susceptibility testing; FIND: Foundation for Innovative New Diagnostics; FN: false negative; FP: false (more...)

The sensitivity for ethionamide resistance in this study ranged from 78% to 100% and the specificity from 97% to 100%. The pooled sensitivity was 98.0% (95% CI: 74.2–99.9%) and the pooled specificity was 99.7% (95% CI: 83.5–100.0%).

PICO 4: Should low complexity automated NAATs be used on sputum in people with signs and symptoms of pulmonary TB, and detected resistance to rifampicin, for detection of resistance to amikacin, as compared with culture-based phenotypic DST?

One multinational study with 490 participants provided data for evaluating resistance to amikacin. The reference standard for this study was culture-based phenotypic DST. Each study centre in this multinational study was analysed as a separate study (Fig. 2.3.4).

The 95% CI for sensitivity was wide, which could lead to different decisions around true positives and false negatives, depending on which confidence limits are assumed. Also, there were few participants with amikacin resistance contributing to this analysis for the observed sensitivity. Consequently, the certainty of the evidence was downgraded two levels for imprecision. Also, there were few participants with amikacin resistance contributing to this analysis for the observed sensitivity. Consequently, the certainty of the evidence was downgraded two levels for imprecision; the quality (certainty) of evidence was rated low for sensitivity and high for specificity.

Fig. 2.3.4. Forest plot of included studies for amikacin resistance detection with culture-based phenotypic DST as the reference standard.

Fig. 2.3.4

Forest plot of included studies for amikacin resistance detection with culture-based phenotypic DST as the reference standard. CI: confidence interval; DIAMA: Diagnostics for Multidrug Resistant Tuberculosis in Africa; DST: drug susceptibility testing; (more...)

The sensitivity for amikacin resistance in this study ranged from 75% to 95% and the specificity from 96% to 100%. The pooled sensitivity was 86.1% (95% CI: 75.0–92.7%) and the pooled specificity was 98.9% (95% CI: 93.0–99.8%).

Cost–effectiveness analysis

This section answers the following additional question:

What is the comparative cost, affordability and cost–effectiveness of implementation of low complexity automated NAATs?

A systematic review was conducted, focusing on economic evaluations of low complexity automated NAATs. Four online databases (Embase, Medline, Web of Science and Scopus) were searched for new studies published from 1 January 2010 through 17 September 2020. The citations of all eligible articles, guidelines and reviews were reviewed for additional studies. Experts and test manufacturers were also contacted to identify any additional unpublished studies.

The objective of the review was to summarize current economic evidence and further understand the costs, cost–effectiveness and affordability of low complexity automated NAATs.

Two low complexity automated NAATs were identified: the MeltPro MTB/RIF (Xiamen Zeesan Biotech Co Ltd, China) and the Xpert MTB/XDR assay (Cepheid, Sunnyvale, USA). Only data concerning Xpert MTB/XDR are included in this review. As is the case with Xpert MTB/RIF, the novel XDR assay can be used to test either unprocessed or concentrated sputum. No published studies providing direct evidence on the cost or cost–effectiveness of low complexity automated NAATs were identified.

Through direct communication from the Xpert MTB/XDR manufacturer, Cepheid, the low- and middle-income country (LMIC) cost for the XDR cartridge is expected to be US$ 19.80 ex-works. Shipping and customs costs will be additional and will be borne by the ordering nations or organizations, as is currently the case for Xpert MTB/RIF and Ultra cartridges.

As with the Xpert MTB/RIF and Ultra assays, the test cartridge costs represent just one component of the total unit test costs that must be considered, with equipment being another important consideration. The Xpert MTB/XDR test will not work on existing six-colour modules and will require laboratories to upgrade to 10-colour GeneXpert modules. There will be different upgrade options for the 10-colour system, with different price points depending on the needs and resources available. Upgrade options include:

  • a new 10-colour system – this is the most costly option, at US$ 9420 for one module to US$ 72 350 for 16 modules, including the GeneXpert platform, computer and scanner;
  • a new 10-colour satellite instrument with the GeneXpert connected to an existing system – this costs from US$ 6495 for one module to US$ 69 525 for 16 modules; and
  • converting an existing GeneXpert system from a six-colour to a 10-colour system by replacing modules – a 10-colour module kit costs US$ 3860.

Additional cost considerations for Xpert MTB/XDR include additional testing or repeated testing in the case of indeterminate or non-actionable results (indeterminate, non-determinate or invalid). The potential cost burden of this is likely to vary, depending on the proportion of indeterminate test results across settings and the associated re-testing protocols.

No studies that have directly assessed the cost–effectiveness of the Xpert MTB/XDR cartridge were identified. Although extrapolation from other platforms and testing approaches for costing may be appropriate, extrapolation of cost–effectiveness data from Xpert MTB/RIF (Ultra) or other NAATs is not advised because of differences in diagnostic accuracy, costs associated with XDR treatment, and the different testing and treatment cascade of care.

Several factors are likely to influence the cost–effectiveness of Xpert MTB/XDR; they include diagnostic accuracy, which may lead to more or fewer individuals being diagnosed compared with the standard of care (which in turn will vary, depending on the local standard of care). In addition to diagnostic accuracy associated with the test itself, the diagnostic algorithm and placement of the Xpert MTB/XDR test within the algorithm has important implications.

The novel Xpert MTB/XDR provides results in less than 90 minutes. Thus, introduction of this test is likely to result in faster time to a result for genotypic DST and could affect cost–effectiveness by improving the numbers of patients initiating treatment, reducing loss to follow-up and improving survival rates. Costs associated with XDR treatment are likely to be an important driver of cost and cost–effectiveness because previous work has shown that these costs are high compared to diagnostic and other treatment costs. As larger numbers of XDR-positive individuals requiring treatment are identified, total resources required to treat these individuals will increase.

In the absence of transmission modelling studies, there is no information on the long-term population level impact of introducing Xpert MTB/XDR. Nevertheless, the benefits of identifying more cases earlier could lead to a reduction in ongoing transmission and potential cost-savings over the long term. This requires thorough investigations through transmission modelling.

How large are the resource requirements (costs)?

No published studies provided direct evidence about the total resources required. Resource requirements will include the purchase of cartridges (US$ 19.80/cartridge), upgrading of existing platforms to 10-colour modules (an upgrade that will eventually be required for all Xpert platforms: US$ 3860 to >US$ 72 350) and operational and programmatic costs associated with implementing the novel diagnostic. Resource requirements for XDR treatment (e.g. drugs, hospital capacity and staff) are also likely to increase as the number of people diagnosed increases. Total costs will vary, depending on testing volume and prevalence of XDR in the population; also, the impact on the budget will depend on the current standard of care and associated resource use.

What is the certainty of the evidence of resource requirements (costs)?

Direct costs related to the purchase of cartridges and machinery are provided from the manufacturer; however, several important items related to resource use for implementing Xpert MTB/XDR have not been investigated (e.g. staff time, overhead and operational costs). Differences in resource use between Xpert MTB/XDR and existing approaches will vary across settings using different phenotypic and genotypic DST. There is important variability in costs of staff time and operational costs (e.g. testing volume) across settings.

Does the cost–effectiveness of the intervention favour the intervention or the comparison?

No cost–effectiveness studies using Xpert MTB/XDR were identified. Extrapolation of cost–effectiveness data from Xpert MTB/RIF or other NAATs is not advised because of differences in diagnostic accuracy, and costs associated with XDR treatment and the testing and treatment cascade of care.

More details on economic evidence synthesis and analysis are provided in Web Annex 4.9: Systematic literature review of economic evidence for NAATs to detect TB and DR-TB in adults and children.

User perspective

This section answers the following question about key informants’ views and perspectives on the use of low complexity automated NAATs:

  • Is there important uncertainty about or variability in how much end-users value the main outcomes?
  • What would be the impact on health equity?
  • Is the intervention acceptable to key stakeholders?
  • Is the intervention feasible to implement?

The synthesis and analysis of qualitative evidence on end-users’ perspectives are discussed above in the section “User perspective” for moderate complexity automated NAATs (p. 61–65).

Findings of the review and interviews

The main findings of the systematic review and interviews are given below. Where information is from the review, a level of confidence in the QES is given; where it is from interviews, this is indicated with ‘Interviews’.

Is there important uncertainty about or variability in how much end-users value the main outcomes?

  • Patients in high burden TB settings value:

    getting an accurate diagnosis and reaching diagnostic closure (finally knowing “what is wrong with me”);

    avoiding diagnostic delays because they exacerbate existing financial hardships and emotional and physical suffering, and make patients feel guilty for infecting others (especially children);

    having accessible facilities; and

    reducing diagnosis-associated costs (e.g. travel, missing work) as important outcomes of the diagnostic.

    QES: moderate confidence

  • Low complexity automated NAATs, when compared with existing tests or sputum microscopy, are appreciated by health care professionals because of:

    the rapidity and accuracy of the results;

    the confidence that a result generates to start treatment and motivate patients;

    the diversity of sample types;

    the ability to detect drug resistance earlier or at all, for as many drugs as possible (altering a clinician’s risk perception of drug resistance in children), and the consequence of avoiding costlier investigations or hospital stays.

    QES: high confidence

    Compared with other available diagnostic methods, the cartridge has a quicker turnaround time for first- and second-line DST. Health care professionals value the faster turnaround time, the potential ability to reflex samples from the Xpert MTB/RIF to the Xpert MTB/XDR cartridge, and receiving information on multiple drugs and high-level or low-level resistance simultaneously, because it could enable quicker diagnosis and optimized treatment for patients.

    Interviews

  • Laboratory technicians appreciate low complexity automated NAATs for the following reasons:

    Overall, the tests improve laboratory work compared with sputum microscopy in terms of ease of use, ergonomics and biosafety.

    QES: high confidence

    These tests require minimal user steps, and the GeneXpert platform is a familiar system that people feel comfortable running and interpreting.

    Interviews

  • Laboratory managers appreciate that monitoring of laboratory work and training is easier than with sputum microscopy, and that use of low complexity automated NAATs eases staff retention because it increases staff satisfaction and is symbolic of progress within the TB world.
    QES: low confidence

What would be the impact on health equity?

The impact on health equity would be similar to that of moderate complexity automated NAATs (p. 63–64).

Is the intervention acceptable to key stakeholders?

The acceptability to key stakeholders is similar to that of moderate complexity automated NAATs (p. 64–65).

  • The identified challenges in implementing the use of low complexity automated NAATs and accumulated delays at every step may compromise the added value and benefits identified by the users (e.g. avoiding delays, keeping costs low, accurate results, information on drug resistance and easing laboratory work), ultimately leading to use.
    QES: high confidence
  • If these values are not met, it can be assumed that users are less likely to find low complexity automated NAATs acceptable.

Is the intervention feasible to implement?

  • Low complexity automated NAATs may decrease the workload in the laboratory in terms of freeing up time for laboratory staff. However, based on experience with Xpert MTB/RIF (Ultra), the introduction of a new class of technologies may increase the workload of laboratory staff if added onto existing work without adjusting staffing arrangements or if the new technology does not replace existing diagnostic tests.
    QES: moderate confidence
  • Low complexity automated NAATs require less user training than other DST methods (e.g. LPA and culture), making these tests more feasible to implement than methods with more user steps and those that require significant additional training.
    Interview study
    Implementation of new diagnostics must be accompanied by training for clinicians, to help them interpret results from new molecular tests and understand how this relates to the treatment of a patient. In the past, with the introduction of Xpert MTB/RIF (Ultra), this has been a challenge and has led to underuse.
    QES: high confidence and interview study
    Introduction of Xpert MTB/RIF (Ultra) has also led to overreliance on results of cartridge-based NAATs at the expense of clinical acumen.
    QES: moderate confidence
  • Introduction of new diagnostics must also be accompanied by guidelines and algorithms that support clinicians and laboratories in communicating with each other; for example, these resources allow clinicians and laboratories to discuss discordant results, and interpret laboratory results in the context of drug availability, patient history and patient progress on a current drug regimen.
    Interviews
  • An efficient sample transportation system, with sustainable funding mechanisms, is crucial for feasibility, especially if an algorithm requires multiple samples at different times from different collection points, as is the case when dealing with DR-TB. If mishandled during preparation, there is a risk that the sample may become contaminated and yield inconclusive results on molecular diagnostics. Participants cited good personnel skills, standardized operating procedures and significant laboratory infrastructure as essential in reducing sample contamination in their laboratory.
    Interviews
  • The feasibility of low complexity automated NAATs is challenged if there is an accumulation of diagnostic delays or underuse (or both) at every step in the process, mainly because of health system factors:

    non-adherence to testing algorithms, testing for TB or MDR-TB late in the process, empirical treatment, false negatives due to technology failure, large sample volumes and staff shortages, poor or delayed sample transport and sample quality, poor or delayed communication of results, delays in scheduling follow-up visits and recalling patients, and inconsistent recording of results;

    lack of sufficient resources and maintenance (e.g. stock-outs; unreliable logistics; lack of funding, electricity, space, air conditioners and sputum containers; dusty environment; and delayed or absent local repair option);

    inefficient or unclear workflows and patient flows (e.g. inefficient organizational processes, poor links between providers, and unclear follow-up mechanisms or information on where patients need to go); and

    lack of data-driven and inclusive national implementation processes.

    QES: high confidence

  • The feasibility of using low complexity automated NAATs is also challenged by the value of diagnosing MTB over DR-TB at primary care. This situation makes the NAAT less feasible as a baseline test, although it would fit at a district or intermediate level laboratory.

Implementation considerations

Factors to consider when implementing low complexity automated NAATs for detection of resistance to isoniazid and second-line anti-TB agents are as follows:

  • local epidemiological data on resistance prevalence should guide local testing algorithms, whereas pretest probability is important for the clinical interpretation of test results;
  • the cost of a test varies depending on parameters such as the number of samples in a batch and the staff time required; therefore, a local costing exercise should be performed;
  • low, moderate and high complexity tests have successive increase in technical competency needs (qualifications and skills) and staff time, which affects planning and budgeting;
  • availability and timeliness of local support services and maintenance should be considered when selecting a provider;
  • laboratory accreditation and compliance with a robust quality management system (including appropriate quality control) are essential for sustained service excellence and trust;
  • training of both laboratory and clinical staff will ensure effective delivery of services and clinical impact;
  • use of connectivity solutions for communication of results is encouraged, to improve efficiency of service delivery and time to treatment initiation;
  • rapid and early testing for the detection of fluoroquinolone resistance is essential before starting treatment with the all-oral MDR/RR-TB shorter regimen (i.e. 6–9 months); this may also become relevant (depending on the epidemiological context) if new shorter drug-susceptible TB regimens that include fluoroquinolones are introduced;
  • these tests can be used to rule in ethionamide resistance, but not to rule out resistance, because mutations conferring resistance to ethionamide are not limited to the inhA promoter region – they also include ethA, ethR and other genes;
  • culture-based phenotypic DST may still be required, particularly among those with a high pretest probability of resistance when the low complexity automated NAATs does not detect drug resistance; in addition, culture-based phenotypic DST:

    remains important to determine resistance to other anti-TB agents, particularly the new and repurposed medicines, and to monitor the emergence of additional drug resistance;

    does not apply to ethionamide because it is unreliable and poorly reproducible;

  • for second-line injectable drugs, the panel evaluated the performance in detecting resistance to amikacin only because both kanamycin and capreomycin are no longer recommended for the treatment of DR-TB; and
  • culture-based phenotypic DST may be important to confirm amikacin susceptibility in situations where it is appropriate to use this medicine, to balance risk and benefit.

Research priorities

Research priorities for low complexity automated NAATs for detection of resistance to isoniazid and second-line anti-TB agents are as follows:

  • diagnostic accuracy, in specific patient populations (e.g. children, people living with HIV, and patients with signs and symptoms of extrapulmonary TB) and in non-sputum samples;
  • impact of diagnostic technologies on clinical decision-making and outcomes that are important to patients (e.g. cure, mortality, time to diagnosis and time to start treatment) in all patient populations;
  • impact of specific mutations on treatment outcomes among people with DR-TB;
  • use, integration and optimization of diagnostic technologies in the overall landscape of testing and care, as well as diagnostic pathways and algorithms;
  • economic studies evaluating the costs, cost–effectiveness and cost–benefit of different diagnostic technologies;
  • qualitative studies evaluating equity, acceptability, feasibility and end-user values of different diagnostic technologies;
  • effect of non-actionable results (indeterminate, non-determinate or invalid) on diagnostic accuracy and outcomes that are important to patients;
  • evaluation of low complexity automated NAATs for initial TB detection, in addition to its use as a follow-on test, in all people with signs and symptoms of TB, in children and in people living with HIV; and
  • the potential utility of katG resistance detection to identify MDR-TB clones that may be missed because they do not have an RRDR mutation (e.g. the Eswatini MDR-TB clone, which has both the katG S315T and the non-RRDR rpoB I491F mutation).

First-line LPAs

In 2008, WHO approved the use of commercial LPAs for detecting MTBC in combination with resistance to rifampicin and isoniazid in sputum smear-positive specimens (direct testing) and in cultured isolates of MTBC (indirect testing). A systematic review at that time evaluated the diagnostic accuracy of two commercially available LPAs – the INNO-LiPA Rif.TB assay (Innogenetics, Ghent, Belgium), and the GenoType® MTBDRplus (version 1), hereafter referred to as Hain version 1 – and provided evidence for WHO’s endorsement (37, 38). Excellent accuracy was reported for both tests in detecting rifampicin resistance, but their diagnostic accuracy for isoniazid resistance had lower sensitivity, despite the high specificity. Because there were inadequate data to allow stratification by smear status, WHO’s recommendation for using LPAs was limited to culture isolates or smear-positive sputum specimens. Further data have since been published on the use of LPAs; newer versions of LPA technology have now been developed, such as the Hain GenoType MTBDRplus version 2, hereafter referred to as Hain version 2; and other manufacturers have entered the market, including Nipro (Tokyo, Japan), which developed the Genoscholar™ NTM+MDRTB II, hereafter referred to as Nipro.

In 2015, FIND evaluated the Nipro and the Hain version 2 LPAs, and compared them with Hain version 1. The study demonstrated equivalence among the three commercially available LPAs for detecting TB and resistance to rifampicin and isoniazid (5).

Recommendation

  1. For persons with a sputum smear-positive specimen or a cultured isolate of MTBC, commercial molecular LPAs may be used as the initial test instead of phenotypic culture-based DST to detect resistance to rifampicin and isoniazid.
    (Conditional recommendation, moderate certainty in the evidence for the test’s accuracy)
Remarks
  1. These recommendations apply to the use of LPAs for testing sputum smear-positive specimens (direct testing) and cultured isolates of MTBC (indirect testing) from both pulmonary and extrapulmonary sites.
  2. LPAs are not recommended for the direct testing of sputum smear-negative specimens.
  3. These recommendations apply to the detection of MTBC and the diagnosis of MDR-TB, but acknowledge that the accuracy of detecting resistance to rifampicin and isoniazid differs and, hence, that the accuracy of a diagnosis of MDR-TB is reduced overall.
  4. These recommendations do not eliminate the need for conventional culture-based DST, which will be necessary to determine resistance to other anti-TB agents and to monitor the emergence of additional drug resistance.
  5. Conventional culture-based DST for isoniazid may still be used to evaluate patients when the LPA result does not detect isoniazid resistance. This is particularly important for populations with a high pretest probability of resistance to isoniazid.
  6. These recommendations apply to the use of LPA in children based on the generalization of data from adults.

Test description

LPAs are a family of DNA strip-based tests that can detect the MTBC strain and determine its drug resistance profile through the pattern of binding of amplicons (DNA amplification products) to probes targeting the following: specific parts of the MTBC genome (for MTBC detection), the most common resistance-associated mutations to first-line and second-line agents, or the corresponding wild-type DNA sequence (for detection of resistance to anti-TB drugs) (38).

LPAs are based on reverse hybridization DNA strip technology and involve three steps: DNA extraction from M. tuberculosis culture isolates or directly from patient specimens, followed by multiplex PCR amplification and then reverse hybridization with visualization of amplicon binding (or lack thereof) to wild-type and mutation probes (5).

Although LPAs are more technically complex to perform than the Xpert MTB/RIF assay, they can detect isoniazid resistance. Testing platforms have been designed for a reference laboratory setting and are thus most applicable to high TB burden countries. Results can be obtained in 5 hours.

Some of these steps can be automated, making the method quicker and more robust, and reducing the risk of contamination.

The Hain version 1 and version 2 assays include rpoB probes to detect rifampicin resistance, katG probes to detect mutations associated with high-level isoniazid resistance, and inhA promoter probes to detect mutations usually associated with low-level isoniazid resistance. The probes used to detect wild-type and specific mutations are the same for both versions of the Hain LPA (Fig. 2.3.5a).

Similarly, the Nipro assay allows for the identification of MTBC, and resistance to rifampicin and isoniazid. The Nipro assay also differentiates M. avium, M. intracellulare and M. kansasii from other non-tuberculous mycobacteria (Fig. 2.3.5b).

The rpoB, katG and inhA promoter mutation probes are the same for the three assays, with the exception of the katG S315N mutation, which is included in the Nipro assay but not in Hain version 1 or version 2. There are some minor variations in the codon regions covered for the wild type among Hain version 1 and version 2, and the Nipro.

Fig. 2.3.5. Examples of different line probe assay strip readouts: (a) Hain GenoType MTBDRplus version 1 and version 2 (Hain Lifescience, Nehren, Germany) and (b) Nipro NTM+MDRTB Detection Kit 2 (Nipro, Tokyo, Japan).

Fig. 2.3.5

Examples of different line probe assay strip readouts: (a) Hain GenoType MTBDRplus version 1 and version 2 (Hain Lifescience, Nehren, Germany) and (b) Nipro NTM+MDRTB Detection Kit 2 (Nipro, Tokyo, Japan).

Justification and evidence

In 2015, WHO commissioned an updated systematic review of the accuracy of commercial LPAs for detecting MTBC, and resistance to rifampicin and isoniazid. A total of 74 studies were identified, comprising 94 unique datasets (see Annex 1.3: “FL-LPA”). Of these 94 datasets, 83 evaluated Hain version 1, five evaluated Hain version 2, and six evaluated the Nipro assay. Only one of the studies performed head-to-head testing of all three target LPAs on directly tested clinical specimens and indirectly tested isolates, and these data were included as six separate datasets (39). No studies performed LPA testing on specimens and culture isolates from the same patients, precluding direct within-study comparisons.

Following the 2015 systematic review, the WHO Global TB Programme convened a GDG in March 2016 to assess the data and update the 2008 policy recommendations on using commercial LPAs to detect MTBC, and resistance to isoniazid and rifampicin. The PICO questions are given in Box 2.3.1.

LPAs were compared with a phenotypic culture-based DST reference standard, and a composite reference standard that combined the results from genetic sequencing with results from phenotypic culture-based DST. Phenotypic DST was the primary reference standard applied to all participants for all analyses. These analyses were stratified – first, by susceptibility or resistance to rifampicin or isoniazid (or both) and second, by type of LPA testing (indirect testing or direct testing).

Box 2.3.1PICO questions

  1. Should LPAs be used to guide clinical decisions to use rifampicin in the direct testing of specimens and the indirect testing of culture isolates from patients with signs and symptoms consistent with TB?
  2. Should LPAs be used to guide clinical decisions to use isoniazid in the direct testing of specimens and the indirect testing of culture isolates from patients with signs and symptoms consistent with TB?
  3. Should LPAs be used to diagnose MDR-TB in patients with signs and symptoms consistent with TB?
  4. Should LPAs be used to diagnose TB in patients with signs and symptoms consistent with TB but for whom sputum-smear results are negative?

Several studies contributed to either sensitivity (no true positives and no false negatives) or specificity (no true negatives and no false positives) but not to both. For these studies, a univariate, random-effects meta-analysis of the estimates of sensitivity or specificity was performed separately, to make optimal use of the data. The results from the univariate analysis (using all studies) were compared with the results from the bivariate analysis of the subset of studies that contributed to estimates of both sensitivity and specificity.

If there were at least four studies for index tests with data that contributed only to sensitivity or specificity, a univariate, random-effects meta-analysis was performed to assess one summary estimate, assuming no correlation between sensitivity and specificity. In cases in which there were fewer than four studies, or where substantial heterogeneity was evident on forest plots that precluded a meta-analysis, a descriptive analysis was performed for these index tests. Forest plots were visually assessed for heterogeneity among the studies within each index test and in the summary plots, for variability in estimates and the width of the prediction region (a wider prediction region suggests more heterogeneity).

The performance of the tests is summarized in Table 2.3.2. The results are based on various numbers of studies and specimens tested. In some cases, too few studies were available for meta-analysis. The results from the only head-to-head comparison of the three tests are presented in the right-hand columns for comparison. The data presented are all comparisons with phenotypic culture-based DST as the reference standard.

Table 2.3.2. Performance of the three LPA tests for detection of rifampicin and isoniazid resistance with phenotypic culture-based DST as the reference standard.

Table 2.3.2

Performance of the three LPA tests for detection of rifampicin and isoniazid resistance with phenotypic culture-based DST as the reference standard.

Implementation considerations

Adopting LPAs to detect rifampicin and isoniazid resistance does not eliminate the need for conventional culture and DST capacity. Culture and phenotypic culture-based DST have critical roles in monitoring patients’ responses to treatment and detecting additional resistance to second-line agents.

  • The adoption of LPA should be phased in, starting at national or central reference laboratories, or those with proven capability to conduct molecular testing. Expansion could be considered, within the context of a country’s plans for laboratory strengthening, the availability of suitable personnel in peripheral centres and the quality of specimen transport systems.
  • Adequate and appropriate laboratory infrastructure and equipment should be provided, to ensure that the required precautions for biosafety and the prevention of contamination are met – specimen processing for culture and procedures for manipulating cultures must be performed in biological safety cabinets in TB-containment laboratories.
  • Laboratory facilities for LPAs require at least three separate rooms, one each for DNA extraction, pre-amplification procedures, and amplification and post-amplification procedures. To avoid contamination, access to molecular facilities must be restricted, a unidirectional workflow must be implemented and stringent cleaning protocols must be established.
  • Appropriate laboratory staff should be trained to conduct LPA procedures. Staff should be supervised by a senior staff member with adequate training and experience in molecular assays. A programme for the external quality assessment of laboratories using LPAs should be developed as a priority.
  • Mechanisms for rapidly reporting LPA results to clinicians must be established, to provide patients with the benefit of early diagnosis. The same infrastructure used for performing LPAs can be used also to perform second-line LPAs.
  • LPAs are designed to detect TB and resistance to rifampicin and isoniazid in the direct testing of processed sputum samples, and in the indirect testing of culture isolates of MTBC. The use of LPAs with other respiratory samples (e.g. from bronchoalveolar lavage or gastric aspiration) or extrapulmonary samples (e.g. tissue samples, CSF or other body fluids) have not been adequately evaluated.
  • The availability of second-line agents is critical in the event that resistance to rifampicin or isoniazid, or both, is detected.
  • For patients with confirmed MDR/RR-TB, second-line LPAs are recommended to detect additional resistance to second-line anti-TB agents.

Research priorities

  • Development of improved understanding of the correlation between the detection of resistance-conferring mutations using culture-based DST and patient outcomes.
  • Review of evidence to confirm or revise different critical concentrations used in culture-based DST methods.
  • Determination of the limit of detection for LPA in detecting heteroresistance.
  • Determination of needs for training, assessing competency and ensuring quality assurance.
  • Gathering of more evidence on the impact on mortality of initiating appropriate treatment for MDR-TB.
  • Meeting the STARD for future diagnostic studies.
  • Performance of country-specific cost–effectiveness and cost–benefit analyses of LPA use in different programmatic settings.

Second-line LPAs

Genotypic (molecular) methods have considerable advantages for scaling up programmatic management and surveillance of DR-TB, offering rapid diagnosis, standardized testing, potential for high throughput and fewer requirements for laboratory biosafety. Molecular tests for detecting drug resistance – for example, the GenoType MTBDRsl assay (Hain Lifescience, Nehren, Germany), hereafter referred to as MTBDRsl (17) – have shown promise for the diagnosis of DR-TB. These tests are rapid (can be performed in a single working day) and detect the presence of mutations associated with drug resistance. MTBDRsl belongs to a category of molecular genetic tests called second-line LPAs (SL-LPAs).

MTBDRsl (version 1.0) was the first commercial SL-LPA for detection of resistance to second-line TB drugs. In 2015, the manufacturer developed and made commercially available version 2.0 of the MTBDRsl assay. Version 2.0 detects the mutations associated with fluoroquinolones and second-line injectable drug (SLID) resistance detected by version 1.0, and additional mutations. Once a diagnosis of MDR/RR-TB has been established, an SL-LPA can be used to detect additional resistance to second-line drugs.

The MTBDRsl assay incorporates probes to detect mutations within genes that are associated with resistance to either fluoroquinolones or SLIDs (gyrA and rrs for version 1.0 and those genes plus gyrB and the eis promoter for version 2.0). The presence of mutations in these regions does not necessarily imply resistance to all the drugs within a particular class. Although specific mutations within these regions may be associated with different levels of resistance (i.e. different minimum inhibitory concentrations) to each drug within these classes, the extent of cross-resistance is not completely understood.

Recommendations

  1. For patients with confirmed MDR/RR-TB, SL-LPA may be used as the initial test, instead of phenotypic culture-based DST, to detect resistance to fluoroquinolones.
  2. For patients with confirmed MDR/RR-TB, SL-LPA may be used as the initial test, instead of phenotypic culture-based DST, to detect resistance to the SLIDs.
Remarks
  • These recommendations apply to the use of SL-LPA for testing sputum specimens (direct testing) and cultured isolates of M. tuberculosis (indirect testing) from both pulmonary and extrapulmonary sites. Direct testing on sputum specimens allows for the earlier initiation of appropriate treatment.
  • These recommendations apply to the direct testing of sputum specimens from MDR/RR-TB, irrespective of the smear status, while acknowledging that the indeterminate rate is higher when testing smear-negative sputum specimens than with smear-positive sputum specimens.
  • These recommendations do not eliminate the need for conventional phenotypic DST capacity, which will be necessary to confirm resistance to other drugs and to monitor the emergence of additional drug resistance.
  • Conventional phenotypic DST can still be used in the evaluation of patients with negative SL-LPA results, particularly in populations with a high pretest probability for resistance to fluoroquinolones or SLID (or both).
  • These recommendations apply to the use of SL-LPA in children with confirmed MDR/RR-TB, based on the generalization of data from adults.
  • Resistance-conferring mutations detected by SL-LPA are highly correlated with phenotypic resistance to ofloxacin and levofloxacin.
  • Resistance-conferring mutations detected by SL-LPA are highly correlated with phenotypic resistance to SLID.
  • Given the high specificity for detecting resistance to fluoroquinolones and SLID, the positive results of SL-LPA could be used to guide the implementation of appropriate infection control precautions.

Test description

The SL-LPA is based on the same principle as the first-line LPA. The assay procedure can be performed directly using a processed sputum sample or indirectly using DNA isolated and amplified from a culture of M. tuberculosis. Direct testing involves the following steps:

1)

Decontamination (e.g. with sodium hydroxide) and concentration of a sputum specimen by centrifugation.

2)

Isolation and amplification of DNA.

3)

Detection of the amplification products by reverse hybridization.

4)

Visualization using a streptavidin-conjugated alkaline phosphatase colour reaction.

Indirect testing includes only Steps 2–4. The observed bands, each corresponding to a wild-type or resistance-genotype probe, can be used to determine the drug susceptibility profile of the analysed specimen. The assay can be performed and completed within a single working day.

The index test used was MTBDRsl, and the different characteristics of versions 1.0 and 2.0 are presented in Table 2.3.3. SL-LPAs detect specific mutations associated with resistance to the class of fluoroquinolones (including ofloxacin, levofloxacin, moxifloxacin and gatifloxacin) and SLIDs (including kanamycin, amikacin and capreomycin) in the MTBC. Version 1.0 detects mutations in the gyrA quinolone resistance-determining region (codons 85–97) and rrs (codons 1401, 1402 and 1484). Version 2.0 additionally detects mutations in the gyrB quinolone resistance-determining region (codons 536–541) and the eis promoter region (codons −10 to −14) (40). Mutations in these regions may cause additional resistance to the fluoroquinolones or SLIDs, respectively; thus, version 2.0 is expected to have improved sensitivity for resistance to these drug classes. Mutations in some regions (e.g. the eis promoter region) may be responsible for causing resistance to one drug in a class more than other drugs within that class. For example, the eis C14T mutation is associated with kanamycin resistance in strains from Eastern Europe (41). Version 1.0 also detects mutations in embB that may encode for resistance to ethambutol. Because ethambutol is a first-line drug and was omitted from version 2.0, this review did not determine the accuracy for ethambutol resistance.

Table 2.3.3. Characteristics of GenoType MTBDRsl versions 1.0 and 2.0, as per manufacturer.

Table 2.3.3

Characteristics of GenoType MTBDRsl versions 1.0 and 2.0, as per manufacturer.

More data are needed to better understand the correlation of the presence of certain fluoroquinolone resistance-conferring mutations with phenotypic DST resistance and with patient outcomes.

Fig. 2.3.6 shows an example of MTBDRsl results for version 1.0 and 2.0. A band for the detection of the MTBC (the “TUB” band) is included, as well as two internal controls (conjugate and amplification controls), and a control for each gene locus (version 2.0: gyrA, gyrB, rrs, eis). The two internal controls plus each gene locus control should be positive, otherwise the assay cannot be evaluated for that particular drug. A result can be indeterminate for one locus but valid for another (on the basis of a gene-specific locus control failing).

Fig. 2.3.6. Examples of different GenoType MTBDRsl strip readouts.

Fig. 2.3.6

Examples of different GenoType MTBDRsl strip readouts.

A template is supplied by the manufacturer to help the user to read the strips where the banding patterns are scored by eye, transcribed and reported. In high-volume settings, the GenoScan®, an automated reader, can be incorporated to interpret the banding patterns automatically and give a suggested interpretation. If the operator agrees with the interpretation, the results are automatically uploaded, thereby reducing possible transcription errors.

Justification and evidence

In March 2016, the WHO Global TB Programme convened a GDG to assess available data on the use of the MTBDRsl assay. WHO commissioned a systematic review on the accuracy and clinical use of assays for the detection of mutations associated with resistance to fluoroquinolones and SLID in people with MDR/RR-TB.

The PICO questions in Box 2.3.2 were designed to form the basis for the evidence search, retrieval and analysis.

Box 2.3.2PICO questions

  1. Should the MTBDRsl test be used to guide clinical decisions to use fluoroquinolones in patients with confirmed MDR/RR-TB?
    • Direct testing (stratified by smear grade: smear negative; scanty; 1+; ≥2+).
    • Indirect testing.
  2. Should the MTBDRsl test be used to guide clinical decisions to use SLIDs in patients diagnosed with MDR/RR-TB?
    • Direct testing (stratified by smear grade: smear negative; scanty; 1+; ≥2+).
    • Indirect testing.

Twenty-nine unique studies were identified; of these, 26 evaluated the MTBDRsl version 1.0 assay (including 21 studies from the original Cochrane review). Three studies (one published and two unpublished) evaluated version 2.0. Data for version 1.0 and version 2.0 of the MTBDRsl assay were analysed separately. A phenotypic culture-based DST reference standard was used for the primary analyses. These analyses were stratified first by susceptibility or resistance to a particular drug, and second by type of SL-LPA testing (indirect testing or direct testing).

Performance of SL-LPA on sputum specimens and culture isolates

In patients with MDR/RR-TB, a positive SL-LPA result for fluoroquinolone resistance (as a class) or SLID resistance (as a group) can be treated with confidence. The diagnostic accuracy of SL-LPA is similar when performed directly on sputum specimens or indirectly on cultured isolates of M. tuberculosis.

Given the confidence in a positive result and the ability of the test to provide rapid results, the GDG felt that SL-LPA may be considered for use as an initial test for resistance to the fluoroquinolones and when relevant SLIDs. However, when the test shows a negative result, phenotypic culture-based DST may be necessary, especially in settings with a high pretest probability for resistance to either fluoroquinolones or SLIDs (or both). The use of SL-LPA in routine care should improve the time to the diagnosis of fluoroquinolone and where relevant SLIDs, especially when used for the direct testing of sputum specimens of patients with confirmed MDR/RR-TB. Early detection of drug resistance should allow for the earlier initiation of appropriate patient therapy and improved patient health outcomes. Overall, the test performs well in the direct testing of sputum specimens from patients with confirmed MDR/RR-TB, although the indeterminate rate is higher when testing smear-negative sputum specimens compared with smear-positive sputum specimens.

When the MTBDRsl assay is used in the direct testing of smear-negative sputum specimens from a population of patients with confirmed DR-TB, up to 44% of the results may be indeterminate (less with version 2.0, although very limited data) and hence require repeat or additional testing.

However, if the same test were to be applied to the testing of smear-negative sputum specimens from patients without confirmed TB or DR-TB (i.e. patients suspected of having DR-TB), the indeterminate rate for the test would be significantly higher. Given the test’s sensitivity and specificity when an SL-LPA is done directly on sputum, the GDG felt that SL-LPAs can be used for the testing of all sputum specimens from patients with confirmed MDR/RR-TB, irrespective of whether the microscopy result is positive or negative.

Table 2.3.4. Accuracy of GenoType MTBDRsl (version 1.0) for fluoroquinolone and SLID resistance and XDR-TB, indirect and direct testing (smear-positive specimens), culture-based DST reference standard.

Table 2.3.4

Accuracy of GenoType MTBDRsl (version 1.0) for fluoroquinolone and SLID resistance and XDR-TB, indirect and direct testing (smear-positive specimens), culture-based DST reference standard.

For the reasons mentioned above (inadequate data owing to too few studies on version 2.0), results are not presented here for version 2.0. For MTBDRsl version 2.0, the data were either too sparse or too heterogeneous to combine in a meta-analysis or to compare indirect and direct testing.

Three studies evaluated the MTBDRsl version 2.0 in 562 individuals, including 111 confirmed cases of TB with fluoroquinolone resistance by indirect testing on a culture of M. tuberculosis compared with a phenotypic culture-based DST reference standard. Estimates of sensitivity ranged from 84% to 100% and specificity from 99% to 100%.

See Web Annex 4.16: Drug concentrations used in culture-based DST SL-LPA for details of the drug concentrations used in culture-based DST to evaluate the performance of SL-LPAs in each included study.

Implementation considerations

The SL-LPA should only be used to test specimens from patients with confirmed MDR/RR-TB. Adoption of SL-LPAs does not eliminate the need for conventional culture and DST capability. Despite good specificity of SL-LPAs for the detection of resistance to fluoroquinolones and the SLIDs, culture and phenotypic DST is required to completely exclude resistance to these drug classes as well as to other second-line drugs. The following implementation considerations apply:

  • SL-LPAs cannot determine resistance to individual drugs in the class of fluoroquinolones. Resistance-conferring mutations detected by SL-LPAs are highly correlated with phenotypic resistance to ofloxacin and levofloxacin.
  • Mutations in some regions (e.g. the eis promoter region) may be responsible for causing resistance to one drug in a class more than other drugs within that class. For example, the eis C14T mutation is associated with kanamycin resistance in strains from Eastern Europe.
  • SL-LPAs should be used in the direct testing of sputum specimens, irrespective of whether samples are smear negative or smear positive.
  • SL-LPAs are designed to detect TB and resistance to fluroquinolones and SLIDs from sputum samples. Other respiratory samples (e.g. bronchoalveolar lavage and gastric aspirates) or extrapulmonary samples (tissue samples, CSF or other body fluids) have not been adequately evaluated.
  • Culture and phenotypic DST plays a critical role in the monitoring of a patient’s response to treatment, and in detecting additional resistance to second-line drugs during treatment.
  • SL-LPAs are suitable for use at the central or national reference laboratory level; they can also be used at the regional level if the appropriate infrastructure can be ensured (three separate rooms are required).
  • All patients identified by SL-LPAs should have access to appropriate treatment and ancillary medications.

Research priorities

  • Development of improved understanding of the correlation between the detection of resistance-conferring mutations with phenotypic DST results and with patient outcomes.
  • Development of improved knowledge of the presence of specific mutations detected with SL-LPA correlated with minimum inhibitory concentrations for individual drugs within the classes of fluoroquinolones and SLIDs.
  • Determination of the limit of detection of SL-LPA for the detection of heteroresistance.
  • Gathering of more evidence on the impact of MTBDRsl on appropriate MDR-TB treatment initiation and mortality.
  • Strongly encourage that future studies follow the recommendations in the STARD (42) statement to improve the quality of reporting.
  • Performance of country-specific cost–effectiveness and cost–benefit analyses of the use of SL-LPA in different programmatic settings.

High complexity reverse hybridization-based NAATs for detection of pyrazinamide resistance

Pyrazinamide is an important antibiotic for the treatment of both drug-susceptible TB and DR-TB because of its unique ability to eradicate persisting bacilli and its synergistic properties with other antibiotics. Mono-resistance to pyrazinamide is rare; however, pyrazinamide resistance is strongly associated with MDR/RR-TB, with an estimated 30–60% of MDR/RR-TB also resistant to pyrazinamide. Thus, for people diagnosed with RR-TB, it is important to detect the presence of pyrazinamide resistance so that clinicians can make an informed decision on whether to include or exclude pyrazinamide in the treatment regimen. The high complexity hybridization-based NAAT may be used for diagnosis of pyrazinamide resistance on patient isolates; however, performance of this test requires appropriate infrastructure and skilled staff.

Recommendation

In people with bacteriologically confirmed TB, high complexity reverse hybridization-based NAATs may be used on Mtb culture isolates for detection of pyrazinamide resistance rather than culture-based phenotypic DST.

Conditional recommendation, very low certainty of evidence for diagnostic accuracy

In terms of subgroups to be considered for this recommendation, no special considerations are required (e.g. for children, people living with HIV and those with extrapulmonary TB), given that the test is recommended for use on culture isolates.

Test description

Nipro (Osaka, Japan) developed Genoscholar™ PZA-TB, an LPA with reverse hybridization-based technology for detection of pyrazinamide resistance (43). This assay is a commercially available rapid molecular test for detection of pyrazinamide resistance. Compared with MTBDRplus and MTBDRsl LPA, the Genoscholar PZA-TB LPA does not include specific mutant probes because resistance mutations are widespread across the entire pncA gene with no predominant mutations. Instead, the Genoscholar PZA-TB assay targets a 700 base pair (bp) fragment covering the entire pncA gene and promoter region up to nucleotide −18 of the wild-type H37Rv reference strain.

Fig. 2.3.7. Nipro GenoScholar PZA-TB II strip (a) and Nipro GenoScholar PZA-TB II kit contents (b).

Fig. 2.3.7

Nipro GenoScholar PZA-TB II strip (a) and Nipro GenoScholar PZA-TB II kit contents (b).

DNA extracted from cultures is amplified with primers by PCR. Amplified DNA is then hybridized to complementary oligonucleotide probes that are bound on a membrane strip. Streptavidin labelled with alkaline phosphatase is then added, to bind to any hybrids formed in the previous step. Next, a substrate is added, and an enzymatic reaction results in purple bands, which are visually interpreted. The absence of wild-type probe binding indicates the presence of a mutation. The first version of the assay contained 47 probes, which covered the pncA promoter and open reading frame. The second version contained 48 probes, three of which (pncA 16, 17 and 35) represent silent mutations known to be genetic markers not associated with pyrazinamide resistance: Gly60Gly (probe 16), Ser65Ser (probe 17) and Thr142Thr (probe 35).

Justification and evidence

The Genoscholar PZA-TB LPA assay, which is already commercially available, could potentially be implemented for diagnosis of pyrazinamide resistance in routine care. However, limited data have been published on the diagnostic accuracy of the assay. This systematic review with meta-analysis aimed to assist in collating all the available data to understand the diagnostic accuracy of the pyrazinamide LPA assay for detection of pyrazinamide resistance in TB patients, to guide policy-makers and clinicians.

The WHO Global TB Programme initiated an update of the current guidelines and commissioned a systematic review on the use of high complexity reverse hybridization-based NAATs for detection of pyrazinamide resistance in people with signs and symptoms of TB.

Two PICO questions were designed to form the basis for the evidence search, retrieval and analysis:

  1. Should high complexity reverse hybridization-based NAATs on sputum be used to diagnose pyrazinamide resistance in patients with microbiologically confirmed pulmonary TB, irrespective of resistance to rifampicin, as compared with culture-based phenotypic DST or composite reference standard?
  2. Should high complexity reverse hybridization-based NAATs on isolates be used to diagnose pyrazinamide resistance in patients with microbiologically confirmed pulmonary TB, irrespective of resistance to rifampicin, as compared with culture-based phenotypic DST?

The databases searched were PubMed, Web of Science and Embase, and they were searched without language or date restrictions. The search query was (PZA OR pyrazinamide OR pncA) AND (tuberculosis) AND (“line-probe assay” OR LPA OR “hybridization-based technology”). In addition, we approached Nipro (Osaka, Japan) to identify non-published data.

The microbiological reference standard was defined either as phenotypic culture-based DST performed using BD MGIT 960 PZA liquid assay or another acceptable phenotypic assay, or as genotypic DST performed using either targeted sequencing of the pncA gene or whole genome sequencing. In the case of genotypic DST, all samples with a pncA wild type were defined as being susceptible, while any variant in pncA was considered resistant, which implicitly would categorize “silent” mutations as resistant. In contrast, the composite reference standard was defined by classifying all samples with pncA wild type, pncA silent mutations and neutral mutations as being susceptible, while any other variant in pncA was considered resistant (44).

The certainty of the evidence was assessed consistently through PICO questions, using the GRADE approach (36, 37), which produces an overall quality assessment (or certainty) of evidence and a framework for translating evidence into recommendations. In the GRADE approach, even if diagnostic accuracy studies are of observational design, they start as high-quality evidence.

GRADEpro Guideline Development Tool software (20) was used to generate summary of findings tables. The quality of evidence was rated as high (not downgraded), moderate (downgraded one level), low (downgraded two levels) or very low (downgraded more than two levels), based on five factors: risk of bias, indirectness, inconsistency, imprecision and other considerations. The quality (certainty) of evidence was downgraded by one level when a serious issue was identified and by two levels when a very serious issue was identified in any of the factors used to judge the quality of evidence.

Data synthesis was structured around the two preset PICO questions, as outlined below. Three web annexes give additional information, as follows:

  • details of studies included in the current analysis (Web Annex 1.9: High complexity reverse hybridization-based NAATs);
  • a summary of the results and details of the evidence quality assessment (Web Annex 2.9: High complexity reverse hybridization-based NAATs); and
  • a summary of the GDG panel judgements (Web Annex 3.9: High complexity reverse hybridization-based NAATs).

PICO 1: Should high complexity reverse hybridization-based NAATs on sputum be used to diagnose pyrazinamide resistance in patients with microbiologically confirmed pulmonary TB, irrespective of resistance to rifampicin, as compared with culture-based phenotypic DST or composite reference standard?

Three studies with a total of 122 participants provided data for evaluation of these NAATs for detection of pyrazinamide resistance, including two studies (101 participants) with phenotypic culture-based reference standard and one study (21 participants) with genotypic reference standard. The number of studies and participants were considered insufficient to make a conclusion on a diagnostic accuracy of high complexity reverse hybridization-based NAATs on sputum.

PICO 2: Should high complexity reverse hybridization-based NAATs on isolates be used to diagnose pyrazinamide resistance in patients with microbiologically confirmed pulmonary TB, irrespective of resistance to rifampicin, as compared with culture-based phenotypic DST?

Seven studies with a total of 964 participants provided data for evaluation of these NAATs for detection of pyrazinamide resistance compared with a phenotypic culture-based reference standard (Fig. 2.3.8).

The studies suffered from selection bias because they selected isolates with a wide range of different pncA mutations rather than a representative sample from a population. Thus, the evidence was downgraded by one level for risk of bias. The included studies did not directly address the review question; hence, the evidence was downgraded one level for indirectness. The Burhan trial and the Rienthong study are outliers for their sensitivities compared with the other studies; hence, the evidence was downgraded one level for inconsistency. Taking these judgements together, the quality (certainty) of evidence was rated very low for sensitivity and low for specificity.

Fig. 2.3.8. Forest plot of included studies for pyrazinamide resistance detection, irrespective of rifampicin resistance with culture-based phenotypic DST as the reference standard.

Fig. 2.3.8

Forest plot of included studies for pyrazinamide resistance detection, irrespective of rifampicin resistance with culture-based phenotypic DST as the reference standard. CI: confidence interval; DST: drug susceptibility testing; FN: false negative; FP: (more...)

The overall sensitivity for pyrazinamide resistance in these seven studies ranged from 36% to 100% and the specificity from 96% to 100%. The pooled sensitivity was 81.2% (95% CI: 75.4–85.8%) and specificity was 97.8% (95% CI: 96.5–98.6%).

More details on diagnostic accuracy of the high complexity reverse hybridization-based NAATs, including comparison with genotypic and composite reference standards are available in Web Annex 4.17: High complexity reverse hybridization-based NAATs: diagnostic accuracy for detection of resistance to pyrazinamide. A systematic review.

Cost–effectiveness analysis

This section answers the following additional question:

What is the comparative cost, affordability and cost–effectiveness of implementation of high complexity reverse hybridization-based NAATs?

A systematic review was carried out, focusing on economic evaluations of high complexity reverse hybridization-based NAATs. Four online databases (Embase, Medline, Web of Science and Scopus) were searched for new studies published from 1 January 2010 through 17 September 2020. The citations of all eligible articles, guidelines and reviews were reviewed for additional studies. The experts and test manufacturers were also contacted to identify any additional unpublished studies.

The objective of the review was to summarize current economic evidence and further understand the costs, cost–effectiveness and affordability of high complexity reverse hybridization-based NAATs.

No published studies were identified assessing costs or cost–effectiveness using the commercially available high complexity hybridization-based NAAT (Genoscholar PZA-TB II, Nipro Japan). Indirect evidence was available from several sources. Four studies examining other commercially available LPAs (Genotype MTBDRsl and MTBDRplus, Hain Lifescience) were identified.

The Genoscholar PZA LPA was developed for use with the Nipro automated MultiBlot; however, a recent unpublished trial8 demonstrated that the Twincubator by Hain Lifescience could be used successfully with this LPA. This finding could make it easier to implement the Genoscholar PZA LPA in selected settings where Hain Lifescience equipment is already in use.

How large are the resource requirements (costs)?

No direct evidence from published studies was found regarding the total resources required. Resource requirements will include the purchase of test kits (Genoscholar PZA LPA: US$ 16/test kit consumables only), and the equipment, which is available for US$ 14 000. Operational costs are frequently several times greater than test kit costs (and will vary across settings), but are not accounted for usually. Nipro hopes that further reductions in test costs can be achieved when the Genoscholar PZA-TB II product is distributed globally.

Unit test costs for the Genotype MTBDRsl and MTBDRplus ranged from US$ 23.46 to US$ 108.70 (4548), with higher unit test costs in countries such as China and South Africa, largely driven by higher staff wages and operational costs. Extrapolations from unit test costs using different LPAs should be done with caution, and they are not intended to be directly transferrable estimates. Nevertheless, these indirect data do suggest that the total unit test cost of the Genoscholar PZA-TB II is likely several-fold higher than the unit test kit consumable cost of US$ 16.

Total costs will vary, depending on testing volume, numbers eligible for testing and prevalence of pyrazinamide resistance in the population. The impact on the budget will depend on the current standard of care, diagnostic and care pathways, and associated resource use.

What is the certainty of the evidence of resource requirements (costs)?

Direct costs related to test kits and machinery are available, whereas several important items related to resource use (e.g. staff time, and overhead and operational costs associated with implementing Genoscholar PZA-TB II) have not been investigated. Differences in resource use between Genoscholar PZA-TB II and existing approaches will vary across settings that are using different phenotypic and genotypic DST. Also, there is important variability in costs of staff time and operation (e.g. testing volume) across settings.

Does the cost–effectiveness of the intervention favour the intervention or the comparison?

No cost–effectiveness studies were identified using the Genoscholar PZA-TB II. Extrapolation of cost–effectiveness data from other LPAs is not advised owing to differences in diagnostic accuracy, resistance prevalence, and the testing and treatment cascade of care.

More details on economic evidence synthesis and analysis are given in Web Annex 4.9: Systematic literature review of economic evidence for NAATs to detect TB and DR-TB in adults and children.

User perspective

This section answers the following questions about key informants’ views and perspectives on the use of high complexity reverse hybridization-based NAATs:

  • Is there important uncertainty about or variability in how much end-users value the main outcomes?
  • What would be the impact on health equity?
  • Is the intervention acceptable to key stakeholders?
  • Is the intervention feasible to implement?

The synthesis and analysis of qualitative evidence on end-users’ perspectives are discussed above in the section “User perspective” for moderate complexity automated NAATs (p. 61–65).

Findings of the review and interviews

The main findings of the systematic review and interviews are given below. Where information is from the review, a level of confidence in the QES is given; where it is from interviews, this is indicated with ‘Interviews’.

Is there important uncertainty about or variability in how much end-users value the main outcomes?

  • Patients in high burden TB settings value:

    getting an accurate diagnosis and reaching diagnostic closure (finally knowing “what is wrong with me”);

    avoiding diagnostic delays because they exacerbate existing financial hardships and emotional and physical suffering, and make patients feel guilty for infecting others (especially children);

    having accessible facilities; and

    reducing diagnosis-associated costs (e.g. travel, missing work) as important outcomes of the diagnostic.

    QES: moderate confidence

  • The high complexity reverse hybridization-based NAATs meet some preferences and values of laboratory staff and clinicians, in that the current test:

    provides quicker results about pyrazinamide resistance than other available methods (e.g. culture DST);

    can provide information on different concentration levels; and

    targets a drug that is widely used in first-line TB treatment.

    Interviews

What would be the impact on health equity?

The impact on health equity would be similar to that of moderate complexity automated NAATs (p. 63–64), plus the following:

  • Lengthy diagnostic delays, underuse of diagnostics, lack of TB diagnostic facilities at lower levels and too many eligibility restrictions hamper access to prompt and accurate testing and treatment, particularly for vulnerable groups.
    QES: high confidence
    Applicability to three index tests also confirmed in interviews
  • Staff and managers voiced concerns about the sustainability of funding and maintenance, complex conflicts of interest between donors and implementers, and the strategic and equitable use of resources, which makes it difficult to ensure equitable access to cartridge-based diagnostics.
    QES: high confidence
  • For patients, access to clear, comprehensible and dependable information on what TB diagnostics are available to them and how to interpret results is a vital component of equity; lack of such access represents an important barrier for patients.
    Interviews
  • New treatment options need to be matched with new diagnostics: it is important to improve access to treatment based on new diagnostics, and to improve access to diagnostics for new treatment options.
    Interviews
  • The speed at which WHO guidelines are changing does not match the speed at which many country programmes are able to implement the guidelines. This translates into differential access to new TB diagnostics and treatment:

    between countries (i.e. between those that can and cannot quickly keep up with the rapidly changing TB diagnostic environment); and

    within countries (i.e. between patients who can and cannot afford the private health system that is better equipped to quickly adopt new diagnostics and policies).

    Interviews

Is the intervention acceptable to key stakeholders?

  • Acceptability of a high complexity reverse hybridization-based NAAT depends on how well the test performs on different samples, because laboratory staff question how well LPA methods work on smear-negative samples. If samples need to be cultured before the pyrazinamide LPA is run, this may undermine the benefits of this method’s quicker turnaround time compared with phenotypic DST for pyrazinamide. Acceptability also depends on how well the test actually detects mutations specific to pyrazinamide resistance; clinicians and laboratory staff may require further clarification and justification in some settings as to why this specific drug test is being prioritized, given that it is not currently part of routine DST.
  • Specific feasibility challenges (training and infrastructure requirements, sample quality result interpretation system), general feasibility challenges (as identified in the interview study and QES, respectively) and accumulated delays risk undoing the added value and benefits identified by the users (e.g. avoiding delays and drug-resistance information).
    QES high confidence and interviews

Is the intervention feasible to implement?

  • The feasibility of implementing the pyrazinamide LPA is challenged by the significant training and laboratory infrastructure required to implement this method. Feasibility also hinges on the availability of an automated interpretation system, because the result is difficult to interpret.
    Interviews

Implementation considerations

Factors to consider when implementing a high complexity hybridization-based NAAT for detection of pyrazinamide resistance are as follows:

  • There are specific concerns about the complexity and difficulty of interpretation. The large number of bands makes it difficult to read the result of the high complexity reverse hybridization-based NAAT.
  • Local epidemiological data on resistance prevalence should guide local testing algorithms, whereas pretest probability is important for the clinical interpretation of test results.
  • The cost of a test varies, depending on the number of samples in a batch, staff time and other parameters requiring a local costing exercise to be performed.
  • Low, moderate, and high complexity tests have a successive increase in technical competency needs (qualifications and skills) and staff time, impacting planning and budgeting.
  • Availability and timeliness of local support service and maintenance should be considered when selecting a provider.
  • Laboratory accreditation and compliance with a robust quality management system (including appropriate quality control) is essential for sustained service excellence and trust.
  • Training of both laboratory and clinical staff will ensure effective delivery of services and clinical impact.
  • Use of connectivity solutions for communication of results is encouraged, to improve efficiency of service delivery and time to treatment initiation.
  • Based on a multinational, population-based study, levels of pyrazinamide resistance varied widely in the surveyed settings (3.0–42.1%). In all settings, pyrazinamide resistance was significantly associated with rifampicin resistance (49).
  • Implementation of a high complexity hybridization-based NAAT requires laboratories with the required infrastructure, space and functional sample referral systems.
  • Because there are several manual steps involved, well-trained staff are needed to set up assays and maintain instruments. Special training and experience are required for reading of banding patterns on the strip.

Research priorities

Research priorities for a high complexity hybridization-based NAAT for detection of pyrazinamide resistance are as follows:

  • diagnostic accuracy of high complexity hybridization-based NAATs indirect testing on sputum and non-sputum samples in people with signs and symptoms of TB, with or without resistance to rifampicin;
  • impact of diagnostic technologies on clinical decision-making and outcomes important to patients (e.g. cure, mortality, time to diagnosis and time to start treatment) in all patient populations;
  • impact of specific mutations on treatment outcomes among people with DR-TB;
  • use, integration and optimization of diagnostic technologies in the overall landscape of testing and care, as well as diagnostic pathways and algorithms;
  • economic studies evaluating the costs, cost–effectiveness and cost–benefit of diagnostic technologies;
  • qualitative studies evaluating equity, acceptability, feasibility and end-user values of diagnostic technologies; and
  • interpretation of the results from a high complexity hybridization-based NAAT compared with sequencing and newer evidence on genotypic and phenotypic associations.

Targeted next-generation sequencing NEW

Targeted NGS technology couples amplification of selected genes with NGS technology to detect resistance to many drugs with a single test. Also, since targeted NGS can interrogate entire genes to identify specific mutations associated with resistance, tests based on this technology may be more accurate than existing WRDs. In addition, new tests based on NGS can detect resistance to new and repurposed drugs that are not currently included in any other molecular assays. Hence, tests based on targeted NGS offer great potential to provide comprehensive resistance detection matched to modern treatment regimens.

Recommendations

  1. In people with bacteriologically confirmed pulmonary TB disease, targeted next-generation sequencing technologies may be used on respiratory samples to diagnose resistance to rifampicin, isoniazid, fluoroquinolones, pyrazinamide and ethambutol rather than culture-based phenotypic drug susceptibility testing.
    (Conditional recommendation, certainty of evidence moderate [isoniazid and pyrazinamide], low [rifampicin, fluoroquinolones and ethambutol])

Remarks

  • Priority should be assigned to those at higher risk of resistance to first-line treatment medications, including individuals who:

    continue to be smear or culture positive after 2 or more months of treatment, or experience treatment failure;

    have previously had TB treatment,

    are in contact with a person known to have resistance to TB drugs; or

    reside in settings or belong to subgroups where there is a high probability of resistance to either rifampicin, isoniazid or fluoroquinolone (used in new shorter regimens), or where there is a high prevalence of M. tuberculosis strains harbouring mutations not detected by other rapid molecular tests.

  • This recommendation is conditional because of the lack of data on health benefits, the variable certainty of evidence on diagnostic accuracy, and the fact that accuracy is suboptimal for certain drugs. In addition, because this is a new technology that has not yet been widely implemented, there is still limited and variable evidence on costs, cost–effectiveness and feasibility of implementation.
2.

In people with bacteriologically confirmed rifampicin-resistant pulmonary TB disease, targeted NGS technologies may be used on respiratory samples to diagnose resistance to isoniazid, fluoroquinolones, bedaquiline, linezolid, clofazimine, pyrazinamide, ethambutol, amikacin and streptomycin rather than culture-based phenotypic drug susceptibility testing.

(Conditional recommendation, certainty of evidence high [isoniazid, fluoroquinolones and pyrazinamide], moderate [ethambutol], low [bedaquiline, linezolid, clofazimine and streptomycin], very low [amikacin])

Remarks

  • Priority should be given to those at a higher risk of resistance to medications used for the treatment of RR-TB, including individuals who:

    continue to be smear or culture positive after 2 months or more of treatment or have experienced treatment failure;

    have previously had TB treatment, including with the new and repurposed drugs;

    are in contact with a person known to have resistance to TB drugs, including the new and repurposed drugs; or

    have pre-XDR-TB with resistance to fluoroquinolones.

  • As above, this recommendation is conditional because of the lack of data on health benefits, the variable certainty of evidence on diagnostic accuracy, the fact that accuracy is suboptimal for certain drugs, and limited and variable evidence on costs, cost–effectiveness and feasibility of implementation.

The products and drugs for which eligible data met the class-based performance criteria are listed below:

Deeplex® Myc-TB (Genoscreen, France): rifampicin, isoniazid, pyrazinamide, ethambutol, fluoroquinolones, bedaquiline, linezolid, clofazimine, amikacin and streptomycin

AmPORE-TB® (Oxford Nanopore Diagnostics, United Kingdom): rifampicin, isoniazid, fluoroquinolones, linezolid, amikacin and streptomycin

TBseq® (Hangzhou ShengTing Medical Technology Co., China): ethambutol

Where a product has not yet met the requirements for a specific drug (i.e., the drug is not listed), further improvements to the product are needed, and a review of the evidence is necessary before clinical use.

Test description

Three products met the inclusion criteria for detection of drug resistance to at least one of the anti-TB drugs under evaluation.

  • The Deeplex® Myc-TB test (Genoscreen, France) is a targeted NGS-based kit for the simultaneous identification of mycobacterial species, genotyping and prediction of drug resistance of MTBC strains, directly applicable on sputum samples (50). The assay relies on deep sequencing of a 24-plex amplicon mix, and it targets 18 MTBC gene regions associated with resistance to a number of anti-TB drugs. Mycobacterial species identification is performed by targeting the hsp65 gene; the spoligotyping target (CRISPR/Direct Repeat locus) and phylogenetic single nucleotide polymorphisms (SNPs) in targets associated with drug resistance are used for MTBC strain genotyping. The assay is performed using the Nextera XT and DNA Flex library preparation kits on the iSeq 100, MiniSeq, MiSeq and NextSeq sequencing platforms (Illumina). The solution includes an automated analysis pipeline of the sequencing data in a secure online application with integrated databases for results interpretation.
  • The AmPORE-TB® test (Oxford Nanopore Diagnostics, United Kingdom) – previously referred to as Nano-TB) – is a targeted NGS-based kit for the simultaneous identification of mycobacterial species and the detection of MTBC genetic variants associated with antimicrobial resistance in DNA extracted from sputum samples.9 The assay relies on sequencing of a 27-plex amplicon mix: 24 drug-resistance targets, a genotyping target, a non-tuberculous mycobacteria (NTM) identification target (hsp65) and an internal control. The 24 drug-resistance targets are MTBC gene regions that are associated with resistance to various TB drugs. Mycobacterial species identification is performed by targeting the hsp65 gene; the spoligotyping target (CRISPR/Direct Repeat locus) is used for MTBC strain genotyping. The assay is performed using the OND AmPORE-TB kit (OND-TBDR001-XX) and Flow Cells (OND-FLO-MIN001-XX) on the GridION Diagnostic Sequencing System (OND). The sequencing control software on the device can automatically start and report the results for the analysis workflows installed. The AmPORE-TB includes analysis software pre-installed on a device that processes readouts produced by the sequencing control software and creates an easy-to-interpret report, all performed locally on the device.
  • The TBseq® test (Hangzhou ShengTing Medical Technology Co., China) is a kit based on targeted NGS that is used for the simultaneous identification of mycobacterial species and the prediction of drug resistance of MTBC strains; it is directly applicable to clinical specimens such as sputum and bronchoalveolar lavage fluid (51). The assay relies on deep sequencing of a multiplex amplification mixand it targets 21 MTBC genes associated with resistance to various TB drugs. Mycobacterial species identification is performed by targeting the 16S and hsp65 gene regions. The assay is performed using the Universal Gene Sequencing Kit (ShengTing) to generate libraries that are sequenced on either a MinION or a GridION platform (Oxford Nanopore Technologies). The solution includes automated analysis software (Nano TNGS V1.0) for sequencing data processing and a secure online application (TBseq® Web App) with integrated databases for interpretation of results.
Image ch2f22

Source: Reproduced with permission of GenoScreen, © 2024. All rights reserved.

Image ch2f23

Source: Reproduced with permission of Oxford Nanopore Technologies plc, © 2024. All rights reserved.

Image ch2f24

Source: Reproduced with permission of Hangzhou ShengTing Medical Technology Co., © 2024. All rights reserved.

Justification and evidence

Diagnostic accuracy and health benefits

Two health questions were designed using the PICO approach, to form the basis for the evidence search, retrieval and analysis.

  1. Should targeted NGS as the initial test be used to diagnose drug resistance in individuals with bacteriologically confirmed pulmonary TB disease? This question applies to:

    rifampicin, using a composite reference standard of phenotypic DST and whole genome sequencing (WGS), and Xpert MTB/RIF® or Xpert Ultra®;

    isoniazid, using phenotypic DST as the reference standard;

    levofloxacin, using phenotypic DST as the reference standard;

    moxifloxacin, using phenotypic DST as the reference standard;

    pyrazinamide, using a composite reference standard of phenotypic DST and WGS; and

    ethambutol, using a composite reference standard of phenotypic DST and WGS.

  2. Should targeted NGS be used to diagnose drug resistance in individuals with bacteriologically confirmed rifampicin-resistant pulmonary TB disease? This question applies to:

    isoniazid, using phenotypic DST as the reference standard;

    levofloxacin, using phenotypic DST as the reference standard;

    moxifloxacin, using phenotypic DST as the reference standard;

    pyrazinamide, using a composite reference standard of phenotypic DST and WGS;

    bedaquiline, using phenotypic DST as the reference standard;

    linezolid, using phenotypic DST as the reference standard;

    clofazimine, using phenotypic DST as the reference standard;

    amikacin, using phenotypic DST as the reference standard;

    ethambutol, using a composite reference standard of phenotypic DST and WGS; and

    streptomycin, using phenotypic DST as the reference standard.

A broad search was conducted to find, appraise and synthesize evidence about health benefits and the diagnostic test accuracy of targeted NGS compared with phenotypic drug sensitivity testing for patients with bacteriologically confirmed TB or with bacteriologically confirmed rifampicin-resistant pulmonary TB disease. A comprehensive search of three databases (Medline, Ovid Embase and Scopus) for relevant citations was performed. No date restriction was applied and the search was initially performed on 7 September 2022 and repeated on 17 January 2023. In addition, WHO made a public call for data and contacted well-known experts in the field to ask whether they had, or knew of, unpublished data that could contribute.

No data were found for the impact of targeted NGS on patient-level health effects. For the analysis of diagnostic accuracy, because few data were available in the literature, all data identified from the literature were included after correspondence with the authors. Hence, no manual data extraction from publications was required. A post-hoc decision was made to perform only an individual patient data (IPD) meta-analysis; thus, any study that could not provide IPD was excluded. Two report authors made independent assessments of methodological quality using QUADAS-2. Disagreements were resolved by discussion and uncertainties or disagreements were reviewed by an independent third party.

Subanalyses were performed to assess the diagnostic test accuracy in PLHIV and for semiquantitative results (derived from cycle thresholds) from Xpert MTB/RIF® or Xpert Ultra®, where “very low” or “low” concentrations of M. tuberculosis were compared with “medium” or “high” concentrations. The very low or low semiquantitative categories represent paucibacillary disease states, such as those frequently observed in paediatric TB.

Data were included from both published and unpublished prospective, observational clinical studies of targeted NGS platform diagnostic accuracy. All studies where targeted NGS had been performed directly from processed clinical samples were included, whereas those performed exclusively on cultured isolates were excluded. All studies were required to have comparator phenotypic DST data as a reference; in the cases of rifampicin, ethambutol and pyrazinamide, studies were required to also have WGS, to allow a composite reference to be generated. Rifampicin resistance results and semiquantitative results from Xpert MTB/RIF® or Xpert Ultra® were requested from all studies.

Given that this was a review of the diagnostic accuracy of a class of diagnostic platforms, all the data from each platform alone were analysed to assess which to include in an analysis to inform a class recommendation. Where the performance of any one platform appeared to be an outlier for sensitivity or specificity, that platform was excluded from subsequent meta-analyses. A platform was considered to be an outlier for a particular drug if the point estimate for sensitivity was more than 10 percentage points worse than the best performing platform, or where the point estimate for specificity was more than 5 percentage points worse.

An IPD meta-analysis was performed instead of a classical meta-analysis, because the studies identified in the literature were generally too small to contribute to a classical meta-analysis, particularly for the new and repurposed drugs. In addition, this type of approach allowed for relevant co-variables to be included in the model; it could also control for repeated testing on the same samples using different platforms, which was the case for much of the available data.

For each dependent variable, a multivariable model included a number of co-variables as fixed effects. These included rifampicin resistance as determined by Xpert MTB/RIF® or Xpert Ultra® for all drugs other than rifampicin; semiquantitative cycle threshold (CT) value from Xpert MTB/RIF® or Xpert Ultra®; and a co-variable to indicate which samples featured in duplicate, meaning that some samples were sequenced on two different platforms and thus were represented twice in the analysis. For models looking specifically at diagnostic test accuracy in PLHIV, the HIV test result was included as a co-variable. Finally, the study site was included as a random effect. The models were run in Stata (version 17) using the melogit command, and the outputs were transformed using the margins command. Models were run for all PICO questions for sensitivity and specificity.

The certainty of the evidence of the pooled studies was assessed systematically for each of the PICO questions using the GRADE approach, which produces an overall quality assessment (or certainty) of evidence and has a framework for translating evidence into recommendations.

The GRADEpro Guideline Development Tool software (20) was used to generate summary of findings tables for the sensitivity and specificity of each drug. The numbers of samples classified as true, false positive or negative were then calculated across a range of three prevalences of drug resistance, chosen to be representative of different global settings. The quality of evidence was rated as high (not downgraded), moderate (downgraded one level), low (downgraded two levels) or very low (downgraded more than two levels), based on five factors: risk of bias, indirectness, inconsistency, imprecision and other considerations. The quality (certainty) of evidence was downgraded by one level when a serious issue was identified and by two levels when a very serious issue was identified in any of the factors used to judge the quality of evidence.

The data sources for the IPD data analysis are shown in Fig. 2.3.9. The analysis included data from published studies, a large multicountry trial conducted by FIND, and several other studies across multiple countries. Most of the studies only evaluated the Deeplex assay, while the FIND trial evaluated both the Deeplex and the AmPORE-TB. Only one study evaluated TBseq. For each drug, one or two platforms were dropped from the analysis based on the overall number of resistant or susceptible samples available for that platform and drug, or because the accuracy of the platform did not meet the diagnostic test accuracy criteria for inclusion when compared with the best performing platform.

Fig. 2.3.9. Studies included in the IPD meta-analysis for targeted NGS.

Fig. 2.3.9

Studies included in the IPD meta-analysis for targeted NGS. ERJ: European Respiratory Journal; FIND: Foundation for Innovative New Diagnostics; IPD: individual patient data; NGS: next-generation sequencing; NICD: National Institute for Communicable Diseases; (more...)

Data synthesis was structured around the two preset PICO questions, as outlined below.

PICO 1: Should targeted NGS as the initial test be used to diagnose drug resistance in patients with bacteriologically confirmed pulmonary TB disease?

The available evidence included in the final pooled analysis varied by drug, from 12 studies with 1440 participants for the sensitivity of isoniazid to three studies with 269 participants for the specificity of pyrazinamide (Table 2.3.5). The pooled estimates were determined using a multivariable, mixed-effects model. All drugs were downgraded by one level for indirectness for sensitivity and specificity, because all studies were enriched for rifampicin resistance, leading to applicability concerns. In addition, for rifampicin, levofloxacin and pyrazinamide, specificity was downgraded a further level for imprecision; however, for ethambutol, it was downgraded for risk of bias because different samples were used for the index and reference tests. The overall certainty of the evidence for test accuracy ranged from moderate to very low.

The test performance was determined to be accurate for all drugs included in the assessment, with a pooled sensitivity of at least 95% for isoniazid, moxifloxacin and ethambutol, more than 93% for rifampicin and levofloxacin, and 88% for pyrazinamide. The pooled specificity was at least 96% for all drugs.

The reference standard was culture-based phenotypic DST for isoniazid, levofloxacin and moxifloxacin, and a combination of phenotypic DST and WGS for rifampicin, pyrazinamide and ethambutol. The percentage of tests with indeterminate results ranged from 9% (levofloxacin and moxifloxacin) to 18% (pyrazinamide), with higher indeterminate rates in samples with lower bacterial load (semiquantitative category low or very low).

Table 2.3.5. The accuracy and certainty of evidence of targeted NGS for the detection of resistance to anti-TB drugs among bacteriologically confirmed pulmonary TB.

Table 2.3.5

The accuracy and certainty of evidence of targeted NGS for the detection of resistance to anti-TB drugs among bacteriologically confirmed pulmonary TB.

There were no data on the impact of targeted NGS on patient outcomes such as time to treatment or treatment outcome.

PICO 2: Should targeted NGS be used to diagnose drug resistance in patients with bacteriologically confirmed rifampicin-resistant pulmonary TB disease?

The available evidence varied by drug, from 12 studies with 1440 participants for sensitivity of isoniazid to three studies with 31 participants for sensitivity of bedaquiline (Table 2.3.6). The pooled estimates were determined using a multivariable, mixed-effects model.

The overall certainty was high for some of the drugs. Levofloxacin was downgraded one level for inconsistency. Bedaquiline and linezolid were downgraded by two levels for imprecision in sensitivity because the number of resistant samples was below the threshold set and the confidence intervals were wide. Clofazimine was also downgraded by two levels, one for inconsistency (because two studies were outliers) and another level for imprecision (because the confidence intervals were wide). Amikacin was downgraded by one level for sensitivity and specificity because critical concentrations outside those recommended by WHO were used for a large proportion of samples. Amikacin sensitivity was further downgraded by two more levels, one for inconsistency and the other for imprecision. Ethambutol was downgraded by one level for risk of bias because different samples were used for the index and reference tests. Streptomycin specificity was downgraded by two levels, one for inconsistency and the other for imprecision. The overall certainty of the evidence for test accuracy ranged from high to very low.

The test performance among people with RR-TB was determined to be accurate for isoniazid, levofloxacin, moxifloxacin, ethambutol and streptomycin (pooled sensitivity ≥95%) and acceptable for pyrazinamide (90%), bedaquiline (68%), linezolid (69%), clofazimine (70%) and amikacin (87%). The pooled specificity was 95% or greater for all drugs except streptomycin (75%). The reference standard was culture-based phenotypic DST for all drugs except for ethambutol and pyrazinamide, where a combination of phenotypic DST and WGS was used. The percentage of tests with indeterminate results ranged from 9% (levofloxacin and moxifloxacin) to 21% (ethambutol); indeterminate rates were higher in samples with a lower bacterial load (semiquantitative category low or very low).

Table 2.3.6. The accuracy and certainty of evidence of targeted NGS for the detection of resistance to anti-TB drugs among bacteriologically confirmed rifampicin-resistant pulmonary TB.

Table 2.3.6

The accuracy and certainty of evidence of targeted NGS for the detection of resistance to anti-TB drugs among bacteriologically confirmed rifampicin-resistant pulmonary TB.

There were no data on the impact of targeted NGS on patient outcomes such as time to treatment or treatment outcome.

Three web annexes give additional information, as follows:

  • details of studies included in the current analysis (Web Annex 4.18: Review of the diagnostic accuracy of targeted NGS technologies for detection of drug resistance among people diagnosed with TB);
  • a summary of the results and details of the evidence quality assessment (Web Annex 2.10: GRADE profiles of targeted next-generation sequencing for detection of TB drug resistance); and
  • a summary of the GDG panel judgements (Web Annex 3.10: Evidence to decision tables: targeted next-generation sequencing for detection of TB drug resistance).

Cost–effectiveness analysis

The cost and cost–effectiveness data for targeted NGS were assessed through a systematic review of the published literature and a generalized model-based cost–effectiveness analysis commissioned by WHO.

The systematic review on the cost and cost–effectiveness of using either targeted NGS or WGS to diagnose DR-TB searched three databases: PubMed, Embase and Scopus. The search was run on 30 October 2022 and had no time restriction. All costing data were inflated to 2021 US dollars. Findings were synthesized descriptively, given the considerable degree of heterogeneity in study methodology and outcomes. Among the studies included in the systematic review, three were on targeted NGS only, three were on targeted NGS and WGS, and four were on WGS only. For targeted NGS based on a single study (n=1), the cost per sample was between US$ 69.64 for Illumina MiSeq on 24 samples, and US$ 73.47 for Nanopore MinION on 12 samples; however, this costing was limited to only some components and did not include human resource costs or overhead costs. For WGS (n=5), cost per sample ranged from US$ 63.00 on Nanopore MinION to US$ 277.00 on Illumina MiSeq; given that studies used an inconsistent number of component costs, comparisons were challenging. Based on the review, the most significant cost component was the sequencing step, and the largest component costs were reagents and consumables, including those necessary for sequencing, sample processing and targeted NGS steps library preparation. Study authors identified four major cost drivers: use of different sequencers, depth and breadth of coverage, inefficiencies in initial sample runs, and economies of scale via batching or cross-batching.

The cost data from the systematic review were limited; therefore, an empirical unit costing was performed, in consultation with manufacturers and FIND. At the time of this work, only pricing for Deeplex Myc-TB was available and it was used for estimation of cost for the class. Unit costs included consumables, equipment, staffing and overheads (where available); also, costs assumed targeted NGS testing for all drugs. Based on the empirical analysis, the cost of targeted NGS was estimated to be:

  • US$ 134 to US$ 257 in South Africa;
  • US$ 120 to US$ 198 in Georgia; and
  • US$ 121 to US$ 175 in India.

These costs are dependent on patient volume, batching and negotiated cost per targeted NGS kit.

Recognizing the lack of economic evidence on this topic, a hypothetical cost–effectiveness modelling study was undertaken to assess the cost–effectiveness (Objective 1) and affordability (Objective 2) of these tests for the diagnosis of DR-TB in various high TB burden settings.

Objective 1: To assess the potential cost–effectiveness of introducing the targeted NGS technology for the diagnosis of DR-TB in Georgia, India and South Africa.

This assessment included modelling the cost–effectiveness of targeted NGS in three separate scenarios with distinct comparison options:

  • Cost–effectiveness of targeted NGS for DST among individuals with RR-TB after a rapid molecular test for rifampicin resistance as a replacement for phenotypic DST (PICO 2).
  • Cost–effectiveness of targeted NGS for DST among individuals with RR-TB after a rapid molecular test for rifampicin resistance as a replacement for current in-country DST practice (PICO 2).
  • Cost–effectiveness of targeted NGS as the initial test for TB drug resistance in patients with bacteriologically confirmed TB compared with rapid molecular testing for drug resistance and phenotypic DST in a high DR-TB burden setting (PICO 1).

In the first scenario, targeted NGS was compared with universal phenotypic DST; in the second scenario, targeted NGS was compared with current in-country phenotypic DST practice among individuals with detected rifampicin resistance (PICO 2). This was done across three countries: Georgia, India and South Africa. Current DST practice in Georgia and South Africa includes Xpert XDR® followed by phenotypic DST; in India it includes LPAs and phenotypic DST done in parallel. A final scenario included targeted NGS compared to rapid molecular testing for drug resistance and phenotypic DST as initial tests for TB drug resistance among all TB patients (PICO 1) but was modelled for only one setting, Georgia – a high DR-TB burden setting. Epidemiological data were sourced from published literature; targeted NGS diagnostic accuracy data were sourced from the systematic review and IDP analysis conducted for this guideline. Economic data were sourced from published literature and a systematic and scoping review done in parallel by our team and supplemented with empirical data collection.

A decision analysis modelling approach was used to estimate the incremental cost–effectiveness of using targeted NGS for the diagnosis of DR-TB compared with various existing DST scenarios. This was done from the perspective of the health care system and accounts only for the health care system costs required to diagnose and treat TB. The estimation did not account for societal costs, or any direct or indirect costs incurred by patients. In addition, costs for sample transportation were not included in this analysis. The primary outcome was the incremental cost–effectiveness ratio (ICER), which was calculated as the incremental cost in US dollars per disability-adjusted life year (DALY) averted.

Main findings for PICO 1: Using targeted NGS as an initial test

Using targeted NGS as an initial test for DST in the high DR-TB burden setting of Georgia led to more health gains (DALYs=0.49) compared with Xpert MTB/RIF or Xpert Ultra, followed by phenotypic DST (DALY=0.51). The ICER per DALY averted was US$ 9261 (95% uncertainty range [UR]: US$ 5258–32 040/DALY averted), which was considered cost effective at a willingness-to-pay (WTP) threshold of three times the country GDP per capita (US$ 15 609), with 80% of simulated iterations falling below the WTP threshold.

Main findings for PICO 2: Using targeted NGS among those with RR-TB

Using targeted NGS as a replacement for universal phenotypic DST among RR-TB patients, targeted NGS was dominated by phenotypic DST, with targeted NGS having higher costs and leading to fewer health gains. This finding was driven by the high diagnostic accuracy of phenotypic DST (which was assumed to be universal in this scenario), and an assumption of no difference in loss to follow-up between targeted NGS and phenotypic DST. When in-country DST practice was used as the comparator (instead of universal phenotypic DST), targeted NGS led to more health gains than in-country DST across all three countries. Targeted NGS was cost effective in South Africa (ICER: US$ 15 619/DALY averted, 95% UR: cost saving −US$ 114 782, at a WTP threshold of US$ 21 165), but was not cost effective in Georgia (ICER: US$ 18,375/DALY averted, UR: cost saving −US$ 158 972/DALY averted, at a WTP threshold of US$ 15 065). In India, where LPA, liquid culture and DST are being used as part of in-country DST, targeted NGS dominated the country’s current DST practice, with lower costs and more health gains (95% UR: cost saving −US$ 60 083).

Main findings: scenario analyses

Several key scenario analyses were investigated. In the base case approach, loss to follow-up was assumed to be equivalent between phenotypic DST and targeted NGS; in a scenario where there was no loss to follow-up in targeted NGS compared with 10% in phenotypic DST, targeted NGS was cost effective in South Africa (ICER: US$ 13 004/DALY averted, WTP: US$ 21 165) and Georgia (ICER: US$ 13 640/DALY averted, WTP: US$ 15 069) and targeted NGS still dominated in-country DST practice in India. In scenarios where sequencing platforms are used for multiple different diseases to reduce the unit test cost of targeted NGS, the cost–effectiveness of targeted NGS improves in all three countries. A batching scenario was investigated, with an assumed 20% fewer samples per targeted NGS run, and led to an increased unit test cost for targeted NGS; in this scenario, the targeted NGS approach retained cost–effectiveness only in South Africa. When a 50% price reduction in targeted NGS test kit cost was assumed, targeted NGS cost–effectiveness further improved in all countries.

Objective 2: To assess the financial impact of introducing targeted NGS as a replacement for existing DST for diagnosis of DR-TB among TB patients across three countries: Georgia, India and South Africa.

A budget impact assessment was undertaken to estimate the financial consequences of adopting targeted NGS for DST for all patients diagnosed with TB, and replacing in-country DST practice in Georgia (PICO 1). The analysis suggested that implementing targeted NGS for all patients diagnosed with TB would be more expensive than testing all patients with Xpert MTB/RIF or Xpert Ultra, followed by phenotypic DST (see Fig. 2.3.10).

Fig. 2.3.10. Budget impact assessment results comparing current standard practice for DST with implementation of targeted NGS for all patients diagnosed with TB in Georgia.

Fig. 2.3.10

Budget impact assessment results comparing current standard practice for DST with implementation of targeted NGS for all patients diagnosed with TB in Georgia. DST: drug susceptibility testing; NGS: next-generation sequencing; pDST: phenotypic DST; TB: (more...)

A budget impact assessment was undertaken to estimate the financial consequences of adopting targeted NGS for DST after a rapid molecular test for rifampicin resistance, and replacing in-country DST practice in Georgia, India and South Africa (PICO 2). In-country DST practice included Xpert XDR combined with phenotypic DST in Georgia and South Africa, and Xpert XDR combined with LPA in Georgia over a 1-year and 5-year period. It was assumed that the eligible RR-TB patient populations requiring DST were 58 837, 8200 and 187 in South Africa, India and Georgia, respectively, and that the TB reduction rate over the 5 years was stable (2). To estimate the impact on the country-specific budget, the economic costs generated by the model were multiplied by the number of patients.

Results from a 1-year budget impact assessment for PICO 2 are presented in Fig. 2.3.11. In India, it was estimated that implementing targeted NGS would cost about US$ 57 130 727 – slightly lower than the current practice of LPA combined with phenotypic DST, which has a cost of US$ 57 719 097. In South Africa, it was estimated that implementing targeted NGS would result in a rise in budget to about US$ 27 888 200, slightly more than LPA combined with phenotypic DST, which has a cost of US$ 26 428 600. Finally in Georgia, where there are fewer bacteriologically confirmed patients, it was estimated that implementing targeted NGS would cost about US$ 592 221, slightly more than LPA combined with phenotypic DST, which has a cost of US$ 568 480.

Fig. 2.3.11. Budget impact assessment results comparing current standard practice for DST to implementing targeted NGS for patients with RR-TB in India, South Africa and Georgia.

Fig. 2.3.11

Budget impact assessment results comparing current standard practice for DST to implementing targeted NGS for patients with RR-TB in India, South Africa and Georgia. DST: drug susceptibility testing; LPA: line probe assay; NGS: next-generation sequencing; (more...)

User perspective

A rapid review was commissioned to identify and synthesize qualitative evidence on the use of targeted NGS for the detection of TB drug resistance; in particular, the aim was to examine the implementation considerations related to acceptability, feasibility, and values, preferences and equity. The review searched Medline with no year or language limits. The search was run on 19 August 2022, and then rerun on 10 October 2022 to include WGS-related studies for the detection of TB drug resistance. The review did not identify any eligible studies for analysis and synthesis. Based on the systematic search, three records were identified; in addition, based on the open, hand and expert searches, 27 records were found. On full-text review of the 30 records, none were found to be eligible for inclusion. Given that no direct evidence was found, note was made of a Cochrane qualitative evidence synthesis published in 2022 that examined recipient and provider perspectives on rapid molecular tests for TB and drug resistance (52); that study provides relevant (though indirect) evidence on the subject. The authors noted that people with TB valued reaching diagnostic closure with an accurate diagnosis, avoiding diagnostic delays and keeping diagnostic-associated costs low, whereas health care providers valued aspects of accuracy and the resulting confidence in low-complexity NAAT results, rapid turnaround times and low costs to people seeking a diagnosis.

To address the direct evidence gap, WHO commissioned an additional qualitative cross-sectional study comprising semi-structured interviews, primarily with laboratory staff and management personnel directly involved with implementing targeted NGS in the three FIND trial sites, as well as with three global experts involved in TB care and diagnostics. In total, there were 17 respondents, and the work was conducted during September to October 2022. The objective was to explore the perceptions and experiences of those implementing targeted NGS technology, with respect to acceptability, feasibility, and values, preferences and equity. The main findings are summarized below.

Acceptability

A consistently positive sentiment was expressed for the acceptability and potential utility of targeted NGS technology. Targeted NGS was seen as a “major advancement” in molecular MDR-TB diagnostics.

  • The main reasons for the high level of acceptability were the comprehensiveness (resistance diagnosis for more drugs and for the newest and repurposed drugs), the convenience of using a sputum sample (as compared with culture samples), and the rapidity (quick results compared with phenotypic testing times; 3–5 days as compared with 4–6 weeks).
  • There was also the sense that there is a good window of opportunity to benefit from the utility of targeted NGS technology; that is, the technology is arriving at the right time, given that resistance to newer TB drugs is likely to increase as the use of these drugs becomes routine.
Feasibility

Although there was high praise for the capability and potential utility of targeted NGS technology, several challenges were identified when testing samples using the targeted NGS platforms, which may limit the feasibility of targeted NGS for routine uptake at the present time. The overall sentiment was that the targeted NGS technology needs to be further developed before it can be considered fully ready for operational use.

The following feasibility challenges were identified:

  • Start-up and setting-up challenges: Multiple problems were identified with starting and setting up the technology. These problems related to the newness of the technology and the trial setting, importing technology and specialist supplies, lack of in-country technical assistance for problem-solving and need for more hands-on training practice.
  • High technical complexity of the test: Targeted NGS technology was seen as a high complexity molecular test that was technically challenging. For example, preparing the sample for sequencing involves multiple steps that require attention to detail and precision, leaving little room for error. Preparation of the library is particularly complex for the Deeplex platform, although both the Deeplex and the Nanopore platforms are quite complex. In both platforms, it was thought that there were too few opportunities for early recognition and correction of errors, increasing the risk of failed runs.
  • Specialized laboratory infrastructure and human resource requirements: Because targeted NGS is a molecular-based testing platform, it requires highly specialized laboratory infrastructure (e.g. multiple rooms to prevent amplicon contamination and specialized cold storage facilities). Also, highly specialized molecular and medical scientists are needed to perform the tests. In LMIC settings, such specialized laboratory infrastructure and staff may only be available at centralized laboratories (i.e. not at regional laboratories).
  • Special requirements for operating the test: In addition to highly specialized laboratory infrastructure and staff, the testing technology also requires an uninterrupted supply of electricity, high internet connectivity, high computer capacity, clean water and temperature controls – requirements that may pose challenges in some LMIC settings.
  • Supply chain challenges: Major challenges were reported relating to the required supply chain for implementing targeted NGS. Procurement bottlenecks and delays coupled with shelf-life limitations of reagents jeopardize continuous access to specialist supplies.
  • Data management and storage requirements: There were concerns that data analysis and data storage requirements were not fully developed, including systems for backing up data, ownership of data and security of data. Another issue that needs to be considered is how targeted NGS and routine laboratory information systems can be interlinked.
  • Continuous updating of the WHO catalogue of mutations is required: There was agreement that the usefulness of the targeted NGS technology depends on the informational support provided by the WHO catalogue of mutations (53), which allows for meaningful interpretation of resistance data; thus, there is a need for the WHO catalogue to be continuously updated.
  • Feasibility concerns differed for the different targeted NGS platforms: The overall sentiment was that all targeted NGS platforms needed to be further developed before they are fully ready for operational use, some more than others. The high level of technical complexity of the sample preparation stages (mainly the library preparation stage) was considered a key challenge for the Deeplex platform, and the need for improved computer analysis and storage capacity was a challenge for the Oxford Nanopore platform, although both required a high level of precision and attention to detail. There is also a need to incorporate steps for early error recognition.

Values, preferences and equity

The overall sentiment is that MDR-TB diagnostic technology needs to balance accuracy, speed, affordability, equity and cost–effectiveness, and that targeted NGS technology would need to address these considerations before it can be implemented in LMIC settings. These considerations were consistent across the different stakeholder groups who participated in the study.

  1. Centralized versus decentralized placement may have equity implications for access: Given the high-level specialized laboratory infrastructure, specialized human resources and technical complexity needed for targeted NGS, the technology may be suitable for placement only at centralized, reference laboratories. This may have equity access considerations if it means less access for some regions of a country that lack reference laboratories. This may also have implications for costs (e.g. costs for transport of sputum), probability of sample loss and time to results.
  2. Affordability and cost–effectiveness are major concerns: There was a major concern about the financial costs of the targeted NGS technology and the affordability for LMIC. Participants were worried about the cost of the equipment and the costs of ongoing specialist supplies (especially reagents), as well as the cost of maintaining equipment. They noted that costing calculations should be comprehensive and should include the cost of special consumables, extra general laboratory consumables and additional infrastructure needs (e.g. extra space, temperature control and internet connectivity). There were concerns that cost–effectiveness calculations should be comprehensive and should include assessment of the impact of the use of targeted NGS testing on improving TB outcomes.
  3. The MDR/RR-TB case burden of a country could influence equitable access at centralized levels. In some settings with high caseloads, the targeted NGS technology capacity in central laboratories may not be sufficient for processing large caseloads in good time; also, in settings with low caseloads, waiting for sufficient samples to batch-test will cause delays.

Implementation considerations

Although the evidence that is available supports the use of targeted NGS to detect drug resistance after TB diagnosis, to guide clinical decision-making for DR-TB treatment, the following factors need to be considered when implementing these tests:

  • Regulatory approval from national regulatory authorities or other relevant bodies is required before implementation of these diagnostic tests.
  • In its current format, targeted NGS is a high complexity test that is most suitable for centralized laboratories equipped with specialized skills and infrastructure.
  • Targeted NGS tests do not replace existing rapid tests that are more accessible and easier to perform for detecting resistance to rifampicin, isoniazid and fluoroquinolones. However, if targeted NGS can be performed rapidly, it can be considered as an alternative initial option for prioritized populations. Those who will benefit most from these tests are individuals who require rapid and comprehensive DST but have limited access to phenotypic DST.
  • Priority should be given to samples with a high bacillary load as determined by initial bacteriological tests (e.g. semiquantitative high/medium or smear-positive grading). In situations where the bacillary load is low (e.g. semiquantitative low/very low/trace or smear-negative grading), the recommendations still hold, although rates of indeterminate results are likely to be higher; therefore, phenotypic DST is likely still required for samples with a low bacillary load.
  • Similarly, the recommendations apply to children, adolescents and PLHIV populations because these populations have a higher frequency of samples with low bacterial load.
  • The recommendation is based on data obtained from sputum and bronchoalveolar lavage specimens, and can be extrapolated to other lower respiratory tract samples (e.g. endotracheal aspirates). However, further research is needed to evaluate the use of these tests on alternative sample types for diagnosing pulmonary TB in children (e.g. nasopharyngeal and stool samples) and diagnosing extrapulmonary TB.
  • Since sensitivity for bedaquiline, linezolid and clofazimine resistance is suboptimal, consideration of the pretest probability is important in interpreting the targeted NGS results for these drugs. Further testing of samples with a susceptible result (using culture-based phenotypic DST) would be warranted, particularly when the risk of resistance is high. Since specificity is high, a result that indicates resistance may be used to guide the therapy, particularly among those at risk for resistance. In the case of pretomanid, the basis for resistance has not been fully elucidated; hence, culture-based DST is also required for this drug.

Research priorities

Several key research priorities emerged from the reviews of the available evidence on targeted NGS for detecting TB drug resistance. They fall into three main categories: clinical research, implementation research, and monitoring and evaluation.

Clinical research:

  • Conduct clinical trials to assess the impact of targeted NGS on patient-important outcomes10.
  • Evaluate the accuracy and impact on patient-important outcomes of targeted NGS among populations of individuals diagnosed with TB, across a range of prevalences of rifampicin or other drug resistance).
  • Assess the accuracy and impact on patient-important outcomes of targeted NGS for detecting resistance to new and repurposed drugs, including pretomanid, across varied geographical and epidemiological settings.
  • Assess the accuracy and impact on patient-important outcomes of targeted NGS for analysing extrapulmonary samples, including CSF for meningitis, non-sputum samples (e.g. nasopharyngeal aspirate, gastric aspirate or stool) for children, and alternative sample types (e.g. tongue swabs) in both adults and children.
  • Undertake additional qualitative and quantitative research to further understand the perspectives of end-users and clinicians regarding the acceptability and feasibility of using targeted NGS.

Implementation research:

  • Develop and evaluate effective and efficient implementation models by integrating targeted NGS into laboratory networks and optimizing algorithms, with the aim of enhancing timely access to testing and treatment initiation, and improving patient outcomes.
  • Develop strategies to enhance the efficiency of targeted NGS testing, including sample processing and concentration techniques, determining optimal thresholds of bacterial load from initial tests before performing targeted NGS, and employing molecular transport medium for the ambient storage and transfer of samples to testing sites.
  • Regularly update the WHO catalogue of mutations (53), incorporating additional genetic targets and including new drugs (e.g. pretomanid) to enhance the sensitivity and specificity of targeted NGS.
  • Explore technological advancements to simplify the testing process, automate steps (especially library preparation), develop decentralized targeted NGS solutions and investigate potential synergies with existing initial tests (e.g. using leftover DNA or smear-positive slides).
  • Conduct comprehensive mapping of sequencing capacity within countries and perform diagnostic network optimization exercises. Placement of the technology should consider the demand for sequencing across multiple diseases, facilitating cross-disciplinary use of the machines and shared costs.
  • Compile and use lessons learned from applying targeted NGS technology in other diseases (e.g. COVID-19) to develop effective implementation strategies for TB.

Monitoring and evaluation:

  • Standardize the nomenclature for reporting of results across different targeted NGS technologies, for integration into health information data systems.
  • Ensure separate recording of true failures and unclassified mutations, and monitor trends over time as an essential component of result reporting.
  • Regularly monitor performance data, including overall resistance rates, resistance rates by specific drugs or targets and turnaround times (both total and in-laboratory).
  • Incorporate quality monitoring measures, such as tracking indeterminate rates, sequencing coverage and depth, and participating in external quality assurance programmes.
  • Establish an external quality assurance programme for sequencing that covers all relevant targets of interest.
  • Integrate the sequencing data generated into existing surveillance systems to monitor the prevalence and trends in drug resistance effectively. Share the data to update the WHO mutation catalogue.
  • Collect cost data to address important questions, such as the costs associated with introducing and scaling up targeted NGS in different settings, the trade-offs between turnaround time and batching, and the optimal balance in various settings.
  • Assess the impact of multidisease testing on programme operations and costs, including disease-specific testing volumes, turnaround times, costing, resource sharing and resource requirements.
  • Evaluate the impact of time to treatment initiation or modification, treatment outcomes and overall cost–effectiveness of targeted NGS implementation.

Footnotes

1

Based on PICO questions 3 and 4.

2

Based on PICO question 5.

3

When not specified, this term applies to both Xpert MTB/RIF and Xpert Ultra.

4

A complete list of web annexes is provided at page iv–vi.

5

Data courtesy of H Sohn and W Stevens at FIND (unpublished).

6

The numbers in brackets show the 95% credible interval (CrI).

7

Kanamycin and capreomycin are not to be included in the treatment of MDR/RR-TB patients on longer regimens. Conditional recommendation, very low certainty in the estimates of effect.

8

Oxford Nanopore Diagnostics provided a draft protocol for the test.

9

Oxford Nanopore Diagnostics provided a draft protocol for the test.

10

Mortality, Cure, Lost to follow up; Time to diagnosis; Time to treatment

© World Health Organization 2024.

Sales, rights and licensing. To purchase WHO publications, see https://www.who.int/publications/book-orders. To submit requests for commercial use and queries on rights and licensing, see https://www.who.int/copyright.

Third-party materials. If you wish to reuse material from this work that is attributed to a third party, such as tables, figures or images, it is your responsibility to determine whether permission is needed for that reuse and to obtain permission from the copyright holder. The risk of claims resulting from infringement of any third-party-owned component in the work rests solely with the user.

Some rights reserved. This work is available under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 IGO licence (CC BY-NC-SA 3.0 IGO; https://creativecommons.org/licenses/by-nc-sa/3.0/igo).

Under the terms of this licence, you may copy, redistribute and adapt the work for non-commercial purposes, provided the work is appropriately cited, as indicated below. In any use of this work, there should be no suggestion that WHO endorses any specific organization, products or services. The use of the WHO logo is not permitted. If you adapt the work, then you must license your work under the same or equivalent Creative Commons licence. If you create a translation of this work, you should add the following disclaimer along with the suggested citation: “This translation was not created by the World Health Organization (WHO). WHO is not responsible for the content or accuracy of this translation. The original English edition shall be the binding and authentic edition”.

Any mediation relating to disputes arising under the licence shall be conducted in accordance with the mediation rules of the World Intellectual Property Organization (http://www.wipo.int/amc/en/mediation/rules/).

Bookshelf ID: NBK602012

Views

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...