SUMMARY OF MANUFACTURER’S INDIRECT COMPARISON

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Entyvio (Vedolizumab) [Internet]. Ottawa (ON): Canadian Agency for Drugs and Technologies in Health; 2016 Dec.

Entyvio (Vedolizumab) [Internet].

Show details

Contents

< Prev Next >

APPENDIX 6SUMMARY OF MANUFACTURER’S INDIRECT COMPARISON

Introduction

Background

Given the absence of head-to-head studies that have compared vedolizumab against other relevant biologics for moderate to severe Crohn’s disease in this CDR review, the objective of this Appendix is to summarize and critically appraise the evidence available regarding the comparative efficacy and safety of vedolizumab versus infliximab and adalimumab through indirect comparison (IDC). Both induction and maintenance treatment in adult patients with moderate to severely active Crohn’s disease are evaluated in the review.

Methods for Manufacturer’s Indirect Comparison

Study eligibility and selection process

The manufacturer reported that the IDC is based on a systematic literature review; however, its methodology was poorly reported. There is limited information about the methods used for the literature search, study selection, data extraction, and risk of bias assessment. The methods for the literature search were missing information regarding the search terms, electronic search strategy, dates associated with the original and updated literature searches (it was noted that the current review was conducted as an updated), and any limitations or filters applied in the search. It can be inferred from the results section that the literature search included the following: Embase, MEDLINE, the Cochrane Library, the manufacturer’s internal database, clinical trial registries (e.g., clinicaltrials.gov), and manual searching based on reference lists of retrieved articles.

The inclusion criteria for the systematic review are summarized in Table 40. The eligibility criteria for the systemic review were reported only at a high level in the introduction and are absent from the methods section. CADTH reviewers extracted information from the results section of the report to populate the summary provided in Table 40.

Table 40

Inclusion Criteria for the Manufacturer’s Indirect Comparison.

In general, inclusion was limited to placebo-controlled trials investigating one or more of the following: vedolizumab, infliximab, or adalimumab. The Methods section lacks any description about how any of the following characteristics were considered in the study selection process: Definitions for moderate to severe Crohn’s disease, dosage regimens of adalimumab and infliximab, study durations, and previous exposure to pharmacotherapy in the management of Crohn’s disease. However, an examination of the reasons for study exclusion suggests that studies were excluded if they were perceived to lack comparability with the vedolizumab pivotal studies with regard to patient characteristics (e.g., fistulizing Crohn’s disease or recent respective surgery) or trial end points (e.g., absence of induction phase outcome or recurrence of Crohn’s disease following surgery). In addition, at least one study was excluded for using an infliximab dosing regimen that exceeds the recommended induction dosage in Canada (i.e., 10 mg/kg rather than 5 mg/kg). Studies were also excluded if they investigated combination usage of a tumour necrosis factor (TNF) alpha antagonist with other agents (e.g., azathioprine).

Quality assessment of included studies

Quality assessment of the individual included studies was performed, but the specific instrument used was not identified in the methods section. Based on the results section, the following characteristics were considered in the manufacturer’s quality assessment: Allocation concealment, blinding, withdrawals, and the use of an intention-to-treat (ITT) analysis.

Indirect comparison methods

The manufacturer conducted IDCs of vedolizumab versus infliximab and adalimumab using the Bucher method, with placebo as the common comparator. All of the outcomes that were evaluated in the IDC were dichotomous outcomes, and differences between treatments were reported as relative risks (RRs). In instances where there are data from multiple clinical studies (e.g., GEMINI II and GEMINI III), the results from the individual studies were pooled using a random-effects model. The pooled estimate was subsequently used in the Bucher calculations. Analyses were conducted using the overall study populations, and subgroup analyses were conducted for patients who were TNF alpha antagonist-naive and those who had experienced failure with or intolerance to one or more TNF alpha antagonists. A summary of the IDCs conducted by the manufacturer is provided in Table 41 for the induction studies and in Table 42 for the maintenance studies.

Table 41

Outcomes and Populations Evaluated in Indirect Comparison for Induction Studies.

Table 42

Outcomes and Populations Evaluated in Indirect Comparison for Maintenance Studies.

Results

Study and patient characteristics

In total, nine unique placebo-controlled randomized controlled trials were included in the manufacturer’s IDC. The following studies were included in the manufacturer’s evaluation of induction phase end points: Two studies of vedolizumab (GEMINI II and GEMINI III), one study of infliximab (T16),⁸² and three studies of adalimumab (CLASSIC-I,⁸³ GAIN,⁸⁴ and Watanabe et al.⁸⁵). The following studies were included in the manufacturer’s evaluation of maintenance phase end points: one study of vedolizumab (GEMINI II), one study of infliximab (ACCENT I⁶⁷), and three studies of adalimumab (CLASSIC II,⁵⁴ CHARM,⁸⁶ and Watanabe et al.⁸⁵). Study characteristics of the randomized controlled trials included in the IDC are summarized in Table 43. The vedolizumab trials (GEMINI II and GEMINI III) have already been detailed in this review. All of the included studies were used in the IDCs for the efficacy end points, with the exception of the CLASSIC II maintenance study, which was excluded due to the requirement for demonstrating clinical response twice for inclusion in the maintenance phase (i.e., at weeks 4 and 8). However, CLASSIC II was included in the IDC for safety end points.

Table 43

Select Study Characteristics Included in the Indirect Comparison.

Induction therapy

The results of the IDCs for induction therapy are summarized in Figure 6. In the overall treatment population, the manufacturer’s IDC of vedolizumab versus infliximab provided estimates of effect favouring treatment with infliximab for inducing clinical remission (RR 0.15; 95% confidence interval [CI], 0.02 to 1.11) and clinical response (RR 0.29; 95% CI, 0.12 to 0.74). Similar results were reported for the TNF-failure subgroup analyses, with RRs of 0.19 (95% CI, 0.02 to 1.48) and 0.28 (95% CI, 0.11 to 0.73) for inducing clinical remission and clinical response, respectively. The manufacturer reported that vedolizumab was noninferior to infliximab for the induction of clinical remission in both the overall and TNF-naive populations and inferior for clinical remission in both populations.

Figure 6

Results From the Indirect Comparison of Induction Studies. ADA = adalimumab; CI = confidence interval; IFX = infliximab; ITT = intention-to-treat; RR = relative risk; TNF = tumour necrosis factor alpha antagonist; VDZ = vedolizumab; vs. = versus. Source: (more...)

Similar to the comparison against infliximab, the indirect estimate effect for clinical remission favoured adalimumab over vedolizumab (RR 0.61; 95% CI, 0.34 to 1.08), although the upper bound of the CI did not exclude unity. The indirect estimates for enhanced clinical response (RR 0.84; 95% CI, 0.55 to 1.28) and clinical response (RR 0.87; 95% CI, 0.67 to 1.14) were closer to unity. Results were similar in the TNF-failure subgroups. For all comparisons in the induction phase, the manufacturer reported that vedolizumab was noninferior to adalimumab.

Maintenance therapy

The results of the IDCs for maintenance therapy are summarized in Figure 7. The manufacturer reported that vedolizumab was noninferior to infliximab for maintenance of clinical remission (RR 0.87; 95% CI, 0.49 to 1.69) and durable clinical remission (RR 0.66; 95% CI, 0.30 to 1.45). In contrast, the manufacturer reported that vedolizumab was inferior to infliximab for maintaining clinical response (RR 0.52; 95% CI, 0.30 to 0.92). Results were similar for the TNF-naive subgroup; however, the indirect estimate for maintaining clinical response did not exclude unity, and therefore, the manufacturer claimed that vedolizumab was noninferior to infliximab.

Figure 7

Results From the Indirect Comparison of Maintenance Studies. ADA = adalimumab; CI = confidence interval; IFX = infliximab; ITT = intention-to-treat; RR = relative risk; TNF = tumour necrosis factor alpha antagonist; VDZ = vedolizumab; vs. = versus. Source: (more...)

The indirect estimate of effect for maintaining clinical remission favoured adalimumab compared with vedolizumab; however, the CI does not exclude unity (RR 0.58; 95% CI, 0.37 to 1.01). Therefore, the manufacturer reported that vedolizumab was noninferior to adalimumab for the maintenance of the clinical remission. The manufacturer reported that vedolizumab was inferior to adalimumab for enhanced clinical response (RR 0.56; 95% CI, 0.35 to 0.90) and clinical response (RR 0.51; 95% CI, 0.32 to 0.79) and noninferior to adalimumab for corticosteroid-free clinical remission (RR 0.41; 95% CI, 0.13 to 1.28). For the TNF-failure subpopulation, the manufacturer reported that vedolizumab was noninferior to adalimumab for both clinical remission and enhanced clinical response.

Harms

The manufacturer conducted a number of IDCs for safety end points in both the induction and maintenance phases. RRs for the IDCs are summarized in Figure 8. The manufacturer reported that vedolizumab was noninferior to the comparators for all safety end points with the exception of being associated with a reduced risk of withdrawals due to adverse events compared with infliximab in the maintenance phase and a greater risk of serious adverse events compared with infliximab and adalimumab in the maintenance phase.

Figure 8

Results From the Indirect Comparison of Harms From the Induction and Maintenance Studies. ADA = adalimumab; AE = adverse event; CI = confidence interval; IFX = infliximab; RR = relative risk; SAE = serious adverse event; VDZ = vedolizumab; vs. = versus; (more...)

Critical Appraisal of Manufacturer’s Indirect Comparison

The quality of data reported in the manufacturer’s IDC was assessed according to the recommendations provided by the International Society for Pharmacoeconomics and Outcomes Research (ISPOR) Task Force on Indirect Treatment Comparisons.⁸⁷ A summary of heterogeneity is provided in Table 44 for the induction studies and Table 45 for the maintenance studies. The manufacturer’s rationale for conducting the IDC (i.e., absence of head-to-head studies) and the objectives of the IDC (i.e., comparisons of vedolizumab against infliximab and adalimumab) are clearly reported in the manufacturer’s submission. However, the manufacturer did not provide a rationale for electing to submit an IDC using the Bucher method, as opposed to the network meta-analysis (NMA) that was submitted to NICE (National Institute for Health and Care Excellence) and the Scottish Medicines Consortium for the same indication.

Table 44

Heterogeneity in the Induction Studies Included in the Manufacturer’s Indirect Comparison.

Table 45

Heterogeneity in the Maintenance Studies Included in the Manufacturer’s Indirect Comparison.

Study characteristics

The GEMINI II and GEMINI III studies (concluded in 2012) were conducted later than trials for the comparators, particularly for infliximab (T16 concluded in 1996;⁸² ACCENT I concluded in 2001⁶⁷). It is possible that the clinical management of Crohn’s disease has evolved over the period since the introduction of the first biologic, introducing heterogeneity between the included studies.

The only study included in the induction phase IDC for infliximab (i.e., T16) evaluated the efficacy end points after a single 5 mg/kg induction dose of infliximab.⁸² This is not reflective of the induction dosage regimen recommended in the Canadian product monograph (i.e., 5 mg/kg at weeks 0, 2, and 6).^13,14 In contrast, vedolizumab was administered multiple times prior to the evaluation of efficacy end points i.e., at weeks 0 and 2 in GEMINI II and 0, 2, and 6 in GEMINI III).^15,21 These additional dosages of active treatment may bias the study results in favour of vedolizumab. The ACCENT I maintenance study that was used for the IDC comparing vedolizumab with infliximab used a maintenance dose of 5 mg/kg every eight weeks, which is consistent with recommendations in the Canadian product monograph;^13,14 however, clinical response in the induction phase was evaluated at two weeks, after a single 5 mg/kg infusion at week 0.⁶⁷ Patients who demonstrated a clinical response at week 2 were subsequently randomized to receive infusions of either infliximab or placebo at weeks 2 and 6, followed by every eight weeks. Therefore, patients in the placebo group of ACCENT I received only a single infusion of active treatment (i.e., at week 0) compared with the two infusions of active treatment in the GEMINI II trial (i.e., at weeks 0 and 2).¹⁵ This difference in exposure to the active treatments within the placebo groups is a significant source of heterogeneity between the vedolizumab and infliximab trials and could contribute to the reduced placebo-response rates reported in ACCENT I compared with those reported in GEMINI II.

The induction phase trials^83–85 that were included in the IDC for adalimumab each had one treatment group that used induction doses that were consistent with recommendations in the Canadian product monograph (i.e., 160 mg at week 0 and 80 mg at week 2).²⁴ Dosing in at least one treatment group of the maintenance phase of the adalimumab trials was also consistent with recommendations in the Canadian product monograph (i.e., 40 mg every two weeks); however, the doses provided in the induction phase of the maintenance trials were below the recommended doses: All patients in CHARM received adalimumab at doses of 80 mg at week 0 and 40 mg at week 2,⁸⁶ and patients in Watanabe could have received 160 mg/80 mg or 80 mg/40 mg at weeks 0 and 2 (respectively).⁸⁵ Similar to the infliximab comparison, these differences in exposure to active treatment within the placebo groups is a significant source of heterogeneity between the vedolizumab and adalimumab trials and could contribute to the reduced placebo-response rates reported in the CHARM and Watanabe trials compared with those reported in GEMINI II. Patients in CLASSIC II also received induction doses below those recommended in Canada (i.e., 80 mg at week 0 and 40 mg at week 2).⁵⁴ As noted previously, this study was not used in the IDC for efficacy evaluations due to the requirement for demonstrating clinical response twice for inclusion in the maintenance phase (i.e., at both week 4 and week 8); however, it was included in the pairwise frequentist meta-analyses used to calculate the pooled relatives of the various safety end points. For all safety comparisons, using less than the recommended doses of infliximab and adalimumab could underestimate the comparative harms associated with these treatments (with the exception of those associated with disease exacerbation).

As shown in Table 43, induction of clinical remission was evaluated at four weeks in the adalimumab and infliximab trials and six weeks in the vedolizumab trials. The Australian Pharmaceutical Benefits Advisory Committee noted that this difference could potentially favour vedolizumab.¹⁶ In the maintenance trials, there were also differences in the timing used to evaluate response to induction treatment prior to enrolment in the maintenance phase. All of the maintenance trials used in the IDC efficacy evaluations used clinical response (i.e., a reduction of at least 70 in CDAI score) as the threshold for inclusion; however, this was assessed at week 6 in the vedolizumab trial (GEMINI II), week 4 in the adalimumab trials (CHARM and Watanabe),^85,86 and week 2 in the infliximab trial (ACCENT I).⁶⁷ In addition, the efficacy end points in the maintenance phase studies were also evaluated at different time points (46 weeks with vedolizumab, 52 weeks with infliximab, and 52 to 56 weeks with adalimumab). Given that patients who failed to complete the trials were considered to be nonresponders in all of the included studies and that the proportion of patients who withdrew for any reason (including loss of efficacy and patients who were lost to follow-up) increased with time, having an earlier end point evaluation in the maintenance phase could favour vedolizumab treatment compared with the alternatives.

Study populations

The population of interest for the current CDR submission is patients with moderately to severely active Crohn’s disease who have had an inadequate response to alternative therapies (as per the indication under review for vedolizumab). Mean baseline CDAI scores for the induction phase studies were all within the moderate to severe range and were generally similar across the different studies. However, there is substantial heterogeneity in the characteristics of the different study populations, including clinically relevant parameters such as prior exposure to TNF antagonists and concomitant use of corticosteroids.

Patients were enrolled in the maintenance phase studies only if they had demonstrated a response to the active treatment in the induction phase. This introduces variation within the placebo groups across the studies, as the patients who were randomized to receive placebo in the maintenance phase had been previously treated with a different biologic therapy (i.e., vedolizumab, adalimumab, or infliximab). There were also differences in the placebo-response rates for maintaining clinical remission across the studies (12% and 9% in the adalimumab trials; 14% in the infliximab trial; and 22% in the vedolizumab trial). The reason for these differences in the baseline risk for inducing and maintaining clinical remission is unclear; however, the manufacturer of vedolizumab has suggested that the differences in the maintenance phase could be attributed to the longer-lasting effect of vedolizumab compared with the TNF alpha antagonists (i.e., remission induced as a result of vedolizumab treatment is maintained longer than remission induced with the TNF alpha antagonists following removal of active treatment). Overall, these differences are an important source of between-study heterogeneity, and the implications on the results of the IDC are unclear.

The placebo-response rate for inducing clinical remission was lower in the infliximab trial (4%) compared with the trials for vedolizumab (7% to 12%) and adalimumab (7% to 13%). Similar to the maintenance phase analyses, the reasons for the differences in placebo-response rates are unclear for the induction phase, and the analyses were not adjusted for differences in the placebo-response rates.

As infliximab was the first biologic to be approved for use in the treatment of Crohn’s disease, all patients enrolled in the infliximab trials were naive to biologic therapy for Crohn’s disease. In contrast, the study populations of GEMINI II and GEMINI III studies were composed of 50% and 75% patients, respectively, who had previously failed at least one TNF alpha antagonist. In addition, as shown in Table 12, a significant proportion of the patients in the vedolizumab trials had failed treatment with two TNF alpha antagonists (approximately 20% in GEMINI II and 40% in GEMINI III) or three TNF alpha antagonists (5% in GEMINI II and 8% in GEMINI III). Some of the adalimumab trials included patients with prior exposure to TNF antagonists, although few would have failed multiple TNF alpha antagonists as those patients enrolled in the vedolizumab trials. These differences in prior exposure to biologic therapy for Crohn’s disease may be clinically relevant and may be an indication that the study populations of GEMINI II and GEMINI III trials are composed of patients with Crohn’s disease that is more refractory to treatment.

There were differences between the induction studies in the proportion of patients using corticosteroids at baseline. Usage of corticosteroids was reported for 60% of patients in the infliximab trial,⁸² approximately 50% in the vedolizumab trials, and ranged from 23% to 39% in the adalimumab trials. The clinical expert consulted by CDR indicated that dependence on corticosteroids was more common in Crohn’s disease patients before the introduction of TNF alpha antagonists. Hence, the greater usage of corticosteroids in the T16 infliximab trial may be a reflection of clinical practice at that time (i.e., 1996),⁸² when there were fewer alternative treatments for patients with refractory Crohn’s disease. The use of corticosteroids in the maintenance phase was similar in the vedolizumab trial (GEMINI II, 53%),¹⁵ the infliximab trial (ACCENT I, 52%),⁶⁷ and the smallest of the adalimumab trials (CLASSIC II, 49%).⁵⁴ For the remaining adalimumab studies, corticosteroid usage was slightly lower in CHARM (42%)⁸⁶ and substantially lower in Watanabe (16%).⁸⁵ As noted previously, the manufacturer conducted pairwise frequentist meta-analyses to calculate the differences for adalimumab versus placebo. As expected, given the differences in sample size, the results from CHARM contributed to more than 90% of the estimated treatment effect for all efficacy end points for adalimumab.⁵⁰ This weighting of the response may help mitigate the potential impact of the large disparity in corticosteroid usage between Watanabe and the other trials. Overall, there is insufficient evidence to evaluate whether or not these across-trial imbalances in corticosteroid usage could influence the results of the IDCs, particularly in the induction phase analyses where the differences were most pronounced.

Systematic review methods

The methods for the literature search were incomplete, with inadequate reporting of the following information: Electronic search strategy, search terms, dates associated with the original and updated literature searches, and the limitations used in the search. Overall, eligibility criteria for the IDC were poorly reported. The report contains a general PICO (population, intervention, comparison, outcomes) statement in the objective section, but provides no details in the methods section. However, the relevant information can be inferred based on the study inclusions and the exclusion reasons provided for the individual studies. Definitive statements about the following criteria are absent from the report: definitions for moderate to severe Crohn’s disease, acceptable dosage regimens of interventions and comparators, and the minimum study durations. The end points of interest for the review are identified only as “key efficacy and safety outcomes” in the methods section; however, the end points that were evaluated are adequately described in the IDC report.

Methods for the systematic review are poorly reported, with just a statement indicating that “relevant methods suitable for determining the evidence base that can be included in a health technology assessment (HTA) submission” were used. There is no description of methodology used for study selection, data extraction, or critical appraisal. Quality assessment of individual studies was performed, but the specific instrument that was used was not identified in the methods section. Based on the results section, the following characteristics were included in the quality assessment: Allocation concealment, blinding, follow-up, and use of an ITT analysis.

Analysis methods

The methodological description of the Bucher analyses was adequately reported, and both direct (placebo comparisons) and indirect (active comparisons) estimates of effect are presented in the report (as applicable). The results of the IDCs for both efficacy and safety end points were adequately reported in summary tables, as RRs with 95% CIs. Study-level results and direct pairwise meta-analyses were also presented.

There was no description or justification for the manufacturer’s noninferiority assessments, which were reported for all efficacy and safety end points. There is no discussion of the noninferior margin that was used (though it appears that any indirect estimates where the upper bound of the CI did not exclude unity were considered to be noninferior by the manufacturer). It is unclear if the manufacturer’s IDCs were adequately powered to evaluate noninferiority for any of the outcomes that were assessed. Similar concerns were noted by the Australian Pharmaceutical Benefits Advisory Committee, who appraised the same indirect comparative data.¹⁶ In addition, all of these analyses were conducted using the ITT analysis populations from the included trials; however, per-protocol data sets are typically considered to be more conservative when establishing the noninferiority of two treatments. The individual trials included in the manufacturer’s IDC were relatively short-term studies that were not powered or designed to conduct robust statistical evaluations of adverse events. Therefore, it is unclear if the IDCs that were calculated using the effect sizes derived from the individual studies had sufficient statistical power to support the manufacturer’s claims of noninferiority.

Subgroup analyses were conducted based on whether or not the patients were treatment-experienced or treatment-naive with TNF alpha antagonists, which are relevant patient characteristics. The manufacturer provides a description of some potential sources of bias in the IDC (e.g., differences in placebo-response rates); however, there were no sensitivity analyses conducted to investigate the potential effects of such bias (though the limited number of studies that were considered to be comparable would likely preclude the conduct of these analyses).

Conclusion

The manufacturer submitted an IDC of vedolizumab versus infliximab and adalimumab using the Bucher method, with placebo as the common comparator. The manufacturer reported that vedolizumab was noninferior to infliximab for inducing and maintaining clinical remission, but inferior for inducing and maintaining clinical response. The manufacturer also reported that, compared with adalimumab, vedolizumab was noninferior for inducing and maintaining clinical remission and corticosteroid-free clinical remission and inducing clinical response. Vedolizumab was inferior to adalimumab for maintaining enhanced clinical response and clinical response.

The manufacturer’s claims of noninferiority are limited by the absence of any pre-specified noninferiority margins or considerations of the statistical power required to make such conclusions. In addition, there is substantial heterogeneity in the study designs and patient characteristics across the studies included in the IDC. Overall, given the limitations of the manufacturer’s analysis and the heterogeneity across studies, the comparative efficacy of these agents is uncertain in both the induction and maintenance phases of treatment. Therefore, there is uncertainty with the manufacturer’s conclusions about the noninferiority or inferiority of vedolizumab compared with infliximab and adalimumab.

Except where otherwise noted, this work is distributed under the terms of a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International licence (CC BY-NC-ND), a copy of which is available at http://creativecommons.org/licenses/by-nc-nd/4.0/

Bookshelf ID: NBK424352

Contents