Results

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Clinical Review Report: Insulin Degludec (Tresiba): (Novo Nordisk Canada Inc): Indication: For once-daily treatment of adults with diabetes mellitus to improve glycemic control [Internet]. Ottawa (ON): Canadian Agency for Drugs and Technologies in Health; 2017 Dec.

Cover of Clinical Review Report: Insulin Degludec (Tresiba)

Clinical Review Report: Insulin Degludec (Tresiba): (Novo Nordisk Canada Inc): Indication: For once-daily treatment of adults with diabetes mellitus to improve glycemic control [Internet].

Show details

Contents

< Prev Next >

Results

Findings From the Literature

A total of 20 studies were identified from the literature for inclusion in the systematic review (Figure 1). The included studies are summarized in Table 11 and described in the Included Studies section. A list of excluded studies is presented in Appendix 3.

Figure 1

Flow Diagram for Inclusion and Exclusion of Studies.

Table 11

Details of Included Studies — DEVOTE, SWITCH-1, and SWITCH-2.

Included Studies

Description of Studies

Fifteen manufacturer-sponsored randomized controlled trials (RCTs) plus five extensions were included in this systematic review, four of which were double blinded and the remainder open label. The studies featured different populations, T1DM and T2DM. Another four RCTs were excluded because either their small sample size or their short duration suggested they were unlikely to add any additional information beyond what was obtained in the studies summarized in the report.

DEVOTE was a double-blind RCT that compared IDeg with IGlar, both in a basal-bolus regimen, in a population of 7,637 patients with T2DM and cardiovascular disease. The primary outcome was the time to first major adverse cardiovascular event (MACE), testing the noninferiority of IDeg to IGlar, with a margin for noninferiority of 1.3 for the upper limit of the 95% confidence interval (CI) for the hazard ratio. DEVOTE was an event-driven study, targeting 633 MACEs before study completion, and it lasted a mean of 24 months. Confirmatory secondary end points, which all tested the superiority of IDeg to IGlar, included the number of confirmed hypoglycemic episodes and the occurrence of at least one hypoglycemic episode within a participant. The study had a two-week screening period, as well as a 30-day follow-up period at the end of study. DEVOTE is by far the largest of the studies included in this review.

The SWITCH studies, SWITCH-1 (T1DM) and SWITCH-2 (T2DM), employed a crossover design, with patients randomized to start on either IDeg or IGlar and then cross over to the other intervention after 32 weeks of therapy, resulting in a total treatment period of 64 weeks. The primary outcome of the SWITCH studies was the proportion of participants with severe or blood glucose–confirmed symptomatic hypoglycemic episodes during the maintenance period, that is, after 16 weeks of treatment. SWITCH-1 tested the noninferiority of IDeg to IGlar, with noninferiority confirmed if the upper bound of the 95% CI for the rate ratio was ≤ 1.10. SWITCH-2 tested the superiority of IDeg to IGlar. In each study, before testing of the primary outcome could proceed, noninferiority had to be confirmed for the secondary supportive end point of change from baseline in glycated hemoglobin (A1C) after 32 weeks of therapy. The margin for noninferiority was 0.4%, the same margin used in the BEGIN trials, described below. Each of the SWITCH studies had a two-week screening period and a one-week follow-up.

All of the BEGIN trials that compared IDeg with another basal insulin (described in more detail below) were noninferiority trials that tested the noninferiority of IDeg to a comparator for the primary outcome of change from baseline in A1C. Most of the trials had confirmatory secondary outcomes that compared the superiority of IDeg against the comparator. The extension trials, where they occurred, focused on the long-term safety of IDeg; therefore, there were no efficacy outcomes assessed. In the extensions, all patients continued in their originally randomized groups.

Type 1 Diabetes Mellitus

Studies 3583 (BBT1 [basal-bolus TD1M] Long, 52-week treatment period), 3770 (Flex T1, 26 weeks), and 3585 (BBT1, 26 weeks) were all open label, and all had extensions where patients continued on their originally randomized treatments. Study 3585 had IDet as a comparator, while the other two studies compared IDeg with IGlar. In Flex T1, the objective of the study was to compare a flexible dosing regimen of IDeg with a regular IDeg regimen or to IGlar. The studies had one-week screening and one-week follow-up. The primary outcome of Studies 3770, 3583, and 3585 was the change from baseline to end of treatment in A1C. All of these studies tested the noninferiority of IDeg to a comparator, with a margin for noninferiority of 0.4% for the change from baseline in A1C. All but Study 3770 had confirmatory secondary end points that tested the superiority of IDeg to their respective comparators.

Type 2 Diabetes Mellitus

Insulin-Naive

Five open-label RCTs and one double-blind RCT enrolled patients with T2DM who were insulin-naive. All participants were receiving OADs. The comparator in four of the studies was IGlar, while in Study 3580 (BEGIN Early) the comparator was sitagliptin and in Study 3944, the only double-blind RCT, the comparator was placebo. Study 3672 (BEGIN Low Volume) compared the more concentrated formulation, IDeg 200 U/mL, with IGlar, while the other studies used the standard 100 U/mL concentration. Five studies were 26 weeks in duration, while Study 3579 (BEGIN Once Long) was a 52-week study. All studies except Study 3944 had a one-week screening period and at least a one-week follow-up, while Study 3944 also had a 15-week run-in where participants were initiated on liraglutide, which they would all continue once the double-blind period started. One trial (Study 3579) had an extension while the others did not. The primary outcome in all studies was the change in A1C from baseline to end of study. Studies with IGlar as a comparator tested the noninferiority of IDeg to IGlar for the primary end point, while Studies 3580 and 3944 tested the superiority of IDeg for the primary outcome. Confirmatory secondary outcomes, which all tested the superiority of IDeg to a comparator, included the number of treatmentemergent severe or minor hypoglycemic episodes, change from baseline in fasting plasma glucose (FPG), within-patient variability as measured by coefficient of variation in self-measured FPG, and responders without hypoglycemic episodes (A1C < 7.0% at end of trial and no severe or minor hypoglycemic episodes during the last 12 weeks of treatment including only patients exposed for at least 12 weeks). The confirmatory secondary outcomes in Study 3580 were change from baseline in FPG, frequency of responders (A1C < 7.0% at end of trial), and frequency of responders without hypoglycemic episodes (A1C < 7.0% at end of trial and no severe or minor hypoglycemic episodes during the last 12 weeks of treatment).

Basal Insulin Only (± OADs)

Two open-label RCTs (Studies 3668 and 3943) were included with this population. In Study 3668, all participants received metformin, with or without a dipeptidyl peptidase-4 inhibitor, and IDeg was compared with IGlar over a 26-week treatment course. The trial had a one-week screening period and at least one week of follow-up. The primary outcome was change from baseline in A1C to the study end point at 26 weeks; the study tested the noninferiority of IDeg to IGlar using the same noninferiority margin as the other BEGIN trials, 0.4%. Secondary end points, none of which appeared to be confirmatory, included the number of confirmed hypoglycemic episodes, change from baseline in FPG, within-patient variability in pre-breakfast, self-measured plasma glucose, and responders without hypoglycemic episodes (A1C < 7.0% at end of trial and no confirmed episodes during the last 12 weeks of treatment, including only patients exposed for at least 12 weeks). Study 3943 was a noninferiority, open-label RCT with a crossover design featuring two treatment periods of 16 weeks each where IDeg was compared with IGlar. Participants were all on metformin plus or minus an additional OAD. The trial had a one-week screening period and a 16-week run-in where participants discontinued their OAD (other than metformin) and were initiated on a regimen of IGlar, as well as a one-week follow-up. The purpose of the run-in was to establish which participants required a “high” dose of IGlar (> 81 units), as this was the population of interest for the study. The primary outcome again tested noninferiority for the change from baseline in A1C. Secondary outcomes were A1C responders, change from baseline in FPG, self-measured plasma glucose, and patient-reported outcomes; however, none of these appeared to be confirmatory.

Basal-Bolus Insulin (± OADs)

One noninferiority open-label RCT (Study 3582) was included with this population. Participants were receiving metformin, plus or minus pioglitazone, and IDeg was compared with IGlar. Participants were on a regimen that combined these basal insulins with insulin aspart. The primary outcome was the change in A1C from baseline to the study end point at 52 weeks. Confirmatory secondary end points included change from baseline in FPG after 52 weeks, frequency of responders (A1C < 7.0% at end of trial), and frequency of responders without hypoglycemic episodes (A1C < 7.0% at end of trial and no severe or minor hypoglycemic episodes during the last 12 weeks of treatment).

Populations

Inclusion and Exclusion Criteria

Participants in DEVOTE had T2DM, with an A1C of 7% or more (or below 7% if receiving current insulin therapy of at least 20 units daily). Participants were currently treated with one or more oral or injectable antidiabetes drugs. They had to be at least 50 years old and have evidence of cardiovascular disease or chronic kidney disease (Table 11).

In the trials in T1DM, participants had to have been treated on a basal-bolus regimen for at least 12 months, with an A1C of 10% or less and a BMI of 35 kg/m² or less. For a high-level summary of these trials, see Table 12; for detailed summaries, see Table 39 and Table 40.

Table 12

Type 1 Diabetes Mellitus and Type 2 Diabetes Mellitus Trial Details.

In the T2DM trials in insulin-naive patients, participants had to have had T2DM for at least six months, an A1C of between 7.0% or 7.5% and 10%, and a maximum BMI of 40 kg/m² to 45 kg/m². All were receiving OADs for at least three months before randomization in a regimen that typically featured metformin with or without another OAD. See Table 12 for a high-level summary of study designs; for detailed summaries, see Table 41, Table 42, Table 43, Table 44, and Table 45.

Participants in Study 3668 (basal only) had to have had T2DM for at least six months and be on OAD monotherapy, insulin monotherapy, or a combination of the two. The only allowed OADs were metformin, insulin secretagogues, or pioglitazone. Participants on OAD alone had to have an A1C between 7% and 11%. Those on combination basal insulin and OAD or basal insulin monotherapy were to be between 7% and 10%. In the other basal-only study, participants were to have an A1C of at least 7.5% and be on metformin with or without another OAD. For a high-level summary of these two studies, see Table 12; for detailed summaries, see Table 46 and Table 47.

In the basal-bolus study, participants were to have hadT2DM for at least six months and could be on any insulin regimen, with or without OADs, for at least three months before randomization. Their A1C had to be between 7.5% and 11%, and their maximum BMI was to be 40 kg/m². For a high-level summary, see Table 12; for a detailed summary, see Table 48.

Baseline Characteristics

Trials with populations with T1DM tended to feature younger participants (early to mid-40s) versus studies that focused on T2DM (mid-50s to mid-60s), which aligns with the onset and progression of the disease subtypes. Among T2DM studies, the oldest populations were in DEVOTE (65 years of age) and SWITCH-2 (61 years of age) (Table 13, Table 14).

Table 13

Summary of Baseline Characteristics — DEVOTE.

Table 14

Summary of Baseline Characteristics — SWITCH-1 and SWITCH-2.

Across all studies, the majority of participants were male, and most were Caucasian, with the exception of Studies 3585 and 3586, where almost all participants were Asian (non-Indian), and Study 3587, where about two-thirds were Asian (non-Indian).

In the T1DM studies, between 14% and 29% of participants had diabetes complications at baseline. In DEVOTE, 86% of participants had established cardiovascular or chronic kidney disease, while in the other T2DM studies, the proportion of participants with diabetes complications varied widely at baseline, from a low of around 10% in Studies 3579 and 3580 to a high of about 40% in Study 3586.

Baseline characteristics were generally balanced between groups within studies. The most common baseline parameter to differ between groups was gender, with the largest difference between groups found in Study 3668, where 59% of participants in the IDeg-Flex (IDeg flexible dosing regimen) group and 48% of participants in the IGlar group were male.

Table 15

Summary of Baseline Characteristics — Type 1 Diabetes Mellitus (Studies 3770, 3583, and 3585).

Table 16

Summary of Baseline Characteristics — Type 2 Diabetes Mellitus, Insulin-Naive (Studies 3579, 3580, 3586, and 3672).

Table 17

Summary of Baseline Characteristics — Type 2 Diabetes Mellitus, Insulin-Naive (Studies 3587 and 3944).

Table 18

Summary of Baseline Characteristics — Type 2 Diabetes Mellitus, Basal Insulin (Studies 3668 and 3943).

Table 19

Summary of Baseline Characteristics — Type 2 Diabetes Mellitus, Basal-Bolus (Study 3582).

Interventions

The studies generally employed a treat-to-target strategy for insulin dosing. For example, in SWITCH-1, participants’ plasma glucose was titrated to a self-measured plasma glucose of 4.0 mmol/L to 5.0 mmol/L. A dose reduction was to be implemented if one or more of the pre-breakfast glucose values was < 4.0 mmol/L. Bolus insulin (insulin aspart) was titrated individually based either on carbohydrate counting or by using a sliding scale based on the lowest of three pre-meal or bedtime glucose values. Participants were typically given an algorithm they used to adjust their insulin regimens throughout the trial.

The majority of studies randomized participants in a 1:1 manner; however, Studies 3585, 3586, and 3587 randomized participants in a 2:1 manner (IDeg to comparator), and Studies 3579, 3583, and 3582 randomized 3:1. Two studies, 3770 and 3668, included an IDeg-Flex regimen in addition to the standard IDeg daily regimen, where participants’ injections were to be given in a rotating schedule with eight-hour to 40-hour intervals between doses. These studies randomized participants 1:1:1. If stratification was reported or performed, the most common variable was by region (Studies 3587, 3944, 3586, 3585). Study 3580 stratified by use of pioglitazone at screening, Study 3582 by prior insulin regimen, and Study 3668 by prior treatment (insulin, OAD, or both). The SWITCH studies and DEVOTE did not report whether stratification occurred.

Type 2 Diabetes Mellitus

Inclusion criteria specified adequate minimum doses of OADs to ensure that antidiabetes therapy was optimized before intervention and that inadequacy of glycemic control at baseline was not due to suboptimal dosing of OAD treatment. No washout period was applied. During the trials, patients continued on the pre-specified OADs at unchanged doses unless dose reduction was required for safety reasons.

Two studies, 3944 and 3943, had extensive run-in periods. In Study 3944, the 15-week run-in was used to initiate participants on liraglutide, which was to become the standard adjunctive therapy, added to IDeg and placebo, during the treatment phase. In Study 3943, the 16-week run-in was used to determine which participants needed a high (> 80 units) dose of IGlar to maintain glycemic control.

Outcomes

The primary outcome of DEVOTE was time from randomization to first occurrence of an event adjudication committee (EAC)–confirmed 3-component MACE: cardiovascular death, non-fatal myocardial infarction, or non-fatal stroke.

Confirmed hypoglycemic episodes consisted of episodes of severe hypoglycemia as well as minor hypoglycemic episodes with a confirmed plasma glucose value of < 3.1 mmol/L. Hypoglycemic episodes were defined as nocturnal if the time of onset was between 00:01 a.m. and 05:59 a.m., inclusive. Severe hypoglycemia was defined as an episode requiring the assistance of another person to actively administer carbohydrate or glucagon, or take other resuscitative actions.

Blood samples for A1C were analyzed using a Bio-Rad high performance liquid chromatography method at a central laboratory. A1C samples were collected at multiple visits in the main trial and extensions (where applicable). The assay method used was a National Glycohemoglobin Standardization Program–certified method. Blood samples for FPG were analyzed using a Roche enzymatic method at a central laboratory. FPG samples were collected at multiple visits in the main trials and extensions (where applicable). The patients were to attend these visits in a fasting state.

Within-patient variability as measured by coefficient of variation was to be derived from pre-breakfast plasma glucose values after 26 weeks of treatment. Logarithm-transformed self-measured plasma glucose values were to be analyzed as repeated measures in a linear mixed model with treatment, antidiabetes treatment at screening, sex, and region as fixed factors, age as a covariate, and patient as random factor. The model was to assume independent within-patient and between-patient errors with variances depending on treatment. Within-patient variability as measured by coefficient of variation for a treatment could be calculated from the corresponding residual variance. The CI for the coefficient of variation ratio between treatments was to be calculated using the delta method.

Changes in patients’ health-related quality of life (HRQoL) and treatment-related impacts of minor hypoglycemic episodes on patients’ daily function and well-being were evaluated using the Short Form (36) Health Survey, version 2.0 (SF-36v2) and Treatment-Related Impact Measure — Hypoglycemic Events (TRIM-HYPO) questionnaires, respectively. Responses for the SF-36v2 were measured on standardized scales from 1998 based on the US general population, with a mean of 50 and standard deviation of 10. Responses for the TRIM-HYPO were standardized to a scale of 0 to 100. In the SF-36v2 questionnaire, higher scores indicate a better HRQoL. In the TRIM-HYPO questionnaire, lower scores indicate better daily function and well-being for the patient.

SF-36 is a generic health assessment questionnaire that has been used in clinical trials to study the impact of chronic disease on HRQoL. SF-36 consists of 36 items representing eight dimensions: physical functioning, role physical, bodily pain, general health, vitality, social functioning, role emotional, and mental health. Item response options are presented on a 3-point to 6-point Likert-like scale. Each item is scored on a 0 to 100 range and item scores are averaged together to create the eight domain scores. SF-36 also provides two component summaries, the physical component summary and the mental component summary, which are created by aggregating the eight domains according to a scoring algorithm. On any of the scales, an increase in score indicates improvement in health status. Based on clinical anchor data, the SF-36 User’s Manual proposed the following minimal important differences for general use of the SF-36v2: a change of 2 to 4 points in each domain or 2 to 3 points in each component summary.²⁶ No minimal clinically important difference (MCID) in patients with T1DM or T2DM was found in the literature.

Treatment-Related Impact Measure — Diabetes (TRIM-D) is a diabetes-specific questionnaire developed to assess the full impact of diabetes treatment on patients’ quality of life. This patient-reported outcome measure consists of 28 items encompassed in five domains: treatment burden (six items), daily life (five items), diabetes management (five items), psychological health (eight items), and compliance (four items). Response options are presented on a 5-point Likert-like scale. An increase in score indicates an improvement in health state. Domains can be scored individually or the measure can be scored as a total of these domains.²⁷^,²⁸ No MCID has been determined for the TRIM-D.

TRIM-HYPO is a patient-reported outcome measure developed to measure the impact of non-severe hypoglycemic events on patients’ HRQoL arising from the use of insulin to treat both forms of diabetes (T1DM and T2DM). TRIM-HYPO is a self-reported questionnaire comprising 33 Likert-like scale items (scored 1 to 5) in five domains: daily functioning, emotional well-being, diabetes management, work productivity, and sleep disruption. Domains are scored individually. A total score is also calculated, using three of the five domains (daily functioning, emotional well-being, diabetes management), as work productivity and sleep disruption do not apply to all patients. Lower scores on the TRIM-HYPO indicate a better health state. Raw scores are obtained by aggregating scale items into their respective domain scales. A weighted score is then generated, based by the number of non-severe hypoglycemic occurrences in the past 30 days: the higher the number, the greater the impact on the weighted score. This weighting helps account for the difference in HRQoL of patients experiencing few events versus those experiencing many hypoglycemic events. A standard algorithm method transforms the weighted scores into a 0 to 100 score.²⁹ No MCID has been determined for the TRIM-HYPO.

Statistical Analysis

All included studies carried out power calculations to determine sample size, and all studies randomized sufficient numbers of patients to ensure adequate power for assessing the primary end point.

DEVOTE

The primary end point (time from randomization to first occurrence of an EAC-confirmed, three-component MACE) was presented descriptively in a Kaplan–Meier plot and analyzed using a Cox proportional hazard regression with treatment (IDeg and IGlar) as a factor. The hazard ratio and the corresponding two-sided 95% CI were estimated. Noninferiority of IDeg to IGlar was considered confirmed if the upper limit of the two-sided 95% CI for the hazard ratio was below 1.3 or equivalent if the P value for the one-sided test of the null hypothesis, hazard ratio ≥ 1.3, against the alternative hypothesis, hazard ratio < 1.3, was < 2.5%. This is the margin recommended by the FDA for evaluating the cardiovascular safety of new antihyperglycemic drugs.³⁰ Results for the full analysis set (FAS) population were also presented as for the per-protocol population.

Any EAC-confirmed MACE occurring after a patient’s first EAC-confirmed MACE did not contribute to the analysis (i.e., time to first event only). Where an EAC-confirmed cardiovascular death was linked by the EAC to an earlier myocardial infarction or stroke, the patient contributed to the analysis with time to the cardiovascular death. If a patient did not experience any EAC-confirmed MACE, the time was censored at the patient’s individual end-of-trial date.

Patients were allowed to go on and off randomized treatment during the trial (resulting in “on-treatment” and “off-treatment” periods). Sensitivity analyses were made using the same Cox regression model as the primary analysis. but including only EAC-confirmed MACEs occurring during an on-treatment period.

Four sensitivity analyses were performed, covering two types of censoring mechanisms: strict censoring (censor at time of first EAC-confirmed MACE if occurring during an off-treatment period) and censoring where the first EAC-confirmed MACE occurring during an off-treatment period was ignored. These censoring mechanisms were applied to two types of on-treatment definition:

on-treatment: EAC-confirmed MACEs occurring on randomized treatment
on-treatment + 30 days: EAC-confirmed MACEs occurring on randomized treatment plus up to 30 days of a subsequent off-treatment period.

Provided that noninferiority for the primary end point was confirmed, the number of EAC-confirmed severe hypoglycemic episodes was analyzed using a negative binomial regression model with log-link function and the logarithm of the observation time as offset. The model included treatment (IDeg versus IGlar) as a fixed factor, and was fitted using the FAS. Superiority was considered confirmed if the upper limit of the two-sided 95% CI for the rate ratio was below 1.0, or equivalent if the P value for the one-sided test of the null hypothesis, rate ratio ≥ 1.0, against the alternative hypothesis, rate ratio < 1.0, was less than 2.5%.

Several subgroup analyses were reported for the primary outcome, of which one was relevant to our protocol: cardiovascular risk group (patients with established cardiovascular disease or chronic kidney disease versus patients with risk factors for cardiovascular disease).

Multiplicity

The primary and secondary confirmatory end points were tested in a predefined hierarchical order to control the overall type I error. In this testing sequence, it was necessary to fulfill the test criteria (i.e., to reject the corresponding null hypothesis) in order to go to the next step. If the corresponding null hypothesis was not rejected, the testing was stopped, and no further hypotheses were tested.

Step 1: Noninferiority of IDeg versus IGlar for the primary end point
Step 2: Superiority of IDeg versus IGlar for the number of EAC-confirmed severe hypoglycemic episodes
Step 3: Superiority of IDeg versus IGlar for the occurrence of at least one EAC-confirmed severe hypoglycemic episode in a patient.

Because the statistical tests and results of the interim analysis did not affect the continuation of the trial or the statistical tests and results of the full trial data (as stated in the statistical analysis plan for the interim analysis), there was no need to adjust the alpha level for the statistical tests of the full trial data.

Missing Data

In DEVOTE, a tipping-point analysis was made to address the impact of missing information for patients not completing the trial. Events were added for all patients randomized to IDeg not having an EAC-confirmed MACE in the primary analysis who were non-completers (i.e., 66 patients) and lost to follow-up (i.e., four patients).

As these patients had observation periods of different lengths, the order in which they were added to the analysis could potentially have an impact. Hence, patients were sorted based on the duration of their observation times using the following approaches:

forward imputation of events, in which events were imputed for patients with the shortest observation period first and the longest observation last
backward imputation, where events were imputed for patients with the longest observation period first and the shortest observation last
iImputation using median time to event in reference group for patients with non-informative censoring and observation time less than median time in reference.

Finally, the tipping point was established by adding first EAC-confirmed MACEs to the IDeg group until the tipping point (i.e., upper limit of the one-sided 95% CI for hazard ratio > 1.3) was reached. Each added EAC-confirmed MACE was assumed to have an onset date on the day following the patient’s end-of-trial date.

SWITCH (Studies 3995 and 3998)

Analyses of all end points were based on the FAS. Efficacy end points and patient-reported outcomes were summarized using the FAS.

Multiplicity

In both SWITCH studies, before testing the primary end point, the secondary supportive efficacy end point (“Change from baseline in A1C after 32 weeks of treatment”) was tested for noninferiority as a prerequisite for testing the primary end point. The analysis was made for each treatment period separately. Analysis was based on a mixed model for repeated measurement; treatment, sex, region, pre-trial insulin treatment regimen, visit, and dosing time were fixed effects, and age and baseline A1C were covariates.

Noninferiority was considered confirmed if the upper bound of the two-sided 95% CI for A1C was equal to or below 0.40% or if the P value for the one-sided test of the null hypothesis (treatment difference > 0.40%) against the alternative hypothesis (treatment difference ≤ 0.40%) was less than 2.5% (IDeg once daily + insulin aspart [IAsp] minus IGlar once daily + IAsp).

Upon confirmation of noninferiority for both treatment periods, the primary end point was tested for noninferiority in SWITCH-1 and for superiority in SWITCH-2. The number of treatment-emergent severe or blood glucose–confirmed symptomatic hypoglycemic episodes during the maintenance period was analyzed using a Poisson model with patient as a random effect; treatment, period, sequence, and dosing time as fixed effects; and time exposure to trial drug in each counting period for hypoglycemic episodes as an offset.

Noninferiority was considered confirmed if the 95% CI for the rate ratio (IDeg followed by IGlar) was ≤ 1.10 or if the P value for the one-sided test of the null hypothesis (rate ratio > 1.10) against the alternative hypothesis (rate ratio ≤ 1.10) was less than 2.5%, where rate ratio is the estimated rate ratio of IDeg followed by IGlar. If noninferiority was confirmed, the superiority of IDeg followed by IGlar was investigated outside of the test hierarchy. Superiority was considered confirmed if the upper bound of the two-sided 95% CI was < 1.00.

Two confirmatory secondary end points were tested provided that superiority was confirmed for the primary end point. The confirmatory secondary end points are given below together with the direction of the test.

The following safety end points were assessed in the maintenance period (i.e., after 16 weeks of treatment) and in each treatment period (weeks 16 to 32 and weeks 48 to 64):

number of treatment-emergent severe or blood glucose–confirmed symptomatic nocturnal hypoglycemic episodes
proportion of patients with one or more severe hypoglycemic episodes.

The number of treatment-emergent severe or blood glucose–confirmed symptomatic nocturnal hypoglycemic episodes during the maintenance period was tested using the same model and sensitivity analyses as for the primary end point. In SWITCH-1, this was a noninferiority analysis, as for the primary outcome, and in SWITCH-2, this was a superiority analysis, as for the primary outcome.

The proportion of patients with one or more severe hypoglycemic episodes in the maintenance period was tested for superiority in both SWITCH studies using McNemar’s test, in which the proportion of patients with severe hypoglycemic episodes treated with IDeg was tested against the proportion of patients with severe hypoglycemic episodes treated with IGlar and not treated with IDeg.

Missing Data

Patients who withdrew or dropped out of the trial were explored with the purpose of investigating whether, in particular, the population that dropped out before the first maintenance period was different from the population exposed in the first maintenance period, and whether there were any differences in dropout between the two treatments. This analysis was added to the statistical analysis plan.

In the primary analysis, patients who were not exposed in the second maintenance period contributed to the estimation of the treatment difference. This implies that these patients were assumed to behave like patients who were exposed in both maintenance periods; that is, a “missing completely at random” assumption. To investigate how this assumption influenced the primary results, a sensitivity analysis was added that included only patients who were exposed in both maintenance periods. This analysis follows the randomization principle in that the same patients were analyzed on both treatments. The treatment estimate from this analysis is an unbiased estimate in the subset of patients who were exposed to the maintenance period for both treatments, under the assumption that missing data for patients who drop out in the second maintenance period are missing at random. Since data from patients who were exposed only in the first maintenance period were excluded, the pragmatic effectiveness principle is violated.

BEGIN Trials (Studies 3770 [Plus Extension], 3585 [Plus Extension], 3583 [Plus Extension], 3579, 3580, 3672, 3586, 3587, 3944, 3668, 3943, and 3582 [Plus Extension])

A1C was analyzed centrally using a National Glycohemoglobin Standardization Program–certified method. The primary objective in all the therapeutic confirmatory trials was to confirm the efficacy of IDeg with respect to glycemic control as measured by change in A1C from baseline to end of trial between IDeg and an active comparator. The primary end point was analyzed using an analysis of variance method with treatment, antidiabetes therapy at screening, sex, and region as fixed factors, and age and baseline A1C as covariates. In the three-arm trials, the primary analysis was IDeg-Flex versus the comparator IGlar. It is not clear how multiplicity was adjusted for when the IDeg-Flex group was compared with the IDeg group.

All efficacy analyses, as well as analyses of hypoglycemia and body weight, were based on the FAS and followed the intention-to-treat principle, with patients contributing to the evaluation “as randomized.” Unless otherwise specified, missing values (including intermittent missing values) were imputed using the last observation carried forward (LOCF) method as recommended for its transparency in the FDA guidance. Both baseline and post-baseline values were used for LOCF, in line with the intention-to-treat principle. As adherence to the intention-to-treat principle could bias the results toward null (i.e., no difference between treatments), the noninferiority assessments for A1C were also confirmed using a per-protocol analysis set, which included only patients treated for at least 12 weeks with a valid post-baseline A1C assessment. In addition, a post hoc analysis was made including only patients who completed the trial.

With the exception of Studies 3580 and 3944, all trials were noninferiority trials, and efficacy was considered confirmed if the upper bound of the two-sided 95% CI for the estimated treatment difference for A1C (IDeg versus comparator) was ≤ 0.4%. This limit — which, according to the manufacturer, is in agreement with the FDA guidance on diabetes — has been used in previous submissions for other insulin products (NovoRapid/NovoLog, NovoMix, and Levemir). Study 3580 tested the superiority of IDeg to sitagliptin, and Study 3944 tested the superiority of IDeg to placebo.

There were five extensions among the BEGIN trials. All three studies in T1DM had extensions, as did T2DM Studies 3579 (insulin-naive) and 3582 (basal-bolus). In all cases, the primary objective of the extensions was to assess safety and tolerability; therefore, they focused on outcomes such as hypoglycemia, adverse events, body weight, and insulin dose, which were all considered to be primary end points. The extensions did not have a primary efficacy variable. The data from the core studies and extensions were to be combined and analyzed as one trial using the original baseline values from core trials. This combined data set was to be the basis for derivation, analyses, and presentation of end points. An additional analysis set, the extension trial set, was reported for efficacy analyses.

Data from study site 109 in Study 3582 was excluded from the analysis due to concerns about the quality of the data after an audit.

Multiplicity

The overall type I error rate was controlled using a hierarchical testing procedure. Hence, if noninferiority was confirmed for the primary end point (superiority in Studies 3580 and 3944), the therapeutic confirmatory trials (except for Studies 3668 and 3770) aimed at demonstrating superiority for a number of confirmatory secondary end points, ordered on the basis of their clinical relevance within the respective treatment regimens and populations investigated. Consequently, superiority could be confirmed only for end points where all previous hypotheses had been confirmed, and the term “superior” is used solely if statistical superiority was confirmed based on hierarchical testing. If superiority was not confirmed, the result was considered to have the same level of evidence as the remaining, non-confirmatory end points.

Missing Data