NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Grant AM, Boachie C, Cotton SC, et al. Clinical and economic evaluation of laparoscopic surgery compared with medical management for gastro-oesophageal reflux disease: 5-year follow-up of multicentre randomised trial (the REFLUX trial). Southampton (UK): NIHR Journals Library; 2013 Jun. (Health Technology Assessment, No. 17.22.)

Chapter 4Comparison of the REFLUX trial with other randomised trials of laparoscopic surgery compared with medical management for gastro-oesophageal reflux disease

Introduction

The REFLUX trial is one of four randomised trials that have compared laparoscopic surgery with medical management of GORD. Although the REFLUX trial has similarities to the other trials, its design is the most pragmatic42 and this is reflected in significant differences in comparison with the other trials. The characteristics of the four trials are summarised in some detail in Appendix 5; key similarities and differences in characteristics between the REFLUX trial and the other trials will be highlighted here. This overview draws heavily on the relevant Cochrane review,43 two of whose authors are authors of this report, but incorporates reports published since the Cochrane review, identified primarily through an updated search using a similar strategy to the one described in the Cochrane review.

The three comparable trials

The Anvari et al. trial4446 is a publicly funded single-centre trial conducted in Canada, led by upper gastrointestinal surgeons. It is the smallest of the four trials (104 randomised). The two intervention policies were standardised and the surgery was undertaken by only four surgeons (Table 23). Reflecting this, nearly all participants – unlike in the REFLUX trial – were managed in the way allocated. Like the REFLUX trial, its primary outcome was a GORD-related QoL instrument (the GERSS or Gastro-Esophageal Reflux Symptom Score), and HRQoL was measured with the same instruments as in REFLUX (SF-36 and EQ-5D). The first report described the trial up to 12 months after surgery,44 and recent papers have reported 3-year results45 and an economic evaluation.46 At 3 years, participants in the medical group were offered surgery and a large proportion (42%) accepted; hence, although further follow-up is reported to be ongoing, it will be of limited usefulness in comparing laparoscopic surgery with medical management.

The LOng-Term Usage of esomeprazole versus Surgery for treatment of chronic GERD (LOTUS) trial4750 is the largest of the four trials (554 randomised). The study was funded by a pharmaceutical company, AstraZeneca, and the reports all include authors based in the company. The trial involved 39 centres in 11 European countries and was led by an upper gastrointestinal surgeon. The trial is described as ‘not designed as a superiority or equivalence trial but, rather, was an exploratory study to estimate the efficacy of laparoscopic anti-reflux surgery and PPI treatment in PPI responders’. Unlike in the REFLUX trial, all participants had shown response to PPI treatment in a run-in phase, and both clinical management policies were strictly standardised (see Table 23).

The method by which the total fundoplication approach was standardised has been described in detail.50 In the medically managed group, the only PPI used was esomeprazole, initially at the standard dose of 20 mg. Both the surgical and medically treated patients were followed up by the investigators at 6-monthly intervals and symptoms were assessed using the Gastrointestinal Symptoms Rating Scale (GSRS) questionnaire. In the medically treated group, esomeprazole could be increased to 40 mg once a day and then to 20 mg twice a day if symptom control was insufficient. Another key difference from the REFLUX trial was that the primary outcome measure was ‘treatment failure’. A single definition of treatment failure could not be used for both trial groups; rather, this was specifically defined for each group (including in the medical group need for escalation of medication and in the surgical group, need for regular medication). The concern is that the thresholds for these may not reflect similar levels of GORD. A GORD-specific QoL instrument (Quality of Life in Reflux and Dyspepsia or QOLRAD54) was among the secondary outcomes but was given relatively little emphasis in the reporting of the trial. No HRQoL instruments were used and there was no economic evaluation. Although the main analysis was said to be carried out on an ITT basis, it seems that the 40 people allocated surgery who did not receive it were excluded from analyses. Results were first reported after 3 years' follow-up47 and recently 5-year data have been published.48

The Mahon et al. trial5153 was a two-centre UK trial led by and involving two upper gastrointestinal surgeons. It is not clear how the main trial was funded but supplementary funds were provided by Jansen Pharmaceuticals ‘for physiological studies’ and by Ethicon Endo-Surgery for the economic analysis.52 In total, 217 people were randomised; the sequence was ‘computerised’ but the randomisation process and extent of concealment were not described. The two surgeons used a similar Nissen fundoplication method (see Table 23) and there was the option of four different PPI regimens depending on what PPI a participant had been taking prior to the trial. A range of outcome measures were reported and these included a gastrointestinal symptom score (GSRS) and a HRQoL measure [Psychological General Well-Being Index (PGWI)].55 All those allocated to medical management were offered surgery after 1 year (and apparently this was made clear to potential participants before trial entry) and the majority [54/94 (57%)] then had surgery. The 1-year follow-up was thus essentially the end of this randomised trial, even though a further follow-up has been reported.53

Gastro-oesophageal reflux disease-related quality-of-life and symptom scores

Data available for each of the trials that describe GORD QoL or symptom scores at 1, 3 and 5 years' follow-up are summarised in Tables 2426. Although it is not possible to combine data because different instruments (or subscales of instruments) were used in the trials, the results are consistent.

At 1 year there are eligible data from all four trials (see Table 24). In each case there are highly statistically significant differences all favouring the surgically managed groups. As mentioned above, the randomised element of the Mahon et al. trial5153 ended at 1 year but data at 3 years are available for the other three trials (see Table 25). Again, all favour the surgical group and this was statistically significant in both the LOTUS4750 and the REFLUX13 trials.

Only the LOTUS and (now) the REFLUX trial have reported 5-year follow-up. GORD-related QoL scores significantly favour the surgical groups in both trials (see Table 26).

Health-related quality of life

No general HRQoL measure has been reported for the LOTUS trial.4750 Data for the other three trials are shown in Tables 2729. The SF-36 was used in the Anvari et al. trial4446 as it was in the REFLUX trial.13 Unfortunately, it is reported only as the two summary component scores, physical (PCS) and mental (MCS), plus the ‘general health’ domain score. For comparability, in Tables 27 and 28 the same score formats are shown for the REFLUX trial but it should be borne in mind that the eight domain scores shown in Chapter 3 for the REFLUX trial are more informative.

At 1 year, in both trials, the PCS and MCS favour the surgical group, although only the difference in the PCS in the REFLUX trial13 is statistically significant. Both trials showed marked differences in the ‘general health’ domain score. There was also a statistically significant difference favouring surgery in the Mahon et al. trial5153 (based on the PGWI).

Although EQ-5D data were collected in the Anvari et al. trial,4446 they were not reported in a way that allows interpretation. At baseline, scores were markedly lower in the surgery group [mean 0.68 (SD 0.28) vs 0.76 (SD 0.21)] and the reason for this imbalance is not clear. At 1 year the equivalent results were 0.79 (SD 0.23) compared with 0.81 (SD 0.19), that is, still lower in the surgery group. As shown in Table 27, in the REFLUX trial,13 the mean 1-year EQ-5D score was higher in the surgery group (p = 0.07).

At 3 years, the report of the Anvari et al. trial mentions collection of the SF-36 ‘every 3 months’ but the only data reported are for the ‘general health’ domain score. This, as in the REFLUX trial, significantly favours the surgical group (see Table 28). There is no mention of collection of EQ-5D data in the 3-year follow-up of the Anvari et al. trial. At 5 years, the only data describing generic HRQoL are from the REFLUX trial (as the LOTUS trial has not included a measure) (see Table 29).

Individual symptoms of gastro-oesophageal reflux disease or its management

Data describing individual symptoms are available for all trials, although only dysphagia was reported in the Mahon et al. trial.5153

Heartburn

As would be expected from the overall GORD-related QoL and symptom scores, all three trials providing data reported less heartburn in their surgical groups. At 1 year in the Anvari et al. trial,4446 the GERSS heartburn subscore is lower in the surgical group (p < 0.001); in the LOTUS trial4750 there is clearly less heartburn in the surgical group but data are presented only graphically; and in the REFLUX trial13 heartburn rates in the surgical group are around half those in the medical group. At 3 years, Anvari et al.4446 report significantly more heartburn-free days in the surgical group (p = 0.008); in the LOTUS trial,4750 less heartburn in the surgical group is shown graphically and the p-value is reported as < 0.001; and in the REFLUX trial13 51% of the randomised surgical group compared with 75% of the randomised medical management group report any heartburn (see Table 15). At 5 years, data are available only from the LOTUS and REFLUX trials. In LOTUS,4750 8% in the surgery group compared with 16% in the medical group are reported to have heartburn, ‘although there was no significant difference in the severity of heartburn (p = 0.14)’. In the REFLUX trial,13 41% in the surgery group compared with 74% in the medical group reported any heartburn (see Table 15).

Regurgitation

Again, as would be expected from the overall GORD-related QoL and symptom scores, all three trials providing data reported less regurgitation in the surgical groups. At 1 year in the Anvari et al.4446 trial, the GERSS regurgitation subscore is significantly lower in the surgical group (p = 0.002); in the LOTUS trial,4750 graphical presentation clearly indicates less regurgitation in the surgical group, although no figures are reported; and in the REFLUX trial,13 regurgitation rates in the surgical group are half those in the medical group. At 3 years, information is available only for the LOTUS and REFLUX trials and both report lower rates in the surgical groups. At 5 years in the LOTUS trial, 2% in the surgical group compared with 13% in the medical group (p < 0.001) have regurgitation, and in the REFLUX trial 25% in the surgical group compared with 37% in the medical group report any regurgitation.

Dysphagia

As mentioned in Chapter 1, dysphagia following both open fundoplication and laparoscopic fundoplication has been reported. At 1 year, Anvari et al.4446 report a higher GERSS dysphagia subscore in the surgical group but this was not statistically significant (p = 0.8); in the LOTUS trial4750 there were more reports of dysphagia in the surgical group but data were presented only graphically; in the Mahon et al. trial,5153 dysphagia persisting beyond 3 months was reported in 5 out of 104 (4.8%) having surgery; and in the REFLUX trial,13 rates of ‘difficulty swallowing’ were the same in the two randomised groups. At 3 and 5 years, information is available only from the LOTUS and REFLUX trials. In the LOTUS trial there is more dysphagia in the surgical group (p < 0.001) at both time points: at 5 years 11% in the surgical group report dysphagia compared with 5% in the medical group. In the REFLUX trial, one further participant had undergone oesophageal dilatation (see Table 12), but the numbers reporting difficulty swallowing were the same in the two randomised groups (see Table 15, e.g. any difficulty swallowing 24.2% vs 23.9%).

Flatulence

Flatulence has also been reported as more common after both open and laparoscopic fundoplication. Information is available only from the LOTUS4750 and REFLUX13 trials. In the LOTUS trial, flatulence was more common in the surgery group than in the medical management group at 1, 3 and 5 years. At 5 years, the rates are 57% in the surgical group and 40% in the medical group (p < 0.001). In the REFLUX trial, rates of ‘wind from the lower bowel’ are not statistically significantly different between the groups [more than three times per week: 65.0% in the randomised surgical group vs 59.4% in the randomised medical group at 1 year; 57.6% vs 58.5% at 3 years; and 65.3% vs 60.0% at 5 years (see Table 15 for more detail)].

Other symptoms

In the LOTUS trial,4750 ‘bloating’ was reported more commonly in the surgical group (40% vs 28% at 5 years). In contrast, ‘bloating/trapped wind’ was reported less commonly in the surgical group in the REFLUX trial13 (at 1 year: 72.1% vs 82.4%). A particular concern following fundoplication is an inability to vomit despite wanting to. In the REFLUX trial we attempted to address this through a question on ‘frequency of wanting to be sick but being physically unable to’ and found no difference between the groups (see Table 15).

Surgical complications

Like all procedures involving surgery under general anaesthesia, laparoscopic fundoplication carries risks. Table 30 summarises intra and early postoperative complications reported in the four trials.

Conversion to an open procedure

The decision to convert from a laparoscopic to an open approach is usually indicative of difficulties experienced during the procedure. Rates varied from 0% in the Anvari et al. trial4446 to 2.4% in the LOTUS trial4750 (see Table 30).

Intraoperative complications

In the Mahon et al.5153 and REFLUX13 trials combined, the 10 intraoperative complications reported (overall rate 2.3%) were injuries to the spleen (n =  3), liver (n =  3), pleura (n = 3) and oesophagus (n = 1). In the LOTUS trial4750 it was unclear whether intraoperative complications occurred or whether they were incorporated within all postoperative complications; however, the report noted that 29 participants encountered a variety of operative difficulties that were described as ‘trivial’.

Early postoperative complications

In the Anvari et al. trial,4446 seven (14%) participants had postprandial bloating, two of whom were treated with a single dilatation of the wrap. No details are given of the postoperative complications in the LOTUS trial. In the Mahon et al. trial5153 there were three wrap migrations, two respiratory tract infections and one case of a sutured nasogastric tube. In the REFLUX trial,13 one participant in the randomised group and two in the preference group were admitted to a high-dependency unit immediately after the surgical procedure.

Reoperations

By the time of the 3-year follow-up in the Anvari et al. trial,4446 4 of 51 (7.8%) participants had undergone a second fundoplication operation. Four (3.7%) in the Mahon et al. trial5153 required reoperation within 3 months of their first fundoplication, one of whom had a gastric resection because of necrosis. It is not clear if anyone in the LOTUS trial4750 had a reoperation. As shown in Table 11, in the REFLUX trial,13 5 of the 112 (4.5%) randomised to surgery who actually had a fundoplication had a second reflux-related operation, and this applied to 16 (4.4%) of the total 364 participants in the study who had a laparoscopic fundoplication.

Other late postoperative complications

Dilatation of the wrap was reported for two (3.9%) people in the Anvari et al. trial4446 and four (3.7%) in the Mahon et al. trial.5153 It is not stated whether or not dilatation occurred in the LOTUS trial.4750 In the REFLUX trial,13 two (1.8%) participants in the randomised surgical group (plus two in the preference surgical group – giving an overall rate of 1.1%) had stricture dilatation or food disimpaction (see Table 12). There were three cases (0.8%) of repair of incisional hernia in the REFLUX trial – all in the preference group – but this complication was not mentioned in the other trials' reports. There were no deaths in any of the trials associated with surgical or medical management.

Surgery-related mortality

No perioperative deaths were reported among the 771 people in the four trials who had fundoplication surgery.

Discussion

Of the four trials, the REFLUX trial is the most pragmatic in design. It involved a large proportion of UK centres where laparoscopic anti-reflux surgery is undertaken and the surgery was undertaken by NHS upper gastrointestinal surgeons within these centres, all of whom had experience of carrying out the procedure. The exact method of fundoplication was left to the discretion of the surgeon, so he or she was comfortable with the approach. After surgery and, in the medically treated patients, after optimisation of their PPI medication, care of the participants was the responsibility of GPs. The principal measure of outcome was a patient-reported disease-specific QoL measure. Unlike the other trials, the REFLUX trial was coordinated from an accredited trials unit, local recruitment was led by gastroenterologist/gastrointestinal surgeon partnerships rather than by gastrointestinal surgeons alone, and the trial was publicly funded through the HTA programme rather than by industry.

In respect of potential benefits of surgery, the four trials appear to be consistent. All show significantly better relief of GORD symptoms for as long as the length of their current follow-up. (Surprisingly, the LOTUS trial report48 does not draw attention to this but, judged on data describing the QOLRAD reported in an e-table, there are significant differences between the groups in all dimensions of this instrument, favouring surgery.) Data available describing the principal symptoms of GORD (heartburn and regurgitation) show large differences, again favouring surgery. Only limited data are available from generic QoL measures, and much of this is from the REFLUX trial; although differences are less marked than for the GORD-related QoL instruments, they are consistent with benefit from surgery.

The four trials are broadly consistent in respect of intraoperative and early postoperative complications: a small number of operations are converted to an open procedure, a small number of laparoscopic procedures have associated visceral injuries, a small number of people have problems postoperatively and a small number require dilatation of the wrap. The REFLUX trial suggests that 4.5% have reoperations and the other trials are broadly consistent with this. None of the trials had a reported perioperative death. Data from the Finnish Registry56 suggest a mortality of 0.1%, but this is based on a single case among 1162 people who had laparoscopic fundoplication; furthermore, the registry included all cases of fundoplication and hence went beyond the sorts of patients recruited to the REFLUX trial.

The other trials, particularly the LOTUS trial, show higher rates of dysphagia and flatulence following laparoscopic fundoplication than in the medically managed group. As mentioned above, a small number of participants in the REFLUX trial did have a dilatation procedure, presumably because of difficulty swallowing, but this was not reflected in responses to the REFLUX questionnaire, suggesting that there were only a few isolated cases of dysphagia following surgery in this trial. Similarly, there were no significant differences in flatulence in the REFLUX trial.

Hence, taking all four trials together, it is now possible to give a clear picture of most of the potential benefits and risks of laparoscopic fundoplication, at least up to 5 years. There are, however, differing resource implications of surgery and medical management. In the next chapter we explore whether or not the benefits of surgery in patients with established GORD requiring long-term PPI therapy for reasonable control and suitable for either clinical policy (average age around 45 years) are sufficient to outweigh any differences in costs.

Tables

TABLE 23

Surgical procedure/experience in the four trialsa

TrialSurgeon experienceCrural repairGastric divisionNo. of surgeons participating
Anvari4446> 50 procedures performedNot reportedShort vessels divided4
LOTUS4750> 40 procedures performed and current workload ≥ 20 per annumProtocol specified posterior repairProtocol specified division40 trained
Mahon5153‘Experienced’Yes, all patientsShort vessels divided2
REFLUX13> 50 procedures performed Surgeon discretionSurgeon discretion Not reported
a

Adapted from Wileman et al.43

TABLE 24

Gastro-oesophageal reflux disease-related QoL and symptom scores at 1 year

TrialSurgicalMedicalMean difference (95% CI)p-value
nMean (SD)nMean (SD)
Anvari4446
GERSS528.3 (8.4)5213.6 (9.5)−5.3 (−8.7 to −2.0)0.002
LOTUS4750
QOLRAD
  Vitality2036.84 (0.52)2206.42 (0.92)0.42 (0.28 to 0.56)< 0.001
  Food and drink2036.78 (0.60)2206.34 (0.98)0.44 (0.28 to 0.60)< 0.001
  Sleep2036.87 (0.49)2206.53 (0.76)0.34 (0.22 to 0.46)< 0.001
  Physical/social2036.93 (0.36)2206.72 (0.52)0.21 (0.12 to 0.30)< 0.001
GSRS
  REFLUX dimension2481.18 (0.44)2661.66 (0.88)−0.48 (−0.60 to −0.36)< 0.001
Mahon5153
GSRS8037.0 (5.4)8635.0 (7.3)2.00 (0.003 to 3.94)0.003
REFLUX13
REFLUX QoL17884.6 (17.9)17973.4 (23.3)14.0 (9.6 to 18.4)< 0.001

TABLE 25

Gastro-oesophageal reflux disease-related QoL and symptom scores at 3 years

TrialSurgicalMedicalMean difference (95% CI)p-value
nMean (SD)nMean (SD)
Anvari4446
GERSS496.21 (8.66)449.05 (10.40)−2.84 (−6.77 to 1.09)0.166
LOTUS4750
QOLRAD
  Vitality1816.90 (0.31)1896.53 (0.85)0.37 (0.24 to 0.50)< 0.001
  Food and drink1816.85 (0.40)1896.38 (0.91)0.47 (0.33 to 0.61)< 0.001
  Sleep1816.92 (0.33)1896.53 (0.82)0.39 (0.26 to 0.52)< 0.001
  Physical/social1816.94 (0.25)1896.74 (0.58)0.20 (0.11 to 0.29)< 0.001
Mahon5153 – trial terminated at 1 year
REFLUX13
REFLUX QoL13287.0 (15.0)13479.7 (20.1)9.0 (4.9 to 13.1)< 0.001

TABLE 26

Gastro-oesophageal reflux disease-related QoL and symptom scores at 5 years

TrialSurgicalMedicalDifference (95% CI)p-value
nMean (SD)nMean (SD)
Anvari4446 – no data available
LOTUS4750
QOLRAD
  Vitality1606.86 (0.44)1796.49 (0.99)0.37 (0.20 to 0.54)< 0.001
  Food and drink1606.80 (0.51)1796.47 (0.80)0.33 (0.18 to 0.48)< 0.001
  Sleep1606.89 (0.47)1796.61 (0.72)0.28 (0.15 to 0.41)< 0.001
  Physical/social1606.94 (0.23)1796.75 (0.51)0.19 (0.10 to 0.28)< 0.001
Mahon5153 – trial terminated at 1 year
REFLUX13
REFLUX QoL12786.7 (13.8)11980.7 (20.3)6.42 (1.61 to 11.23)0.009

TABLE 27

Health-related quality of life at 1 year

TrialSurgeryMedicalDifference (95% CI)p-value
nMean (SD)nMean (SD)
Anvari4446
SF-36
  PCS5246.4 (10.9)5243.9 (10.3)3.15 (−0.94 to 7.23)0.13
  MCS5252.7 (10.9)5251.5 (9.1)0.98 (−2.8 to 4.76)0.61
  General health domain score5275.4 (23.2)5266.4 (23.6)12.3 (3.7 to 20.8)0.005
LOTUS4750 – not reported
Mahon5153
PGWB79106.2 (16.3)86100.4 (18.9)5.8 (0.43 to 11.17), adjusted 7.1 (2.5 to 11.7)
REFLUX13
SF-36
  PCS15048.0 (10.2)16145.1 (9.7)3.51 (1.77 to 5.25)< 0.001
  MCS15046.6 (12.8)16145.1 (13.1)1.63 (−0.79 to 3.85)0.195
  General health domain score17845.2 (11.1)17940.7 (11.2)4.8 (2.7 to 6.8)< 0.001
EQ-5D1780.75 (0.25)1790.71 (0.27)0.047 (−0.004 to 0.097)0.07

TABLE 28

Health-related quality of life at 3 years

TrialSurgeryMedicalDifference (95% CI)p-value
nMean (SD)nMean (SD)
Anvari4446
SF-36
  PCS – not reported
  MCS – not reported
  General health domain score4978.50 (19.76)4471.41 (21.73)12.19a (2.65 to 21.72)0.0124
LOTUS4750 – not reported
Mahon5153 – trial terminated at 1 year
REFLUX13
SF-36
  PCS12847.2 (9.9)12746.6 (10.0)1.43 (−0.45 to 3.32)0.136
  MCS12848.9 (10.6)12745.6 (12.6)4.05 (1.57 to 6.52)0.001
  General health score13245.3 (10.0)13442.4 (11.8)3.69 (1.50 to 5.87)0.001
EQ-5D1320.803 (0.231)1340.747 (0.262)0.070 (0.0015 to 0.126)0.013
a

Presumably adjusted.

TABLE 29

Health-related quality of life at 5 years

TrialSurgeryMedicalDifference (95% CI)p-value
nMean (SD)nMean (SD)
Anvari4446 – no data available
LOTUS4750 – not reported
Mahon5153 – trial terminated at 1 year
REFLUX13
SF-36
  PCS11346.1 (9.9)10946.1 (10.5)1.47 (−0.84 to 3.79)0.211
  MCS11347.8 (11.7)10947.9 (11.7)1.27 (−1.36 to 3.90)0.343
  General health domain score11744.1 (10.3)11143.2 (11.5)2.76 (0.21 to 5.31)0.034
EQ-5D1270.774 (0.259)1190.761 (0.282)0.047 (−0.013 to 0.108)0.126

TABLE 30

Intra- and early postoperative events in the four trialsa

Trialn having operationConversion, n (%)Intraoperative complications, n (%)Postoperative complications, n (%)
Anvari4446510 (0.0)0 (0.0)7 (13.7)
LOTUS47502486 (2.4)Unclear7 (2.8)
Mahon51531091 (0.9)4 (3.7)6 (5.5)
REFLUX13
  Randomised1112 (1.8)2 (1.8)1 (0.9)
  Preference2180 (0.0)4 (1.8)2 (0.9)
a

Adapted from Wileman et al.43

Copyright © Queen's Printer and Controller of HMSO 2013. This work was produced by Grant et al. under the terms of a commissioning contract issued by the Secretary of State for Health. This issue may be freely reproduced for the purposes of private research and study and extracts (or indeed, the full report) may be included in professional journals provided that suitable acknowledgement is made and the reproduction is not associated with any form of advertising. Applications for commercial reproduction should be addressed to: NIHR Journals Library, National Institute for Health Research, Evaluation, Trials and Studies Coordinating Centre, Alpha House, University of Southampton Science Park, Southampton SO16 7NS, UK.

Included under terms of UK Non-commercial Government License.

Bookshelf ID: NBK260648