NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Gregson BA, Rowan EN, Francis R, et al.; on behalf of the STITCH(TRAUMA) investigators. Surgical Trial In Traumatic intraCerebral Haemorrhage (STITCH): a randomised controlled trial of Early Surgery compared with Initial Conservative Treatment. Southampton (UK): NIHR Journals Library; 2015 Sep. (Health Technology Assessment, No. 19.70.)

Appendix 4Health economic analysis

Prepared by Dr Dwayne Boyers and Professor Paul McNamee

Introduction

The objective of the economic evaluation is to assess the cost-effectiveness of a strategy of Early Surgery compared with Initial Conservative Treatment in the management of traumatic ICH. This chapter reports the quality of life outcomes, resource use, costs and cost-effectiveness analyses performed alongside the STITCH(TRAUMA) randomised controlled trial, over a 6-month follow-up period.

It should be noted that the original intention of the economic component of the study was to conduct a cost–utility analysis from the UK NHS perspective, using cost and outcome data collected only on UK patients. We anticipated recruiting 150 UK patients into the trial to achieve this goal. However, UK recruitment achieved only six patients, which was not sufficient to produce a meaningful assessment of costs and outcomes for UK patients. An alternative analysis was therefore conducted, which used data from all participating centres in all countries recruiting into the trial. This approach has the advantage of being more relevant to a wider group of decision-makers. However, such an approach required the collection of additional cost data in non-UK sites, which proved challenging for some centres. Results are presented in terms of all patients recruited as a single analysis, as well as subgroup analyses based on World Bank country income group classifications (low-, lower middle-, upper middle- and high-income countries).

Methods

Resource use and costs

The main analysis focuses on results from an international perspective. However, given that the original objective of the cost-effectiveness analysis was to report from a UK perspective, we begin by reporting UK resource use for the six patients recruited into the trial.

Resource use and costs based on a UK analysis

Descriptive results are presented for the resource use and costs associated with UK participants for information only. Costs are assigned to resource use data based on 2013 health resource group (HRG) payment by results data.1 We apply the non-elective cost associated with intracranial procedures for trauma with a diagnosis of intracranial injury (HRG code AA02).39 Costs with (£6231 up to 43 days admission + £207 per day thereafter) and without (£4126 up to 18 days + £207 per day thereafter) complications are presented. As our data are presented in days in hospital, we calculate cost per day as the tariff value, divided through by the trim-point time. We then take the average tariff for those with and without complications. This means an overall cost per day applied to resource use in the analysis of £187 per day over the initial episode of care. Further, data from the patient questionnaire are used to estimate the number and length of hospital readmissions at 6 months’ follow-up. As the exact reason for readmission was not reported, we have assumed the same HRG codes would apply to these episodes of care also.

Resource use and costs based on international analysis

All data from the six UK trial participants are used in the costing analysis together with additional data collected from the other trial participating countries. Collection of this supplementary data required the development of a site-specific questionnaire, administered to all participating centres in the trial, to collect data on resource use and unit costs of care. The supplementary questionnaire used for the analysis is included as Appendix 5. Not all sites responded to this questionnaire which generated a substantial amount of missing data. Data were returned for 16 out of 31 sites (52%), representing 115 out of 168 (68%) patients recruited to the study. This required the imputation of cost and resource use data following some plausible assumptions, namely:

  1. Where resource use and cost data were missing for some centres recruiting within a country, we have imputed weighted averages based on the number of patients recruited at centres where data are available.
  2. Where resource use and cost data were missing for all centres within a country, we have pragmatically imputed weighted average data from all countries in the same income group, with country income subgroups determined according to World Bank classifications.

The impact of these data imputation methods is tested in sensitivity analyses, described in the sensitivity analyses section of this chapter.

The costing analysis is reported on the intention-to-treat principle and undertaken from an international health services perspective. The likely major drivers of costs (i.e. surgery, hospital stay and readmissions) are included. Surgery resource use (including staff time and overheads) and unit costs (e.g. surgeon’s salary and cost of theatre use) were collected using the site-specific questionnaires. Other health-care resource use data were collected, including days in intensive care, high-dependency units and general wards (all sourced from individual case report forms) and hospital readmissions (sourced from the participant 6-month questionnaire). The analysis follows recommendations from Drummond et al.29 and Manca et al.,30 reporting resource use and cost data at the country level. The costing of these hospital resource use data is undertaken in two stages. First, we apply country-specific unit costs for nights in hospital (intensive care unit, high-dependency unit, general ward) to resource use data to generate total costs. National average unit costs were not available for the majority of countries in the trial. Costs were therefore sourced directly from finance departments at specific sites and applied to resource use data to generate estimates of costs. Then, country-specific unit costs were transformed into international dollar costs (2013 values), using the Campbell & Cochrane Economics Methods Group purchasing power parity calculator.31

To account for the highly skewed nature of cost data (i.e. a small proportion of patients incurring very high costs), we use GLM regression models, specifying a gamma family and identity link which best fits the distribution of the cost data. The choice of base-case model for the analysis was made on the basis of the lowest Akaike information criterion score, a method which is recommended as standard best practice.30 Heteroskedastic robust standard errors were used for all analyses. The model estimates the impact of treatment group (Early Surgery compared with Initial Conservative Treatment) on costs adjusting for patient characteristics (age and sex). GLM models were bootstrapped using non-parametric bootstrapping techniques (n = 1000 repetitions) in Stata Version 13 (StataCorp LP, College Station, TX, USA) to generate data for developing CEACs.31

Subgroup analyses of costs

Owing to the differences in organisation of care across countries and the value of money differences across jurisdictions, it is likely that the base-case analysis (a single analysis of all trial participants), while statistically more efficient, may have little relevance to informing decision-making at an individual country level. Results are, thus, also reported for groups of countries based on a measure of their development. For this purpose, all countries within the trial were ranked in ascending order of GNI per capita, according to World Bank classifications. The classification groups are low income, GNI per capita equal to Int$1005 or less (including Nepal); lower middle income, GNI per capita between Int$1006 and Int$3975 (including India); upper middle income, GNI per capita between Int$3976 and Int$12,275 (including China); and high income, GNI per capita of Int$12,276 or more (including most Western European countries). The logic is that such countries with similar GNI will deliver broadly comparable levels of care and reporting in this way improves the relevance of the costing for local policy-makers. It also provides an intuitive grouping of countries in the absence of enough data to conduct a country-specific analysis. This approach has been used successfully in a previous study.40

Quality-adjusted life-years

The EQ-5D-3L generic quality of life instrument41 was administered to all trial participants at 6 months’ follow-up. Responses to the EQ-5D-3L questionnaire were valued using UK general population tariffs32 to generate a utility score for every patient within the trial. We assumed that all patients suitable for randomisation were in an unconscious state and would thus have a baseline utility of –0.402.42 Given a lack of published tariff data across all trial recruiting countries, we have assumed that the UK tariffs offer a reasonable reflection of quality of life scores across all trial participants. This assumption is associated with limitations and assumes preferences for health states valued on the EQ-5D-3L are similar across countries. However, this approach offered the most pragmatic solution and has the advantage of applying tariffs derived from a standard method (i.e. time trade-off) to all quality of life data.

Quality-of-life data derived from the EQ-5D-3L are combined with mortality data from the trial, using the standard assumption that all participants who have died in the trial will have a utility value of 0. QALYs are then calculated on the basis of these assumptions using an area beneath the curve approach, assuming linear extrapolation of utility between time points [baseline (assumed –0.402) and 6 months’ follow-up]. Where the date of death was available, the QALY calculation has been modified to include this additional information. This introduces some asymmetries into the calculation between those who died and those who survived. The impact of assuming a linear extrapolation between time points for all patients is tested in the sensitivity analysis.

Differences in QALY estimates across groups are analysed using ordinary least squares (OLSs) standard linear regression models. Bootstrapped regressions (1000 repetitions in Stata) are conducted to account for the non-normality of QALY data and regressions are adjusted for patient characteristics, namely age and sex, and to generate data for CEACs.43 Heteroskedastic robust standard errors are applied to all models. Subgroup analyses are presented for the utility scores and QALYs for each country income group according to that outlined in the Subgroup analyses of costs section. This facilitates the production of incremental cost-effectiveness ratios (ICERs) for each country income group.

Cost–utility analysis

The health economic evaluation is a cost–utility analysis, reporting results as incremental cost per QALY gained for Early Surgery compared with Initial Conservative Treatment. The cost per QALY is presented using the ICER, calculated as the coefficient of treatment effect on costs divided by the coefficient of treatment effect on QALYs from the respective linear regression models. Estimates of the ICER should then be compared with the normal decision-making practice in individual countries or regions.

Costs and QALY differences are presented on the cost-effectiveness plane and CEACs are derived from the net benefit statistic to illustrate the probability of Early Surgery being the most cost-effective option. All statistical analyses were conducted using Stata and Microsoft Excel® (Microsoft Corporation, Redmond, WA, USA) was used for calculation of CEACs. Owing to the short period of follow-up of only 6 months, no discounting of costs or QALYs was necessary. No extrapolation to a longer-term time horizon was conducted. Owing to the acute nature of the clinical indication, and the likely recovery time after surgery, it is likely that all patients will either have recovered or died during the trial follow-up period, with no substantial additional costs or QALYs to be accrued over a lifetime horizon.

Sensitivity analyses and assumptions

Resource use and costs

A range of sensitivity analyses were undertaken in order to test the impact of assumptions and uncertainty surrounding resource use and cost data. Specifically:

  1. We have tested the impact of variability of resource use and costs at different sites within a country. We conducted an assessment where costs for a given country were estimated on the basis of the highest and lowest cost sites within a country, with those costs applied to all sites in that country. This analysis helps to assess the variability in the organisation of care at different sites within a country and will to some extent address the impact of any differences in the private/public provision of care at different sites.
  2. We have tested the impact of variability of resource use and costs at a regional level, where a region is defined by the World Bank country income classifications outlined above. The aim of this sensitivity analysis is to address the uncertainty across countries within an income group and present a plausible range of costs and ICERs for countries (including those not participating in the trial) at a regional level.

These and other assumptions used for the analysis of cost data are presented in Table 14. The table outlines the assumptions made together with justifications for each assumption and any sensitivity analyses conducted to assess the impact on costs and cost-effectiveness.

Quality of life and quality-adjusted life-years

European Quality of Life-5 Dimensions-3 level follow-up responses and death data were fully recorded for all respondents who entered the trial and therefore QALY data were complete, with no missing data needing to be imputed. Owing to a large number of deaths within the trial, we have conducted sensitivity analyses exploring the impact of alternative methods of extrapolation of the EQ-5D-3L utility data for calculation of QALYs in the trial. The base-case analysis imputes a utility score of ‘0’ from the date of death to 6-month follow-up for those who died. Survivors’ QALYs are calculated on the basis of linear interpolation between time points. This reflects the fact that there may, in theory, be a QALY benefit to dying earlier in the trial, as opposed to remaining in a health state valued worse than death for a longer period of time. However, the use of these data in this way creates an asymmetry of information between those who survived and those who died over the trial follow-up, as we have more precise information for those who died. We, therefore, conducted an alternative analysis, in which all patients accrued QALY gains using the same linear interpolation, regardless of whether they survived or died. Although this method addresses the issues of asymmetry in the information available, it does not make use of all information available to us for QALY calculation. Assumptions surrounding QALY calculations, justifications and associated sensitivity analyses conducted where appropriate are outlined in Table 15.

Analysis models of data and the impact of crossovers within the trial

In addition to the sensitivity analyses carried out on the resource use, unit cost and QALY calculations outlined in Tables 14 and 15, we conduct two further sensitivity analyses investigating the impact on results of (1) using an alternative non-parametric bootstrapped OLS regression model to account for the non-normal distribution of both cost and QALY data and (2) including interaction terms in the base-case analysis model to address the impact of crossovers on incremental costs, incremental QALYs and on cost-effectiveness.

Sampling uncertainty

Non-parametric bootstrapping techniques,44 based on 1000 repetitions of the GLM and OLS regressions were undertaken to determine the cost and QALY differences, respectively, between Early Surgery and Initial Conservative Treatment. Data from these bootstrapped regressions for cost and QALY differences were used to develop CEACs and to present scatterplots of cost and QALY differences on the cost-effectiveness plane.44 CEACs are calculated using a net benefit approach and indicate the probability of an intervention being cost-effective at various threshold values of willingness to pay (WTP) for a QALY gain. They are especially useful when making decisions on cost-effectiveness on the balance of probabilities, and when incremental costs or effects fail to meet the traditional level of statistical significance (i.e. 95% confidence).

Results

Resource use and costs – site-specific costing questionnaire

Full hospital resource use data were available for all patients within the trial (n = 168). These data were supplemented by unit cost information and surgical resource use data collected from the additional site questionnaires. The questionnaires were sent to all 31 recruiting sites in 13 countries worldwide. Data were returned for 16 out of 31 sites (52%), representing 115 out of 168 (68%) patients recruited to the study. Completeness of data from the questionnaires is outlined in Figure 11.

Data from the site-specific questionnaires returned were used to make plausible assumptions about resource use and unit costs in other countries where data were missing. These assumptions and sensitivity analyses undertaken have been outlined in Table 14.

The results of the surgery costing exercise show wide variation within countries, for example India (Figure 12) and across countries (Figure 13).

Figure 12 shows substantial variation in costs across the two sites which reported data for India. This is because of different public/private mixes of care at these hospitals and illustrates the impact this variability can have on the cost estimates at a country level. The variation between sites in India is the most extreme of all the countries recruiting to the trial; however, it illustrates the need to conduct sensitivity analyses for the imputation of data at all the other centres who did not contribute data to the resource use and costing questionnaire. Figure 13 illustrates the variability in costing across countries.

There is wide variability across the different countries recruiting to the trial. As expected, countries in Europe have much greater treatment costs than those in lower-income countries. However, even within country income groups, there appears to be substantial variability. The most extreme variation in country groups is between the UK and the Czech Republic, which again illustrates the importance of sensitivity analyses to assess the impact of this variation on total cost estimates for the final cost-effectiveness calculations.

Resource use and costs: – UK analysis

The results of the resource use and costs associated with the six patients recruited from the UK into the study are presented in Table 16, using descriptive statistics and are presented for information only.

The remainder of this chapter refers to the costing and cost-effectiveness analysis from an international health-care provider perspective, using data from all participants recruited into the study.

Resource use and costs: international analysis

The results of the base-case costing analysis, showing mean [standard deviation (SD)] resource use and mean (SD) costs in international dollars, based on the intention-to-treat principle are reported in Table 17.

For the whole sample, a comparison of raw mean costs shows that Early Surgery was, on average, Int$476 more costly than Initial Conservative Treatment. Using the general linear modelling estimates, with adjustment for patient characteristics of age and sex, Early Surgery is Int$1774 more costly (95% CI –Int$132 to Int$3679). The results are not significantly different between groups at the traditional 5% level of significance, but are significant at the 10% level. This suggests that there is weak evidence which suggests that Early Surgery is significantly more expensive than Initial Conservative Treatment. This perhaps indicates that patients, who are more likely to survive in the Early Surgery arm, are thus more likely to incur greater health-care costs also, through longer term treatment, and rehospitalisations as a result of their survival.

Costing subgroup analysis

Table 18 presents the cost breakdown of resource use and costs by income subgroups based on World Bank income classifications presented in the Methods section.

Sample sizes were small for country subgroups and results should be interpreted with caution. This is particularly true for low- and high-income subgroups, which recruited 30 participants. Results should therefore be treated as exploratory. Owing to the small sample sizes, there is no evidence of significant differences in costs at the country income subgroup level.

Quality-adjusted life-years

For the purposes of this analysis, we assume that all patients in the trial started with an EQ-5D-3L health state of unconscious, corresponding to a baseline utility value of –0.402, applied to all patients in the trial. Figure 14 details the results of the responses to the individual EQ-5D-3L domains at 6 months’ follow-up. Data are presented on the basis of the percentage of respondents to the questionnaire who reported any problems on any of the EQ-5D-3L domains, broken down by randomised group.

The results show that a greater proportion of respondents in the Early Surgery group experienced at least some problems in each of the five domains of the EQ-5D-3L, when compared with respondents in the Initial Conservative Treatment group. The most notable differences were in the Usual Activities, Pain/Discomfort and Anxiety/Depression domains, in which at least 20% more respondents in the Early Surgery group had at least some problems. In contrast, the proportion of patients who had died over the course of the 6-month follow-up period was twice as high in the Initial Conservative Treatment group as in the Early Surgery group (33% vs. 15% respectively). These data suggest that as more people survive in the Early Surgery group, they continue to have some problems in their quality of life at 6 months’ follow-up.

Table 19 presents the descriptive statistics for the QALY analysis, based on the assumption of all patients commencing with a baseline utility score of –0.402, equivalent to being unconscious. Owing to the non-normal nature of the QALY data, means (SD) are presented together with median (intraquartile range) for information. Differences between groups are estimated using the bootstrapped regressions described in the Methods section.

Based on the regression analysis, incorporating date of death into the QALY calculation, and adjusting for patients’ characteristics of age and sex, we find that on average, patients randomised to the Early Surgery group had an average gain of 0.019 QALYs over a 6-month period, 95% bootstrapped CI (–0.004 to 0.043), when compared with those randomised to the Initial Conservative Treatment. This is equivalent to an incremental QALY gain of 3.5 days over a 6-month period. The broad QALY gains are driven primarily by the increased chance of survival in the Early Surgery group.

Subgroup analysis of quality-adjusted life-years data

Again applying UK-specific population weights to the quality of life scores from the EQ-5D-3L questionnaire, Table 20 presents the results by country income subgroup as classified by the World Bank country income subgroups described in the Methods section.

For all income subgroups, utility scores were, on average, higher in the Early Surgery group than in the Initial Conservative Treatment group at 6 months’ follow-up. There were negligible differences in raw mean QALYs for the low- and high-income countries, although sample sizes were very small and the regression outputs should be interpreted with caution. However, lower middle- and upper middle-income countries tended to show average QALY gains for those in the Early Surgery group. The results indicate that, although not reaching statistically significant QALY gains, patients in these groups are likely to experience QALY gains from Early Surgery. The results are promising for Early Surgery and indicate broad generalisability across the largest recruiting countries. Owing to the very small sample sizes recruited, it is not possible to draw conclusions on QALY outcomes for the low- and high-income countries.

Cost-effectiveness

The results of the base-case cost-effectiveness analysis are presented in Table 21.

For the base-case analysis, the incremental costs for the whole group together were Int$1774, with average QALY gains of 0.019. Therefore, on this basis, the base-case ICER is Int$1774/0.019 = Int$93,368 per QALY gained for Early Surgery when compared with Initial Conservative Treatment.

However, the ICER in itself should be interpreted with caution, given the range of countries which contributed data to the costing process, and also relating to alternative thresholds of cost-effectiveness which may be used by decision-makers in different jurisdictions. This complicates interpretation of the ICER. It may thus be more informative to examine the incremental costs and QALYs for individual subgroups based on their country income classification. These data are presented in Table 22. Owing to uncertainty in incremental costs and effects, we have not presented ICERs in this table. Decision-makers are instead referred to the CEAC presented in Figure 17, which illustrates the uncertainty in cost-effectiveness at the subgroup level.

As expected, there are substantial differences in costs across the subgroups. It appears that incremental costs of Early Surgery versus Initial Conservative Treatment could be decreasing for more developed countries. However, the results could equally be skewed by outliers in the data, which would have a large impact, given the small numbers. Despite substantial uncertainty in the presented results at a subgroup level, the data, on balance, suggest that favourable results could be achieved for Early Surgery in all subgroups with the exception of low-income countries.

Sensitivity analyses

A range of sensitivity analyses were undertaken focusing on the assumptions used to impute data from the site-specific questionnaire, QALY calculation methods and methods of analysis of the data. The impact of these analyses on cost-effectiveness results is outlined in Table 23. Sensitivity analyses on cost and QALY calculations are based on the assumptions outlined in the methods section and cross-referenced in the table below. Further analyses explore the impact of the model of analysis of cost data and the impact of crossovers within the study. Two ICERs are produced for each sensitivity analysis, the first based on a comparison of raw mean data across groups and the second based on the incremental costs and QALYs calculated using regression analyses described.

Owing to the small sample size in two of the income subgroups, sensitivity analyses are performed only on the whole sample of patients recruited into the trial.

The results of the sensitivity analyses show some important impacts on cost-effectiveness calculations depending on the assumptions used in the analysis. For the costing assumptions, as expected, imputing higher cost estimates increased the ICERs, and lower cost estimates reduced the ICER. Based on the range of cost imputations tested in the sensitivity analyses, the ICER ranged from Int$15,227 to Int$141,724. This serves to illustrate the impact of within-country variation in reported unit costs and resource use from the centre-specific questionnaires, and the impact of assumptions around missing data on cost estimates used for the cost-effectiveness analysis. This variation is probably driven by the differing public/private mix of care in specific countries, and especially in relation to data provided for India. The results were less sensitive to cross-country variation, within World Bank country income groups. This adds some confidence to the reliability of cross-country comparisons within income subgroups and suggests results may to some extent be generalisable to other countries within income groups.

The sensitivity analysis with one of the greatest impacts on the ICER related to the method of QALY calculation. The base-case analysis makes use of all available information, imputing a utility score of ‘0’ for those who have died, from the date of death to the end of follow-up. We chose to use the date of death for QALY estimation because, given the initial negative utility, patients could in theory attain a QALY advantage from dying earlier in the follow-up period. However, the use of date of death will create an asymmetry of information on the time point at which utility is accrued in the trial between those who have died and those who survived, as we have no comparable information for survivors, other than at their 6-month follow up. Therefore, in order to illustrate the impact of this asymmetry, we have conducted an analysis assuming a linear interpolation between time points when calculating QALYs. This effectively ignores the date of death in the calculations but addresses the asymmetry of information. Take, for example, a patient who died at 10 days after surgery. In the base-case analysis, they would have [(–0.402 + 0)/2] × (10/365)] = –0.005507 QALY, over the first 10 days + 0 QALYs between death and 6 months. For the sensitivity analysis, the equivalent calculation would be –0.1005 QALYs [(–0.402 + 0)/2] × (182.5/365)]. These results show substantial differences in QALY calculation depending on whether or not date of death is accounted for.

In addition to the base-case GLM model, we conducted exploratory analysis on the impact of model of analysis on trial outcomes. For example, running a bootstrapped OLS regression model to account for the skewed distribution of the cost data shows Early Surgery being Int$314 more costly [95% bootstrapped CI (–Int$5139 to Int$5766)]. The resultant ICER falls to Int$16,526 per QALY gained. However, this analysis should only be interpreted as exploratory and closer examination of the data confirms that the GLM model with gamma distribution is the most appropriate fit to the data. However, the analysis serves to illustrate the potential uncertainty and impact of analysis model on the costing results.

Running an alternative model as a sensitivity analysis, which includes interaction terms to address crossovers in the GLM model, suggests on average higher costs and lower quality of life for groups which have crossed over in both directions, therefore indicating that crossovers have an important impact on results. Increases in costs and deterioration in QALYs were significant at the 10% and 5% levels respectively for crossovers from initial conservative management to surgery. These results would be expected, given that crossovers to surgery were likely a result of emergency circumstances. In the model which adjusts for crossovers, being randomised to Early Surgery, and receiving the randomised allocation suggests incremental costs of Int$1021 (95% CI –Int$367 to Int$2368) compared with being randomised to and receiving conservative management.

Sampling uncertainty

In order to address sampling uncertainty in our estimates, we present the bootstrapped iterations of incremental costs and incremental outcomes on a scatterplot of the cost-effectiveness plane. These are also presented in the form of a CEAC, outlining the probability of Early Surgery intervention being a cost-effective use of scarce health-care resource use. Figures 15 and 16 illustrate the probability of cost-effectiveness for the base-case analysis.

The graphical illustrations presented above indicate the probability of cost-effectiveness of Early Surgery compared with Initial Conservative Treatment and illustrate the sampling uncertainty surrounding the cost-effectiveness calculations. The scatterplot shows that there is a high probability of Early Surgery delivering improved QALYs; however, there is much uncertainty surrounding the incremental cost estimates. Figure 16 shows the probability of Early Surgery being cost-effective at certain threshold values of WTP for a QALY gain. The probability of cost-effectiveness increases as WTP increases, indicating approximately 50% probability of cost-effectiveness at a threshold value of WTP for a QALY gain of Int$50,000, increasing to approximately 65% when the threshold increases to Int$100,000. However, this graph should be interpreted with caution. Conclusions on cost-effectiveness would depend on a number of issues, including (1) how reflective these costs, which are based on all recruiting countries, are of individual country circumstances and (2) what amount of money decision-makers are willing to pay in international dollars for a QALY gain in their country. In the light of these two issues, Figure 17 presents the CEACs calculated for each of the individual income subgroups recruiting into the trial and may be more informative at a local decision-making level.

The data broken down by subgroup and presented above indicate wide variation in the probability of cost-effectiveness depending on the country income subgroup considered. The probability of cost-effectiveness appears to be lowest for the low-income country group. However, data for both low- and high-income groups are based on very small numbers recruited and are as such unreliable to draw firm conclusions. Higher-income (e.g. UK, Germany), upper middle-income (e.g. China) and lower middle-income (e.g. India) countries have a probability of cost-effectiveness between 70% and 80% at a Int$50,000 threshold value of WTP for a QALY gain.

Interpretation of these country income subgroup CEACs is likely to depend on the wealth of individual countries and countries in income subgroups here should draw conclusions based on the data presented for their income subgroup but also in conjunction with normal threshold values of WTP for a QALY in their jurisdiction. The WHO Choosing Interventions that are Cost-Effective (CHOICE) project has issued guidance to assist with this process, suggesting that an ICER less than three times gross domestic product (GDP) per capita might be considered cost-effective and an ICER that is less than GDP per capita would be considered highly cost-effective.

Discussion

Our analysis is undertaken from an international health services perspective, with costs and QALYs presented by country income subgroup, according to World Bank classifications. Our analysis follows a similar approach to one previously used to report international data in major surgery trials.28 Our results suggest that an improvement in average QALY outcomes may be achievable for additional costs of Early Surgery, with many of the QALY analyses falling just short of statistical significance at the 5% level. However, in one sensitivity analysis which varied the QALY calculation method, incremental QALYs were significant, highlighting the importance of how the date of death is treated within the calculations. In terms of costs, there were no significant differences between groups. The results are promising for the potential cost-effectiveness of Early Surgery and further studies are warranted to confirm these findings. These results indicate that, had the trial recruited to its original target, there is a high probability that we would see significant improvements in QALYs associated with Early Surgery.

The results of the analyses and the probability of cost-effectiveness are probably best interpreted on the income subgroup level. Although the subgroup analyses lack any statistical power, they are perhaps more relevant to decision-makers locally. The probability of cost-effectiveness at alternative threshold values of WTP for a QALY gain should be interpreted in the light of guidelines for cost-effectiveness in individual countries. Not all countries will have formal cost-effectiveness criteria for deciding on best practice guidelines. However, some studies7 have determined threshold values of WTP for a QALY gain. In addition, the WHO has suggested three levels of cost-effectiveness based on GDP per capita in that country. According to these guidelines, interventions could be considered cost-effective if their cost is < GDP per capita (very cost-effective); 1–3 times GDP per capita (cost-effective); > 3 times GDP per capita (not cost-effective).38 The thresholds for 2013 could be calculated for any individual country in the trial, based on data from the World Bank, GDP per capita 2012 international dollars.28 The threshold for each income country subgroup could be taken as the average of the GDP per capita in all of the trial participating countries in that country group. On this basis, the thresholds for low-income countries would be Int$1457 (very cost-effective) and Int$4371 (cost-effective); for lower middle-income countries Int$4389 (very cost-effective) and Int$13,167 (cost-effective); for upper middle-income countries Int$14,763 (very cost-effective) and Int$44,289 (cost-effective); and for high-income countries Int$32,363 (very cost-effective) and Int$97,089 (cost-effective).

This assessment of threshold values is designed to be a broad indicator of cost-effectiveness for individual countries and is based on the assumption that WTP for a QALY gain and WTP to avert a disability-adjusted life-year loss would be similar. This of course is an assumption which may not fall true in reality given the different make-up of the measures. However, while they are not identical, they have many similarities in their composition and could offer a broad estimate of the value placed on health in individual countries which may not have any formal measure of deciding on cost-effectiveness.

Based on the results of the study, and the WHO guidelines for cost-effectiveness,38 one could interpret the Early Surgery intervention as offering a high probability of cost-effectiveness in both high- and upper middle-income countries. There may also be a high probability of cost-effectiveness in lower middle-income countries also; however, based on the CEAC analysis, this conclusion would be more sensitive to the threshold value of cost-effectiveness imposed by decision-makers.

The results are based on a number of assumptions which have the potential to greatly influence the final cost-effectiveness results, as is evident from our sensitivity analyses. The data should therefore be interpreted as a preliminary indication of cost-effectiveness, based on currently available evidence. A larger trial population would provide more robust evidence.

Conclusions

The cost-effectiveness analysis indicates that Early Surgery may be associated with additional QALYs and increases in health-care expenditures. However, differences in costs and QALYs do not reach statistical significance. The results of our analyses, especially in relation to costs, should be interpreted with caution, in light of the assumptions outlined in this chapter. Further research is required to determine more conclusively whether Early Surgery is more cost-effective than Initial Conservative Treatment.

Figures

FIGURE 11. Data completeness (supplementary costing questionnaire) for surgical costs questionnaire, by patients recruited.

FIGURE 11

Data completeness (supplementary costing questionnaire) for surgical costs questionnaire, by patients recruited.

FIGURE 12. Within country variation of cost of surgery (example of Indian centres).

FIGURE 12

Within country variation of cost of surgery (example of Indian centres). Note that only two sites in India reported complete surgery costing data.

FIGURE 13. Across country variation in surgery costs (Int$).

FIGURE 13

Across country variation in surgery costs (Int$).

FIGURE 14. Responses to the EQ-5D-3L.

FIGURE 14

Responses to the EQ-5D-3L.

FIGURE 15. Scatterplot of the cost-effectiveness plane: Early Surgery vs.

FIGURE 15

Scatterplot of the cost-effectiveness plane: Early Surgery vs. Initial Conservative Treatment (all randomised patients – all countries). ES, Early Surgery; ICT, Initial Conservative Treatment.

FIGURE 16. Cost-effectiveness acceptability curve: base-case analysis.

FIGURE 16

Cost-effectiveness acceptability curve: base-case analysis. ES, Early Surgery; ICT, Initial Conservative Treatment.

FIGURE 17. Cost-effectiveness acceptability curves for base-case and country income subgroups.

FIGURE 17

Cost-effectiveness acceptability curves for base-case and country income subgroups. ES, Early Surgery; ICT, Initial Conservative Treatment.

Tables

TABLE 14

Assumptions and sensitivity analyses for resource use and unit costs estimation

Assumption no.Issue arisingAssumption madeJustification for assumptionHow uncertainty is incorporatedImplication for interpretation of results
C1Missing data from site-specific questionnaire regarding surgical costs and resource useIf only one centre in a country has reported data, then assume these data are applicable to all centres in that country. If more than one country has data available, impute the weighted average for missing data in that countryProvides the best possible assessment of average costs for each country, weighted for the largest recruiting centres in an individual countryConduct sensitivity analysis imputing data from the highest and lowest resource use and cost centre data to all centres in that countrySensitivity analysis will provide a within-country range of possible costs which can be presented to local decision-makers. This will account for differences in public/private mix of care which may impact on costs
C2Surgery and hospital stay costing. Payment methods to hospitals and reimbursement methods differ across countriesTake data reported in the questionnaires as being comparableThis assumption is made because the questionnaire was designed in a way to obtain similar information from all sites, although in practice some sites may have reported overheads while others may notConduct sensitivity analyses imputing the highest and lowest resource use and costs estimates from within a country income group to all countries in that income groupThis assumption has implications for cross-country comparisons of cost-effectiveness. However, the questionnaire design provides a reasonably comparable and standardised method of asking resource use and costs across all centres. Sensitivity analyses assess the cross-country variation and its impact on cost-effectiveness
C3Missing data for all centres in a given country (e.g. Malaysia/Lithuania)Assume that costs are equal to the average of all countries contributing data to that income group of countries (low, lower middle, upper middle, high)The logic is that countries in the same income group will have broadly similar organisations of care and will incur similar values of costsSensitivity analysis conducted in C2 above deals with uncertainty in this assumptionTaken together, the base-case and sensitivity analyses will provide a plausible minimum and maximum range in which costs and incremental costs may fall, allowing the presentation of a range for the most likely ICER
C4HDU/ICU/ward unit cost data from the additional costing questionnaire. Missing data from the questionnaireTake weighted averages as above wherever possible imputing for (a) site data to all sites in the country and (b) country data to all countries in an income groupAs aboveSensitivity analyses conducted for C1 and C2 above address uncertainty in this assumptionBase-case and sensitivity analyses give a plausible range for the calculations

HDU, high-dependency unit; ICU, intensive care unit.

TABLE 15

Assumptions and sensitivity analyses for QALYs

Assumption no.Issue arisingAssumption madeJustification for assumptionHow uncertainty is incorporatedImplication for interpretation of results
Q1No baseline utility data collected due to severity of initial injuries of patients in the trialAll patients enter in an unconscious state and thus have a baseline utility equivalent to that of the unconscious state, valued at –0.402This is reflective of the likely state of health, or close to the likely state of health of patients at the time of randomisationOwing to the lack of alternative data, we have not conducted sensitivity analyses on this assumption. As all other clinical baseline estimates were similar at baseline, it is likely that our assumption will not bias outcomesThis assumption, assuming equality of baseline utility could in theory create a bias of for or against the Early Surgery. However, given similarity across groups of baseline clinical characteristics, it is unlikely that any biases would change cost-effectiveness results
Q2Most countries in the trial do not have published EQ-5D valuation tariffsUK tariffs can be used for QALY calculations for all participants in the trialA similar method is used and applied to all (i.e. the time trade-off). Using differing methods (e.g. visual analogue scale/standard gamble) could introduce greater biasesNo sensitivity analyses undertaken, as alternative tariffs not available for participating countriesResults of the QALY analysis are important to all countries, but could be re-ran if new time trade-off tariffs for individual countries become available
Q3There were many deaths in the trial. QALYs may depend on how deaths are included in the calculationsA linear interpolation between time points was assumedStandard method of analysisSensitivity analysis explores the impact of using date of death directly in the QALY calculationsAny differences between the methods of interpolating between time points for QALY calculation should be taken account of in interpretation of the results. The base-case and sensitivity analysis together will give a range for the ICER

TABLE 16

UK-specific resource use and cost data (£)

Cost itemEarly Surgery, mean (SD) resource useInitial Conservative Treatment, mean (SD) resource useEarly Surgery, mean costs per patient (£)aInitial Conservative Treatment, mean costs per patient (£)aMean cost difference (Early Surgery vs. Initial Conservative Treatment) (£)
n2424
Cost surgery
Cost ICU7 (9.9)3.5 (7)
Cost HDU3.5 (4.9)1 (1.5)
Cost ward28 (24)49.3 (61)
Cost of initial episode of care (without CC – with CC)Mean total days = 38.5Mean total days = 53.87199.5010,061–2861.50
Cost hospital readmission (days)32.5 (46)4 (8)6077.507485329.50
Total costs13,27710,8092468

CC, complications and comorbidities; HDU, high-dependency unit; ICU, intensive care unit, SD, standard deviation.

a

Based on a unit cost per day of £187, calculated as the average of tariffs with and without complications and divided through by the trim-point times to derive a cost per day.

TABLE 17

Base-case cost analysis (including all sites recruiting to the trial)

Costs (Int$)Early Surgery (n = 82)Initial Conservative Treatment (n = 86)Difference of means
Resource use; mean (SD)Costs (Int$); mean (SD)Resource use; mean (SD)Costs (Int$); mean (SD)Raw difference (Int$)Adjusted difference (Int$) (95% CI)
All countries
Cost surgery981 (1678)515 (1206)4761774 (–132 to 3679)
Cost ICU4.18 (4.2)2808 (5762)4.06 (4.61)2988 (6131)
Cost HDU1.72 (2.55)385 (1053)1.76 (3.01)461 (1445)
Cost ward11.88 (15.95)3595 (10,206)14.24 (29.43)3997 (13,789)
Cost readmission4.23 (14.43)1145 (5775)2.42 (9.63)421 (1720)
Total cost8812 (18,032)a8336 (18,685)a

HDU, high-dependency unit; ICU, intensive care unit.

a

Total mean cost is not equal to the sum of the resource use. This is because of the use of Diagnosis Related Group costs per episode of care, applied to resource use in Germany.

TABLE 18

Costing subgroup analysis (by country income subgroup)

Costs (Int$)Early Surgery (n = 6)Initial Conservative Treatment (n = 10)Difference of means
Resource use, mean (SD)Costs, mean (SD)Resource use, mean (SD)Costs, mean (SD)Raw differenceAdjusted difference, (95% CI)a
Low-income countries
Cost surgery142 (0)14 (45)20205 (–48 to 459)
Cost ICU0.83 (1.60)203 (391)1.20 (2.70)293 (659)
Cost HDU3.83 (0.75)468 (92)3.5 (2.12)427 (259)
Cost ward5.33 (1.03)325 (63)6.30 (6.43)384 (392)
Cost readmission0.00 (0.00)0 (0)0.00 (0.00)0 (0)
Total cost1139 (418)1118 (614)
Costs (Int$)Early Surgery (n = 40)Initial Conservative Treatment (n = 39)Difference of means
Resource use, mean (SD)Costs, mean (SD)Resource use, mean (SD)Costs, mean (SD)Raw differenceAdjusted difference, (95% CI)
Lower middle-income countries
Cost surgery439 (511)176 (369)477192 (–1010 to 395)
Cost ICU3.20 (3.78)580 (1449)2.38 (3.39)227 (314)
Cost HDU1.93 (2.84)87 (128)1.97 (3.08)168 (513)
Cost ward5.70 (5.84)64 (98)6.31 (5.29)125 (364)
Cost readmission0.35 (2.21)3 (20)0.13 (0.59)1 (5)
Total cost1174 (1583)697 (964)
Costs (Int$)Early Surgery (n = 28)Initial Conservative Treatment (n = 30)Difference of means
Resource use, mean (SD)Costs, mean (SD)Resource Use, mean (SD)Costs, mean (SD)Raw differenceAdjusted difference, (95% CI)
Upper middle-income countries
Cost surgery1089 (1174)822 (1031)–936–1798 (–8378 to 781)
Cost ICU5.43 (3.79)4272 (4134)6.93 (4.49)6010 (5588)
Cost HDU0.93 (1.84)643 (1261)1.17 (3.26)821 (2295)
Cost ward16.86 (16.30)3603 (4132)14.27 (26.05)3080 (6267)
Cost readmission8.39 (20.55)997 (2986)6.23 (15.48)805 (1881)
Total cost10,603 (7517)11,538 (10,149)
Costs (Int$)Early Surgery (n = 8)Initial Conservative Treatment (n = 7)Difference of means
Resource use, mean (SD)Costs, mean (SD)Resource use, mean (SD)Costs, mean (SD)Raw differenceAdjusted difference, (95% CI)
High-income countries
Cost surgery4927 (3,617)2020 (3542)–994–9679 (–41,613 to 22,256)
Cost ICU7.25 (5.95)13,432 (14,847)5.14 (6.72)10,310 (15,989)
Cost HDU1.88 (3.18)1089 (2668)0.57 (0.98)622 (964)
Cost ward30.25 (31.45)23,671 (24,462)69.71 (68.18)34,662 (35,806)
Cost readmission12.27 (22.53)8233 (16,895)2.29 (6.05)1719 (4547)
Total cost46,489 (38,880)b47,483 (46,221)b

HDU, high-dependency unit; ICU, intensive care unit.

a

Mean difference (95% CI) based on GLM regressions with family (gamma), link (Identity).

b

Total mean cost is not equal to the sum of the resource use. This is because of the use of Diagnosis Related Group costs per episode of care, applied to resource use in Germany.

TABLE 19

Utility values based on EQ-5D-3L responses

UtilityEarly Surgery (n = 82)Initial Conservative Treatment (n = 86)Difference of means
Mean (SD)Median (IQR)Mean (SD)Median (IQR)Raw differenceAdjusted differencea
Baseline utilityb–0.402 (0.000)–0.402 (–0.402 to –0.402)–0.402 (0.000)–0.402 (–0.402 to –0.402)
6-month utility0.663 (0.374)0.796 (0.516 to 1.000)0.530 (0.454)0.710 (0.000 to 1.000)
QALY (over 6 months’ follow-up)0.078 (0.074)0.0985 (0.0285 to 0.1495)0.06 (0.085)0.077 (–0.004 to 0.1495)0.0180.019 (95% bootstrapped CI –0.004 to 0.043)

HDU, high-dependency unit; ICU, intensive care unit; IQR, intraquartile range.

a

Mean QALY gain adjusted for patient characteristics of age and sex with 95% bootstrapped CI to account for non-normality of the QALY data.

b

Based on the assumption that all patients attending for care have a baseline utility of –0.402 (i.e. are in an unconscious state at baseline).

TABLE 20

Quality-adjusted life-years and incremental QALYs by country income subgroup

Country income groupEarly SurgeryInitial Conservative TreatmentQALY difference (95% bootstrap CI)a
nBaseline, mean (SD)6-month, mean (SD)QALY, mean (SD)nBaseline, mean (SD)6-month, mean (SD)QALY, mean (SD)
All countries82–0.402 (0)0.663 (0.374)0.078 (0.074)86–0.402 (0)0.53 (0.454)0.06 (0.085)0.019 (–0.004 to 0.043)
Low income6–0.402 (0)0.918 (0.094)0.129 (0.023)10–0.402 (0)0.876 (0.331)0.129 (0.051)–0.018 (–0.055 to 0.019)
Lower middle income40–0.402 (0)0.658 (0.369)0.082 (0.062)39–0.402 (0)0.548 (0.453)0.069 (0.075)0.015 (–0.014 to 0.045)
Upper middle income28–0.402 (0)0.630 (0.369)0.07 (0.074)30–0.402 (0)0.391 (0.460)0.029 (0.094)0.037 (–0.006 to 0.08)
High income8–0.402 (0)0.606 (0.514)0.051 (0.129)7–0.402 (0)0.575 (0.378)0.052 (0.081)0.008 (–0.283 to 0.299)
a

Mean differences between arms for QALY calculations based on OLS regression models adjusted for patient characteristics (age and sex) and presented alongside 95% bootstrapped CIs.

TABLE 21

Base-case cost-effectiveness results

Base case resultEarly Surgery, mean (SD)Initial Conservative Treatment, mean (SD)Difference of means
Raw differenceAdjusted difference (95% CI)a
Costs (Int$)8812 (18,032)8336 (18,685)4761774 (–132 to 3679)
QALYs0.078 (0.074)0.060 (0.085)0.0180.019 (–0.004 to 0.043)
ICER (Int$)26,44493,368
a

Adjusted differences based on the above described GLM model for costs and OLS model for QALYs.

TABLE 22

Cost-effectiveness analysis by subgroup of the population

Country groupEarly Surgery costs (Int$)Initial Conservative Treatment costs (Int$)Incremental costsa (Int$)Early Surgery QALYs (Int$)Initial Conservative Treatment QALYs (Int$)Incremental QALYsa (Int$)
Base-case analysis8812833617740.0780.060.019
Low income113911182050.1290.129–0.018
Lower middle income11746971920.0820.0690.015
Upper middle income10,60311,538–17980.0700.0290.037
High income46,48947,483–96790.0510.0520.008
a

Based on GLM and OLS regressions for costs and QALYs respectively, with adjustment for patient characteristics of age and sex.

TABLE 23

Impact of sensitivity analyses on cost-effectiveness results

Sensitivity analysisEarly Surgery costs (Int$)Initial Conservative Treatment costs (Int$)Incremental costs (Int$)aEarly Surgery QALYsInitial Conservative Treatment QALYsIncremental QALYsbICER (unadjusted data) (Int$/QALY)ICER (modelled data) (Int$/QALY)
Base-case analysis8812833617740.0780.060.01926,44493,368
Assumptions regarding missing data from centre-specific questionnaires (see Table 14)
Assumption C1: Lowest resource use and cost data for individual centres applied to all centres in that country5940.005207.32289.320.0780.060.01940,70415,227
Assumption C1: Highest resource use and cost data for individual centres applied to all centres in that country10,404.779256.372692.750.0780.060.01963,800141,724
Assumption C2: Lowest resource use and cost data for any country in a country income group, applied to all countries in that income group5362.303613.50517.300.0780.060.01997,15627,226
Assumption C2: Highest resource use and cost data for any country in a country income group, applied to all countries in that income group16,291.5114,211.961780.150.0780.060.019115,53193,692
Sensitivity analyses (see Table 15)
Assumption Q3: Assume QALYs calculated with linear extrapolation from baseline directly to date of death, as opposed to current assumption of linear extrapolation to 6 months for those who died in the trial8812833617740.0650.0320.035114,30350,541
Using a standard OLS regression method, with bootstrapped CI for costsc88128336314c0.0780.060.01926,44416,526
Interaction terms in model to account for crossovers9316556610210.0870.0780.011416,66792,818
a

Incremental costs calculated on the basis of GLM regression models, gamma family, link identity, and adjusting for patient characteristics of age and sex.

b

Incremental QALYs calculated on the basis of OLS regression models, with 95% bootstrapped CIs and adjustment for patient characteristics of age and sex.

c

Analysis based on OLS regression with bootstrapped CI, adjusted for age and sex characteristics.

Copyright © Queen’s Printer and Controller of HMSO 2015. This work was produced by Gregson et al. under the terms of a commissioning contract issued by the Secretary of State for Health. This issue may be freely reproduced for the purposes of private research and study and extracts (or indeed, the full report) may be included in professional journals provided that suitable acknowledgement is made and the reproduction is not associated with any form of advertising. Applications for commercial reproduction should be addressed to: NIHR Journals Library, National Institute for Health Research, Evaluation, Trials and Studies Coordinating Centre, Alpha House, University of Southampton Science Park, Southampton SO16 7NS, UK.

Included under terms of UK Non-commercial Government License.

Bookshelf ID: NBK315829