Included under terms of UK Non-commercial Government License.
NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.
Murray DW, MacLennan GS, Breeman S, et al.; on behalf of the KAT group. A randomised controlled trial of the clinical effectiveness and cost-effectiveness of different knee prostheses: the Knee Arthroplasty Trial (KAT). Southampton (UK): NIHR Journals Library; 2014 Mar. (Health Technology Assessment, No. 18.19.)
Patellar resurfacing versus no patellar resurfacing
Currently there is great variability in the use of resurfacing both in the NHS and world-wide. This is primarily because some surgeons believe in resurfacing and some do not. In addition, a small proportion of surgeons resurface the patella in some patients and not others. With some designs of knee replacement, the trochlea is anatomically shaped. This design is considered patella-friendly and to perform well without patella replacement. Previous studies have not clearly demonstrated whether or not it is preferable to resurface the patella, or whether this depends on the design of the knee replacement, the state of the patella or other patient factors.
In this pragmatic study, which is substantially larger than previous RCTs, we found no significant difference in clinical outcome, in terms of pain and function (assessed by OKS, EQ-5D or SF-12), complications, readmission or reoperations between patients with and without patellar resurfacing (Table 41). There was also no significant difference in the incidence of patella-related reoperations. However, as there was a non-significant trend towards improved quality of life (0.187 QALYs per patient treated) and decreased costs (£104 per patient treated) associated with resurfacing, patellar resurfacing was cost-effective. The KAT results indicate a 96% probability that patellar resurfacing is cost-effective at a £20,000/QALY ceiling ratio. Sensitivity analyses indicated that this conclusion was generally robust. Subgroup analyses also suggested patellar resurfacing is more cost-effective in patients aged < 70 years, although it remains good value for money in patients aged ≥ 70 years. The study, therefore, provides an evidence base supporting routine resurfacing of the patella in all patients.
We did not find evidence that the outcome of patellar resurfacing is influenced by whether the femoral component had a trochlea designed to fit an anatomical patella button or a domed patella button; the trial findings therefore apply whether or not the femoral component is considered to be patella-friendly. We also found that late patellar resurfacing had little or no benefit, suggesting that, if a patient has not had patellar resurfacing, late resurfacing should be avoided if possible.
Further research is needed: with increasing follow-up, there was an increasing number of reoperations for complications of resurfacing and a decreasing number of late patellar resurfacing procedures. Some of the complications resulting from resurfacing, such as patella fracture, require complex reconstructions and may be associated with poor outcomes. The operations for patella complications are undertaken in patients who have had resurfacing, whereas the late resurfacings are undertaken in patients who have not had resurfacing. Therefore, there is a concern that after 10 years the rate of complications and reoperations in the resurfaced patella group will increase more than in the non-resurfaced group. If there is a substantial increase in the reoperation rate in the resurfaced group, particularly if it is associated with a worsening clinical outcome resulting from resurfacing complications, our conclusion that the patella should routinely be resurfaced would change. Follow-up to 15 and 20 years is required.
Late patellar resurfacing, overall, had little effect on outcome. However, this does not necessarily mean that no patients improved after late resurfacing. Further research is required to understand the factors associated with a good or poor outcome after late resurfacing. If guidelines that advised against late resurfacing of the patella were made and adhered to, the benefit of resurfacing might disappear. We found some evidence of an interaction between patellar resurfacing and mobile bearings and all-polyethylene tibias. This needs to be explored in more depth to determine if this is a real effect, or just chance.
Mobile bearing versus fixed bearing
Mobile bearings were introduced to minimise wear. They achieve this by having larger areas of contact and thus lower contact stresses. However, their advantage of decreased wear may be nullified by them having more articulating surfaces. Improved wear should result in a decrease in long-term failure rate. Mobile bearings can also be used to alter the kinematics of the knee replacement. Improved kinematics should result in an improved functional outcome. The main theoretical disadvantage is instability and dislocation of the mobile bearing. In addition, mobile-bearing devices tend to be more expensive than fixed-bearing devices. Previous studies have shown no clear advantage or disadvantage of mobile bearings.
We found no definite advantage or disadvantage of mobile bearings in terms of postoperative functional status, quality of life, reoperation and revision rates, or cost-effectiveness (see Table 41). We did, however, identify two disadvantages of mobile bearings that could encourage surgeons to use fixed-bearing devices. First, there was a 2% incidence of instability or bearing dislocation in the mobile bearing group and none in the fixed bearing group. Second, although there was no significant difference in overall costs in the long term, there was a short-term saving for the hospital, as fixed bearings are appreciably cheaper than mobile bearings.
Further follow-up of the cohort would allow assessment of the long-term benefits, risks and costs of mobile bearings. The main theoretical advantage of mobile bearings is decreased wear. Wear can cause failure of knee replacement either mechanically, if the bearing is worn through, or through loosening and osteolysis. Both modes of failure require revision surgery. Failure due to wear tends to occur in the second decade after knee replacement. Therefore, if decreased wear were a real as well as a theoretical advantage of mobile bearings, it would probably be seen in the second decade. Follow-up of the patients in KAT at least to 15 years would clarify this.
Within the health economic analysis, trends were observed which, if they persist in the long term, will have important implications. The current evidence suggests that patients treated with mobile bearings are expected to have marginally higher QALYs which are sufficient to justify the small increased cost. There is, however, substantial uncertainty around this finding. In particular, there is some evidence that the benefits of mobile bearings are short lived, with the group assigned to mobile bearings tending to have higher costs and accrue fewer QALYs from the fourth year after TKR onwards. In the secondary analyses of the subgroup of patients < 70 years, the findings were somewhat stronger than those in the cohort as a whole. In particular, there was an estimated 86% probability that mobile bearings were cost-effective at a £20,000/QALY ceiling ratio. If mobile-bearing knee replacements are cost-effective, they are likely to be most cost-effective in the young active patients, as theoretically they should provide better function and longevity. It may be, therefore, in the long term that mobile bearings are cost-effective in patients aged < 70 years, whereas in patients aged ≥ 70 years fixed bearings may dominate, generating more QALYs and being less expensive. Again, longer term follow-up would help to determine if this is the case.
All polyethylene versus metal-backed
Currently metal-backed tibial components are used for most knee replacements. Previous randomised trials and meta-analyses of these trials found no difference in clinical outcome between the two types of tibial component. As all-polyethylene components are substantially cheaper than metal-backed components, the general recommendation within the orthopaedics community is that, in the elderly, all-polyethylene devices should be used so as to save money.33,34 There have, however, not been any formal economic analyses to support this recommendation.
We found that the functional results with a metal-backed tibia were better than those with an all-polyethylene tibia (see Table 41). This difference was statistically significant when the function was assessed with the EQ-5D and SF-12, but not with the OKS. The complication and reoperation rates were similar. There was a non-significant trend towards a higher major reoperation rate with the all-polyethylene tibia. The economic analysis indicated that the initial cost saving, resulting from the all-polyethylene tibia being cheaper, was offset by higher subsequent costs such that overall the costs of the two types of tibia were similar. However, as metal-backed components were found to be more effective, there was a 91% probability that metal backing is cost-effective compared with all-polyethylene components, costing £35 per QALY gained. Previous recommendations suggested that metal-backed tibias would be less cost-effective than the all-polyethylene tibias in older people; however, we found the opposite: metal-backed tibial components were more cost-effective in patients ≥ 70 years than in younger patients, but were cost-effective in both age groups. This suggests that routinely using the metal-backed tibia in all patients would be good value for money. Hence, we believe that the previous recommendation that all-polyethylene tibias should be used to save money in the elderly is incorrect. Although initially they save money for the hospital, overall they will cost the health service more and are less effective.
Further follow-up would provide very useful clarification. Theoretically, one would expect differences in the revision rates of all-polyethylene and metal-backed tibias in the long term. All-polyethylene designs are likely to have fewer problems due to wear, as they tend to have thicker polyethylene and as there is no possibility of backside wear between the polyethylene and the metal backing. In addition, the transmission of load to the proximal tibia is different, so there may be a difference in loosening rates. Up to 10 years we found a non-significantly higher incidence of major reoperations in the all-polyethylene group. As the incidence of revision tends to increase with time, longer follow-up would clarify whether or not this is a real difference. There was also some conflicting evidence about the functional advantages of the metal-backed tibia. Although the patterns of results were similar, the OKS did not demonstrate a significant advantage, whereas the EQ-5D and SF-12 did. Further follow-up would clarify this.
Unicompartmental versus total knee replacement
The question of whether unicompartmental knee replacements should be widely used or not remains a topical and controversial issue. Potentially, they could offer appreciable advantages compared with TKRs. Unfortunately, because of inadequate recruitment, we were not able to address this subject. The experience gained from KAT has, however, been very useful, as it provided the necessary background information for planning of another study, TOPKAT, to address this issue. TOPKAT finished recruitment in September 2013.
General implications for clinical practice from the trial as a whole
Taken together, the results of the randomisations provide evidence to support routine resurfacing of the patella and the use of metal-backed tibial components, and suggest mobile bearings should be used with caution and probably only in younger patients.
In each of the randomisations, some differences among the various arms were observed. For the functional outcome scores, the differences tended to be relatively small. For reoperations and revisions, although the relative differences were large, the absolute differences were small because the overall reoperation and revision rates were low. For the health economic outcomes, the differences were clearer. Surgeons should be aware of this when selecting implants and should adopt more expensive devices only when there is evidence to support this.
If failure is defined as a reoperation or OKS being less than it was preoperatively, then at 10 years the cumulative failure rate is about 30%. This is a relatively large figure and patients should be warned about this preoperatively. Further work is also needed to improve implant design and techniques. However, it does not necessarily mean that 30% of patients end up with a poor result. This is partly because with time the OKS may improve, and also because patients usually have a satisfactory outcome from reoperations or revision surgery. At 10 years about 90% of patients have a better OKS than they did preoperatively.
Comparing the KAT population with data from the national Patient Reported Outcome Measures (PROMs) data set, which covers around 85% of participants undergoing TKR in England in 2010–11, suggests that KAT participants are typical of those undergoing TKR this decade. KAT participants had a mean baseline OKS of 18.0 (cf. 19.0 in PROMs) and a baseline EQ-5D of 0.38 (cf. 0.41 in PROMs).91 Postoperative scores seen in KAT at 12 months were also similar to those observed in the national PROMs data set at 6 months (OKS 34.1 vs. 33.8 in PROMs; EQ-5D 0.73 vs. 0.70 in PROMs).91
The length of stay observed for KAT procedures (mean 10 days; standard deviation 5 days) is typical of that observed across England and Wales in 2000–3.94 However, the average length of stay has fallen substantially in the past 10 years, such that the mean hospital stay for primary knee replacement is now 5.3 days.82 As a result of their longer length of stay, the average total cost of the inpatient stay for primary TKR estimated in KAT (mean £7070; standard deviation £1873) is substantially higher than the current national average in England and Wales (£6080, based on HRGs HB21A-C in 2010–11).84 However, if the length of stay among KAT participants had been the same as that seen in recent years, the mean estimated cost of KAT primary admissions would have been reduced to £5526 (standard deviation £1212): £554 lower than the current national average. The reason for this difference is unclear. Differences in costing methodology could be one explanation. In particular, our analysis used Scottish data on operating theatre costs, because of a lack of available data on the cost of operating theatre time in England, and based the cost of the inpatient stay on the cost per excess bed-day to avoid double counting. However, the difference may also reflect changes in resource use over time, such as the higher cost of knee replacement components now than in KAT, a greater usage of regional anaesthesia, or more physiotherapy and other rehabilitation resources so as to achieve an early discharge.
Whereas the clinical results are likely to be applicable world-wide, the findings of economic evaluations are generally more sensitive to changes in relative prices and clinical practice and are specific to a UK setting. There may also be variations in clinical practice and procurement polices within the UK that could affect cost-effectiveness. In particular, the discounts that hospitals receive off component list prices and the loan charges incurred for instruments vary between hospitals, with low-volume centres typically incurring higher costs. There are also substantial variations in component price among manufacturers, which may increase variations among hospitals or surgeons who predominantly use components by one or two manufacturers. Other variations in hospital care, such as variations in recovery room use, were also observed. The indications and rates of revision surgery are also likely to vary among centres, although such variations cannot easily be identified within a sample of this size. Other unit costs will also vary geographically: particularly between Scotland and England and between London and provincial towns. However, given that the economic results were primarily driven by the magnitude and direction of quality of life differences and were insensitive to even substantial changes in the cost of components and hospital care, the findings from KAT are likely to have wider relevance than other evaluations in which costs comprise the major driver. Variation between centres also has equity implications. At present, a shortage of data on the relative merits of different prostheses leads to marked variation among surgeons in the types of prostheses used.
At present, surgeons also take account of several patient characteristics when deciding on the most appropriate type of prosthesis, such as disease severity, deformity, diagnosis, age and activity. In particular, more costly component designs, such as metal-backed components and mobile bearings, are predominantly given to younger participants, who are more active and more likely to outlive their prostheses. In KAT, secondary subgroup analyses suggested that patellar resurfacing, mobile bearings and all-polyethylene tibial components were less cost-effective in participants aged ≥ 70 years than in younger participants. However, patellar resurfacing and metal backing were nonetheless cost-effective for both age groups, suggesting that allocation by age is not appropriate. However, subgroup analyses did suggest that mobile bearings were dominated by fixed bearings in older participants, but dominant in younger participants, suggesting that age and activity may be an important consideration for this aspect of component design, although further research is needed.
The results also have implications for hospital and commissioning budgets. Although economic results suggest that patellar resurfacing and metal backing are cost-effective from an NHS perspective, both aspects of prosthesis design increase costs during participants’ primary hospital stay, which are offset by reductions in subsequent care and improvements in quality of life.
General research implications from the trial as a whole
The trial used a partial factorial design, which has been used in only a handful of trials to date, including the Women’s Health Initiative92 and the UK prospective diabetes study.95 This study design enabled us to address three distinct research questions in the same study, increasing our effective sample size by recruiting some participants to two comparisons and avoiding the need to incur the fixed costs of trial administration and analysis for each comparison. These benefits are of particular relevance to orthopaedics, for which long follow-up time is essential and component designs raise a series of inter-related research questions.
The partial factorial design also enabled an exploratory assessment of interactions between patellar resurfacing and the other aspects of component design. Although the trial was extremely underpowered for this analysis, this sensitivity analysis suggested substantial qualitative interactions among the comparisons that could change the conclusions of the metal backing and patellar resurfacing comparisons. Although the results of these sensitivity analyses could be explained by chance and should be interpreted with caution, they nonetheless highlight an important area for future research. Although it may not be feasible to conduct a fully factorial trial adequately powered to detect interactions, preliminary work to explore the potential interaction between patellar resurfacing and metal backing or mobile bearings may be warranted.
The partial factorial study design also introduces challenges for the trial-based economic evaluation. KAT data are being used in ongoing research to explore the appropriate methodology for economic evaluation of factorial design trials,96 which could help improve the quality of subsequent research. Orthopaedic research also raises additional challenges for trial-based economic evaluation: particularly in relation to valuing joint prostheses and operating theatre time and dealing with data collected over a 10-year trial period.
Limitations
The study was designed about 15 years ago. Therefore, the questions that were considered to be important then may not be relevant today. However, the questions are, in fact, still important particularly as there are limited funds available for health care. The prostheses used in the study are no longer commonly used today. However, as the questions were generic and as there have only been small changes in prosthetic design, this makes no difference to the conclusions. Similarly, clinical practice has not changed substantially, except that larger numbers of knee replacements are implanted and the inpatient stay is shorter, so this should not affect the conclusions. Traditional randomised trials in orthopaedics have had tight inclusion and exclusion criteria and have included surgeon-based outcome measures as well as radiographs. KAT is very different as it is pragmatic in nature and is therefore better at guiding health policy. A great strength of KAT is the detailed health economic analysis; however, the resource-use data collection focused on the main drivers and excluded non-knee-related costs, pain medication and mobility aids. In addition, we did not have accurate data on the discounts that hospitals receive. We therefore assumed that there was a flat rate of discount across all components, which may not be the case in practice. The partial factorial design of the study means that we cannot easily allow for interactions among treatment factors or have the power to accurately estimate or exclude such interactions.
Analysis of the non-randomised data
The comprehensive range of data on clinical characteristics, quality of life and resource use that have been collected for KAT could be used to address additional research questions related to knee replacement.
The trial data were used as an observational data set to explore how the cost-effectiveness of TKR varies with baseline characteristics and to assess the evidence base underpinning the eligibility criteria for TKR that had recently been introduced by a number of primary care trusts.97 This research demonstrated that, although the costs and benefits of TKR vary with OKS, TKR is highly cost-effective for participants of grades 1–2 who had baseline OKS < 40 and for ASA grade 3 participants with OKS < 35. The study also showed that the cost-effectiveness of TKR was independent of BMI and of disease in other joints. This study was published in BMJ Open and presented at a number of national and international meetings. EQ-5D and OKS data from KAT were also used alongside data from the national PROMs programme to develop a mapping algorithm that can be used to estimate EQ-5D responses and utilities from patients’ responses to the OKS questionnaire,98 thereby facilitating future research assessing cost-effectiveness on older data sets that include OKS but not EQ-5D. KAT data were also used to explore the potential clustering effects of surgeon and/or centre in surgical trials and contributed to a database of intracluster correlation coefficients to aid in the design of future randomised surgical trials.99 There is also a collaboration between KAT and COAST (another study funded by NIHR) in which KAT data are being used to develop a predictive model of knee replacement outcome.
Further research
The three main priorities for further research are:
- Continue follow-up of KAT patients up to a minimum of 15 years.
- Additional detailed analysis of the 10-year KAT data set.
- Further RCTs in joint replacement based on the experience gained from KAT. A good example of this is TOPKAT, a study designed to determine whether total or partial knee replacement is better.
The analysis of the median 10-year follow-up data from KAT patients has gone a long way towards providing a substantially firmer evidence base to guide answers to the questions addressed by the randomisations within KAT. However, further follow-up to a minimum of 15 years and analysis of these data should result in stronger conclusions, which should provide the basis for more detailed, stronger and complete recommendations. There are two reasons for this. First, differences among the arms of the various randomisations may appear or become more marked in the second decade post knee replacement, and, second, the power of the study will increase with longer follow-up, which will allow more detailed subgroup analysis. Failure due to many causes, such as component loosening, polyethylene wear and osteolysis, tends to occur more frequently in the second decade than the first. Therefore, any design features, such as patellar resurfacing, mobile bearings and metal backing of the tibia, that influence these failure mechanisms are likely to have a greater effect on revision rates in the second decade than the first. In the longer term, outcome scores following knee replacement tend to drop. Therefore, functional differences among different designs of knee replacement may become more marked in the second decade. With time there will be more reoperations and revisions, which will increase the power of the study. For the standard analysis of the outcome scores, increased observations over time will not increase power, although the power of the marginal benefit calculation may increase with longer follow-up. Similarly, as costs and QALYs accumulate over time, the power of the health economic analysis may also increase if the follow-up is increased to 15 and 20 years.
Ongoing follow-up is particularly important for the patellar resurfacing randomisation, as we have found that with increasing follow-up there were an increasing number of reoperations for complications of resurfacing and a decreasing number of late patellar resurfacing procedures. We are therefore concerned that after 15 years there may be an increasing number of problems with resurfacing the patella that may change our findings, suggesting that the patella should not be resurfaced routinely. With the mobile bearing randomisation, we found that after up to 10 years there was no definite difference between the two arms. However, the main advantage of the mobile bearing, which is decreased wear, is most likely to manifest in the second decade. Long-term follow-up could also explore the trend towards lower QALYs with mobile bearings beyond year 5 and explore the trend towards mobile bearings having a better cost-effectiveness in younger people than in older people. Follow-up to 15 and 20 years should clarify these issues. In the metal-backed versus all polyethylene randomisation, the 10-year results suggest a clear health economic advantage for the metal-backed tibia. There was, however, no clear clinical difference. Up to 10 years, we found a non-significantly higher incidence of major reoperations in the all polyethylene group. As the incidence of revision tends to increase with time, longer follow-up will clarify whether or not this is a real difference. There was also some conflicting evidence about the functional advantages of the metal-backed tibia. The OKS did not demonstrate a significant advantage, whereas the EQ-5D and SF-12 did, although the patterns of results were very similar. Further follow-up would also clarify this.
We believe the median 10-year KAT data set is the best data set for knee replacement that exists. It contains detailed data on patient demographics, surgical findings and management and implant characteristics for a very large number of patients. It also contains data from annual follow-up about clinical scores, complications, reoperations, costs and resource use. Further observational analysis of the data set, which should ideally be extended to a minimum of 10 years, could be undertaken to describe the natural history of knee replacement and to answer many of the key outstanding questions relating to TKR. For example, KAT data could be used to identify patient, centre, surgical and implant factors associated with a poor outcome, in terms of clinical score or reoperation rate, which would help surgeons improve the results of knee replacements. It could be used to determine the optimum way to follow-up knee replacement patients. It could be used to develop a detailed long-term health economic model of knee replacement and thus to improve the cost-effectiveness of knee replacement. It could be used to explore important observations made in the study such as that, when failure is defined as reoperation or a worse OKS than pre operation, the cumulative failure rate at 10 years is about 30% and that the various outcome measures discriminate differently among knee replacement designs.
Tables
TABLE 41
Outcome | Patellar resurfacing vs. no resurfacing | Mobile vs. fixed bearings | All-polyethylene vs. metal-backed tibial components |
---|---|---|---|
Functional (OKS) | Small but consistent difference in favour of patellar resurfacing; 95% CI suggests MCID unlikely; treatment effect not modified by patella shape | Similar between groups | Consistent benefit favouring metal-backed, not statistically significant |
Quality of life (EQ-5D utility, SF-12 PCS and MCS) | Similar between groups | Similar between groups | Similar pattern to OKS but statistically significant differences found |
Reoperation | Similar between groups | Similar in both groups; however, five participants required reoperation for instability or dislocation in the mobile bearing group | Similar between groups |
Incremental QALYs (95% CI) | 0.187 (–0.025 to 0.399; p = 0.08) | 0.051 (–0.333 to 0.435; p = 0.79) | –0.293 (–0.706 to 0.119; p = 0.16) |
Incremental costs (95% CI) (£) | –104 (95% CI –630 to 423; p = 0.70) | 85 (–911 to 1081; p = 0.87) | –10 (–872 to 851; p = 0.98) |
Base-case cost-effectiveness result | Patellar resurfacing dominates no resurfacing, with a 96% probability of being cost-effective | Mobile bearings cost £1666 per QALY gained vs. fixed bearing, with a 59% probability of being cost-effective | All polyethylene saves £35 per QALY lost vs. metal-backed, with a 9% probability of being cost-effective |
Sensitivity analysis results | Complete case finds resurfacing not cost-effective | Complete case and per-protocol analyses find mobile bearings dominated | Conclusions robust to changes in methods other than assumptions about interactions |
Subgroup analysis results | Cost-effective in both age subgroups. Probability of being cost-effective: 97% in participants < 70 years, 74% in participants ≥ 70 years | Cost-effective in participants < 70 years (86% probability), but not ≥ 70 years (24% probability) | All polyethylene is poor use of resources in age subgroups. Probability of being cost-effective: 46% in participants < 70 years, 5% in participants ≥ 70 years |