Included under terms of UK Non-commercial Government License.
NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.
Blom AW, Artz N, Beswick AD, et al. Improving patients’ experience and outcome of total joint replacement: the RESTORE programme. Southampton (UK): NIHR Journals Library; 2016 Aug. (Programme Grants for Applied Research, No. 4.12.)
Improving patients’ experience and outcome of total joint replacement: the RESTORE programme.
Show detailsParts of this chapter have been reproduced from Wylde and colleagues.262 © 2012 Wylde et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Some parts have also been reproduced with permission from Wylde V, Lenguerrand E, Brunton L, Dieppe P, Gooberman-Hill R, Mann C, et al. Does measuring the range of motion of the hip and knee add to the assessment of disability in people undergoing joint replacement? Orthop Traumatol Surg Res, vol. 100, pp. 183–6.263 Copyright © 2014 Elsevier Masson SAS. All rights reserved. Parts of this chapter have also reproduced from Lenguerrand and colleagues264 © 2016 Lenguerrand et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Abstract
Background
In the ADAPT study we aimed to compare outcome measures over time in patients with hip or knee replacement and assess how well they measure impairment, activity limitation and participation.
Methods
Outcome measures were studied prospectively in 263 patients receiving joint replacement. Function was assessed prior to surgery and at 3 and 12 months using patient-completed questionnaires, clinician-administered tools and performance tests.
Results
A clinically significant improvement occurred in about 90% of patients with hip replacement and 70% of those with knee replacement. Patients with severe disease at the time of surgery were more likely to have substantial improvements in pain and functional ability.
Pain and function measures were highly correlated. People with anxiety or depression may assess themselves as being worse off than objective measures suggest. Measures of function may need adjustment for pain, psychological status, age and perhaps muscle strength to obtain a satisfactory picture of functional loss. Results suggested that physical function should be measured with both a PROM and a performance test. ROM is commonly assessed in clinical practice but did not correlate well with other measures of disease severity.
Conclusions
The ADAPT study highlighted the importance of different methods of assessing pain and function in patients receiving hip and knee replacement. Different pain characteristics predicted long-term pain in hip and knee replacement. Outcomes after joint replacement should be assessed with a patient-reported outcome and a functional test.
Background
In medical research it is conventional to use ‘outcome measures’ to assess change over time and the response to any intervention, such as a joint replacement.265 Outcome measures are an artificial construct, as everyone’s life and health changes continuously and well-being is influenced by many factors other than a specific illness or its treatment.266 Nevertheless, we need outcome measures at appropriate time points when comparing the value of different approaches to health care. Research suggests that most of the benefit that can accrue from a successful joint replacement has occurred by 12 months after the operation.48 In addition, we need to measure the long-term costs of providing treatments, which allows us to assess whether or not incremental health benefits are worth the incremental costs required to provide them.
It is essential that patient-reported outcomes after joint replacement are continuously reviewed and monitored to improve practice and optimise the results of surgery. However, the use of many different outcome measures can lead to difficulty in applying evidence to clinical practice and renders comparisons across studies and meta-analyses problematic.267 One recent systematic review found extensive variation in the outcome measures used in RCTs of joint replacement.119
Measurement by clinician or patient report
The outcome after a hip or knee replacement can be assessed in different ways and can be classified according to who makes the judgement – a clinician, the patient alone, a ‘significant other’, or a mixture of two or more groups. In early studies, adverse events such as infection and prosthesis survival were the main issues of concern.45 As prosthesis design and the control of adverse events improved, these issues become less important and attention turned towards clinician administered tools, such as the Harris Hip Score (HHS)268 and AKSS113 and more recently towards PROMs.121 Clinician-administered tools have been widely criticised because of the recognised discordance between views of patients and clinicians.269,270
Research studies in joint replacement frequently use PROMs assessing different domains. Patients rate their pain, function, HR-QoL, social participation, mental health and satisfaction with the outcome of health-care interventions. In England, following the report of Lord Darzi,121 PROMs are routinely collected after elective surgery.271
General or specific measures
Outcome measures may be general reflecting overall pain, function and well-being or specific relating to an arthritic hip or knee. ‘Joint specific’ measures are used to assess the effectiveness of an intervention targeting a joint (such as joint replacement); however, it is important to find out if a patient’s general QoL has been affected.
It is generally agreed that we should assess pain and function when treating arthritis, as these are the two problems that bother people most. But it is not known what aspect of the pain (e.g. activity related pain, night pain or resting pain) and types of function (e.g. stair climbing, shopping, getting on a bus or playing golf) cause researchers endless problems. In addition, these often depend on issues such as culture and context.
There are many different outcome measures for use in assessment of health status and the response to interventions for people with arthritis and such instruments need to be validated before use. We need to be sure that they measure the outcome appropriately and that they are reproducible, responsive to change, consistent and acceptable to patients. Many researchers choose an instrument because it is widely used by others and this will help them to compare their results with those in the published literature.
Utilities
In health care, we attribute a measure of ‘utility’ to the time patients spend with a particular QoL profile. Utility is an economic term to describe the benefit that individuals derive from consuming goods or services. Because goods and services are scarce, individuals are faced with choices and their preferences are revealed by choosing to consume some goods and services over others. Individuals would rationally prefer goods and services that provide them with a higher utility level. In terms of health and health care, we consider that each patient has a given health profile that gives them a certain amount of health benefit or ‘utility’. Better health profiles are those for which patients have a higher QoL and, therefore, higher utility scores as well. We measure each individual’s health profile by asking patients to fill in generic HRQoL questionnaires. Such questionnaires can be filled in a myriad of ways, each corresponding to a different health profile. These are then sent for valuation to a random sample of individuals from the same population with a particular health profile. The weighted average of values for each health profile is the ‘utility score’ that a particular society has attributed to the specific health profiles.
Outcomes used in RESTORE
In Table 15, we summarise the key outcome measures used in the RESTORE programme.
The main issues of concern to patients undergoing total hip or knee joint replacement include pain and functional problems that are related to the joint disease, as well as general QoL and satisfaction with the surgery. In this chapter we consider joint specific pain and function. Pain is a purely subjective domain, so that we are dependent on patient self-report to assess it. In contrast, function can be assessed in many different ways, which include patient report, observation of specific or general activities, measurement of certain activities and third party observations and perceptions.
The WHO introduced the ICF,80 which provides a theoretical framework on which to base the assessment of function. This framework splits function into three separate domains: impairment, activities limitations and participation restrictions. The value of this in the context of total hip or knee replacement can be illustrated by taking the example of climbing a step, a common problem for people considering a total hip or knee replacement. The impairments might include reduced joint movement, pain on movement and muscle weakness; the resulting activities limitations might be difficulty climbing stairs or difficulty getting onto a bus. Consequent participation restrictions might be inability to get to the shops or to go to stay with grandchildren because of the need to use stairs. Research has shown that the relationship between the impairment, activities limitations and participation restrictions domains of the ICF are not simple, with other factors such as self-efficacy and comorbidities acting as independent determinants of the relationships between these variables.300
It has been recommended that a combination of outcome measures should be used to assess function after total hip or knee replacement.301,302 However, there are many reasons not to use a wide number of measures with every patient, whether in clinical practice or research. These include patient fatigue and burden, time constraints of clinic and research appointments, and time taken to process and analyse multiple information sources. Therefore, there is a need for guidance about which outcome measures are the most useful in assessing function before and after total hip or knee replacement.
The aims of the ADAPT study were to compare the properties and responsiveness of a selection of commonly used measures that are either self-assessment tools or functional tests, to examine how well they relate to the ICF concepts of impairment, activities limitations and participation restrictions, and to explore the changes in the measures and domains of outcome over time.
Methods
The ADAPT study is a single-centre cohort study at the AOC. Based in the south-west of England, this is one of the largest elective orthopaedic units in the UK, with approximately 800 hip operations and 800 knee operations performed in 2011.303 The ADAPT study was approved by Southwest 4 Research Ethics Committee (09/H0102/72) and all participants provided their informed, written consent to take part. The study was registered on the NIHR Clinical Research Network Portfolio (UKCRN ID 8311).
Inclusion/exclusion
Recruitment into the study began in February 2010 and finished in November 2011. Patients listed for one of the following operations were eligible: primary TKR, revision TKR, unicompartmental knee replacement, patellofemoral replacement, primary THR, revision THR or hip resurfacing. We included patients with different surgical procedures so that functional measures could be assessed across a range of people with diverse issues and degrees of functional loss.
Patients were excluded from the study if they lacked the capacity to provide informed consent. This was assessed by the research nurse in accordance with guidance from the integrated research application system, which is responsible for providing ethical approval in the UK and the Mental Capacity Act of 2005.304 This decision was made by a research nurse if the patient met one of the following criteria: (1) they could not understand the information relevant to the decision to participate; (2) they were unable to retain the information about the study; (3) they were unable to use or weigh that information as part of the decision-making process; and (4) they were unable to communicate their decision about participation (whether by talking, using sign language or any other means). Another exclusion criterion was severe functional limitations such that the patient was unable to walk because this would prevent the patient from being able to attempt any of the functional tests. This was assessed in the discussion between the research nurse and the potential participant and was always a mutual decision by the researcher and the patient. Being unable to complete questionnaires in the English language was also an exclusion criterion because not all the validated questionnaires have been translated into other languages.
Participant recruitment
Potential participants were identified from the joint replacement waiting list by the code for the intended operation and sent a postal invitation. Patients who returned a reply form were telephoned by a research nurse to discuss participation. This included a full explanation of study involvement and a preliminary eligibility assessment by asking about the intended operation and assessing understanding of the information provided. Those that did not reply or who were missed from the initial postal invitation list owing to late scheduling of hospital appointments were approached by a research nurse when they attended the pre-operative assessment clinic. These patients were identified by daily checking of the clinic lists. If they were interested, eligibility was assessed and a full explanation of study involvement was provided. The first appointment was arranged then or patients were telephoned a few days later if they needed time to consider. Demographic data (age, sex and postcode) was recorded from all patients.
Assessment times
Participants attended a research appointment, lasting approximately 1 hour, at the AOC. Appointments were scheduled before surgery and then at 3 months and 1 year after surgery. Assessments were conducted at 3 months post operation to coincide with the standard clinical review, by which time most patients should have experienced a large improvement in pain and function.48 Assessments were also conducted at 1 year post operation as outcomes can continue to improve up until this time point.48 The inclusion of two postoperative assessments also allowed exploration of outcome trajectories and comparison of rates of improvement between different outcome domains (e.g. pain, function, participation).
At the initial pre-operative appointment, eligibility was confirmed, informed written consent was obtained and a questionnaire was given to participants to return by post before their operation. For the postoperative assessments, a questionnaire was sent out prior to the research appointment. At each time point, participants underwent a clinical assessment. These assessments were conducted by trained research nurses who followed standard operating procedures to ensure consistency and standardisation in data collection and who were assessed for competency in the examination procedures by a senior research nurse and orthopaedic surgeon. Risk assessments of the functional tests were undertaken and safe operating procedures specified. The data collected during the assessments were recorded by the research nurses on standardised proformas.
Selection of outcome measures
Table 16 provides a summary of the functional assessment measures used in the ADAPT study. The table also provides an overview of the ICF domains included within each functional assessment measure, with classification of self-completed PROMs and clinician-administered measures based on the results of an expert consensus study by Pollard and colleagues60 We also conducted a gait analysis using a single inertial sensor to derive motion parameters during ADL.305
There are a number of other measures that could have been included in this study such as the OHS and OKS,117,274 HSS,113 KOOS306 and NHP.307 However, to avoid participation burden and fatigue only a selection of measures was chosen.
Patient-reported outcome measures
The following validated measures were used to provide disease-specific and generic self-reported measures of outcome.
The WOMAC function scale.114 This disease-specific subscale, validated in osteoarthritis patients, consists of 17 questions assessing the extent of function limitations when performing a range of daily activities. Responses are provided on a 5-point Likert-type scale.
Aberdeen impairment, activity limitation and participation restriction measure (Ab-IAP).285 This 35-item disease-specific measure, validated in osteoarthritis patients, uses the ICF framework to assess disability and produces scores for impairment, activities limitations and participation restrictions. Responses are provided on a 5-point Likert-type scale.
Short Form questionnaire-12 items.116 This 12-item general health measure produces a PCS and mental component score scale. Responses are provided as binary options (yes/no) or on a Likert-type scale.
Measure Yourself Medical Outcome Profile 2 (MYMOP2).287 This patient-generated instrument allows participants to generate and rate the severity of two symptoms that are concerning them and one activity important to them that is restricted by the symptoms. Participants also rate their general well-being, duration of symptom 1 and medication usage for symptom 1. At follow-up, participants are asked to rate the severity of the symptoms and degree of restriction of the activity that they identified at the first data collection point. Ratings are provided on scales of 0–6. The MYMOP2 was completed during research appointments by participants with the assistance of research nurses.
Participants also completed a number of other questionnaires to assess factors that have been found to influence outcomes after total hip or knee replacement (Table 17). At each assessment time, participants completed the HADS281 and the WOMAC pain and stiffness subscales.114 Participants were also asked to rate how disabled they perceived themselves because of their joint problems and why, and to list three things that they were hoping for from their total hip or knee replacement. Pre-operatively, medical comorbidities were recorded using the Functional Comorbidity Index279 and information was collected about socioeconomic status (marital status, living arrangements, educational attainment, employment status), joints affected by arthritis and previous surgery on other joints. In the 1-year postoperative questionnaire, satisfaction with the outcome of surgery was assessed using the Self-Administered Patient Satisfaction Scale for Primary Hip and Knee Arthroplasty.284
Clinician-administered measures
The HHS was completed with patients receiving hip replacement.268 This assessment measure provides a total score of between 0 and 100 (worst to best) collected over four domains. Function, which includes limp, use of assistive devices, walking distance, managing stairs, using public transport, sitting comfortably and putting on shoes and socks, is weighted the most heavily and is assigned 47 points. Pain is assigned 44 points. The physical examination involves assessing deformity (4 points) and ROM (5 points).
The AKSS was completed with patients receiving knee replacement.113 This assessment consists of a knee score and a function score, both with a total score ranging from 0 to 100 (worst to best). The knee score incorporates examiner’s rating of patients’ pain (50 points) and a clinical assessment of stability (25 points) and ROM (25 points), with deductions for flexion contracture, extension lag and misalignment. The function score consists of questions about walking distance (50 points) and stair climbing ability (50 points), with deductions for the use of walking aids.
Performance tests
Before performing each of these tasks, participants were asked if they thought that they would be able to perform the task and estimate how difficult they thought the test would be to perform on a 0–10 scale (no difficulty at all to impossible). After they had completed the test, they were asked to rate how difficult the task actually was to perform on the same 0–10 scale. The research nurse conducting the assessment also provided a rating of how difficult it appeared to be for the participant to perform the task. If participants were unwilling to attempt the test or the research nurse was unhappy to proceed because of safety concerns, the test was not performed. All tests were performed without the use of supportive aids except the timed 20-metre walk and are described in the order in which they were performed.
Timed 20-metre walk288
Participants were timed as they walked a 20-metre straight distance on level ground at their normal, comfortable speed. If the participant normally used a walking aid they were asked to try without it but, if they felt unable to do so, they completed the test using the walking aid. The recorded outcome was the time taken to complete the test.
Timed get-up-and-go test289
Participants sat on a height-adjustable chair such that a 90° angle was formed when the femur was horizontal and the tibia vertical with their feet shoulder width apart and their arms crossed against their chest. Participants were timed as they stood up from the chair without using their hands, walked at a normal pace past a marker 3 metres away, turn around, walked back and sat down again. The recorded outcome was whether or not participants were able to complete the activity and how long it took.
Sit-to-stand-to-sit290
Participants sat on a height-adjustable chair as described for the previous test. Participants then stood up, waited 2 seconds and sat down again without using their hands. The recorded outcome was whether or not participants were able to complete the activity.
Step test291
Participants stepped up onto a 20-cm-high block leading with the contralateral leg, waited 2 seconds and then stepped down from the block with the index leg leading, without using their arms. The test was then repeated with the index leg leading. If participants successfully completed this test, the test was repeated with a 30-cm-high block. The recorded outcome was whether or not participants were able to complete the activity.
Single stance balance test292
Participants stood with their feet together facing the research nurse and placed their palms gently on top of the research nurse’s palms. Participants then lifted their index leg and attempted to balance on their contralateral leg for 15 seconds. If the participant lost balance within 3 seconds, then the test was reattempted. If the participant lost balance before 15 seconds, the length of time was recorded. This test was then repeated while balancing on the index leg. If these tests were completed successfully, the participant repeated the tests with no stability support from the research nurse. The recorded outcome was whether or not participants were able to maintain the stance for 15 seconds.
Inertial sensor-based motion and gait analyses
Movement analysis by body-fixed inertial sensors containing accelerometers and gyroscopes enables the objective assessment of the translational and angular movements of body segments outside a gait laboratory.293,308 We used a single 3 dimensional (3D) inertial sensor [41 × 63 × 24 mm; 39 g; Microstrain Inertia Link (Williston, VT)] containing accelerometers (± 5 g) and gyroscopes (± 300°/second) along the three orthogonal axes in frontal, sagittal and transverse plane and positioned centrally between both posterior superior iliac spines to measure trunk movements near the centre of gravity. Based on the 3D linear accelerations, angular rates and angular positions put out by the sensor and sent wirelessly to a computer at a 100 Hz sampling frequency via a Bluetooth® (Bluetooth SIG, Inc., Kirkland, WA) link, analysis algorithms calculated motion parameters such as step frequency, step asymmetry or trunk sway.
The inertial sensor was used to derive motion parameters from a battery of movement tasks which were clinically feasible to perform during a routine outpatient visit and which challenged the patient’s functional capacity in different ways: (1) locomotion (walking), (2) transfers (sit-to-stand-to-sit test, get-up-and-go test), (3) rising and descending (step test) and (4) balance tests (single-leg stance). The walk test309 and the step-test were repeated twice and the sit-to-stand-to-sit test was repeated three times to derive representative mean values or study possible effects of fatigue or warming up.
Data collection
Information on BMI, diagnosis, side of surgery, type of surgery and surgical approach was extracted from participants’ medical records.
Data recording
All data were entered into a password-protected database by research nurses or study administrators. The study was overseen by an independent Steering Committee which met every 6 months to discuss the progress of the study. Data monitoring, which involved double data entry and quality checks, was conducted every 3 months. All data were cleaned before data analysis was performed. Any inconsistencies were collegially discussed by an internal board of researchers involved in the data collection.
Sample size
This study involved exploratory analysis to compare different measures to assess function after total hip or knee replacement. Therefore, no formal sample size calculation was performed, although we aimed to recruit a sufficient number of patients to allow meaningful data analysis. We approached all patients listed for surgery with participating surgeons between February 2010 and November 2011. Previous longitudinal studies comparing measures of function in an orthopaedic population have included between 30 and 200 patients.310–318
Analysis
The statistical methods used to analyse the data are described in each section of the results, which have been divided into three sections:
- baseline demographic data
- cross-sectional correlations of baseline data
- analysis of change from longitudinal data including preliminary results from the gait analysis.
Results
Demography of the cohort
A total of 130 patients receiving hip replacement and 133 patients receiving knee replacement were recruited to the ADAPT study.
The patients listed for a hip surgery were planned to receive a primary THR (n = 78), revision THR (n = 44) or a hip resurfacing (n = 8). The 133 patients listed for a knee surgery were planned to receive a primary TKR (n = 51), revision TKR (n = 45), unicompartmental knee replacement (n = 32) or patellofemoral replacement (n = 5). The five patients awaiting patellofemoral joint replacement were excluded from the cross-sectional analysis owing to the isolated nature of their knee disease.
Not all data were available on all 258 patients and at each measurement point, so the subsequent analyses reported below are often on slightly smaller groups.
Patient demographics for the 249 participants with available pre-operative data (125 listed for hip replacement and 124 listed for knee replacement) are summarised in Table 18.
Cross-sectional analysis of the different measures of function
Introduction
As noted above, one of the main aims of this study, within the overall programme was to improve our understanding of the best ways of measuring function before and after hip or knee joint replacement. In this section of the results, we describe the correlations between the different measures of function that we have data on at the baseline visit. We also describe the disparities/similarities in the associations between these measures and patient characteristics. Investigating these issues provides insight into how the outcomes as measured by these various tools can be interpreted and sheds insight into the comparability of the tests. The data should also aid those investigating disability caused by severe hip and knee pathology to make an informed choice of measurement tool.
Statistical analysis
The relationships between the different functional measures were assessed with correlation statistics. Spearman’s rank-order coefficients were used to assess correlations between continuous variables and point-biserial coefficients to assess correlations between continuous and dichotomous variables. These measures range from –1 to 1. The strength of correlation was interpreted as |0.00–0.25| = none–little, |0.26|–|0.49| = low, |0.50|–|0.69| = moderate, |0.70|–|0.89| = high, |0.90|–|1.00| = very high. Correlations between two binary measures were assessed with Cramér’s V-statistic, which ranges from 0 to 1. A value > 0.3 was considered very strong.
The association between participants’ characteristics and functional measures was investigated with linear regression for HHS, AKSS, WOMAC function, Aberdeen activity limitation subscale (Ab-A), Aberdeen participation restriction subscale (Ab-P) [transformed as root squared (score)], walking speed and get-up-and-go tests (transformed as 1/time). The step and balance tests produced dichotomous outcomes (able/unable to do test) and were investigated with a modified Poisson regression with robust variance estimation.
Individual patient characteristics were first considered in a univariate model. Factors that were found to be significant (p ≤ 0.05) were then investigated in a multivariate analysis to determine if their effects were confounded by other factors.
The analyses were conducted separately for hip and knee patients. Although few participants had missing information for one or more of the considered variables, missing data were addressed using a multiple imputation by chained equations approach. Ten imputations were generated and estimates were combined using Rubin’s rules. Statistical analyses were performed in Stata 12.1.
Results
The mean/median and range of scores for each of the functional measures are shown in Table 19. These data show that most participants had significant functional limitations, although the wide range of each of the measures suggests considerable variability.
Relationships between functional measures
Correlations between the different functional measures were all statistically significant, but some were much stronger than others (Table 20). The HHS correlated relatively well with PROMS and with walk time in patients with hip disease, but not so well with the other performance tests. The AKSS correlated poorly with all other types of functional measures in patients with knee disease. The highest correlations, in both hip and knee patients, were between the WOMAC and Ab-A scores – the two PROMs for disability; between the walking speed and the get-up-and-go test – the two timed tests; and between the balance test and 30-cm step test.
Associations between patient characteristics and functional measures
Associations between patient characteristics and the different functional measures are shown in Tables 21 (hip) and 22 (knee). Pain was an important determinant of all measures of function in both patient groups. In contrast, age, sex and comorbidities discriminated between hip and knee patients as well as between the different methods of assessing disability. Sex affected most measures of disability in hip patients, but not in knee patients. Age affected the performance tests more than the PROMs or clinician-administered tests, whereas anxiety and depression had much more effect on the PROMs and clinician-administered measures than on the performance tests. BMI does not seem to be important and other comorbidities have more effect on tests of function in people with knee disease than those with hip disease.
Discussion
This study compared different ways of assessing function in patients awaiting hip or knee replacement. Correlations were stronger within the same type of measures (PROMS, clinician-administered or performance test) than between approaches. Correlations that were usually < 0.9 imply that each of these measures describes a slightly different construct of function and that several of them would be needed to provide an accurate and exhaustive assessment of function. Nevertheless, the WOMAC function, Ab-A, HHS, walking test and the get-up-and-go tests had satisfactory convergence validity (correlation ≥ 0.3). This suggests that each of these measures can individually provide a reasonably comprehensive description of function if it is possible to conduct only one test. However, the AKSS and the balance test correlated poorly with most of the measures and should not be used alone. These findings are consistent with previous research which found moderate to strong (> 0.4) correlations with the WOMAC function and stair climbing, walking or the get-up-and-go test.319–323 Moderate to strong correlations have also been reported pre-operatively between the WOMAC function and the HHS or its components,319,321,324 and small to moderate correlations have been found between the WOMAC function and the AKSS.319,325,326 Our finding of a strong relationship between the Ab-A and WOMAC function score is not surprising as the Ab-A is based on several items of the WOMAC.285 In this study, the Ab-A measure had slightly better correlations than the WOMAC function with all the other measures. This suggests that it may be the preferred tool for assessment of function in this population.
Previous inconclusive studies exploring the association between the HHS or AKSS and performance tests were based on moderate sample sizes, and mainly focused on associations between joint ROM and performance tests.285,327–329 Our study highlighted that associations between patient characteristics and function differed according to the measurement approach used. For example, obesity was associated with poor AKSS but not with functional outcomes as measured any other way.
Responses to PROMs are influenced by factors including age, sex, mental health or socioeconomic characteristics.322,330–333 Clinical assessments can also be influenced by patients’ characteristics; for instance, fat mass and bony structure affect the reliability and validity of extremity measurements,334 while age and vulnerability may influence communication with health professionals or interviewers.335 Performance testing may not always assess ADL of relevance to an individual and may not take into account environmental or behavioural adaptations.336 Tests are also likely to be confounded by factors such as sarcopenia, which in turn can be influenced by other patient characteristics such as activity or self-efficacy.337
Although pain is a major determinant of function irrespective of measurement method, we found that psychological health influenced self-assessment more than performance-based methods. In addition, age affected performance measures, but not self-assessment. This has several implications. First, a causal investigation of function will be accurate, exhaustive and corroborative only if conducted simultaneously with several measures of function. Second, the investigation of any risk factor of function should be adjusted for the patient’s psychological status (if a self-assessment measure is used) or for patient age (if a performance test is used), and in both cases for pain. Third, any comparison of measures of function obtained with different measurement methods is flawed unless the effects of pain, age and psychological status are considered. Fourth, there is an age-related decline in function when measured objectively, but this is not evident on PROMs. Fifth, the effect of psychological factors on self-reported function, but not on objective measures, indicates that psychological status influences the perception of function more than the ability to do something; patients may be able to do more than they say they can do and may need encouragement to overcome anxiety. Finally, it seems that any assessment of function should be accompanied by pain assessment to obtain unconfounded assessment. The association of pain with function, even after taking into account the age and psychological status of the patients, confirms the lack of discriminant validity of currently used functional measures. This is to be expected with the clinician-completed HHS and AKSS, which include a pain component. It has also been observed previously between self-reported measures of pain function.318 However, the association with the performance tests is more problematic as even the most ‘objective’ measures of function are confounded by pain.
These findings were obtained on patients from a single-centre orthopaedic unit; however, this is a large sample with a representative age range undergoing a diversity of procedures. The study also focused on a discrete number of assessment measures and did not include measures such as the OHS, OKS, KOOS or HOOS. Measures were selected to include a broad range of tools that could be administered at the same time alongside demographic information. Through this, we were able to compare effectively the measures and the influence of patient characteristics.
Conclusion
Our study shows that associations between patient characteristics and function differed according to the measurement approach used. Measures of pain and psychological health could be routinely used alongside self-report of activity limitations to enable appropriate adjustments. Performance-based tests are strongly influenced by age, possibly owing to age-related sarcopenia. If this is correct, for research purposes the inclusion of a simple muscle strength test, such as grip strength, alongside performance-based methods might aid interpretation of the findings.
Cross-sectional analysis of joint range of motion and its relevance to functional measures
Introduction
Range of motion is often routinely assessed in orthopaedic surgery. Measures of ROM are included in both the AKSS113 and HHS.268 However, the relationship between ROM and function is contested, with some authors regarding ROM as a good determinant of function329 but others reporting poor correlations.317,338 In view of ongoing use of ROM and continuing uncertainty about its relationship with function, this analysis of the ADAPT data was undertaken to investigate the relationship between ROM and our other measures of function. In this analysis we have also specifically looked at the different domains of function as described in the WHO ICF, that is, we have analysed the relationships between ROM and impairment, activities limitations and participation restriction separately.
Statistical analysis
Analyses were conducted separately for patients listed for hip and knee replacement. Spearman’s rank-order correlation coefficients were used to assess correlations between continuous variables. Point-biserial correlation coefficients were used to assess correlations between continuous and dichotomous variables. These correlation measures range from –1 to 1. The strength of correlation was interpreted as |0.00|–|0.25| = none–little, |0.26|–|0.49| = low, |0.50|–|0.69| = moderate, |0.70|–|0.89| = high, |0.90|–|1.00| = very high. Linear regression was conducted to adjust for the effect of demographic factors (age, sex, socioeconomic status, joints affected by arthritis, comorbidities and psychological status) on the relationship between WOMAC pain and self-report activity limitations. To adjust for the effect of demographic factors on the relationship between WOMAC pain and participation restrictions, the participation restrictions scale of the Aberdeen impairment, activity limitation and participation restriction measure (Ab-P) was transformed with a root square function to comply with the assumptions of the linear model.
To compare functional measures between patients with low and high active flexion, patients were dichotomised into those with low flexion (< 110° for knee patients and < 95° for hip patients) and those with high flexion (≥ 110° for knee patients and ≥ 95° for hip patients). This cut-off was chosen because 90° of hip and knee flexion is required when rising from sitting to standing in order for the centre of gravity in the sagittal plane to transfer from behind the midline (in sitting) to in front of the midline (in standing). Continuous variables were compared between these two groups using unpaired t-tests or Mann–Whitney U-tests for non-normally distributed variables. Categorical variables were compared using chi-squared tests. Statistical analysis was performed using Stata 12.
Results
Relationship between measures of impairment and activity limitations
Correlations between the measures of impairment (ROM and WOMAC pain) and measures of activity limitations [WOMAC function, activity limitations scale of the Aberdeen impairment, activity limitation and participation restriction measure (Ab-A), performance tests] are displayed in Table 23.
Hip and knee ROM correlated weakly with self-report (Spearman’s rank-order correlation coefficients ranging from |0.11| to |0.43|) and observed activity limitations (|0.09| to |0.38|). In comparison, correlations between pain and self-report activity limitations were moderate to high (|0.63| to |0.80|) and remained so after adjustment for demographic factors (data not shown). However, correlations between pain and observed activity limitations were low (|0.13| to |0.44|). Correlations between individual WOMAC function items and ROM measurements were investigated to determine if ROM correlated with specific functions. All correlations were found to be low (|0.01| to |0.40|). The highest correlation in patients listed for hip replacement was between flexion and getting on/off toilet (–0.37) and in patients listed for knee replacement it was between flexion and getting in/out of a car (–0.40) and putting on socks/stockings (–0.40).
Relationship between measures of impairment and participation restrictions
Correlations between measures of impairment and participation restrictions (Ab-P) are displayed in Table 23. Hip and knee ROM correlated poorly with participation restrictions (|0.06| to |0.32|). In comparison, correlations between pain and participation restrictions were high in patients listed for hip replacement (–0.71) and moderate in patients listed for knee replacement (–0.53), and these correlations remained strong after adjustment for demographic factors.
Comparison of functional measures between patients with low and high active flexion
Patients listed for knee replacement with low flexion had significantly worse results on all measures of impairment, activity limitations and participation restrictions than patients with high flexion (Table 24). Patients listed for hip replacement with low flexion had significantly worse activity limitations than patients with high flexion.
Discussion
The WHO ICF model offers a theoretical framework for describing and assessing disability. The data from this study show that in patients listed for joint replacement, there is a poor relationship between ROM and any of the disability measures used in this study, which contrasts with the strong relationship found between pain, activity limitations and participation restrictions. Previous studies have arrived at discordant conclusions about the relationship between function and ROM. Some reports suggest that ROM is an important determinant of function321,329 while others disagree.317,338 Furthermore, it is suggested that ROM is important for some specific functions, or that a threshold of around 95–100° of flexion is required for adequate function.338 Our data suggest that there may be such a threshold, but that ROM does not correlate with specific activities on the WOMAC function and modest restrictions of ROM are of little relevance to functional outcomes.
These findings are important for two reasons. First, commonly used methods of assessing patients’ disability, such as the AKSS and HHS, include ROM. Second, many orthopaedic surgeons often consider the achieved ROM of a replaced joint to be an important measure of surgical outcomes and discuss this with their patients. We suggest that as a measure of impairment, ROM is of little relevance to function and the only concern should be whether or not knee flexion is restricted to < 110° and, to a lesser extent, whether or not hip flexion is limited to < 95°.
Weaknesses of the study were the lack of randomisation of the order of the performance tests and inclusion of patients from only one specialist orthopaedic unit. However, by including patients listed for a range of joint replacement procedures, a diverse and varied sample was achieved. Strengths include the study’s relatively large size, the extent of and care taken with the measures of ROM and disability, and the good interobserver and intraobserver reliability for ROM.
Conclusion
These findings suggest that measuring ROM adds little value to assessment of impairment in patients undergoing joint replacement, unless hip or knee flexion is restricted to < 90° and, therefore, should not be used to assess disability in a pre-operative context.
Longitudinal analysis of changes in the different measures of function over time
Introduction
One of the main aims of the ADAPT study was to assess the responsiveness of various different measures of function and, in particular, to contrast the value of the three main different approaches (self-assessment, clinician-administered tools and functional tests) in assessing change after joint replacement. An important part of such an analysis is to assess which measures might have an important ceiling effect, that is which measures often reach their limits after joint replacement surgery, such that they cannot detect further improvement.
In this section we describe the changes seen in function between the pre-operative and 12-month postoperative assessment.
Statistical analysis
Analyses were conducted separately for patients listed for hip and knee replacement.
Change in function ability (as measured by the WOMAC function subscale, the SF-12 physical function subscale, the Ab-IAP, the MYMOP2 score, the get-up-and-go test, the timed 20-metre walk, the HHS and the AKSS) was defined as the 12-month postoperative score minus the pre-operative score. Patients were categorised into three groups: those with deteriorated function, those with unchanged function and those with improved function.
To determine if the individual changes were due to chance or not, we used the following approach. After transformation of the pre- and postoperative scores (using inverse, root square or logarithm function), the changes in physical function were normally distributed. Therefore, we used linear mixed models with random intercept and slope to regress the transformed outcomes on the time of assessment.339 We then determined if the individual changes between pre- and post-surgery assessments were different from 0 using the 95% CI around each participant’s trajectories. This was derived from the post-estimation of the above models using fixed effects and subject-specific random effects. A 95% CI including 0 is not statistically significant, that is, the observed change is no different than a flat trajectory. This approach was preferred over the traditional relative change index measure as it does not depend on a deterministic external measure of test–retest reliability, which was not available for the studied scores. It also allows a better control of the regression to the mean effect by assuming that all scores are drawn from the same population distribution (shrinkage effect).
We also derived the minimum clinical important improvement (MCII) of each score, that is the improvement in functional score between two time points likely to be important from the patient’s perspective. We used an anchoring question about participant satisfaction with recreational activities at 12 months post surgery284 to dichotomise the ADAPT participants into two groups (very or somewhat satisfied vs. somewhat or very dissatisfied). We then calculated the cut-off point (MCII) on the distribution of score change, using a receiver operating characteristic curve analysis, to determine the threshold maximising the sensitivity and specificity. Patients experiencing an improvement greater or equal to this threshold were defined as having a clinically meaningful change in function.
We also determined the observed ceiling effect for each score at 12 months post operation. The ceiling effect was defined as the percentage of patients with a score equal to the highest possible score. For example, a patient with a WOMAC function subscale score of 100 [score ranges from 0 to 100 (worst to best)] has reached the ‘ceiling’ of the score and is considered to have the maximum possible functional ability. For ease of interpretation, the lowest values of scores ranging in the opposite direction to the WOMAC function scoring system such as the Ab-IAP measures [Ab-I 0–36, Ab-A 0–68, Ab-P 0–36; best to worst] or the MYMOP2 score [0–6 (best to worst)] were considered as the ‘maximum score’.
Results
This analysis was undertaken on patients who participated in both the baseline and 12-month assessments and provided information on at least one of the functional measures. Of the total cohort (n = 263, 130 hip replacement and 133 knee replacement patients), there were 104 hip replacement and 101 knee replacement patients who had this full data set and were included in these analyses.
Change in scores
The scores for the various functional measures, at baseline and 12 months post operation, are shown in Tables 25 and 26. Tables 27 and 28 also record the changes in each of the scores between baseline and 12 months.
Table 25 demonstrates that, as expected, scores on the function measures improve from baseline to 12 months. People in this cohort had better function after surgery than before they had the surgical intervention.
Table 26 shows similar results, suggesting that people’s function improves after surgery. However, a particular feature of these data is the number of participants who were able to complete certain tasks. Nearly everyone could complete most of the tests, both before and after surgery. However, two of the tests appear to be more discriminatory – the 30-cm step test and the balance test. As shown in Table 26, only 61.5% of people with hip disease could do the 30-cm step test prior to surgery, and this improved to 82.4% postoperatively; the equivalent figures for those with knee disease were 40.0% and 70.3%, respectively. The balance test was also difficult for patients with knee disease; only 38.6% of patients could perform the test pre-operatively and 46.0% postoperatively. Those with hip disease could do a little better – 48.1% were able to do it pre-operatively, and 60.6% postoperatively.
Change for better or worse and clinically important improvements
The above scores tell only a part of the story. They do not provide information about what proportion of people got better or worse, or whether or not the improvements were clinically, as well as statistically, important. In order to answer these questions, we investigated the following:
- the number of patients with scores that indicated improvement, deterioration or no change in function between baseline and 12 months
- the numbers of patients in the improved or deteriorated function categories whose changes were statistically significant
- the proportion of patients for whom these changes reached clinical significance, using the anchoring satisfaction question.
The results of these analyses are shown in Tables 27 and 28. The data also allow us to further compare the degree of improvement following knee or hip replacement.
As shown in Tables 27 and 28, there were a small number of people whose function did not change or deteriorate after surgery. Overall, each of the functional assessment methods is telling a similar story of a few people getting worse (but rarely significantly so) and most getting better, often significantly better. The walking time was more likely to show deterioration than other tests and arguably the ‘get-up-and-go test’ showed more differentiation between changes for the better or worse than other tests. Overall patients having hip surgery were more likely to improve functionally than those having knee replacement, with most of the measures used, although it is interesting to note that walking speed changes were very similar in both groups.
We also looked for ceiling effects on each of the functional measures at baseline and 12 months, as shown in Tables 29 and 30.
The striking finding here is that many of the self-assessment questionnaires that are routinely used to assess function in people with lower limb osteoarthritis show an important ceiling effect in response to joint replacement. The problem is particularly evident in the assessment of function after hip replacement; the WOMAC function subscale, SF-12 physical function measure and all three domains of the Aberdeen scale all reach a maximum score in between 20% and 50% of patients postoperatively. Rather fewer patients reached the maximum score after knee surgery, but the problem is a very real one for this intervention as well (between 8% and 22% of patients reaching the maximum on one or other of the scores). Timed tests such as the walk time or ‘get-up-and-go’ test, cannot, by definition, suffer from this problem and it is interesting to note that very few people achieved ‘top marks’ on either the HHS or AKSS.
Discussion
It is well known that, on average, people undergoing a hip or knee joint replacement get some functional benefit.340 Our data support this, showing that, on average, there was a large improvement in the functional scores between baseline and 12 months after surgery. Our data also confirm findings from previous research that those undergoing hip surgery can, on average, expect more improvement in function that those having a knee replacement.341,342
However, average scores obscure the fact that there can be big differences in the change and that some people may experience a decline in function. Our data suggest that very few people get worse, although quite a lot of those who improve are not achieving a level of improvement that can be called clinically, rather than statistically, significant. For example, if we examine the data in Table 27 carefully, they tell us that the WOMAC function score improved in 93% of people having hip replacement and 86% of those having knee replacement, and that this change was statistically significant in 90% and 82%, respectively. However, the change for the better was clinically significant in only some 70% of those having hip replacement and 61% of the knee replacement patients. Clearly, there is a need to be cautious in relation to the information given to people when they have a joint replacement regarding expected functional outcomes (as opposed to pain improvement).
The data do suggest that there are important differences in what is being assessed by self-report, clinician-administered tools and functional assessments, as was apparent in our cross-sectional data. An important finding in relation to that is the fact that the self-assessment tests often used by rheumatologists to detect change in response to non-surgical interventions (e.g. WOMAC and SF-12) show a big ceiling effect when used in the surgical setting.
Assessment of the trajectories of change
Introduction
The analysis presented in the previous section (see Longitudinal analysis of changes in the different measures of function over time) showed that the majority of participants experienced an improvement in function after joint replacement. To supplement and extend on this work, we undertook further analyses to investigate participants’ trajectories of recovery after joint replacement surgery, using data collected pre-operatively and at 3 months and 12 months after surgery.
First, we explored the trajectory of recovery, in terms of pain and function, in the first year postoperatively, with a particular focus on patients undergoing revision joint surgery. By including patients listed for different sort of surgical procedures, the ADAPT cohort study allowed us to investigate the specificities of pain and function changes following revision surgery. We then investigate how recovery of pain and function were interrelated. In these analyses, we used both a self-reported (WOMAC function) and objective measure of function (time to complete a 20-metre walk test) to identify any disparities in recovery pattern induced by assessment measure. Pain is a subjective experience and, therefore, we used the self-report WOMAC pain score for assessing pain severity. Finally, we present findings from a gait analysis study conducted on total primary hip replacement to shed light on the above self-reported versus objective findings comparison.
Statistical analysis
Analyses were conducted separately for patients undergoing hip and knee replacement surgery using Stata 13.1 (StataCorp LP, College Station, TX, USA).
Pain and function were analysed jointly using a multivariate linear mixed (MLM) regression with random intercepts and slopes. This approach allows the modelling of the longitudinal trajectory (patients’ trajectory) of each outcome measure and the assessment of the correlations within and between these outcome measures (correlation structure) in a single regression framework while providing unbiased estimations in a context of missing data (under the missing at random assumption). Postoperative change over time was modelled as two linear splines: one spline for the ‘immediate change’ occurring between the pre-operative assessment and the first postoperative assessment (3 months) and another spline for the ‘long-term change’ occurring between the two postoperative assessments (3 and 12 months). These changes were normally distributed for each outcome allowing the use of MLM regression. However, the postoperative function and pain scores were not normally distributed preventing the use of this modelling framework to compare participants’ scores at specific postoperative time points. Mann–Whitney U-tests were used for this purpose. The strength of correlation between parameters was interpreted as |0.00|–|0.25| = none–little, |0.26|–|0.49| = low, |0.50|–|0.69| = moderate, |0.70|–|0.89| = high, |0.90|–|1.00| = very high. p-values of ≤ 0.05 were considered statistically significant.
The MLM models were conducted on WOMAC pain and WOMAC function and replicated on WOMAC pain and time to complete the 20-metre walk test to investigate disparities in findings according to the method of functional assessment (self-reported vs. objective). The inverse of the 20-metre walk test completion time (Time–1) was derived to facilitate the comparison of the effects between outcomes (lower scores: worse function/pain score/time of test completion; higher scores: best function/pain score/time of test completion).
To assess if the pattern of changes differed by surgery type, patients were split into two groups: primary surgeries (including primary total hip and knee surgeries, knee unicompartmental and patellofemoral surgeries and revision surgeries). WOMAC pain, WOMAC function and Time–1 were modelled separately with univariable linear mixed (ULM) regressions. It was not possible to model all these outcomes in a single multivariate framework as the numbers of patients in the subgroups were not sufficient to fit a MLM (hip: revision, n = 43; primary, n = 80; knee: revision, n = 42; primary, n = 84). The ULM models were adjusted for the two time splines defined above, surgery type, their interaction and random effects on each of these parameters. Differences in the immediate- and long-term changes by surgery type (primary vs. revision) were tested using appropriate contrasts. A similar approach was used to investigate the influence of pre-operative pain/function score on the postoperative recovery pattern: patients were split into groups of high or low level of pre-operative pain using the pre-operative WOMAC pain median as a cut-off point (hip: median = 55; knee: median = 40); they were also split into groups of high or low level of pre-operative functional ability using the pre-operative WOMAC function and time to complete the 20-metre walk test medians as cut-off points (hip: median WOMAC function = 56, time to complete the 20-metre walk test = 22–1 seconds; knee: 50 and 22–1 seconds, respectively).
Results
A total of 123 hip replacement participants were considered. Of these patients, 80 (65%) had a primary THR and 43 (35%) had a revision hip replacement. They had a mean age of 65 years (SD 11 years) and BMI of 28 kg/m2 (SD 5 kg/m2). Half of them were female (n = 62) and retired or unemployed (n = 67) and 24% (n = 30) were living alone. Approximately 75% (n = 91) had osteoarthritis in at least one other joint.
Of the 123 hip replacement participants with a pre-operative assessment, 121 (98%) completed a WOMAC pain and function measure and 118 (96%) performed the timed 20-metre walk test. Of the 112 (91%) patients who participated in a 3-month assessment, all completed the WOMAC pain and function scores and 107 (87%) completed the 20-metre walk test. At 12 months, 110 (89%) hip patients were still in the study, 109 (89%) completed the WOMAC function score, and 108 (88%) completed the WOMAC pain score and the 20-metre walk test.
A total of 126 knee replacement participants were considered. Of these patients, 48 (38%) had a primary TKR, 42 (33%) a unicompartmental knee replacement, 5 (4%) a patellofemoral knee replacement and 42 (33%) a revision knee replacement. They had a mean age of 67 years (SD 10 years) and a BMI of 31 kg/m2 (SD 6 kg/m2). Approximately 55% (n = 69) were female, 66% (n = 83) were retired or unemployed and 29% (n = 37) were living alone. Approximately 83% (n = 104) had osteoarthritis in at least one other joint.
Of the 126 knee replacement participants with a pre-operative assessment, 123 (98%) completed the three outcome measures. Of the 115 (91%) patients who participated in a 3-month assessment, 113 (90%) completed the WOMAC function score and 114 (91%) the WOMAC pain score and the 20-metre walk test. At 12 months, 112 (89%) patients were still in the study, 111 (88%) completed the WOMAC scores and 102 (81%) the 20-metre walk test.
Pain and function measures at the different assessment points are presented in Table 31.
Hip replacement: change in function and pain
As expected, both function (Figure 13a) and pain scores (Figure 13b) improved after surgery. These improvements occurred mainly within the first 3 months following surgery [Table 32: +0.24 of WOMAC function point/day (p < 0.001); +0.28 WOMAC pain point/day (p < 0.001)]. There was no evidence of further changes after 3 months (see Table 32: WOMAC function and WOMAC pain, p-values = not significant). A similar pattern of recovery was observed with the objective measure of function (Figure 13c and Table 32) with a statistically significant mean immediate change (+0.002 seconds–1/month; p < 0.001) but no significant long-term mean change (p = 0.057). This latter effect is close to the 0.05 significance level, suggesting a marginal effect, but a larger sample would be required to provide a more definitive answer.
The mean trajectories are derived from the fixed effects of linear mixed models regressing WOMAC pain, WOMAC function and Time–1 to perform the 20-metre walk test on the time of assessment parameterised as two linear splines (to assess immediate changes and long-term changes, see Table 32, footnotes f and i).
This overall pattern of change in self-reported function was observed in both primary and revision surgery patient groups (Figure 14a). However, the immediate change was twice as large (p < 0.001) for the primary (+0.28/day, 95% CI 0.24 to 0.33; p < 0.001) than for the revision (+0.15/day, 95% CI 0.10 to 0.20; p < 0.001) surgery group. No significant long-term improvement in function was observed after 3 months for either group. As a result of the different pace in immediate recovery, and despite similar levels of pre-operative WOMAC function scores, the median level of functional ability observed at 12 months post operation was higher in the primary surgery group (p = 0.01).
Similar results were observed for WOMAC pain (Figure 14b). Those patients listed for a primary surgery had more pain pre-operatively than those in the revision surgery group (p = 0.03), but their immediate improvement was twice as large (+0.33/day, 95% CI 0.29 to 0.38, vs. +0.17/day, 95% CI 0.11 to 0.22); long-term mean changes in pain were not significant for both groups. At 12 months, the primary surgery group had caught up with the revision group and had similar pain score levels (p = not significant).
In contrast, to the two other outcomes, the immediate improvements in walking time were similar for both surgical groups but their changes in objective function after 3 months were different (p = 0.05), being nearly flat for the revision group (p = not significant) whereas patients in the primary surgery group continued to experience an improvement in their function (p < 0.01). Both groups had the same pre-operative walking speed, but at 12 months the primary surgery group did better (p < 0.01).
Hip replacement: correlation structure between and within pain and function
Two sets of correlations are presented in Table 32, one relating to the joint MLM modelling of pain and self-reported function (‘self-reported model’) and another relating to the modelling of pain and objective measure of function (‘objective model’).
Participants were more likely to concomitantly report high/low level of pre-operative pain and functional disability (correlation: + 0.78), similar direction of immediate (+ 0.76) and long-term (+ 0.64) pain and function improvements. When an objective measure of function was considered, these correlations were weaker or non-existent (0.39, 0.49 and 0.05). With regard to the ‘functional improvement journey’, high pre-operative functional disability was correlated with large functional improvement within the first 3 months following surgery and those with more favourable pre-operative functional scores had smaller immediate functional gain (–0.61 in the ‘self-reported’ and –0.47 in the ‘objective’ models). This evidence is illustrated in Figure 15a and b. Participants in the low pre-operative function group had an immediate functional improvement nearly 2.5 times larger than those who were in the high function group (difference in slope between high/low groups: p < 0.0001 for both self-reported and objective function). As reported in Table 31, the two groups of patients had statistically significantly different levels of pre-operative function but had similar levels of functional ability 12 months after surgery (WOMAC function). A significant difference was still observed after surgery when an objective measure of function was considered but the gap had reduced (see Figure 15b).
No relationship was observed between the pre-operative functional scores and long-term changes (see Table 32, correlations < 0.25). Large/small immediate functional improvements were correlated with small/large long-term functional improvement (–0.36 in the ‘self-reported’ and –0.33 in the ‘objective’ models).
With regard to the ‘functional improvement journey’, high pre-operative functional disability was correlated with large functional improvement within the first 3 months following surgery while those with more favourable pre-operative functional scores had smaller immediate functional gain (–0.61 in the ‘self-reported’ and –0.47 in the ‘objective’ models). This evidence is illustrated in Figure 15a and c. Participants in the low pre-operative function group had an immediate functional improvement nearly 2.5 times larger than those who were in the high function group (difference in slope between high/low groups: p < 0.0001 for both self-reported and objective function). As reported in Table 31, the two groups of patients had statistically different levels of pre-operative function but had similar levels of functional ability 12 months after surgery (WOMAC function). A significant difference was still observed after surgery when an objective measure of function was considered but the gap had reduced (see Figure 15b).
No relationship was observed between the pre-operative functional scores and long-term changes (see Table 32, correlations < 0.25). Large/small immediate functional improvements were correlated with small/large long-term functional improvement (–0.36 in the ‘self-reported’ and –0.33 in the ‘objective’ models).
With regard to the pain improvement trajectory, a similar picture was found, with evidence of relationships between the pre-operative scores and immediate changes (–0.56) and between the immediate- and long-term changes (–0.56). Participants with a high level of pre-operative pain had an immediate change that was twice as large as those in the low-pain group (Figure 15c, difference in slope between high/low groups: p < 0.001) and the pre-operative difference in pain severity was no longer present at 12 months postoperatively (see Table 31).
Pain and function were inter-related and appeared to influence each others ‘recovery journey’ as shown by the correlations between them. The patient-reported pre-operative level of functional ability was related to immediate change in pain (–0.45), with higher functional improvement for patients with worse pre-operative pain and lower pain improvement for those with better baseline functional ability. This relationship was not observed when function was objectively measured (objective model). Similarly, the level of pre-operative pain was related to the immediate changes in self-reported function (–0.36) as well as with immediate changes in objective function (–0.45). Pre-operative self-reported function did not seem to be correlated with long-term changes in pain and pre-operative pain did not seem to be correlated with long-term changes in self-reported function. Immediate self-reported functional improvement was correlated with long-term pain improvements (–0.38), with smaller long-term changes for those who had large immediate changes or larger long-term changes. Similarly, immediate improvement in pain was correlated with long-term improvement in perceived functional ability (–0.25). These relationships were not observed in the ‘objective model’.
Knee replacement: change in function and pain
Patients experienced an improvement in function (Figure 16a) and pain (Figure 16b) after their knee surgery. The improvements occurred mainly within the first 3 months following surgery (Table 33: + 0.18 of WOMAC function point/day, p < 0.001; + 0.21 WOMAC pain point/day, p < 0.001). There was no evidence of further changes after 3 months for WOMAC function (see Table 33: p = not significant) but there was some suggestion of further long-term improvement for pain (p = 0.051). Contrary to the self-reported measure of function, the time to complete the 20-metre walk test (Figure 16c and see Table 33) improved significantly until the 12-month assessment [immediate change + 0.001 seconds–1/month (p < 0.001); long-term change + 0.0002 seconds–1/month (p < 0.01)], with steeper improvement in the first 3 months (difference between the two slopes, p = 0.04).
The differences in improvement patterns by surgical type group are presented in Figure 17. Patients in both the revision and primary groups experienced significant improvements within the first 3 months following their surgery in their subjective (Figure 17a; p < 0.0001 for both groups) and objective (Figure 17b, revision surgery, p = 0.02; other surgery, p = 0.01) measures of function. These immediate improvements in function were higher in the primary surgery group, the difference was not statistically significant (p > 0.05). Pain improved for both groups during the first 3 postoperative months (Figure 17c) but at a slower pace for the revision surgery group (+ 0.14/day, 95% CI 0.08 to 0.20; p < 0.0001) than for the primary surgery group (+ 0.25/day, 95% CI 0.21 to 0.29; p < 0.0001). No evidence of function or pain change was observed between 3 and 12 months, except in patients in the primary surgery group who experienced an improvement in their 20-metre walk test time completion (+ 0.0009 seconds–1/month; p = 0.01). Pre-operative levels of pain and function (subjective and objective) were similar in both surgery groups (p > 0.05), but at 12 months those who had a revision surgery had worse median scores (WOMAC function, p < 0.02; WOMAC pain, p < 0.01; 20-metre walk test time, p = 0.03).
Knee replacement: correlation structure between and within pain and function
The correlation structures of the ‘self-reported’ and ‘objective’ models are presented in Table 33.
Participants were more likely to concomitantly report high/low level of pre-operative pain and functional disability (+ 0.81), similar direction of immediate (+ 0.80) and long-term (+ 0.64) pain and functional improvements. When an objective measure of function was considered, the corresponding correlations were much weaker or negligible (+ 0.29, + 0.33 and –0.04).
With regard to the ‘functional improvement journey’, low/high pre-operative function scores were correlated with high/low immediate improvement in function (self-reported model –0.26, objective model –0.31). In Figure 18a and b, we can notice steeper immediate improvements for those in the low pre-operative function group than in those in the high pre-operative function group (difference in slope between high/low groups: WOMAC function, p < 0.001; Time–1, p = 0.012).
The long-term changes in function did not seem to be related with the pre-operative scores (see Table 33). A weak relationship between the self-reported immediate- and long-term function change was observed (–0.19) whereas high/small immediate improvement in the 20-metre walk test completion time was correlated with small/high long-term improvement (–0.49).
With regard to the ‘pain improvement journey’, large immediate pain improvements were correlated with high pre-operative pain level and small immediate improvement with low pre-operative level of pain (–0.30). This is illustrated in Figure 18c, which shows that immediate improvement was steeper for patients in the low pre-operative pain group than for those in the high pre-operative pain group (difference in slope between high/low groups, p < 0.01).
The long-term pain change was not related to the pre-operative pain scores (see Table 33). Those with larger/smaller immediate pain improvements also had smaller/larger long-term pain improvement (–0.38).
With regard to the pain–function inter-relation, that is, the influence on each other’s ‘journey’, no evidence of relationship between the pre-operative function level and the postoperative changes in pain was observed. Pre-operative pain level was weakly correlated (–0.22) with immediate self-reported function change. This relation was not found with the objective measure of function. Long-term change in function appeared independent of the pre-operative pain severity. The self-reported immediate functional change was weakly correlated (–0.24) with long-term pain change, although this relationship was not observed in the objective model. The immediate pain change was not related to long-term change in function.
Assessment of function using accelerometry in patients receiving hip replacement
Patients were also invited to wear an inertial ambulatory motion sensor incorporating accelerometers and gyroscopes during the completion of the 20-metre walk test (see Inertial sensor-based motion and gait analyses). Participants were asked to walk along a straight flat corridor at their own preferred speed. Participants wore their own clothes and shoes but high-heeled shoes were not permitted. After crossing the finish line, one last step was allowed to establish a complete stop avoiding a significant slowdown within the 20 metres. The exact distance covered (20 metres + the last step) was measured and used for the following analyses. The test was conducted on the 36 patients listed for a primary THR without any history of previous lower limb joint surgery and with available longitudinal WOMAC function and ambulatory gait analysis data.
The collected measures are presented in Table 34 and longitudinal changes are reported in Figure 19.
The 36 participants had the same pattern of WOMAC function recovery (see Figure 19) as the pattern observed in the overall sample (see Figure 13) – a large significant improvement within the first 3 months following the surgery (p < 0.0001) but no further improvement between 3 months and 12 months postoperatively (p > 0.05).
Pre-operatively, all the gait parameters had some weak to moderate correlations with WOMAC function (see Table 34). Apart from the ‘range of motion pelvic obliquity’ gait parameter, all the others maintained some weak correlations 12 months after the surgery. These findings suggest that the patient-reported measure partially reflect functional aspects captured by the gait analysis objective parameters.
Steps cadence and time to complete a step had the same course of longitudinal changes as the WOMAC function scores (see Figure 19).
Conversely, speed, ROM pelvic obliquity and step length continued to improve after the 3-month postoperative assessment (p < 0.0001).
The postoperative average changes in step irregularity and asymmetry were not statistically significant, suggesting that these aspects of function are not altered by THR surgery. However, there is a large heterogeneity in the patterns of individual changes (see Figure 19).
Discussion
These findings indicate that the pain and function trajectories in the first year following hip or knee surgery are similar, with most of the improvement occurring within the first 3 postoperative months. No clear indication of further improvement was observed after the 3 months postoperative assessment for those undergoing hip surgery. However, for those who had knee surgery, function measured objectively (20-metre walk test) continued to improve until 12 months post operation.
The absence of improvement after 3 months post operation could be viewed as an artefact resulting from the ceiling effect inherent to score bounded PROMs such as WOMAC limiting the ability to detect improvement for patients recovering very quickly.343,344
However, the long-term mean changes associated with the objective function measures were marginal and much smaller than the one that occurred before 3 months and observed only among patients undergoing knee surgery. The gait analysis also revealed steeper slopes before 3 months and not all gait parameters had a statistically significant improvement beyond 3 months. Only residual changes might have to be expected after 3 months in proportion to those occurring before. The modest sample size of ADAPT limited our ability to adjust for factors known to be associated with the postoperative outcome such as age, sex, mental health and other comorbidities, and adjusted findings might have provided a slightly different picture. Our results are consistent with the existing literature.48,341,342,345–350 Improvements in WOMAC physical function beyond 3 months were observed by Bachmeier and colleagues48 in patients who had undergone hip replacement. However, changes in WOMAC pain were marginal. For patients with knee replacement, changes beyond 3 months in both WOMAC function and pain were marginal. Heiberg and colleagues,347 in Norway, found a small but significant improvement after 3 months post operation among hip surgery patients for pain and subjective and objective measures of function [HOOS and 6-metre walk test (6MWT)]. Kennedy and colleagues348,349 found some further improvement between 3 and 6 months post knee or hip surgery but none thereafter using objective or PROMs measure of function [Lower Extremity Functional Scale (LEFS) and 6MWT] in a Canadian population. Halket and colleagues,341 in Canada, and Naylor and colleagues,350 in Australia, found hardly any improvement after 3 months in PROM measures of pain including WOMAC pain.
Interesting relationships have been found between the pre-operative score levels and postoperative changes. The pre-operative situation is negatively related to the immediate postoperative changes. The worse the situation before the surgery, the more likely the participants are to improve within the first few postoperative months; therefore, the better the pre-operative situation, the lower the immediate postoperative improvement. These findings can be induced by the ceiling effect inherent in the scoring system of PROMs in which those doing well have less room for improvement, magnifying the effect of those starting with lower scores. However, these negative correlations were also observed with the objective measure of function suggesting that more can be expected from the surgery when the patients have very poor pre-operative pain and function. Twelve months after their hip surgery, patients with poor pre-operative scores had caught up with those who had better pre-operative score. However, after their knee surgery, and despite a faster immediate improvement, participants with low function or high pain before their surgery still had significantly poorer function or pain level, even if the gap had reduced. This suggests that, even if there is a lot to expect from the joint surgery, any delay in the knee surgery might be associated with functional and/or pain degradation, which cannot be corrected by the operation. It is unlikely that these findings are driven by the revision/non-revision status of the knee participants. The prevalence of patients listed for a knee revision did not differ between high- and low-score groups; their pain or function pre-operative median scores did not differ from those listed for a first-time joint surgery.
For both groups, any intervention modifying the pre-operative pain level is likely to affect its course of immediate postoperative improvement, and the same is true for function. However, the relationship between pain and function differed between those having hip or knee surgery. For the hip patients, pain and function were interrelated whereas, for knee disease, no correlation between patients’ pre-operative score levels and postoperative changes was found. This could suggest that separate interventions specific to pain and function need to be designed for knee pre- and postoperative rehabilitation whereas more generic hip intervention could affect both domains simultaneously.
Our findings suggest that patients undergoing primary or revision surgery will experience an improvement in pain and function but the pattern of recovery will differ between these two types of surgery. Despite similar or better function and pain scores for patients undergoing a revision surgery than for those undergoing a primary joint surgery, a revision surgery does not seem to bring as much pain and functional improvement 12 months later. This finding is congruent with the existing literature351,352 and needs to be kept in mind when patients and clinicians discuss post-surgical expectations.
Finally, the pattern of functional recovery seems to be influenced by the method used to assess function, with significant long-term improvement observed in objective measure of function, but not with the self-reported measure. This could reflect the well-documented ceiling effect of the WOMAC function score. However, some of the gait analysis parameters, less subject to a ceiling effect, have a similar course of postoperative improvement as the self-reported measure of function. As the WOMAC function score is capturing information on several daily activities, this might also suggest that the potential loss of information induced by the use of a score-bounded instrument might not be as important as we think. It might also reflect the actual improvement pattern of some specific dimensions of function. Moreover, pain appears more correlated with the self-reported than with the objective measure of function. This might confirm previous work by Stratford and Kennedy353 which suggested an internal limitation of the WOMAC index scores: ‘activity overlap on the pain and function subscales plays a causal role in limiting the WOMAC physical function subscale’s ability to detect change’.
In all cases, these findings imply that it is valuable to use both self-reported and objective measures of function whenever possible. Doing so will capture a comprehensive longitudinal functional ability picture of patients undergoing joint replacement.
Discussion and conclusions
The ADAPT study aimed to investigate the different measures used to assess function in people undergoing hip or knee replacement and their responsiveness to the change resulting from joint replacement. The cohort included 263 people undergoing a mixture of hip and knee replacement, and primary and revision surgeries. This provided a mix of patients with a wide variety of different levels of disability, but the disadvantage of lacking homogeneity.
Our theoretical basis for the assessment of disability was the ICF, which differentiates between impairments, activities limitations and participation restrictions. We deliberately chose a number of different types of approach to functional assessment:
- standard self-report measures widely used in rheumatology practice (the WOMAC and SF-12)
- the Aberdeen measure, a recently developed self-assessment tool which differentiates between impairments, activities limitations and participation restrictions
- clinician-administered tools widely used by orthopaedic practitioners (HHS and AKSS)
- performance-based tests widely used by geriatricians (‘get-up-and-go test’, step tests, balance tests and walking time)
- accelerometry tests, which are a recent development and have the promise of providing us with a more objective way or assessing function.
Measures were made immediately prior to surgery, at the standard 3-month follow-up visit and at 1 year.
Our key findings are:
- 1.
There is no ‘right’ way to assess function in patients undergoing joint replacement.
We had hoped at the outset of this study that we might be able to conclude that some measures should be used, and others discarded, but the data do not support this. Arguably the ‘knee score’ component of the AKSS is of questionable value because it correlates poorly with other measures. Each of the different methods of assessing function appears to be measuring something a little different and is influenced by different covariates, so nothing is ‘right’ and nothing is ‘wrong’. The strongest correlations were between the different self-assessment measures and also between the different performance tests. However, the correlations between self-assessment measures and performance tests were much lower. This suggests that it might be wise to use one of each type of measure to obtain a satisfactory picture of the degree of functional loss in any individual patient. This is also confirmed by our comparisons of the longitudinal changes between patient-reported and performance test functional measures.
- 2.
Self-assessment measures and functional tests are influenced by different factors.
We have shown that:
- Pain affects every type of functional assessment measure.
- Mental health status has a large influence on self-assessment measures but little effect on functional testing.
- Age (and sex in the case of the hip replacement) affects laboratory tests of function but not self-assessment measures.
We interpret this as confirming previous research that suggests pain and function are inextricably linked in musculoskeletal disability, that people with anxiety or depression may assess themselves as being worse off than they objectively are and that the influence of age on functional tests may be mediated by sarcopenia (a hypothesis that requires further investigation).
The implication is that measures of function may need adjustment for pain, psychological status, age and perhaps muscle strength if we are to obtain a satisfactory picture of functional loss.
- 3.
Range of joint motion is not a satisfactory surrogate measure for function.
It is relatively easy to assess the ROM of the hip or knee and this measure is commonly carried out in clinical practice. Health-care professionals and patients often assume that it provides a useful surrogate measure of osteoarthritis severity and/or functional problems. It constitutes an important part of both the HHS and the AKSS.
Our data indicate that ROM does not correlate well with other measures of disease severity and we would suggest that it should not be given any weight in patient assessment.
The ROM, within the ICF classification, is a measure of impairment. Pain is generally considered to be an impairment measure as well300 but in contrast to ROM correlates well with measures of activities limitations and participation restrictions. We would argue that it may be inappropriate to classify pain as an impairment measure in this context.
- 4.
Function is improved 1 year after surgery in most, but not all, people.
Data on the outcomes of joint replacement are usually presented simply as the average difference in pain or function before and after surgery.
However, ‘averages’ do not tell us how many people might have got significantly (for them) better and, conversely, how many did not change or got worse. Our data are presented in such a way as to make these aspects of outcome totally explicit. They show that improvement that is both clinically and statistically significant will occur in some 90% of patients having a hip replacement and 70% of those having a knee replacement (an important difference between the two joint sites) and that, in contrast, some 5% of those having a hip replacement and 10% of those having a knee replacement will stay much the same or experience a deterioration in function 1 year after surgery. However, the degree of deterioration is rarely of clinical significance.
These data are important to patients and surgeons counselling them about the likely outcome of a joint replacement.
- 5.
‘Ceiling effects’ are a major problem for many measures of function.
A possible limitation of a measure of function, and one that we were keen to explore, is that it reaches a ceiling, so that patients cannot improve further on that score, even if their clinical status does improve. Our data indicate that this is a significant problem for self-assessment measures such as WOMAC in the context of joint replacement and, to a lesser extent, for the clinician-administered HHS and AKSS. The longitudinal and gait analyses revealed that some ‘objective’ functional parameters were still improving after the surgery when the WOMAC function scores were reaching a plateau. However, other ‘objective’ gait parameters did have a similar pattern of improvement, suggesting that perhaps this ceiling effect is not necessarily as extensive as we might think and the WOMAC function score is still providing an acceptable reflection of functional change.
- 6.
Walking speeds tell a different story.
We have put a reasonable amount of emphasis on the walking speed of patients in this study for three main reasons: first, it is a reasonably objective measure; second, it does not have a ceiling effect; and, third, it is a widely used surrogate measure of participation.354–356 It also has some correlation with life expectancy.357–362
Our data indicate that walking does not show such as good a response to joint replacement as most of the other measures used. Patients are more likely to be worse 1 year after surgery on their walking time than on other measures and the amount of improvement is rarely large. We believe that walking time is more dependent on other variables, many of which are age related, than the other measures.
- 7.
Patients with hip and knee disease respond differently to joint replacement.
It is widely thought that people have a better outcome after a hip replacement than after a knee replacement, and our data support this idea. The likelihood of improvement and the amount of improvement is much greater for people having hip replacement than knee replacement, and there are subtle but important differences in the nature of the response and its determinants.
Patients and joint replacement surgeons need to consider hip and knee osteoarthritis as different diseases. Pain and function seem also to be differently inter-related over time between these two diseases. A more tailored course of intervention may be required for knee osteoarthritis to tackle pain and function, whereas an intervention tackling one of these domains is also likely to affect the other one for hip osteoarthritis.
- 8.
The chances of a good response to joint replacement depend on the severity of the disease at the time of surgery.
Our data show this very clearly. This is not a new finding, but the ADAPT cohort does shed some new light on this important aspect of joint replacement.
Our findings suggest that we should think about the journey (the amount of change after surgery) and the destination (the ‘final’ point reached 1 year after surgery). Patients with very severe disease at the time of surgery are more likely to have a good journey (i.e. pain and functional ability will probably improve substantially), whether patients have hip or knee disease. But the destination differs for the two joint sites. Those with hip disease can have a similar good destination, irrespective of the starting point, whereas those with knee disease can never ‘catch up’ (i.e. have as good a final outcome or destination) if they start off with very severe disease at the time of surgery. This is an important finding with the possibility that we may be delaying surgery too long for many people with knee disease.
Finally, our findings show that patients listed for a revision surgery had slightly better pre-operative pain and similar functional ability than those listed for primary surgery. However, their postoperative gains do not seem to be as large as the improvement experienced by patients with primary joint surgery. Clinicians and patients should be aware of this to discuss and set the expectations from a revision surgery.
- Measuring functional outcomes in patients having hip and knee replacement: a coh...Measuring functional outcomes in patients having hip and knee replacement: a cohort study - Improving patients’ experience and outcome of total joint replacement: the RESTORE programme
Your browsing activity is empty.
Activity recording is turned off.
See more...