U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

National Clinical Guideline Centre (UK). Osteoarthritis: Care and Management in Adults. London: National Institute for Health and Care Excellence (UK); 2014 Feb. (NICE Clinical Guidelines, No. 177.)

  • Update information: December 2020: in the recommendation on adding opioid analgesics NICE added links to other NICE guidelines and resources that support discussion with patients about opioid prescribing and safe withdrawal management. For the current recommendations, see www.nice.org.uk/guidance/CG177/chapter/recommendations.

Update information: December 2020: in the recommendation on adding opioid analgesics NICE added links to other NICE guidelines and resources that support discussion with patients about opioid prescribing and safe withdrawal management. For the current recommendations, see www.nice.org.uk/guidance/CG177/chapter/recommendations.

Cover of Osteoarthritis

Osteoarthritis: Care and Management in Adults.

Show details

3Methods

The updated guidance was developed in accordance with the methods outlined in the NICE Guidelines Manual 2009.327 This is the case for the clinical and cost evidence presented in chapters 5, 12 and 13 and sections 7.2, 8.4, 8.5, and 10.2.

NICE methods have evolved since the development of CG59. A key change in this update is the focus on the development of recommendations based on the consideration of which interventions make a clinically important difference to patients rather than the statistical significance of the effect of an intervention when compared to an appropriate comparison which CG59 applied. As such, because of this difference in application of methodological approach, decisions have been made on different thresholds between the recommendations from CG 59 and those made as part of this update. This chapter outlines the methods used in this update and the methods used to develop CG59 can be found in Appendix O.

3.1. Developing the review questions and outcomes

Review questions were developed in a PICO framework (patient, intervention, comparison and outcome) for intervention reviews, and with a framework of population, index tests, reference standard and target condition for reviews of diagnostic test accuracy. This was to guide the literature searching process and to facilitate the development of recommendations by the guideline development group (GDG). They were drafted by the NCGC technical team and refined and validated by the GDG. The questions were based on the key clinical areas identified in the scope (Appendix A). Further information on the outcome measures examined follows this section.

Table 2Review questions for guideline update

ChapterReview questionsOutcomes
DiagnosisIn a person with suspected clinical OA (including knee pain) when would the addition of imaging be indicated to confirm additional or alternative diagnoses (particularly to identify red flags) such as:
-

Crystal arthritis (gout or CPPD)

-

Inflammatory arthritis (including rheumatoid arthritis, psoriatic arthritis)

-

Infection

-

Cancer, usually secondary metastases

  • Sensitivity
  • Specificity
  • Likelihood ratio
  • Diagnostic accuracy
  • Other clinical management outcomes (e.g. referral)
AcupunctureWhat is the clinical and cost effectiveness of acupuncture versus sham treatment (placebo) and other interventions in the management of osteoarthritis?
  • Global joint pain (WOMAC, VAS, or NRS pain subscale, WOMAC for knee and hip only, AUSCAN subscale for hand
  • Function (WOMAC function subscale for hip or knee or equivalent such as AUSCAN function subscale or Cochin or FIHOA for hand and change from baseline)
  • Stiffness (WOMAC stiffness score change from baseline)
  • Time to joint replacement
  • Quality of life (EQ5D, SF 36)
  • Patient global assessment
  • OARSI responder criteria
  • Adverse events
  • Measure Yourself Medical Outcome Profile
NutraceuticalsWhat is the clinical and cost effectiveness of glucosamine and chondroitin alone or in compound form versus placebo or other treatments in the management of osteoarthritis?
  • Global joint pain (VAS, NRS or WOMAC pain subscale, WOMAC for knee and hip only, AUSCAN subscale for hand
  • Function (WOMAC function subscale for hip or knee or equivalent such as AUSCAN function subscale or Cochin or FIHOA for hand and change from baseline)
  • Stiffness (WOMAC stiffness score change from baseline)
  • Structure modification
  • Time to joint replacement
  • Quality of life (EQ5D, SF 36)
  • Patient global assessment
  • OARSI responder criteria
  • Adverse events (GI, renal and cardiovascular)
Hyaluronan InjectionsWhat is the clinical and cost effectiveness of intra-articular injections of hyaluronic acid/hyaluronans in the management of OA in the knee, hand, ankle, big toe and hip?
  • Global joint pain (VAS or NRS, WOMAC pain subscale, WOMAC for knee and hip only, AUSCAN for hand)*
  • Function (WOMAC function subscale for hip or knee or equivalent such as AUSCAN function subscale and change from baseline)
  • Stiffness (WOMAC stiffness score change from baseline)
  • Time to joint replacement
  • Minimum joint space width
  • Quality of life (EQ5D, SF 36)*
  • Patient global assessment
  • OARSI responder criteria
  • Adverse events*
  • -post injection flare
Decision-aidsWhat is the clinical and cost-effectiveness of decision aids in the management of OA?
  • Attributes of the choice
  • Attributes of the decision making process
  • Decisional conflict
  • Patient-practitioner communication
  • Participation in decision making
  • Proportion undecided
  • Satisfaction
  • Choice (actual choice implemented, option preferred as surrogate measure)
  • Adherence to chosen option
  • Health status and quality of life (generic and condition specific)
  • Anxiety, depression, emotional distress, regret, confidence
  • Consultation length
Follow-upWhat is the clinical and cost effectiveness of regular follow-up/review in reinforcing core treatments (information, education, exercise, weight reduction) care in the management of OA?
Which patients with OA will benefit the most from reinforcement of core treatment as part of regular follow-up/review?
  • Global joint pain (WOMAC, VAS, or NRS pain subscale, WOMAC for knee and hip only, AUSCAN subscale for hand
  • Function (WOMAC function subscale for hip or knee or equivalent such as AUSCAN function subscale or Cochin or FIHOA for hand and change from baseline)
  • Stiffness (WOMAC stiffness score change from baseline)
  • Time to joint replacement
  • Quality of life (EQ5D, SF 36)
  • Patient global assessment
  • OARSI responder criteria
  • Improvement in depression/psychological outcomes
Timing of surgeryWhat information should people with OA receive to inform consideration of the appropriate timing of referral for surgery as part of their OA management?
  • Patient views/experiences
  • Patient preference/satisfaction
  • Patient knowledge

3.2. Searching for evidence

3.2.1. Clinical literature search

Systematic literature searches were undertaken in accordance with the Guidelines Manual 2012327 to identify evidence within published literature in order to answer the review questions. Clinical databases were searched using relevant medical subject headings, free-text terms and study type filters where appropriate. Studies published in languages other than English were not reviewed. Where possible, searches were restricted to articles published in the English language. All searches were conducted on three core databases: Medline, Embase and the Cochrane Library. An additional subject specific database (Allied and Complementary Medicine database) was used for the question on acupuncture. All searches were updated on 7th May 2013. No papers added to the above databases after this date were considered.

Search strategies were checked by looking at reference lists of relevant key papers, checking search strategies in other systematic reviews and asking the GDG for known studies. The questions, the study type filters applied, the databases searched and the years covered can be found in Appendix F.

During the scoping stage, a search was conducted for guidelines and reports on the websites listed below and on organisations relevant to the topic. Searching for grey literature or unpublished literature was not undertaken. All references sent by stakeholders were considered.

3.2.2. Health economic literature search

Systematic literature searches were also undertaken to identify health economic evidence within published literature relevant to the review questions. The evidence was identified by conducting a broad search relating to osteoarthritis in the NHS economic evaluation database (NHS EED), the Health Economic Evaluations Database (HEED) and health technology assessment (HTA) databases from 2007, the date of searches conducted for the previous osteoarthritis guideline.322 Additionally, the search was run on Medline and Embase, with an economic filter, from 2010, to ensure recent publications that had not yet been indexed by the health economics databases were identified. Studies published in languages other than English were not reviewed. Where possible, searches were restricted to articles published in English language.

The search strategies for health economics are included in Appendix F. All searches were updated on 7th May 2013. No papers published after this date were considered.

3.3. Evidence of effectiveness

The Research Fellow:

  • Identified potentially relevant studies for each review question from the relevant search results by reviewing titles and abstracts – full papers were then obtained.
  • Reviewed full papers against pre-specified inclusion/exclusion criteria to identify studies that addressed the review question in the appropriate population and reported on outcomes of interest (review protocols are included in Appendix C).
  • Critically appraised relevant studies using the appropriate checklist as specified in The Guidelines Manual 2012. 327
  • Extracted key information about the study’s methods and results into evidence tables (evidence tables are included in Appendix G).
    • Generated summaries of the evidence by outcome (included in the relevant chapter write-ups):
      • Randomised studies: meta analysed, where appropriate and reported in GRADE profiles (for clinical studies) – see below for details
      • Observational studies: data presented as a range of values in GRADE profiles
      • Diagnostic studies: data presented as a range of values in adapted GRADE profiles
      • Qualitative studies: each study summarised in a table where possible, otherwise presented in a narrative.

3.3.1. Inclusion/exclusion

See the review protocols in Appendix C for full details.

The guideline population was defined to be adults with osteoarthritis.

The temporomandibular joint was excluded as this is an area predominantly managed by dentists and dental specialists and not the target audience of this guideline

Shoulders were excluded because the vast majority of shoulder pain is not due to OA but to tendonitis and bursitis problems. The GDG also pointed out that the number of studies in true shoulder OA is very small.

Spine and back were excluded because there are other NICE guidelines looking at back pain. The back pain literature is extensive and separate from the OA literature.

Randomised trials, non-randomised trials, and observational studies were included in the evidence reviews as appropriate. Conference abstracts were not automatically excluded from the review but were initially assessed against the inclusion criteria and then further processed only if no other full publication was available for that review question, in which case the authors of the selected abstracts were contacted for further information. Conference abstracts included in Cochrane reviews were included when they met the review inclusion criteria and authors were not contacted. Literature reviews, letters and editorials, foreign language publications and unpublished studies were excluded.

3.3.2. Methods of combining clinical studies

Data synthesis for intervention reviews

Where possible, meta-analyses were conducted to combine the results of studies for each review question using Cochrane Review Manager (RevMan5) software. Fixed-effects (Mantel-Haenszel) techniques were used to calculate risk ratios (relative risk) for the binary outcomes: OARSI responder criteria; adverse events; and withdrawal from trial. The continuous outcomes (global joint pain; function; stiffness; time to joint replacement; patient global assessment and quality of life) were analysed using an inverse variance method for pooling weighted mean differences and due to different sub-scales in studies, standardised mean differences were used on the advice of the GDG. Final values were reported where available for continuous outcomes in preference of change scores. However, if change scores only were available, these were reported and meta-analysed with final values. Stratified analyses were predefined for some review questions at the protocol stage when the GDG identified that these strata were expected to show a different effect (e.g. differences in efficacy of interventions when used for differing joints e.g. knee, hip, ankle etc.).

Statistical heterogeneity was assessed by considering the chi-squared test for significance at p<0.1 or an I-squared inconsistency statistic of >50% to indicate significant heterogeneity. Where significant heterogeneity was present, we carried out predefined subgroup analyses (e.g. in acupuncture including only trials with adequate blinding, please see individual protocols in appendix C for further details).

Assessments of potential differences in effect between subgroups were based on the chi-squared tests for heterogeneity statistics between subgroups. If no sensitivity analysis was found to completely resolve statistical heterogeneity then a random effects (DerSimonian and Laird) model was employed to provide a more conservative estimate of the effect.

The means and standard deviations of continuous outcomes were required for meta-analysis. However, in cases where standard deviations were not reported, the standard error was calculated if the p-values or 95% confidence intervals were reported and meta-analysis was undertaken with the mean and standard error using the generic inverse variance method in Cochrane Review Manager (RevMan5) software. Where p values were reported as “less than”, a conservative approach was undertaken. For example, if p value was reported as “p ≤0.001”, the calculations for standard deviations will be based on a p value of 0.001. If these statistical measures were not available then the methods described in section 16.1.3 of the Cochrane Handbook (September 2009) ‘Missing standard deviations’ were applied as the last resort.

For binary outcomes, absolute event rates were also calculated using the GRADEpro software using event rate in the control arm of the pooled results.

Data synthesis for diagnostic test accuracy review

For diagnostic test accuracy studies, the following outcomes were reported: sensitivity, specificity, positive predictive value, negative predictive value, likelihood ratio and correlations/associations between clinical and radiological features. In cases where the outcomes were not reported, 2 by 2 tables were constructed from raw data to allow calculation of these accuracy measures.

3.3.3. Appraising the quality of evidence by outcomes

The international consensus group OMERACT (Outcome measures in Rheumatology), using a process involving patients, recommended that pain, physical function and patient global assessment should be core outcome measures for OA clinical trials. Pain is also prioritised by patients and other international groups. Patient global assessment is assessed using a wide variety of tools, whereas pain and function outcomes are commonly collected using a more restricted number of tools, especially the WOMAC instrument, which also captures the lesser prioritised domain of stiffness. The GDG agreed therefore that the critical outcomes for decision-making for the intervention evidence reviews were: joint pain, function, and stiffness. The GDG agreed that joint pain was the most important outcome to assess analgesic effect.

The following outcomes were also considered important to decision-making: quality of life, OARSI responder criteria, adverse events, withdrawal from trial, time to joint replacement, and patient global assessment.

The evidence for outcomes from the included RCT and observational studies were evaluated and presented using an adaptation of the ‘Grading of Recommendations Assessment, Development and Evaluation (GRADE) toolbox’ developed by the international GRADE working group (http://www.gradeworkinggroup.org/). The software (GRADEpro) developed by the GRADE working group was used to assess the quality of each outcome, taking into account individual study quality and the meta-analysis results. The summary of findings was presented as two separate tables in this guideline. The “Clinical/Economic evidence profile” table includes details of the quality assessment while the “Clinical/Economic evidence summary of Findings” table includes pooled outcome data, where appropriate, an absolute measure of intervention effect and the summary of quality of evidence for that outcome. In this table, the columns for intervention and control indicate the sum of the sample size for continuous outcomes. For binary outcomes such as number of patients with an adverse event, the event rates (n/N: number of patients with events divided by sum of number of patients) are shown with percentages. Reporting or publication bias was only taken into consideration in the quality assessment and included in the Clinical evidence profile table if it was apparent. This was taken into consideration for randomised trial evidence in the the review of paracetamol versus placbo.

Each outcome was examined separately for the quality elements listed and defined in Table 3 and each graded using the quality levels listed in Table 4. The main criteria considered in the rating of these elements are discussed below (see section 3.3.4 Grading of Evidence). Footnotes were used to describe reasons for grading a quality element as having serious or very serious problems. The ratings for each component were summed to obtain an overall assessment for each outcome.

Table 3. Description of quality elements in GRADE for intervention studies.

Table 3

Description of quality elements in GRADE for intervention studies.

Table 4. Levels of quality elements in GRADE.

Table 4

Levels of quality elements in GRADE.

Table 5. Overall quality of outcome evidence in GRADE.

Table 5

Overall quality of outcome evidence in GRADE.

3.3.4. Grading the quality of clinical evidence

After results were pooled, the overall quality of evidence for each outcome was considered. The following procedure was adopted when using GRADE:

  1. A quality rating was assigned, based on the study design. RCTs start HIGH and observational studies as LOW, uncontrolled case series as LOW or VERY LOW.
  2. The rating was then downgraded for the specified criteria: Study limitations, inconsistency, indirectness, imprecision and reporting bias. These criteria are detailed below. Observational studies were upgraded if there was: a large magnitude of effect, dose-response gradient, and if all plausible confounding would reduce a demonstrated effect or suggest a spurious effect when results showed no effect. Each quality element considered to have “serious” or “very serious” risk of bias was rated down -1 or -2 points respectively.
  3. The downgraded/upgraded marks were then summed and the overall quality rating was revised. For example, all RCTs started as HIGH and the overall quality became MODERATE, LOW or VERY LOW if 1, 2 or 3 points were deducted respectively.
  4. The reasons or criteria used for downgrading were specified in the footnotes.

The details of criteria used for each of the main quality element are discussed further in the following sections 3.3.5 to 3.3.8.

3.3.5. Study limitations

The main limitations for randomised controlled trials are listed in Table 6.

Table 6. Study limitations of randomised controlled trials.

Table 6

Study limitations of randomised controlled trials.

3.3.6. Inconsistency

Inconsistency refers to an unexplained heterogeneity of results. When estimates of the treatment effect across studies differ widely (i.e. heterogeneity or variability in results), this suggests true differences in underlying treatment effect. When heterogeneity exists (Chi square p<0.1 or I- squared inconsistency statistic of >50%), but no plausible explanation can be found, the quality of evidence was downgraded by one or two levels, depending on the extent of uncertainty to the results contributed by the inconsistency in the results. In addition to the I- square and Chi square values, the decision for downgrading was also dependent on factors such as whether the intervention is associated with benefit in all other outcomes or whether the uncertainty about the magnitude of benefit (or harm) of the outcome showing heterogeneity would influence the overall judgment about net benefit or harm (across all outcomes).

3.3.7. Indirectness

Directness refers to the extent to which the populations, intervention, comparisons and outcome measures are similar to those defined in the inclusion criteria for the reviews. Indirectness is important when these differences are expected to contribute to a difference in effect size, or may affect the balance of harms and benefits considered for an intervention.

3.3.8. Imprecision

Imprecision in guidelines concerns whether the uncertainty (confidence interval) around the effect estimate means that we don’t know whether there is a clinically important difference between interventions. Therefore, imprecision differs from the other aspects of evidence quality, in that it is not really concerned with whether the point estimate is accurate or correct (has internal or external validity) instead we are concerned with the uncertainty about what the point estimate is. This uncertainty is reflected in the width of the confidence interval.

The 95% confidence interval is defined as the range of values that contain the population value with 95% probability. The larger the trial, the smaller the confidence interval and the more certain we are in the effect estimate.

Imprecision in the evidence reviews was assessed by considering whether the width of the confidence interval of the effect estimate is relevant to decision making, considering each outcome in isolation. Figure 1 considers a positive outcome for the comparison of treatment A versus B. Three decision making zones can be identified, bounded by the thresholds for clinical importance (Minimal important difference, [MID]) for benefit and for harm (the MID for harm for a positive outcome means the threshold at which drug A is less effective than drug B and this difference is clinically important to patients (favours B).

Figure 1. Imprecision illustration.

Figure 1

Imprecision illustration. Source: Figure adapted from GRADEPro software.

  • When the confidence interval of the effect estimate is wholly contained in one of the three zones (e.g. clinically important benefit), we are not uncertain about the size and direction of effect (whether there is a clinically important benefit or the effect is not clinically important or there is a clinically important harm), so there is no imprecision.
  • When a wide confidence interval lies partly in each of two zones, it is uncertain in which zone the true value of effect estimate lies, and therefore there is uncertainty over which decision to make (based on this outcome alone); the confidence interval is consistent with two decisions and so this is considered to be imprecise in the GRADE analysis and the evidence is downgraded by one (“serious imprecision”).
  • If the confidence interval of the effect estimate crosses into three zones, this is considered to be very imprecise evidence because the confidence interval is consistent with three clinical decisions and there is a considerable lack of confidence in the results. The evidence is therefore downgraded by two in the GRADE analysis (“very serious imprecision”).
  • Implicitly, assessing whether the confidence interval is in, or partially in, a clinically important zone, requires the GDG to estimate an MID or to say whether they would make different decisions for the two confidence limits.

The literature was searched for established MIDs for the selected outcomes in the evidence reviews. The following studies were retrieved and reviewed by the GDG:

The Revicki 2008 study summarised information on evaluating responsiveness and generation of MID estimates in general for patient reported outcomes not specific to OA.

The Pham 2003 study concerned the generation of the OMERACT-OARSI responder criteria, a composite outcome of pain, function and patient global assessment. The GDG selected this as an important outcome and where reported has been included throughout the guideline.

The Tubach 2005 study calculated MIDs for WOMAC function which corresponded to SMDs of 0.33 (knee OA) and 0.16 (hip OA). Patients rated an improvement in their pain symptoms of 0.67 SMD (knee OA) or 0.44 SMD (hip OA) as “good”. The GDG agreed not to use the MIDs proposed in the Tubach 2005 study. The group consensus was that the Tubach MIDs were challenging to use in the context of clinical guideline development as they were developed for an individual RCT and would not be appropriate for the purposes of meta-analysis in guideline development. The GDG felt that we should not routinely be using MIDs from single research studies for decision-making. Current NICE guidance is that the best source of an MID for use in clinical decision making is a systematic review of the evidence or an international consensus statement that is established within the relevant clinical community. Established MIDs are likely to be published widely and should be seen and accepted and utilised by that community. As well as a review of the literature relating to MIDs for the OA field the GDG was asked whether they were aware of any acceptable MIDs in the clinical community of osteoarthritis but they confirmed the lack of international consensus on specific thresholds for the selected outcomes. The GDG was aware of work being done in this area, in particular planned work by OMERACT in 2014 but felt that MIDs were not as yet established for use in this clinical guideline.

As there are no validated MIDs for SMDs, the GDG agreed to use the empirical cut-off suggested by the GRADE working group as part of the NICE methodological process. Therefore, the GDG agreed to use the following GRADE default thresholds to assess imprecision, the MID of 0.5 SMD for continuous outcomes; and 25% relative risk reduction or relative risk increase, which corresponds to a RR clinically important threshold of 0.75 or 1.25 respectively, for binary outcomes. These default MIDs were used for all the outcomes in across the evidence reviews.

The GDG accepted that there are limitations of applying an MID of 0.5 SMD. They acknowledged that there are very few interventions for OA that would reach this cut off for clinical effectiveness. However there was limited published or international consensus evidence available to provide firm cut-offs. An MID of 0.2 SMD was also considered when weighing up individual therapy benefits. For a few therapies, occasional results changed from an intervention being similarly effective to being more clinically effective but all still demonstrated uncertainty.

The GDG also agreed to draft a research recommendation on minimal important differences (MID) for the main clinical outcomes in OA because of the challenges in this area. Further details on the research recommendations can be found in appendix N.

Assessing clinical importance

The GDG assessed the evidence by outcome in order to determine if there was, or was potentially, a clinically important benefit, a clinically important harm or no clinically important difference between interventions.

The assessment of benefit/harm/no benefit or harm was based on the point estimate of the standardised mean difference for intervention studies which was standardized across the reviews and against the MID thresholds described above. This assessment was carried out by the GDG for each outcome. The GDG used the assessment of clinical importance for the outcomes alongside the evidence quality and the uncertainty in the effect estimates to make an overall judgement on the balance of benefit and harms of an intervention.

Publication bias

Downgrading for publication bias would only be carried out if the GDG were aware that there was serious publication bias for that particular outcome. Such downgrading was not carried out for this guideline.

Evidence statements

Evidence statements are summary statements that are presented after the GRADE profiles, summarizing the key features of the clinical effectiveness evidence presented. The wording of the evidence statements reflects the certainty/uncertainty in the estimate of effect. The evidence statements are presented by outcome and encompass the following key features of the evidence:

  • The number of studies and the number of participants for a particular outcome.
  • An indication of the direction of clinical importance (if one treatment is beneficial or harmful compared to the other, or whether there is no difference between two tested treatments).

3.4. Evidence of cost-effectiveness

The GDG is required to make decisions based on the best available evidence of both clinical and cost effectiveness. Guideline recommendations should be based on the expected costs of the different options in relation to their expected health benefits (that is, their ‘cost effectiveness’) rather than the total implementation cost.327 Thus, if the evidence suggests that a strategy provides significant health benefits at an acceptable cost per patient treated, it should be recommended even if it would be expensive to implement across the whole population.

Evidence on cost-effectiveness related to the key clinical issues being addressed in the guideline was sought. The health economist undertook:

  • A systematic review of the published economic literature.
  • New cost-effectiveness analysis in priority areas.

3.4.1. Literature review

The health economist:

  • Identified potentially relevant studies for each review question from the economic search results by reviewing titles and abstracts – full papers were then obtained.
  • Reviewed full papers against pre-specified inclusion/exclusion criteria to identify relevant studies (see below for details).
  • Critically appraised relevant studies using the economic evaluations checklist as specified in The Guidelines Manual. 327
  • Extracted key information about the studies’ methods and results into evidence tables (included in Appendix H).
  • Generated summaries of the evidence in NICE economic evidence profiles (included in the relevant chapter write-ups) – see below for details.

3.4.1.1. Inclusion/exclusion

Full economic evaluations (studies comparing costs and health consequences of alternative courses of action: cost–utility, cost-effectiveness, cost-benefit and cost-consequence analyses) and comparative costing studies that addressed the review question in the relevant population were considered potentially includable as economic evidence.

Studies that only reported cost per hospital (not per patient), or only reported average cost effectiveness without disaggregated costs and effects, were excluded. Abstracts, posters, reviews, letters/editorials, foreign language publications and unpublished studies were excluded. Studies judged to have an applicability rating of ‘not applicable’ were excluded (this included studies that took the perspective of a non-OECD country).

Remaining studies were prioritised for inclusion based on their relative applicability to the development of this guideline and the study limitations. For example, if a high quality, directly applicable UK analysis was available other less relevant studies may not have been included. Where exclusions occurred on this basis, this is noted in the relevant section.

For more details about the assessment of applicability and methodological quality see the economic evaluation checklist (The Guidelines Manual, 327 and the health economics research protocol in Appendix C).

3.4.1.2. NICE economic evidence profiles

The NICE economic evidence profile has been used to summarise cost and cost-effectiveness estimates. The economic evidence profile shows, for each economic study, an assessment of applicability and methodological quality, with footnotes indicating the reasons for the assessment. These assessments were made by the health economist using the economic evaluation checklist from The Guidelines Manual.327. It also shows incremental costs, incremental effects (for example, quality-adjusted life years [QALYs]) and the incremental cost-effectiveness ratio, as well as information about the assessment of uncertainty in the analysis. See Table 7 for more details.

Table 7. Content of NICE economic profile.

Table 7

Content of NICE economic profile.

If a non-UK study was included in the profile, the results were converted into pounds sterling using the appropriate purchasing power parity.336

3.4.2. Undertaking new health economic analysis

As well as reviewing the published economic literature for each review question, as described above, new economic analysis was undertaken by the health economist in selected areas. Priority areas for new health economic analysis were agreed by the GDG after formation of the review questions and consideration of the available health economic evidence.

The GDG identified oral NSAIDs/COX-2 inhibitors as the highest priority area for original economic modelling. The GDG felt that updating the CG59 model was a priority in order to incorporate the updated review data on the effectiveness and adverse events of paracetamol, and also to include the fixed dose combination pills..

The following general principles were adhered to in developing the cost-effectiveness analysis:

  • Methods were consistent with the NICE reference case.325.
  • The GDG was involved in the design of the model, selection of inputs and interpretation of the results.
  • Model inputs were based on the systematic review of the clinical literature supplemented with other published data sources where possible.
  • When published data was not available GDG expert opinion was used to populate the model.
  • Model inputs and assumptions were reported fully and transparently.
  • The results were subject to sensitivity analysis and limitations were discussed.
  • The model was peer-reviewed by another health economist at the NCGC.

Full methods for the cost-effectiveness analysis for oral NSAIDs/COX-2 inhibitors are described in Appendix L.

3.4.3. Cost-effectiveness criteria

NICE’s report ‘Social value judgements: principles for the development of NICE guidance’ sets out the principles that GDGs should consider when judging whether an intervention offers good value for money.326,327 In general, an intervention was considered to be cost effective if either of the following criteria applied (given that the estimate was considered plausible):

  1. The intervention dominated other relevant strategies (that is, it was both less costly in terms of resource use and more clinically effective compared with all the other relevant alternative strategies), or
  2. The intervention cost less than £20,000 per QALY gained compared with the next best strategy.

If the GDG recommended an intervention that was estimated to cost more than £20,000 per QALY gained, or did not recommend one that was estimated to cost less than £20,000 per QALY gained, the reasons for this decision are discussed explicitly in the ‘from evidence to recommendations’ section of the relevant chapter with reference to issues regarding the plausibility of the estimate or to the factors set out in the ‘Social value judgements: principles for the development of NICE guidance’.326 When QALYs or life years gained are not used in the analysis, results are difficult to interpret unless one strategy dominates the others with respect to every relevant health outcome and cost.

3.4.4. In the absence of economic evidence

When no relevant published studies were found, and a new analysis was not prioritised, the GDG made a qualitative judgement about cost effectiveness by considering expected differences in resource use between options and relevant UK NHS unit costs alongside the results of the clinical review of effectiveness evidence.

3.5. Developing recommendations

Over the course of the guideline development process, the GDG was presented with:

  • Evidence tables of the clinical and economic evidence reviewed from the literature. All evidence tables are in Appendices G and H.
  • Summary of clinical and economic evidence and quality (as presented in chapters 5 to 13)
  • Forest plots and summary ROC curves (Appendix I)
  • A description of the methods and results of the cost-effectiveness analysis undertaken for the guideline (Appendix L)

Recommendations were drafted on the basis of the GDG interpretation of the available evidence, taking into account the balance of benefits, harms and costs. When clinical and economic evidence was of poor quality, conflicting or absent, the GDG drafted recommendations based on their expert opinion. The considerations for making consensus based recommendations include the balance between potential harms and benefits, economic or implications compared to the benefits, current practices, recommendations made in other relevant guidelines, patient preferences and equality issues. The consensus recommendations were done through discussions in the GDG. The main considerations specific to each recommendation are outlined in the Evidence to Recommendation Section preceding the recommendation section.

3.5.1. Research recommendations

When areas were identified for which good evidence was lacking, the guideline development group considered making recommendations for future research. Decisions about inclusion were based on factors such as:

  • the importance to patients or the population
  • national priorities
  • potential impact on the NHS and future NICE guidance
  • ethical and technical feasibility

3.5.2. Validation process

The guidance is subject to a six week public consultation for feedback as part of the quality assurance and peer review of the document. All comments received from registered stakeholders are responded to in turn and posted on the NICE website when the the full guideline is published.

3.5.3. Updating the guideline

A formal review of the need to update a guideline is usually undertaken by NICE after its publication. NICE will conduct a review to determine whether the evidence base has progressed significantly to alter the guideline recommendations and warrant an update.

3.5.4. Disclaimer

Health care providers need to use clinical judgement, knowledge and expertise when deciding whether it is appropriate to apply guidelines. The recommendations cited here are a guide and may not be appropriate for use in all situations. The decision to adopt any of the recommendations cited here must be made by the practitioners in light of individual patient circumstances, the wishes of the patient, clinical expertise and resources.

The National Clinical Guideline Centre disclaims any responsibility for damages arising out of the use or non-use of these guidelines and the literature used in support of these guidelines.

3.5.5. Funding

The National Clinical Guideline Centre was commissioned by the National Institute for Health and Care Excellence to undertake the work on this guideline.

Copyright © National Clinical Guideline Centre, 2014.
Bookshelf ID: NBK333071

Views

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...