U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Guthrie B, Yu N, Murphy D, et al. Measuring prevalence, reliability and variation in high-risk prescribing in general practice using multilevel modelling of observational data in a population database. Southampton (UK): NIHR Journals Library; 2015 Oct. (Health Services and Delivery Research, No. 3.42.)

Cover of Measuring prevalence, reliability and variation in high-risk prescribing in general practice using multilevel modelling of observational data in a population database

Measuring prevalence, reliability and variation in high-risk prescribing in general practice using multilevel modelling of observational data in a population database.

Show details

Chapter 3Variation in high-risk non-steroidal anti-inflammatory drug prescribing in quarter 4 2006: two-level model of patients within practices

Introduction

Prescribing is typically done by an individual clinician for an individual patient in a particular setting (a practice, a hospital, a clinic) in a particular region or local health economy in a particular health-care system. All of these are potentially sources of variation which may be of interest to researchers and practitioners. Similar to most analyses of this nature, not all sources of variation are of primary interest or feasible to model. From this perspective, all such models ignore some sources of variation either because they are not relevant (all practices in this analysis come from the same health-care system – NHS Scotland) or because they cannot be measured (in this data set we have no information about region or local health economy). The focus of the analyses described in this chapter is on variation between practices and models for two outcomes are presented.

Methods

Data sources

Data were extracted by PCCIU for 38 practices participating in NHS Scotland PTI in 2006 (i.e. practices with higher than average coding quality, and where it is more straightforward to identify what kind of clinicians IT system users are and whether or not an ‘encounter’ is a clinical encounter as opposed to another kind of record opening). Data were complete up to 31 March 2007.

Outcome measures

As described above, we defined the five measures of high-risk NSAID prescribing in a way suitable for implementation in a cross-sectional analysis. Peptic ulceration (PCCIU supplied code-set) and heart failure (QOF-defined code set116) were defined as the presence of any relevant Read Code ever, and the date of the first Read Code was taken as the date of first diagnosis. Oral NSAIDs were defined as all drugs in BNF section 10.1.1. Oral anticoagulants were defined as relevant oral drugs in BNF section 2.8.2. Aspirin and clopidogrel were defined as relevant drugs in BNF section 2.9.113

Patients were eligible for analysis if they were permanently registered with the practice continuously between 1 October and 31 December 2006, and if they were defined as being vulnerable to adverse effects of a NSAID on the basis of age, existing disease or coprescription. For age and existing disease, patients had to be aged 75 years or over or have a diagnosis of peptic ulcer or heart failure before 1 October 2006. For coprescription, patients had to have been prescribed an anticoagulant or antiplatelet drug either on the same day as a NSAID during the final quarter of 2006, or to have received an anticoagulant and antiplatelet drug both before and after the final quarter of 2006 (data for 1 January 2007 to 31 March 2007 were used to define coprescription to ensure that patients were on a coprescribed drug at the time of any NSAID prescription in Q4 2006). For the coprescription measures, NSAID prescription only triggered the outcome if the relevant coprescribed drug had been issued on the same day as the NSAID or in the 84 days before and the 84 days after the NSAID.

The outcome variable examined was a binary variable defined as the receipt of any oral NSAID between 1 October and 31 December 2006 inclusive. This replicates the methods used in our previous analysis of a broader basket of high-risk prescribing.5

Additional data were extracted to examine how high-risk NSAID prescribing was associated with patient and practice characteristics:

  • At patient level: sex, age, socioeconomic status (measured by quintiles of postcode derived Carstairs Score117), number of active repeat drugs (a measure of overall morbidity and resource use118120) and the number of indicators that an individual was eligible for (a partial measure of how risky a NSAID might be for them).
  • At practice level: list size (in quartiles), practice rurality/remoteness (three aggregated categories of the Scottish Executive Urban Rural Classification), whether or not the practice is accredited for post-graduate training of GPs, whether or not the practice is a dispensing practice, whether or not the practice holds a General Medical Services contract (the standard national contract) or a locally specified contract.
  • At practice level: two practice-level variables were created by aggregating from patient-level data, namely the rate at which practices prescribed new acute NSAIDs to patient at risk in 2006 and the rate at which practices prescribed new repeat NSAIDs in 2006. Both variables were analysed by quartile of practice rate. The rationale for these is that three-quarters of high-risk NSAID prescribing in Q4 is a result of repeat prescribing, and first initiation of an acute NSAID and first initiation of a repeat NSAID are key transition points that lead to repeat prescribing, which is subject to less oversight than acute prescribing. These two variables therefore are measures of recent rates of such transitions at practice level.

Descriptive analysis and multilevel modelling

For each indicator and the composite we calculated the percentage [and exact confidence intervals (CIs)] of eligible patients with high-risk prescribing and how prevalence varied at practice level. Figure 1 illustrates the model structure, with the outcome measured at patient level (a light-green circle indicating that a patient did not receive a high-risk prescription and dark-green circle with a cross indicating that they did), and with patients clustered within practices in a many-to-one relationship (each practice has many patients, but all patients only have one practice).

FIGURE 1. Two-level model of patients within practices.

FIGURE 1

Two-level model of patients within practices. A light-green circle indicates that a patient did not receive a high-risk prescription and a dark-green circle with a cross indicates that they did.

As the outcome is binary at patient level (a patient particularly vulnerable to NSAID ADEs receives a NSAID either in Q4 2006 or now), we fitted a two-level hierarchical multilevel model of patients clustered within practices using the xtmelogit command in Stata Intercooled version 11 (StataCorp LP, College Station, TX, USA).121123 The empty model is defined as:

logit(yij)=B1j+e0ij
(1)
B1j=B1+u1j
(2)
u1jN(0,Ωu),Ωu=[σu12]
(3)
e0ij(0,Ωe),Ωe=[π2/3].
(4)

B1j is the intercept which has an overall fixed component (B1) but is allowed to vary randomly by GP (u1j). u1j is assumed to be normally distributed with mean zero and variance σ2u1 which is estimated. e0ij is the encounter-level residual and is assumed to have mean zero with a variance of π2/3. Fixed effects at either level are straightforwardly fitted as in single-level regression models.

The ICC at practice level is defined as the proportion of total variance that is at practice level:

σu12/(σu12+π2/3).
(5)

The ICC was initially calculated in the empty model with no fixed effects fitted, defined as the level 2 (practice-level) variance divided by the total variance [the sum of the level 2 variance and the level 1 (patient-level) variance which in a logistic model is fixed at π2/3]. For each of the patient and practice characteristics described above, we calculated univariate odds ratios (ORs) and 95% CIs using the xtmelogit command to account for clustering, and then fitted an ‘adjusted’ model retaining only those variables which were statistically significant. Two adjusted models were fitted, one with only patient characteristics (a case-mix adjusted model) and one with both patient and practice characteristics. Selection of variables to include was based on testing the significance of individual variables, and on the overall fit of the model assessed using the Akaike information criterion (AIC) with a smaller is better criterion. Model assumptions were checked, including checking that the number of integration points used in the estimation was appropriate using the quadchk command (the Stata default of seven integration points were shown to produce stable estimates) and by examining graphically whether level 2 residuals were normally distributed.122,123 Example plots for the main outcomes can be found in Figure 17 in Appendix 6. Variation between practices after accounting for patient characteristics was examined by estimating the ICC, calculating median odds ratios124 and by creating caterpillar plots of shrunken practice-level residuals with 95% CIs from the adjusted model including patient characteristics.125,123 Initial data management and descriptive analysis was carried out in IBM Statistical Product and Service Solutions (SPSS) Statistics version 21 (IBM Corporation, Armonk, NY, USA) and regression modelling in Stata Intercooled version 11.

Reliability analysis

In this context, reliability is the extent to which the mean score of a patient-level outcome accurately measures an individual practice’s performance in terms of how well it differentiates between practices in terms of our confidence that observed differences between practices are a result of true differences in prescribing safety.62,79,126128 Reliability increases as variation between practices increases (measured by the ICC) and as the number of patients being measured in the practice increases. The ICC is estimated from all the data and does not vary by practice, but the number of patients being measured in each practice varies widely and an indicator may therefore be reliable for a large practice but not for a small one. Reliability varies between 0 (completely unreliable) and 1 (completely reliable). Values greater than 0.7 are generally considered to indicate acceptable reliability in the sense that observed differences between practices can be attributed reasonably confidently to true differences in quality or safety but for high-stakes evaluation such as for governance purposes, reliabilities of > 0.8 are generally believed to be required and > 0.9 may be preferable. In a two-level model, reliability can be estimated with the Spearman–Brown prophecy formula, which is commonly used in this context both to estimate the reliability of a measure for a given number of patients and to estimate the minimum number of patients that a practice must have being measured to achieve a prespecified level of reliability.79,126

Reliabilityλ=nρ/(1+(n1)ρ),
(6)

where n = the number of patients and ρ = the ICC.

The same formula can be rearranged so that for an observed ICC ρ, the required number of patients, n, to measure with a desired reliability, λ, is given by:

n=λ(1ρ)/ρ(1λ).
(7)

For the two-level models examined in this chapter, we used the estimated ICC to calculate the reliability for a practice with the median number of patients and to calculate the number of practices in the study for which the proportion of their patients with high-risk prescribing had reliability > 0.7, > 0.8 and > 0.9.

Results

Analysis was for 31,646 patients particularly vulnerable to NSAID adverse drug effects throughout Q4 2006, registered with 38 practices.

Prevalence of high-risk prescribing for all five indicators and the composite

Table 5 shows the prevalence of high-risk prescribing in the last quarter of 2006 for the five individual indicators and for the composite and the variation between practices. Between 4.3% and 11.1% of those patients particularly vulnerable to NSAID adverse drug effects as defined by the individual indicators received one or more NSAIDs, with 9.5% (95% CI 8.2% to 10.7%) of patients in any indicator receiving a NSAID. The highest prevalence was found for coprescription of a NSAID with aspirin or clopidogrel, and the lowest for coprescription of a NSAID with an oral anticoagulant.

TABLE 5

TABLE 5

Prevalence and variation between practices of any high-risk NSAID prescribing in Q4 2006 for all five indicators and the composite

There was evidence of quite considerable variation between practices, with a wide range in the prevalence of high-risk NSAID prescribing between practices, with the composite range being 4.2–21.0%, with an interquartile range of 7.3–11.2%. As is commonly found,79 ICC were relatively small, being < 0.05 for all indicators except for the NSAID and oral anticoagulation measure, where the ICC was 0.184 (95% CI 0.086 to 0.351). The interpretation of the ICC is that it shows the proportion of variation in patient outcome attributable to variation between practices. For example, the ICC for the composite indicator is 0.034 (95% CI 0.020 to 0.056), indicating that 3.4% of variation in outcome between patients is attributable to the practice. The remaining 96.6% of variation is either between patients or is random, but in a multilevel logistic regression model it is not possible to disentangle patient and random variation.79,121,129

Composite indicator univariate associations

Univariate associations between high-risk NSAID prescribing measured by the composite indicator and patient characteristics are shown in Appendix 3, Table 22. Men were less likely than women to receive a NSAID (8.9% vs. 10.0%, OR 0.88, 95% CI 0.82 to 0.95), as were people aged under 50 years compared with older people. There was no association with socioeconomic deprivation but NSAID prescribing was less likely in people who had more than one risk factor (6.8% in those eligible for three or more of the five individual indicators vs. 9.8% in those eligible for one, OR 0.66, 95% CI 0.55 to 0.79). There was a strong association between high-risk NSAID prescribing and the number of repeat drugs that a patient is being currently prescribed, with the prevalence of high-risk NSAIDs increasing from 4.4% in those with no active repeats to 17.9% in those with 11 or more active repeats (OR 4.53, 95% CI 3.73 to 5.49).

Associations with practice-level characteristics were generally weaker (see Appendix 3, Table 23). None of the practice structural characteristics was associated with high-risk prescribing (list size, urban/rural location, type of contract, training or dispensing status). A practice’s rate of new acute NSAID prescribing earlier in 2006 was associated with any NSAID use in Q4 2006 [prevalence 12.2% in practices in the highest quartile of new acute NSAID prescribing compared with 7.1% in the lowest quartile, OR 1.83 (95% CI 1.37 to 2.43), and a consistent gradient]. There was a weaker association with practice rates of new repeat NSAID prescribing although this is a rarer event [prevalence 12.0% in practices in the highest quartile of new repeat NSAID prescribing compared with 7.9% in the lowest quartile, OR 1.42 (95% CI 1.03 to 1.96), but a less clear gradient across quartiles].

Composite indicator adjusted analysis

In multivariate analysis, only four variables were significantly associated with high-risk NSAID prescribing (Table 6) – patient age, the number of indicators patients were eligible for, the number of active repeat drugs and practice new acute NSAID prescribing earlier in 2006. After adjustment, there were significant differences only between the under-50-year-olds (reference category) and people aged 80 years and over, who were less likely to be prescribed a high-risk NSAID (OR 0.71, 95% CI 0.73 to 0.85, compared with the under-50-year-olds). The association with the number of indicators that a patient was eligible was strengthened somewhat (OR 0.49, 95% CI 0.41 to 0.59, in those with three or more risk factors vs. those with one), as was the association with the number of active repeat drugs a patient was prescribed (OR 5.62, 95% CI 4.59 to 6.89, in those with 11 or more active repeats vs. those with none). Practice rates of new acute NSAID prescribing earlier in 2006 remained significantly associated with any high-risk NSAID use in Q4 2006 (OR 1.80, 95% CI 1.41 to 2.30, in practices in the highest quartile compared with the lowest quartile).

TABLE 6

TABLE 6

Variation in any high-risk NSAID prescribing in Q4 2006 for the composite indicator, with adjusted associations with patient and practice characteristics (only variables included in the final model are shown)

The ICC in the empty model was 0.034 (95% CI 0.020 to 0.056), compared with 0.026 (95% CI 0.015 to 0.044) in the adjusted model with only patient-level fixed effects and 0.015 (95% CI 0.008 to 0.026) in the adjusted model with both patient-level and practice-level fixed effects. An alternative way of expressing variation between practices is the median odds ratio, which is on the same scale as the fixed effects and is the median odds ratio between a patient randomly selected from one practice and another randomly selected from another practice.124 The median odds ratio in the empty model was 1.38 (95% CI 1.28 to 1.53) and in the fully adjusted model 1.23 (95% CI 1.17 to 1.33), indicating that variation because of differences in patient characteristics is much greater than variation owing to differences in the practice that a patient is registered with. Figure 2 shows variation by practice based on estimated shrunken residuals for each practice from the model adjusted for patient characteristics only. Six of the 38 practices had statistically significantly higher rates of high-risk NSAID prescribing compared with the population average, and nine had significantly lower rates.

FIGURE 2. Residual variation between practices after accounting for patient characteristics significantly associated with any high-risk NSAID prescribing in Q4 2006 (practice-level shrunken residuals estimated from the ‘patient characteristics only’ adjusted model in Table 6).

FIGURE 2

Residual variation between practices after accounting for patient characteristics significantly associated with any high-risk NSAID prescribing in Q4 2006 (practice-level shrunken residuals estimated from the ‘patient characteristics only’ (more...)

Individual and composite indicator reliability

Table 7 shows the estimated reliability of the five individual indicators and the composite, both before and after accounting for patient characteristics. Reliability varies with both the ICC and the number of patients being measured in each practice. Reliability for a practice with the median number of patients was 0.9 or above for all indicators except for NSAID prescribing in heart failure where it was 0.67. All included practices could be measured with reliability > 0.7 for three of the individual indicators and the composite, but the majority of practices could not be reliably measured at this level for the NSAID in heart failure measure. All included practices could be measured with reliability > 0.8 for the composite indicator, and all but one for NSAIDs prescribed to older people and to people prescribed an oral anticoagulant. None of the individual indicators was reliable in more than three-quarters of included practices, but the composite had reliability > 0.9 in 35 of the 38. Adjusting for patient characteristics made no meaningful difference to the reliability of the composite indicator.

TABLE 7

TABLE 7

Reliability of the composite indicator in distinguishing between practices

Discussion

Summary of findings

Prescription of NSAIDs to patients at particularly high risk of adverse drug effects was relatively common, with just under 1 in 10 such patients receiving a NSAID in Q4 2006. In the full multilevel model, high-risk NSAID prescribing was associated with patient age (over-80-year-olds being less likely to receive a NSAID), the number of indicators that a patient was eligible for (those with more risk factors being less likely to receive a NSAID), and the number of repeat drugs a patient was taking (with a progressive increase in NSAID prescribing as the number of repeat drugs increased). At practice level, the only significantly associated variable was the practice rate of new acute NSAID prescribing earlier in 2006 (practices with higher rates of such prescribing had higher rates of total prescribing in Q4 2006).

Variation between practices was fairly large in absolute terms (varying from 4.1% to 21.0% for the composite) but fairly small when measured using the ICC, where less than 5% of variation in patient outcome was attributable to variation between practices. This applied to the composite (ICC in the empty model 0.035) and four of the five individual measures (ICCs varying from 0.026 to 0.043), although for NSAID prescription to people already prescribed oral anticoagulants the ICC was much larger (0.184). Of note is that when variation between practices is expressed as a median odds ratio (the difference in odds of receiving a high-risk NSAID between two patients randomly selected from different practices), then the fixed effects in the model generally had larger associations with high-risk prescribing than the practice the patient happened to be in (median odds ratio in the empty model was 1.38 and 1.23 in the fully adjusted model both smaller than the most weakly associated patient or practice characteristic). Our overall interpretation is that for total high-risk NSAID prescribing, there is small but statistically and clinically significant variation between practices.

Despite the relatively low ICC, four of the five individual indicators were reasonably reliable (the exception being NSAIDs in heart failure which was not) in their ability to distinguish practices from each other, at least to the level where the indicators could be used in more formative ways (reliability > 0.7), although only for NSAIDs in older people and NSAIDs coprescribed with oral anticoagulants was reliability considered adequate for higher-stakes evaluation (reliability > 0.8) in nearly all included practices. Reflecting the large numbers of patients eligible for the measure (i.e. particularly vulnerable to NSAID adverse drug effects), the composite indicator had reliability > 0.8 in all included practices and > 0.9 in 35 of the 38 practices.

Strengths and limitations

The analysis uses routinely collected data from GP electronic medical records, which has a number of strengths and a number of matching weaknesses. Key strengths are that GP records are generally near complete for prescribing and have reasonable accuracy for common diagnostic coding such as heart failure and peptic ulcer. Key weaknesses are that some prescriptions are hand-written and so will not be recorded electronically (particularly prescriptions written during home visits which are most commonly for frailer patients who are most at risk of ADEs), and that the practices being studied are relatively unusual in that they contribute to a national morbidity recording data set (although this does mean that their diagnostic coding will likely be of higher quality than average). However, the findings are similar to our previous work in a representative one-third of Scottish practices, in terms of the prevalence of high-risk prescribing measured by the indicators examined, the amount of variation between practices and the reliability of the indicators.5 It is also important to note that the practice characteristics data that are available predominantly relate to practice structural characteristics rather than to indicators of how practices organise their prescribing systems or how practitioners within them conceive the risk of the measured prescribing. The partial exception to that is the measure of new acute NSAID prescribing in the previous year, which is associated with total prescribing in Q4 2006.

In addition, composite indicators have a number of potential disadvantages, including that they may combine indicators with very different implications and prevalences. In this case, the composite consists of people prescribed a single drug class where each indicator is clearly associated with risk, although given the numbers of people affected it will be little influenced by NSAID prescribing to people taking oral anticoagulation and to lesser extent NSAID prescribing to people with heart failure. Finally, we do not know the specific indication for the NSAID, which will of course be critical for understanding the risk–benefit balance, as the benefits in people with inflammatory arthritis are likely to be greater (and harder to achieve with other drugs) than in people with mild to moderate osteoarthritis.

Interpretation and comparison with existing literature

This analysis replicates our previously applied methods for the set of indicators and practices used in this study. Of note is that the lower rates of NSAID use in the oldest patients and those with more than one risk factor (i.e. included in more than one indicator) is consistent with there being a general recognition by prescribing GPs that NSAIDs are risky. Rising rates of NSAID use as the number of repeat drugs increases likely reflects greater need for analgesics in sicker people, although of course the risks also increase. The association between prior new acute NSAID use and total NSAID use in Q4 2006 is not unexpected, but reinforces the belief that new acute NSAID use is a reasonable outcome to use in the analysis which follows. Ideally, data would be available on how practices organise their prescribing systems and their knowledge of and attitudes to prescribing risk, although such data are unlikely to ever be routinely available.

The observed absolute variation between practices and the ICCs are similar to our previous study of high-risk prescribing,5 the exception being the NSAIDs coprescribed with oral anticoagulants measure, for which the ICC was 0.184 (95% CI 0.086 to 0.351) which is at the upper end of observed inter-practice variation in other studies, consistent with practices (or GPs within practices) varying considerably in their perception of the risk of this prescribing.79 Small ICCs of the kind observed for the other indicators and the composite are the norm in studies of variation in quality, particularly for binary outcomes where the level 1 variance combines both variation between patients and chance variation. ICCs of this magnitude are not infrequently assumed to mean that there is little or only trivial variation in technical quality between practices or other higher-level organisations or areas.62,8587,130 However, as others have pointed out, small ICCs at practice or other higher level are quite compatible with large absolute variation between practices if total variation is also large,79,131,132 and the absolute variation in high-risk NSAID use between practices was large for all indicators including those where there were large numbers of patients being measured.

The reliability of an indicator is a measure of how well it distinguishes between practices, or more formally it is the proportion of the observed variation between practices which is because of true differences between practices.62,79,126,128 Reliability varies between 0 (completely unreliable) and 1 (completely reliable) and it is an arbitrary decision as to how reliable a measure must be to be useful. Commonly applied rules of thumb are that reliability > 0.7 is adequate for formative evaluation, but values > 0.8 and ideally > 0.9 are required for summative or high-stakes evaluation, with Streiner and Norman128 suggesting that > 0.75 is the minimum requirement under most circumstances but also noting that few instruments have reliability > 0.9. In part, this depends on the context and in particular the use that a measure will be put to. For example, the reliability required for feeding data back to practices for reflection as part of a formative appraisal process is less than that required for a high-stakes evaluation such as identification of practices for clinical governance investigation or intervention or tournament-based pay-for-performance schemes where there are both winners and losers.62,126

From this perspective, four of the five individual measures examined here would be suitable for formative assessment (the exception being NSAIDs in heart failure) with reliabilities > 0.7 in almost all practices, and the composite indicator would be suitable for high stakes evaluation with reliability > 0.8 in all practices and > 0.9 in all but the smallest practices. These high reliabilities are despite a relatively small ICC, reflecting that at practice level there are a large number of patients particularly vulnerable to NSAID ADEs, with reliability varying with both the ICC (the same in all practices) and the number of patients being measured (varying between practices). Adjusting for the available patient characteristics made little difference to the reliability, which would simplify implementation in NHS or other health-care settings.

Image 10-2000-29-fig17a
Copyright © Queen’s Printer and Controller of HMSO 2015. This work was produced by Guthrie et al. under the terms of a commissioning contract issued by the Secretary of State for Health. This issue may be freely reproduced for the purposes of private research and study and extracts (or indeed, the full report) may be included in professional journals provided that suitable acknowledgement is made and the reproduction is not associated with any form of advertising. Applications for commercial reproduction should be addressed to: NIHR Journals Library, National Institute for Health Research, Evaluation, Trials and Studies Coordinating Centre, Alpha House, University of Southampton Science Park, Southampton SO16 7NS, UK.

Included under terms of UK Non-commercial Government License.

Bookshelf ID: NBK322062

Views

  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (2.5M)

Other titles in this collection

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...