U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Gliklich RE, Dreyer NA, Leavy MB, editors. Registries for Evaluating Patient Outcomes: A User's Guide [Internet]. 3rd edition. Rockville (MD): Agency for Healthcare Research and Quality (US); 2014 Apr.

Cover of Registries for Evaluating Patient Outcomes

Registries for Evaluating Patient Outcomes: A User's Guide [Internet]. 3rd edition.

Show details

3Registry Design

1. Introduction

This chapter is intended as a high-level practical guide to the application of epidemiologic methods that are particularly useful in the design of registries that evaluate patient outcomes. Since it is not intended to replace a basic textbook on epidemiologic design, readers are encouraged to seek more information from textbooks and scientific articles. Table 3–1 summarizes the key considerations for study design that are discussed in this chapter. Throughout the design process, registry planners may want to discuss options and decisions with the registry stakeholders and relevant experts to ensure that sound decisions are made. The choice of groups to be consulted during the design phase generally depends on the nature of the registry, the registry funding source and funding mechanism, and the intended audience for registry reporting. More detailed discussions of registry design for specific types of registries are provided in Chapters 19, 20, 21, 22, and 23.

Table 3–1. Considerations for study design.

Table 3–1

Considerations for study design.

2. Research Questions Appropriate for Registries

The questions typically addressed in registries range from purely descriptive questions aimed at understanding the characteristics of people who develop the disease and how the disease generally progresses, to highly focused questions intended to support decisionmaking. Registries focused on determining clinical effectiveness or cost-effectiveness or assessing safety or harm are generally hypothesis driven and concentrate on evaluating the effects of specific treatments on patient outcomes. Research questions should address the registry's purposes, as broadly described in Table 3–2.

Table 3–2. Overview of registry purposes.

Table 3–2

Overview of registry purposes.

Observational studies derived from registries (or “registry-based studies”) are an important part of the research armamentarium alongside interventional studies, such as randomized controlled trials (RCTs) or pragmatic trials, and retrospective studies, such as studies derived exclusively from administrative claims data. Each of these study designs has strengths and limitations, and the selection of a study design should be guided by the research questions of interest. (See Chapter 2, Section 2.2, for a discussion of the factors that influence the study design decision.) In some cases, multiple studies with different designs or a hybrid study that combines study designs will be necessary to address a research question. In fact, this more comprehensive approach to evidence development is likely to become more common as researchers strive to address multiple questions for multiple stakeholders most efficiently. Observational studies and interventional studies are more complementary than competitive, precisely because some research questions are better answered by one method than the other. Interventional studies are considered by many to provide the highest grade evidence for evaluating whether a drug has the ability to bring about an intended effect in optimal or “ideal world” situations, a concept also known as “efficacy.”1 Observational designs, on the other hand, are particularly well suited for studying broader populations, understanding actual results (e.g., some safety outcomes) in real-world practice (see Case Example 2), and for obtaining more representative quality-of-life information. This is particularly true when the factors surrounding the decision to treat are an important aspect of understanding treatment effectiveness.2

In many situations, nonrandomized comparisons either are sufficient to address the research question or, in some cases, may be necessary because of the following issues with randomizing patients to a specific treatment:

  • Equipoise: Can providers ethically introduce randomization between treatments when the treatments may not be clinically equivalent?
  • Ethics: If reasonable suspicion about the safety of a product has become known, would it be ethical to conduct a trial that deliberately exposes patients to potential harm? For example, can pregnant women be ethically exposed to drugs that may be teratogenic? (See Chapter 21 and Case Examples 49, 50, 51, and 52.)
  • Practicality: Will patients enroll in a study where they might not receive the treatment, or might not receive what is likely to be the best treatment? How can compliance and adherence to a treatment be studied, if not by observing what people do in real-world situations?

Registries are particularly suitable for some types of research questions, such as:

  • Natural history studies where the goal is to observe clinical practice and patient experience but not to introduce any intervention.
  • Measures of clinical effectiveness, especially as related to compliance, where the purpose is to learn about what patients and practitioners actually do and how their actions affect real-world outcomes. This is especially important for treatments that have poor compliance.
  • Studies of effectiveness and safety for which clinician training and technique are part of the study of the treatment (e.g., a procedure such as placement of carotid stent).
  • Studies of heterogeneous patient populations, since unlike randomized trials, registries generally have much broader inclusion criteria and fewer exclusion criteria. These characteristics lead to studies with greater generalizability (external validity) and may allow for assessment of subgroup differences in treatment effects.
  • Followup for delayed or long-term benefits or harm, since registries can extend over much longer periods than most clinical trials (because of their generally lower costs to run and lesser burden on participants).
  • Surveillance for rare events or of rare diseases.
  • Studies for treatments in which randomization is unethical, such as intentional exposure to potential harm (as in safety studies of marketed products that are suspected of being harmful).
  • Studies for treatments in which randomization is not necessary, such as when certain therapies are only available in certain places owing to high cost or other restrictions (e.g., proton beam therapy).
  • Studies for which blinding is challenging or unethical (e.g., studies of surgical interventions, acupuncture).
  • Studies of rapidly changing technology.
  • Studies of conditions with complex treatment patterns and treatment combinations.
  • Studies of health care access and barriers to care.
  • Evaluations of actual standard medical practice. (See Case Example 58.)

Registry studies may also include embedded substudies as part of their overall design. These substudies can themselves have various designs (e.g., highly detailed prospective data collection on a subset of registry participants, or a case-control study focused on either incident or prevalent cases identified within the registry). (See Case Examples 3 and 47.) Registries can also be used as sampling frames for RCTs.

3. Translating Clinical Questions Into Measurable Exposures and Outcomes

The specific clinical questions of interest in a registry will guide the definitions of study subjects, exposure, and outcome measures, as well as the study design, data collection, and analysis. In the context of registries, the term “exposure” is used broadly to include treatments and procedures, health care services, diseases, and conditions.

The clinical questions of interest can be defined by reviewing published clinical information, soliciting experts' opinions, and evaluating the expressed needs of the patients, health care providers, and payers. Examples of research questions, key outcome and exposure variables, and sources of data are shown in Table 3–3. As these examples show, the outcomes (generally beneficial or deleterious outcomes) are the main endpoints of interest posed in the research question. These typically represent measures of health or onset of illness or adverse events, but also commonly include quality of life measures, and measures of health care utilization and costs.

Table 3–3. Examples of research questions and key exposures and outcomes.

Table 3–3

Examples of research questions and key exposures and outcomes.

Relevant exposures also derive from the main research question and relate to why a patient might experience benefit or harm. Evaluation of an exposure includes collection of information that affects or augments the main exposure, such as dose, duration of exposure, route of exposure, or adherence. Other variables of interest include independent risk factors for the outcomes of interest (e.g., comorbidities, age), as well as variables known as potential confounding variables, that are related to both the exposure and the outcome and are necessary for conducting valid statistical analyses. Confounding can result in inaccurate estimates of association between the study exposure and outcome through mixing of effects. For example, in a study of asthma medications, prior history of treatment resistance should be collected or else results may be biased. The bias could occur because treatment resistance may relate both to the likelihood of receiving the new drug (meaning that doctors will be more likely to try a new drug in patients who have failed other therapies) and the likelihood of having a poorer outcome (e.g., hospitalization). Refer to Chapter 4 for a discussion of selecting data elements and Chapter 5 for a discussion of selecting patient-reported outcomes.

4. Finding the Necessary Data

The identification of key outcome and exposure variables and patients will drive the strategy for data collection, including the choice of data sources. A key challenge to registries is that it is generally not possible to collect all desired data. As discussed in Chapter 4, data collection should be both parsimonious and broadly applicable. For example, while experimental imaging studies may provide interesting data, if the imaging technology is not widely available, the data will not be available for enough patients to be useful for analysis. Moreover, the registry findings will not be generalizable if only sophisticated centers that have such technology participate. Instead, registries should focus on collecting relevant data with relatively modest burden on patients and clinicians. Registry data can be obtained from patients, clinicians, medical records, and linkage with other sources (in particular, extant databases), depending on the available budget. (See Chapters 6, 15, and 16.)

Examples of patient-reported data include health-related quality of life; utilities (i.e., patient preferences); symptoms; use of over-the-counter (OTC), complementary, and alternative medication; behavioral data (e.g., smoking and alcohol use); family history; and biological specimens. These data may rely on the subjective interpretation and reporting of the patient (e.g., health-related quality of life, utilities, symptoms such as pain or fatigue); may be difficult to otherwise track (e.g., use of complementary and alternative medication, smoking, and alcohol use); or may be unique to the patient (e.g., biological specimens). Health care resource utilization is another important construct that reflects both cost of care (burden of illness) and health-related quality of life. For example, more frequent office visits, procedures, or hospitalizations may result in reduced health-related quality of life for the patient. The primary advantage of this form of data collection is that it provides direct information from the entity that is ultimately of the most interest—the patient. The primary disadvantages are that the patient is not necessarily a trained observer and that various forms of bias, such as recall bias, may influence subjective information. For example, people may selectively recall certain exposures because they believe they have a disease that was caused by that exposure, or their recall may be influenced by recent news stories claiming cause-and-effect relationships. (See Case Example 4.)

Examples of clinician data include clinical impressions, clinical diagnoses, clinical signs, differential diagnoses, laboratory results, and staging. The primary advantage of clinician data is that clinicians are trained observers. Even so, the primary disadvantages are that clinicians are not necessarily accurate reporters of patient perceptions, and their responses may also be subject to recall bias. Moreover, the time that busy clinicians can devote to registry data collection is often limited.

Medical records also are a repository of clinician-derived data. Certain data about treatments, risk factors, and effect modifiers are often not consistently captured in medical records of any type, but where available, can be useful. Examples of such data that are difficult to find elsewhere include OTC medications, smoking and alcohol use, complementary and alternative medicines, and counseling activities by the clinician on lifestyle modifications. Medical records are often relied upon as a source of detailed clinical information for adjudication by external reviewers of medical diagnoses corresponding to study endpoints.

Electronic medical records, increasingly available, improve access to the data within medical records. The increasing use of electronic health records has facilitated the development of a number of registries within large health plans. Kaiser Permanente has created several registries of patients receiving total joint replacement, bariatric surgery, and nonsurgical conditions (e.g., diabetes), all of which rely heavily on existing electronic health record data. As discussed further in Chapter 15, the availability of medical records data in electronic format does not, by itself, guarantee consistency of terminology and coding.

Examples of other data sources include health insurance claims, pharmacy data, laboratory data, other registries, and national data sets, such as Medicare claims data and the National Death Index. These sources can be used to supplement registries with data that may otherwise be difficult to obtain, subject to recall bias, not collected because of loss to followup, or likely inaccurate by self-report (e.g., in those patients with diseases affecting recall, cognition, or mental status). See Table 6–1 (in Chapter 6) for more information on data sources.

5. Resources and Efficiency

Ideally, a study is designed to optimally answer a research question of interest and funded adequately to achieve the objectives based on the requirements of the design. Frequently, however, finite resources are available at the outset of a project that constrain the approaches that may be pursued. Often, through efficiencies in the selection of a study design and patient population (observational vs. RCT, case-control vs. prospective cohort), selection of data sources (e.g., medical-records–based studies vs. information collected directly from clinicians or patients), restriction of the number of study sites, or other approaches, studies may be planned that provide adequate evidence for addressing a research question, in spite of limited resources.

Section 6 below discusses how certain designs may be more efficient for addressing some research questions.

6. Study Designs for Registries

Although studies derived from registries are, by definition, observational studies, the framework for how the data will be analyzed drives the data collection and choices of patients for inclusion in the study.

The study models of case series, cohort, case-control, and case-cohort are commonly applied to registry data and are described briefly here. When case-control or case-cohort designs are applied to registry data, additional data may be collected to facilitate examination of questions that arise. Before adding new data elements, whether in a nested substudy or for a new objective, several of the steps outlined in Chapter 2, including assessing feasibility, considering the necessary scope and rigor, and evaluating the regulatory/ethical impact, should be undertaken. Other models that are also useful in some situations, but are not covered here, include: case-crossover studies, which are efficient designs for studying the effects of intermittent exposures (e.g., use of erectile dysfunction drugs) on conditions with sudden onset, and quasi-experimental studies or “pragmatic trials.” For example, in a pragmatic trial, providers may be randomized as to which intervention or quality improvement tools they use, but patients are observed without further intervention. Also, there has been recent interest in applying the concept of adaptive clinical trial design to registries. An adaptive design has been defined as a design that allows adaptations or modifications to some aspects of a clinical trial after its initiation without undermining the validity and integrity of the trial.3 While many long-term registries are modified after initiation, the more formal aspects of adaptive trial design have yet to be applied to registries and observational studies.

Determining what framework will be used to analyze the data is important in designing the registry and the registry data collection procedures. Readers are encouraged to consult textbooks of epidemiology and pharmacoepidemiology for more information. Many of the references in Chapters 13 and 18 relate to study design and analysis.

6.1. Case Series Design

Using a registry population to develop case series is a straightforward application that does not require sophisticated analytics. Depending on the generalizability of the registry itself, case series drawn from the registry can be used to describe the characteristics to be used in comparison to other case series (e.g., from spontaneous adverse event reports). Self-controlled methods, including self-controlled case series, are a relatively new set of methods that lends itself well to registry analyses as it focuses on only those subjects who have experienced the event of interest and uses an internal comparison to derive the relative (not absolute) incidence of the event during the time the subject is “exposed” compared with the incidence during the time when they are “unexposed.”4 This design implicitly controls for all confounders that do not vary over the followup time (e.g., gender, genetics, geographic area), as the subject serves as his or her own control. The self-controlled case series design may also be very useful in those circumstances where a comparison group is not available. Self-controlled case series require that the probability of exposure is not affected by the occurrence of an outcome; in addition, for non-recurrent events, the method works only when the event risk is small and varies over the followup time. Derivative methods, grouped as self-controlled cohort methods, include observational screening5 and temporal pattern discovery.6 These methods compare the rate of events post-exposure with the rate of events pre-exposure among patients with at least one exposure.

6.2. Cohort Design

Cohort studies follow, over time, a group of people who possess a characteristic, to see if individuals in the group develop a particular endpoint or outcome. The cohort design is used for descriptive studies as well as for studies seeking to evaluate comparative effectiveness and/or safety or quality of care. Cohort studies may include only people with exposures (such as to a particular drug or class of drugs) or disease of interest. Cohort studies may also include one or more comparison groups for which data are collected using the same methods during the same period. A single cohort study may in fact include multiple cohorts, each defined by a common disease or exposure. Cohorts may be small, such as those focused on rare diseases, but often they target large groups of people (e.g., in safety studies), such as all users of a particular drug or device. Some limitations of registry-based cohort studies may include limited availability of treatment data and underreporting of outcomes if a patient leaves the registry or is not adequately followed up.7 These pitfalls should be considered and addressed when planning a study.

6.3. Case-Control Design

A case-control study gathers patients who have a particular outcome or who have suffered an adverse event (“cases”) and “controls” who have not but are representative of the source population from which the cases arise.8 If properly designed and conducted, it should yield results similar to those expected from a cohort study of the population from which the cases were derived. The case-control design is often employed for understanding the etiology of rare diseases9 because of its efficiency. In studies where expensive data collection is required, such as some genetic analyses or other sophisticated testing, the case-control design is more efficient and cost effective than a cohort study because a case-control design collects information only from cases and a sample of noncases. However, if no de novo data collection is required, the use of the cohort design may be preferable since it avoids the challenge of selecting a suitable control group and the concomitant danger of introducing more bias.

Depending on the outcome or event of interest, cases and controls may be identifiable within a single registry. For example, in the evaluation of restenosis after coronary angioplasty in patients with end-stage renal disease, investigators identified both cases and controls from an institutional percutaneous transluminal coronary angioplasty registry; in this example, controls were randomly selected from the registry and matched by age and gender.10 Alternatively, cases can be identified in the registry and controls chosen from outside the registry. Care must be taken, however, that the controls from outside the registry meet the requirement of arising from the same source population as the cases to which they will be compared. Matching in case-control designs—for example, ensuring that patient characteristics such as age and gender are similar in the cases and their controls—may yield additional efficiency, in that a smaller number of subjects may be required to answer the study question with a given power. However, matching does not eliminate confounding and must be undertaken with care. Matching variables must be accounted for in the analysis, because a form of selection bias similar to confounding will have been introduced by the matching.11

Properly executed, a case-control study can add efficiency to a registry if more extensive data are collected by the registry only for the smaller number of subjects selected for the case-control study. This design is sometimes referred to as a “nested” case-control study, since subjects are taken from a larger cohort. It is generally applied because of budgetary or logistical concerns relating to the additional data desired. Nested case-control studies have been conducted in a wide range of patient registries, from studying the association between oral contraceptives and various types of cancer using the Surveillance Epidemiology and End Results (SEER) program12-14 to evaluating the possible association of depression with Alzheimer's disease. As an example, in the latter case-control study design, probable cases were enrolled from an Alzheimer's disease registry and compared with randomly selected nondemented controls from the same base population.15

Case-control studies present special challenges with regard to control selection. More information on considerations and strategies can be found in a set of papers by Wacholder.16-18

6.4. Case-Cohort Design

The case-cohort design is a variant of the case-control study. As in a case-control study, a case-cohort study enrolls patients who have a particular outcome or who have suffered an adverse event (“cases”), and “controls” who have not, but who are representative of the source population from which the cases arise. In nested case-control studies where controls are selected via risk-set sampling, each person in the source population has a probability of being selected as a control that is, ideally, in proportion to his or her person-time contribution to the cohort. In a case-cohort study, however, each control has an equal probability of being sampled from the source population.19 This allows for collection of pertinent data for cases and for a sample of the full cohort, instead of the whole cohort. For example, in a case-cohort study of histopathologic and microbiological indicators of chorioamnionitis, which included identification of specific microorganisms in the placenta, cases consisted of extreme preterm infants with cerebral palsy. Controls, which can be thought of as a randomly selected subcohort of subjects at risk of the event of interest, were selected from among all infants enrolled in a long-term study of preterm infants.20

With the assumptions that competing risks and loss to followup are not associated with the exposure or the risk of disease, the case-cohort design allows for the selection of one control group that can be compared with various case series since the controls are selected at the beginning of followup. Analogous to a cohort study where every subject in the source population is at risk for the disease at the start of followup, the control series in a case-cohort design represents a sample of the exposed and unexposed in the source population who are disease-free at the start of followup.

7. Choosing Patients for Study

The purpose of a registry is to provide information or describe events and patterns, and often to generate hypotheses about a specific patient population to whom study results are meant to apply. Studies can be conducted of people who share common characteristics, with or without the inclusion of comparison groups. For example, studies can be conducted of:

  • People with a particular disease/outcome or condition. (These are focused on characteristics of the person.)

    Examples include studies of the occurrence of cancer or rare diseases, pregnancy outcomes, and recruitment pools for clinical trials.

  • Those with a particular exposure. (These exposures may be to a product, procedure, or other health service.)

    Examples include general surveillance registries, pregnancy registries for particular drug exposures, and studies of exposure to medications and to devices such as stents.21 They also include studies of people who were treated under a quality improvement program, as well as studies of a particular exposure that requires controlled distribution, such as drugs with serious safety concerns (e.g., isotretinoin, clozapine, natalizumab [Tysabri®]), where the participants in the registry are identified because of their participation in a controlled distribution/risk management program.

  • Those who were part of a program evaluation, disease management effort, or quality improvement project.

    An example is the evaluation of the effectiveness of evidence-based program guidelines on improving treatment.

7.1. Target Population

Selecting patients for registries can be thought of as a multistage process that begins with understanding the target population (the population to which the findings are meant to apply, such as all patients with a disease or a common exposure) and then selecting a sample of this population for study. Some registries will enroll all, or nearly all, of the target population, but most registries will enroll only a sample of the target population. The accessible study population is that portion of the target population to which the participating sites have access. The actual study population is the subset of those who can actually be identified and invited and who agree to participate.22 While it is desirable for the patients who participate in a study to be representative of the target population, it is rarely possible to study groups that are fully representative from a statistical sampling perspective, either for budgetary reasons or for reasons of practicality. An exception is registries composed of all users of a product (as in postmarketing surveillance studies where registry participation is required as a condition of receiving an intervention), an approach which is becoming more common to manage expensive interventions and/or to track potential safety issues.

Certain populations pose greater difficulties in assembling an actual study population that is truly representative of the target population. Children and other vulnerable populations present special challenges in recruitment, as they typically will have more restrictions imposed by institutional review boards and other oversight groups.

As with any research study, very clear definitions of the inclusion and exclusion criteria are necessary and should be well documented, including the rationale for these criteria. A common feature of registries is that they typically have few inclusion and exclusion criteria, which enhances their applicability to broader populations. Restriction, the strategy of limiting eligibility for entry to individuals within a certain range of values for a confounding factor, such as age, may be considered in order to reduce the effect of a confounding factor when it cannot otherwise be controlled, but this strategy may reduce the generalizability of results to other patients.

These criteria will largely be driven by the study objectives and any sampling strategy. For a more detailed description of target populations and their subpopulations, and how these choices affect generalizability and interpretation, see Chapter 13.

Once the patient population has been identified, attention shifts to selecting the institutions and providers from which patients will be selected. For more information on recruiting patients and providers, see Chapter 10.

7.2. Comparison Groups

Once the target population has been selected and the mechanism for their identification (e.g., by providers) is decided, the next decision involves determining whether to collect data on comparators (sometimes called parallel cohorts). Depending on the purpose of the registry, internal, external, or historical groups can be used to strengthen the understanding of whether the observed effects are real and in fact different from what would have occurred under other circumstances. Comparison groups are most useful in registries where it is important to distinguish between alternative decisions or to assess differences, the magnitude of differences, or the strength of associations between groups. Registries without comparison groups can be used for descriptive purposes, such as characterizing the natural history of a disease or condition, or for hypothesis generation. The addition of a comparison group may add significant complexity, time, and cost to a registry.

Although it may be appealing to use more than one comparison group in an effort to overcome the limitations that may result from using a single group, multiple comparison groups pose their own challenges to the interpretation of registry results. For example, the results of comparative safety and effectiveness evaluations may differ depending on the comparison group used. Generally, it is preferable to make judgments about the “best” comparison group for study during the design phase and then concentrate resources on these selected subjects. Alternatively, sensitivity analyses can be used to test inferences against alternative reference groups to determine the robustness of the findings. (See Chapter 13, Section 5.)

The choice of comparison groups is more complex in registries than in clinical trials. Whereas clinical trials use randomization to try to achieve an equal distribution of known and unknown risk factors that can confound the drug-outcome association, registry studies need to use various design and analytic strategies to control for the confounders that they have measured. The concern for observational studies is that people who receive a new drug or device have different risk factors for adverse events than those who choose other treatments or receive no treatment at all. In other words, the treatment choices are often related to demographic and lifestyle characteristics and the presence of coexisting conditions that affect clinician decisionmaking about whom to treat.23

One design strategy that is used frequently to ensure comparability of groups is individual matching of exposed patients and comparators with regard to key demographic factors such as age and gender. Compatibility is also achieved by inclusion criteria that could, for example, restrict the registry focus to patients who have had the disease for a similar duration or are receiving their first drug treatment for a new condition. These inclusion criteria make the patient groups more similar but may add constraints to the external validity by defining the target population more narrowly. Other design techniques include matching study subjects on the basis of a large number of risk factors, by using statistical techniques (e.g., propensity scoring) to create strata of patients with similar risks. As an example, consider a recent study of a rare side effect in coronary artery surgery for patients with acute coronary syndrome. In this instance, the main exposure of interest was the use of antifibrinolytic agents during revascularization surgery, a practice that had become standard for such surgeries. The sickest patients, who were most likely to have adverse events, were much less likely to be treated with antifibrinolytic agents. To address this, the investigators measured more than 200 covariates (by drug and outcome) per patient and used this information in a propensity score analysis. The results of this large-scale observational study revealed that the traditionally accepted practice (aprotinin) was associated with serious end-organ damage and that the less expensive generic medications were safe alternatives.24 Incorporation of propensity scores in analysis is discussed further in Chapter 13, Section 5.

An internal comparison group refers to simultaneous data collection for patients who are similar to the focus of interest (i.e., those with a particular disease or exposure in common), but who do not have the condition or exposure of interest. For example, a registry might collect information on patients with arthritis who are using acetaminophen for pain control. An internal comparison group could be arthritis patients who are using other medications for pain control. Data regarding similar patients, collected during the same calendar period and using the same data collection methods, are useful for subgroup comparisons, such as for studying the effects in certain age categories or among people with similar comorbidities. However, the information value and utility of these comparisons depend largely on having adequate sample sizes within subgroups, and such analyses may need to be specified a priori to ensure that recruitment supports them. Internal comparisons are particularly useful because data are collected during the same observation period as for all study subjects, which will account for time-related influences that may be external to the study. For example, if an important scientific article is published that affects general clinical practice, and the publication occurs during the period in which the study is being conducted, clinical practice may change. The effects may be comparable for groups observed during the same period through the same system, whereas information from historical comparisons, for example, would be expected to reflect different practices.

An external comparison group is a group of patients similar to those who are the focus of interest, but who do not have the condition or exposure of interest, and for whom relevant data that have been collected outside of the registry are available. For example, the SEER program maintains national data about cancer and has provided useful comparison information for many registries where cancer is an outcome of interest.25 External comparison groups can provide informative benchmarks for understanding effects observed, as well as for assessing generalizability. Additionally, large clinical and administrative claims databases can contribute useful information on comparable subjects for a relatively low cost. A drawback of external comparison groups is that the data are generally not collected the same way and the same information may not be available. The underlying populations may also be different from the registry population. In addition, plans to merge data from other databases require the proper privacy safeguards to comply with legal requirements for patient data; Chapter 7 covers patient privacy rules in detail.

A historical comparison group refers to patients who are similar to the focus of interest, but who do not have the condition or exposure of interest, and for whom information was collected in the past (such as before the introduction of an exposure or treatment or development of a condition). Historical controls may actually be the same patients who later become exposed, or they may consist of a completely different group of patients. For example, historical comparators are often used for pregnancy studies since there is a large body of population-based surveillance data available, such as the Metropolitan Atlanta Congenital Defects Program (MACDP).26 This design provides weak evidence because symmetry is not assured (i.e., the patients in different time periods may not be as similar as desired). Historical controls are susceptible to bias by changes over time in uncontrollable, confounding risk factors, such as differences in climate, management practices, and nutrition. Bias stemming from differences in measuring procedures over time may also account for observed differences.

An approach related to the use of historical comparisons is the use of Objective Performance Criteria (OPC) as a comparator. This research method has been described as an alternative to randomized trials, particularly for the study of devices.27 OPC are “performance criteria based on broad sets of data from historical databases (e.g., literature or registries) that are generally recognized as acceptable values. These criteria may be used for surrogate or clinical endpoints in demonstrating the safety or effectiveness of a device.”28 A U.S. Food and Drug Administration guidance document on medical devices includes a description of study designs that should be considered as alternatives to randomized clinical trials and that may meet the statutory criteria for preapproval as well as postapproval evidence.28 Registries serve as a source of reliable historical data in this context. New registries with safety or effectiveness endpoints may also be planned that will incorporate previously existing OPC as comparators (e.g., for a safety endpoint for a new cardiac device). Such registries might use prior clinical study data to set the “complication-free rate” for comparison.

There are several situations in which conventional prospective design for comparison selection is impossible and a historical comparison may be considered:

  • When one cannot ethically continue the use of older treatments or practices, or when clinicians and/or patients refuse to continue their use, so that the researcher cannot identify relevant sites using the older treatments.
  • When uptake of a new medical practice has been rapid, concurrent comparisons may differ so markedly from treated patients, in regard to factors related to outcomes of interest, that they cannot serve as valid comparison subjects due to intractable confounding.
  • When conventional treatment has been consistently unsuccessful and the effect of new intervention is obvious and dramatic (e.g., first use of a new product for a previously untreatable condition).
  • When collecting the comparison data is too expensive.
  • When the Hawthorne effect (a phenomenon that refers to changes in the behavior of subjects because they know they are being studied or observed) makes it impossible to replicate actual practice in a comparison group during the same period.
  • When the desired comparison is to usual care or “expected” outcomes at a population level, and data collection is too expensive due to the distribution or size of that population.

8. Sampling

Various sampling strategies for patients and sites can be considered. Each of these has tradeoffs in terms of validity and information yield. The representativeness of the sample, with regard to the range of characteristics that are reflective of the broader target population, is often a consideration, but representativeness mainly affects generalizability rather than the internal validity of the results. Representativeness should be considered in terms of patients (e.g., men and women, children, the elderly, different racial or ethnic groups) and sites (academic medical centers, community practices). For sites (health care providers, hospitals, etc.), representativeness is often considered in terms of geography, practice size, and academic or private practice type. Reviewing and refining the research question can help researchers define an appropriate target population and a realistic strategy for subject selection.

To ensure that enough meaningful information will be available for analysis, registry studies often restrict eligibility for entry to individuals within a certain range of characteristics. Alternatively, they may use some form of sampling: random selection, systematic sampling, or a nonrandom approach. Often-used sampling strategies include the following:

  • Probability sampling: Some form of random selection is used, wherein each person in the population must have a known (often equal) probability of being selected.29-32

    Census: A census sample includes every individual in a population or group (e.g., all known cases). A census is not feasible when the group is large relative to the costs of obtaining information from individuals.

    Simple random sampling: The sample is selected in such a way that each person has the same probability of being sampled.

    Stratified random sampling: The group from which the sample is to be taken is first stratified into subgroups on the basis of an important, related characteristic (e.g., age, parity, weight) so that each individual in a subgroup has the same probability of being included in the sample, but the probabilities for different subgroups or strata are different. Stratified random sampling ensures that the different categories of characteristics that are the basis of the strata are sufficiently represented in the sample. However, the resulting data must be analyzed using more complicated statistical procedures (such as Mantel-Haenszel) in which the stratification is taken into account.

    Systematic sampling: Every nth person in a population is sampled.

    Cluster (area) sampling: The population is divided into clusters, these clusters are randomly sampled, and then some or all patients within selected clusters are sampled. This technique is particularly useful in large geographic areas or when cluster-level interventions are being studied.

    Multistage sampling: Multistage sampling can include any combination of the sampling techniques described above.

  • Nonprobability sampling: Selection is systematic or haphazard but not random. The following sampling strategies affect the type of inferences that can be drawn; for example, it would be preferable to have a random sample if the goal were to estimate the prevalence of a condition in a population. However, systematic sampling of “typical” patients can generate useful data for many purposes, and is often used in situations where probability sampling is not feasible.33

    Case series or consecutive (quota) sampling: All consecutive eligible patients treated at a given practice or by a given clinician are enrolled until the enrollment target is reached. This approach is intended to reduce conscious or unconscious selection bias on the part of clinicians as to whom to enroll in the study, especially with regard to factors that may be related to prognosis.

    Haphazard, convenience, volunteer, or judgmental sampling: This includes any sampling not involving a truly random mechanism. A hallmark of this form of sampling is that the probability that a given individual will be in the sample is unknown before sampling. The theoretical basis for statistical inference is lost, and the result is inevitably biased in unknown ways.

    Modal instance: The most typical subject is sampled.

    Purposive: Several predefined groups are deliberately sampled.

    Expert: A panel of experts judges the representativeness of the sample or is the source that contributes subjects to a registry.

Individual matching of cases and controls is sometimes used as a sampling strategy for controls. Controls are matched with individual cases who have similar confounding factors, such as age, to reduce the effect of the confounding factors on the association being investigated.

Patients may be recruited in a fashion that allows for individual matching. For example, if a 69-year-old “case” participates in the registry, a control near in age will be sought. Individual matching for prospective recruitment is challenging and not customarily used. More often, matching is used to create subgroups for supplemental data collection for case-control studies and cohort studies when subjects are limited and/or stratification is unlikely to provide enough subjects in each stratum for meaningful evaluation.

A number of other sampling strategies have arisen from survey research (e.g., snowball, heterogeneity), but they are of less relevance to registries.

9. Registry Size and Duration

Precision in measurement and estimation corresponds to the reduction of random error; it can be improved by increasing the size of the study and modifying the design of the study to increase the efficiency with which information is obtained from a given number of subjects.29

During the registry design stage, it is critical to explicitly state how large the registry will be, how long patients should be followed, and what the justifications are for these decisions. These decisions are based on the overall purpose of the registry. For example, in addressing specific questions of product safety or effectiveness, the desired level of precision to confirm or rule out the existence of an important effect should be specified, and ideally should be linked to policy or practice decisions that will be made based on the evidence. For registries with aims that are descriptive or hypothesis generating, study size may be arrived at through other considerations.

The duration of registry enrollment and followup should be determined both by required sample size (number of patients or person-years to achieve the desired power) and by time-related considerations. The induction period for some outcomes of interest must be considered, and sufficient followup time allowed for the exposure under study to have induced or promoted the outcome. Biological models of disease etiology and causation usually indicate the required time period of observation for an effect to become apparent. Calendar time may be a consideration in studies of changes in clinical practice or interventions that have a clear beginning and end. The need for evidence to inform policy may also determine a timeframe within which the evidence must be made available to decisionmakers.

A detailed discussion of the topic of sample size calculations for registries is provided in Appendix A. For present purposes it is sufficient to briefly describe some of the critical inputs to these calculations that must be provided by the registry developers:

  • The expected timeframe of the registry and the time intervals at which analyses of registry data will be performed.
  • Either the size of clinically important effects (e.g., minimum clinically important differences) or the desired precision associated with registry-based estimates.
  • Whether or not the registry is intended to support regulatory decisionmaking. If the results from the registry will affect regulatory action—for example, the likelihood that a product may be pulled from the market—then the precision of the overall risk estimate is important, as is the necessity to predict and account for attrition.

In a classical calculation of sample size, the crucial inputs that must be provided by the investigators include either the size of clinically important effects or their required precision. For example, suppose that the primary goal of the registry is to compare surgical complication rates in general practice with those in randomized trials. The inputs to the power calculations would include the complication rates from the randomized trials (e.g., 4 percent) and the complication rate in general practice, which would reflect a meaningful departure from this rate (e.g., 6 percent). If, on the other hand, the goal of the registry is simply to track complication rates (and not to compare the registry with an external standard), then the investigators should specify the required width of the confidence interval associated with those rates. For example, in a large registry, the 95-percent confidence interval for a 5-percent complication rate might extend from 4.5 percent to 5.5 percent. If all of the points in this confidence interval lead to the same decision, then an interval of ±0.5 percent is considered sufficiently precise, and this is the input required for the estimation of sample size.

Specifying the above inputs to sample size calculations is a substantial matter and usually involves a combination of quantitative and qualitative reasoning. The issues involved in making this specification are essentially similar for registries and other study designs, though for registries designed to address multiple questions of interest, one or more primary objectives or endpoints must be selected that will drive the selection of a minimum sample size to meet those objectives.

Other considerations that should sometimes be taken into account when estimating sample sizes include—

  • whether individual patients can be considered “independent,” or whether they share factors that would lead to correlation in measures between them;
  • whether multiple comparisons are being made and subjected to statistical testing; and
  • whether levels of expected attrition or lack of adherence to therapy may require a larger number of patients to achieve the desired number of person-years of followup or exposure.

In some cases, patients under study who share some group characteristics, such as patients treated by the same clinician or practice, or at the same institution, may not be entirely independent from one another with regard to some outcomes of interest or when studying a practice-level intervention. To the extent they are not independent, a measure of interdependence, the intraclass correlation (ICC), and so-called “design effect” must be considered in generating the overall sample size calculation. A reference addressing sample size considerations for a study incorporating a cluster-randomized intervention is provided.34 A hierarchical or multilevel analysis may be required to account for one or more levels of “grouping” of individual patients, discussed further in Chapter 13, Section 5. One approach to addressing multiple comparisons in the surgical complication rate example above is to use control chart methodology, a statistical approach used in process measurement to examine the observed variability and determine whether out-of-control conditions are occurring. Control chart methodology is also used in sample size estimation, largely for studies with repeated measurements, to adjust the sample size as needed and therefore maintain reasonably precise estimates of confidence limits around the point estimate. Accordingly, for registries that involve ongoing evaluation, sample size per time interval could be determined by the precision associated with the related confidence interval, and decision rules for identifying problems could then be based on control chart methodology.

Although most of the emphasis in estimating study size requirements is focused on patients, it is equally important to consider the number of sites needed to recruit and retain enough patients to achieve a reasonably informative number of person-years for analysis. Many factors are involved in estimating the number of sites needed for a given study, including the number of eligible patients seen in a given practice during the relevant time period, desired representativeness of sites with regard to geography, practice size, or other features, and the timeframe within which study results are required, which may also limit the timeframe for patient recruitment.

In summary, the aims of a registry, the desired precision of information sought, and the hypotheses to be tested, if any, determine the process and inputs for arriving at a target sample size and specifying the duration of followup.

Registries with mainly descriptive aims, or those that provide quality metrics for clinicians or medical centers, may not require the choice of a target sample size to be arrived at through power calculations. In either case, the costs of obtaining study data, in monetary terms and in terms of researcher, clinician, and patient time and effort, may set upper as well as lower limits on study size. Limits to study budgets and the number of sites and patients that could be recruited may be apparent at the outset of the study. However, an underpowered study involving substantial data collection that is ultimately unable to satisfactorily answer the research question(s) may prove to be a waste of finite monetary as well as human resources that could better be applied elsewhere.

10. Internal and External Validity

The potential for bias refers to opportunities for systematic errors to influence the results. Internal validity is the extent to which study results are free from bias, and the reported association between exposure and outcome is not due to unmeasured or uncontrolled-for variables. Generalizability, also known as external validity, is a concept that refers to the utility of the inferences for the broader population that the study subjects are intended to represent. In considering potential biases and generalizability, we discuss the differences between RCTs and registries, since these are the two principal approaches to conducting clinically relevant prospective research.

The strong internal validity that earns RCTs high grades for evidence comes largely from the randomization of exposures that helps ensure that the groups receiving the different treatments are similar in all measured or unmeasured characteristics, and that, therefore, any differences in outcome (beyond those attributable to chance) can be reasonably attributed to differences in the efficacy or safety of the treatments. However, it is worth noting that RCTs are not without their own biases, as illustrated by the “intent-to-treat” analytic approach, in which people are considered to have used the assigned treatment, regardless of actual compliance. The intent-to-treat analyses can minimize a real difference—generating a distortion known as “bias toward the null”—by including the experience of people who did not adhere to the recommended study product along with those who did.

Another principal difference between registries and RCTs is that RCTs are often focused on a relatively homogeneous pool of patients from which significant numbers of patients are purposefully excluded at the cost of external validity—that is, generalizability to the target population of disease sufferers. Registries, in contrast, usually focus on generalizability so that their population will be representative and relevant to decisionmakers.

10.1. Generalizability

The strong external validity of registries is achieved by the fact that they include typical patients, which often include more heterogeneous populations than those participating in RCTs (e.g., wide variety of age, ethnicity, and comorbidities). Therefore, registry data can provide a good description of the course of disease and impact of interventions in actual practice and, for some purposes, may be more relevant for decisionmaking than the data derived from the artificial constructs of the clinical trial. In fact, even though registries have more opportunities to introduce bias (systematic error) because of their nonexperimental methodology, well designed observational studies can approximate the effects of interventions observed in RCTs on the same topic35, 36 and, in particular, in the evaluation of health care effectiveness in many instances.37

The choice of groups from which patients will be selected directly affects generalizability. No particular method will ensure that an approach to patient recruitment is adequate, but it is worthwhile to note that the way in which patients are recruited, classified, and followed can either enhance or diminish the external validity of a registry. Some examples of how these methods of patient recruitment and followup can lead to systematic error follow.

10.2. Information Bias

If the registry's principal goal is the estimation of risk, it is possible that adverse events or the number of patients experiencing them will be underreported if the reporter will be viewed negatively for reporting them. It is also possible for those collecting data to introduce bias by misreporting the outcome of an intervention if they have a vested interest in doing so. This type of bias is referred to as information bias (also called detection, observer, ascertainment, or assessment bias), and it addresses the extent to which the data that are collected are valid (represent what they are intended to represent) and accurate. This bias arises if the outcome assessment can be interfered with, intentionally or unintentionally. On the other hand, if the outcome is objective, such as whether or not a patient died or the results of a lab test, then the data are unlikely to be biased.

10.3. Selection Bias

A registry may create the incentive to enroll only patients who either are at low risk of complications or who are known not to have suffered such complications, biasing the results of the registry toward lower event rates. Those registries whose participants derive some sort of benefit from reporting low complication rates, for example, those with surgeons participating are at particularly high risk for this type of bias. Another example of how patient selection methods can lead to bias is the use of patient volunteers, a practice that may lead to selective participation from subjects most likely to perceive a benefit, distorting results for studies of patient-reported outcomes.

Enrolling patients who share a common exposure history, such as having used a drug that has been publicly linked to a serious adverse effect, could distort effect estimates for cohort and case-control analyses. Registries can also selectively enroll people who are at higher risk of developing serious side effects, since having a high-risk profile can motivate a patient to participate in a registry.

The term selection bias refers to situations where the procedures used to select study subjects lead to an effect estimate among those participating in the study that is different from the estimate that is obtainable from the target population.38 Selection bias may be introduced if certain subgroups of patients are routinely included or excluded from the registry.

10.4. Channeling Bias (Confounding by Indication)

Channeling bias, also called confounding by indication, is a form of selection bias in which drugs with similar therapeutic indications are prescribed to groups of patients with prognostic differences.39 For example, physicians may prescribe new treatments more often to those patients who have failed on traditional first-line treatments.

One approach to designing studies to address channeling bias is to conduct a prospective review of cases, in which external reviewers are blinded as to the treatments that were employed and are asked to determine whether a particular type of therapy is indicated and to rate the overall prognosis for the patient.40 This method of blinded prospective review was developed to support research on ruptured cerebral aneurysms, a rare and serious situation. The results of the blinded review were used to create risk strata for analysis so that comparisons could be conducted only for candidates for whom both therapies under study were indicated, a procedure much like the application of additional inclusion and exclusion criteria in a clinical trial.

A computed “propensity score” (i.e., the predicted probability of use of one therapy over another based on medical history, health care utilization, and other characteristics measured prior to the initiation of therapy) is increasingly incorporated into study designs to address this type of confounding.41, 42 Propensity scores may be used to create cohorts of initiators of two different treatments matched with respect to probability of use of one of the two therapies, for stratification or for inclusion as a covariate in a multivariate analysis. Studies incorporating propensity scores as part of their design may be planned prior to and implemented shortly following launch of a new drug as part of a risk management program, with matched comparators being selected over time, so that differences in prescribing patterns following drug launch may be taken into account.43

Instrumental variables, or factors strongly associated with treatment but related to outcome only through their association with treatment, may provide additional means of adjustment for confounding by indication, as well as unmeasured confounding.44 Types of instrumental variables include providers' preferences for one therapy over another—a variable which exploits variation in practice as a type of natural experiment; variation or changes in insurance coverage or economic factors (e.g., cigarette taxes) associated with an exposure; or geographic distance from a specific type of service.45, 46 Variables that serve as effective instruments of this nature are not always available and may be difficult to identify. While use of clinician or study site may, in some specific cases, offer potential as an instrumental variable for analysis, the requirement that use of one therapy over another be very strongly associated with the instrument is often difficult to meet in real-world settings. In most cases, instrumental variable analysis provides an alternative for secondary analysis of study data. Instrumental variable analysis may either support the conclusions drawn on the basis of the initial analysis, or it may raise additional questions regarding the potential impact of confounding by indication.42

In some cases, however, differences in disease severity or prognosis between patients receiving one therapy rather than another may be so extreme and/or unmeasurable that confounding by indication is not remediable in an observational design.47 This represents special challenges for observational studies of comparative effectiveness, as the severity of underlying illness may be a strong determinant of both choice of treatment and treatment outcome.

10.5. Bias from Study of Existing Rather Than New Product Users

If there is any potential for tolerance to affect the use of a product, such that only those who perceive benefit from it or are free from harm continue using it, the recruitment of existing users rather than new users may lead to the inclusion of only those who have tolerated or benefited from the intervention, and would not necessarily capture the full spectrum of experience and outcomes. Selecting only existing users may introduce any number of biases, including incidence/prevalence bias, survivorship bias, and followup bias. By enrolling new users (an inception or incidence cohort), a study ensures that the longitudinal experience of all users will be captured, and that the ascertainment of their experience and outcomes will be comparable.48

10.6. Loss to Followup

Loss to followup or attrition of patients and sites threatens generalizability as well as internal validity if there is differential loss; for example, loss of participants with a particular exposure or disease, or with particular outcomes. Loss to followup and attrition are generally a serious concern only when they are nonrandom (that is, when there are systematic differences between those who leave or are lost and those who remain). The magnitude of loss to followup or attrition determines the potential impact of any bias. Given that the differences between patients who remain enrolled and those who are lost to followup are often unknown (unmeasurable), preventing loss to followup in long-term studies to the fullest extent possible will increase the credibility and validity of the results.49 Attrition should be considered with regard to both patients and study sites, as results may be biased or less generalizable if only some sites (e.g., teaching hospitals) remain in the study while others discontinue participation.

10.7. Assessing the Magnitude of Bias

Remaining alert for any source of bias is important, and the value of a registry is enhanced by its ability to provide a formal assessment of the likely magnitude of all potential sources of bias. Any information that can be generated regarding nonrespondents, missing respondents, and the like, is helpful, even if it is just an estimation of their raw numbers. As with many types of survey research, an assessment of differential response rates and patient selection can sometimes be undertaken when key data elements are available for both registry enrollees and nonparticipants. Such analyses can easily be undertaken when the initial data source or population pool is that of a health care organization, employer, or practice that has access to data in addition to key selection criteria (e.g., demographic data or data on comorbidities). Another tool is the use of sequential screening logs, in which all subjects fitting the inclusion criteria are enumerated and a few key data elements are recorded for all those who are screened. This technique allows some quantitative analysis of nonparticipants and assessments of the effects, if any, on representativeness. Whenever possible, quantitative assessment of the likely impact of bias is desirable to determine the sensitivity of the findings to varying assumptions. A text on quantitative analysis of bias through validation studies, and on probabilistic approaches to data analysis, provides a guide for planning and implementing these methods.50

Qualitative assessments, although not as rigorous as quantitative approaches, may give users of the research a framework for drawing their own conclusions regarding the effects of bias on study results if the basis for the assessment is made explicit in reporting the results.

Accordingly, two items that can be reported to help the user assess the generalizability of research results based on registry data are a description of the criteria used to select the registry sites, and the characteristics of these sites, particularly those characteristics that might have an impact on the purpose of the registry. For example, if a registry designed for the purpose of assessing adherence to lipid screening guidelines requires that its sites have a sophisticated electronic medical record in order to collect data, it will probably report better adherence than usual practice because this same electronic medical record facilitates the generation of real-time reminders to engage in screening. In this case, a report of rates of adherence to other screening guidelines (for which there were no reminders), even if these are outside the direct scope of inquiry, would provide some insight into the degree of overestimation.

Finally, and most importantly, whether or not study subjects need to be evaluated on their representativeness depends on the purpose and kind of inference needed. For example, sampling in proportion to the underlying distribution in the population is not necessary to understand biological effects. However, if the study purpose were to estimate a rate of occurrence of a particular event, then sampling would be necessary to reflect the appropriate underlying distributions.

11. Summary

In summary, the key points to consider in designing a registry include study design, data sources, patient selection, comparison groups, sampling strategies, and considerations of possible sources of bias and ways to address them, to the extent that is practical and achievable.

Case Examples for Chapter 3

Case Example 2Designing a registry for a health technology assessment

DescriptionThe Nuss procedure registry was a short-term registry designed specifically for the health technology assessment of the Nuss procedure, a novel, minimally invasive procedure for the repair of pectus excavatum, a congenital malformation of the chest. The registry collected procedure outcomes, patient-reported outcomes, and safety outcomes.
SponsorNational Institute for Health and Clinical Excellence (NICE), United Kingdom
Year Started2004
Year Ended2007
No. of Sites13 hospitals
No. of Patients260

Challenge

The Nuss procedure is a minimally invasive intervention for the repair of pectus excavatum. During a review of the evidence supporting this procedure conducted in 2003, the National Institute for Health and Clinical Excellence (NICE) determined that the existing data included relatively few patients and few quality of life outcomes, and did not sufficiently address safety concerns. NICE concluded in the 2003 review that the evidence was not adequate for routine use and that more evidence was needed to make a complete assessment of the procedure.

Proposed Solution

Gathering additional evidence through a randomized controlled trial was not feasible for several reasons. First, a blinded trial would be difficult because the other procedures for the repair of pectus excavatum produce much larger scars than the Nuss procedure. Surgeons also tend to perform either only the Nuss procedure or only another procedure, a factor that would complicate randomization efforts. In addition, only a small number of procedures are done in the United Kingdom. The sample for a randomized trial would likely be very small, making it difficult to detect rare adverse events.

Due to these limitations, NICE decided to develop a short-term registry to gather evidence on the Nuss procedure. The advantages of a registry were its ability to gather data on all patients undergoing the procedure in the United Kingdom to provide a more complete safety assessment, and its ability to collect patient-reported outcomes.

The registry was developed by an academic partner, with input from clinicians. Hospitals performing the procedure were identified and asked to enter into the registry data on all patients undergoing the intervention. Once the registry was underway, the cases in the registry were compared against cases included in the Hospital Episodes Statistics (HES) database, a nationwide source of routine data on hospital activity, and nonparticipating hospitals were identified and prompted to enter their data.

Results

NICE conducted a reassessment of the Nuss procedure in 2009, comparing data from the registry with other published evidence on safety and efficacy. The quantity of published literature had increased substantially between 2003 and 2009. The new publications primarily focused on technical and safety outcomes, while the registry included patient-reported outcomes. The literature and the registry reported similar rates of major adverse events such as bar displacement (from 2 to 10 percent). Based on the registry data and the new literature, the review committee found that the evidence was now sufficient to support routine use of the Nuss procedure, and no further review of the guidance is planned. Committee members considered that the registry made a useful contribution to guidance development.

Key Point

The Nuss registry demonstrated that a small, short-term, focused registry with recommended (but not automatic or mandatory) submission can produce useful data, both about safety and about patient-reported outcomes.

Case Example 3Developing prospective nested studies in existing registries

DescriptionThe Consortium of Rheumatology Researchers of North America (CORRONA) is a national disease registry of patients with rheumatoid arthritis (RA) and psoriatic arthritis (PsA).
SponsorCORRONA Investigators and Genentech
Year Started2001
Year EndedOngoing
No. of Sites StatesOver 100 sites in the United
No. of PatientsAs of March 31 2012: 36,922 (31,701 RA patients and 5,221 PsA patients)

Challenge

In 2001, the CORRONA data collection program was established to collect longitudinal, physician-and patient-reported safety and effectiveness data for the treatment and management of RA and PsA. Any patient with RA or PsA upon diagnosis can participate in the registry, and participation in the registry is lifelong unless the patient withdraws consent. With an existing infrastructure and its representative, real-world nature, the disease registry can be used as a robust opportunity for nested trials at sites that have been trained in data collection and verification.

Proposed Solution

In collaboration with Genentech, the CORRONA investigators are utilizing the registry in two separate prospective, nested substudies: the Comparative Effectiveness Registry to study Therapies for Arthritis and Inflammatory CoNditions (CERTAIN) and the Treat to Target (T2T) study. Based on the study eligibility criteria and the capabilities of CORRONA sites, different patients and sites are being selected to participate in CERTAIN and T2T.

The CERTAIN study is a nested comparative effectiveness and safety study evaluating real-world differences in classes of biologic agents among RA patients initiating either tumor necrosis factor (TNF) antagonists or non-TNF-inhibitor biologic agents. The study is enrolling approximately 2,750 patients over three years to address comparative effectiveness questions. Long-term safety followup data will be collected through lifelong patient participation in the CORRONA registry after CERTAIN study completion. Data are collected at mandated 3-month intervals and include standard validated physician- and patient-derived outcomes and centrally-processed laboratory measures such as complete blood counts, metabolic panel, high sensitivity CRP, lipids with direct (nonfasting) LDL, immunoglobulin levels and serology (CCP and RF). Serum, plasma, DNA and RNA will be stored for future research. In addition, adverse event data are being obtained with inclusion of primary “source” documents, followed by a robust process of verification and adjudication.

The T2T study is a cluster-randomized, open-label study comparing treatment acceleration (i.e., monthly visits with a change in therapeutic agent, dosage or route of administration in order to achieve a target metric of disease activity) against usual care (i.e., no mandated changes to therapy or visit frequencies beyond what the treating physician considers appropriate for the patient). This study will attempt to determine both the feasibility and outcomes of treating to target in a large U.S. population. This one-year study is enrolling 888 patients. Data collection includes standard measures of disease activity such as Clinical-Disease Activity Index (CDAI) score, Disease Activity Score-28 (DAS28), and Routine Assessment of Patient Index Data-3 (RAPID 3), as well as rates of acceleration, frequency of visits, and suspected RA-drug-related toxicities. The purpose of the trial is to test the hypothesis that accelerated aggressive therapy of RA correlates with better long-term patient outcomes.

Results

The CERTAIN and T2T studies, now in the enrollment phase, exemplify the key advantages and the unique operational synergies of successfully nesting studies within an existing disease registry. This design approach has the advantage of minimizing the usual study start-up and implementation challenges. The registry allows real-time identification of eligible patients typically seen in a U.S. clinical practice, a capability that can facilitate patient recruitment. Both CERTAIN and T2T have broad inclusion criteria to increase representativeness of the population enrolled. Established registry sites include investigators, staff, and patients already experienced with the registry questionnaires and research activities.

The two nested substudies require additional patient consent and site reimbursement, as they collect blood samples that increase the time required to complete a study visit. CORRONA collaborates with an academic institution to collect personal identifiers and patient consent to release medical records, thereby facilitating verification of serious adverse event for patients participating in CERTAIN. While this feature adds value to CERTAIN's ability to address long-term safety questions, it entailed establishing a new mechanism to ensure that the two databases (CORRONA and a database for personal identifiers) remain separate from each other in a highly secure way. New enrollment and screening instructions were developed for each substudy, with mandated completion of required training for participating physicians and research coordinators.

Key Point

Designing a prospective, nested study within an established disease registry has many benefits: the study leverages existing infrastructure, patient and site staff are familiar with the registry, and site relationships are already in place. Substudies need to be well planned and address a compelling clinical issue. Registry personnel must provide sufficient guidance, instructions, and rationale to sites to ensure that the transition is smooth and that the distinction from core registry operations is maintained in order to achieve the goal of high-quality research.

For More Information

Kremer J. The CORRONA database. Ann Rheum Dis. 2005 Nov;64 Suppl 4:iv, 37–41. [PMC free article: PMC1766903] [PubMed: 16239384].

The Consortium of Rheumatology Researchers of North America, Inc. (CORRONA). http://www.corrona.org. [PubMed: 19644891].

US National Institutes of Health. ClinicalTrials.gov. http://www.clinicaltrials.gov/ct2/show/NCT01407419?term=CORRONA&rank=2.

Case Example 4Designing a registry to address unique patient enrollment challenges

DescriptionThe Anesthesia Awareness Registry is a survey-based registry that collects detailed data about patient experiences of anesthesia awareness. Patient medical records are used to assess anesthetic factors associated with the patient's experience. An optional set of psychological assessment instruments measure potential trauma-related sequelae including depression and post-traumatic stress disorder (PTSD).
SponsorAmerican Society of Anesthesiologists
Year Started2007
Year EndedOngoing
No. of SitesNot applicable
No. of Patients265

Challenge

Anesthesia awareness is a recognized complication of general anesthesia, defined as the unintended experience and explicit recall of events during surgery. The incidence of anesthesia awareness has been estimated at 1–2 patients per 1,000 anesthetics and may result in development of serious and long-term psychological sequelae including PTSD. The causes of the phenomenon and preventive strategies have been studied, but there is disagreement in the scientific community about the effectiveness of monitoring devices for prevention of anesthesia awareness.

The population of patients experiencing anesthesia awareness is difficult to identify. Although standard short questionnaires designed to identify anesthesia awareness are sometimes administered to patients postoperatively, many patients experience delayed recollection and do not realize that they were awake during their procedure until several weeks later. These patients may or may not report their experience to their provider. In addition, because of the often unsettling and traumatic nature of their experience, even patients who recognize their anesthesia awareness before being discharged from the hospital may not feel comfortable reporting it to their surgeon or other health care providers.

With ongoing coverage in the media, anesthesiologists were facing increasing concern and fear about anesthesia awareness among their patients. The American Society of Anesthesiologists sought a patient-oriented approach to this problem.

Proposed Solution

Because this population of patients is not always immediately recognized in the health care setting, the registry was created to collect case reports of anesthesia awareness directly from patients. A patient advocate was invited to consult in the registry's development and provides ongoing advice from the patient perspective. The registry hosts a Web site that provides information about anesthesia awareness and directions for enrolling in the registry. Any patient who believes they have experienced anesthesia awareness may voluntarily submit a survey and medical records to the registry. Psychological assessments are optional. An optional open-ended discussion about the patient's anesthesia awareness experience provides patients with an opportunity to share information that may not be elicited through the survey.

Results

The registry has enrolled 265 patients since 2007. Patients who enroll are self-selected, and the sample is likely biased towards patients with emotional sequelae. While the information provided to potential enrollees clearly states that eligibility is restricted to awareness during general anesthesia, a surprising number of enrollments are patients who were supposed to be awake during regional anesthesia or sedation. This revealed a different side to the problem of anesthesia awareness: clearly, some patients did not understand the nature of the anesthetic that would be provided for their procedure, or patients had expectations that were not met by their anesthesia providers. Most enrollees experienced long-term psychological sequelae regardless of anesthetic technique.

Key Point

Allowing the registry's purpose to drive its design produces a registry that is responsive to the expected patient population. Employing direct-to-patient recruitment can be an effective way of reaching a patient population that otherwise would not be enrolled in the registry, and can yield surprising and important insights into patient experience.

For More Information

http://www.awaredb.org

Domino KB. Committee on Professional Liability opens anesthesia awareness registry. ASA Newsletter. 2007. p. 29.p. 34.

Domino KB. Update on the Anesthesia Awareness Registry. ASA Newsletter. 2008;72(11):32, 36.

Kent CD, Bruchas RR, Posner KL, et al. Anesthesia Awareness Registry Update. Anesthesiology. 2009;111:A1518.

Kent CD. Awareness during general anesthesia: ASA Closed Claims Database and Anesthesia Awareness Registry. ASA Newsletter. 2010. pp. 14–16.

Kent CD, Metzger NA, Posner KL, et al. Anesthesia Awareness Registry: psychological impacts for patients. Anesthesiology. 2011:A003.

Domino KB, Metzger NA, Mashour GA. Anesthesia Awareness Registry: patient responses to awareness. Br J Anaesth. 2012;108(2):338P.

References for Chapter 3

1.
Strom BL. Pharmacoepidemiology. 3rd ed. Chichester, England: John Wiley; 2000.
2.
Gliklich R. A New Framework For Comprehensive Evidence Development. IN VIVO. 2012 Oct 22 [June 17, 2013]; http://www​.elsevierbi​.com/publications/in-vivo​/30/9/a-new-framework-for-comprehensive-evidence-development.
3.
Chow SC, Chang M, Pong A. Statistical consideration of adaptive methods in clinical development. J Biopharm Stat. 2005;15(4):575–91. [PubMed: 16022164]
4.
Farrington CP. Relative incidence estimation from case series for vaccine safety evaluation. Biometrics. 1995 Mar;51(1):228–35. [PubMed: 7766778]
5.
Ryan PB, Powell GE, Pattishall EN, et al. Performance of screening multiple observational databases for active drug safety surveillance. Providence, RI: International Society of Pharmacoepidemiology; 2009.
6.
Norén GN, Hopstadius J, Bate A, et al. Temporal pattern discovery in longitudinal electronic patient records. Data Min Knowl Discov. 2010;20(3):361–87.
7.
Travis LB, Rabkin CS, Brown LM, et al. Cancer survivorship—genetic susceptibility and second primary cancers: research strategies and recommendations. J Natl Cancer Inst. 2006 Jan 4;98(1):15–25. [PubMed: 16391368]
8.
Sackett DL, Haynes RB, Tugwell P. Clinical epidemiology. Boston: Little, Brown and Company; 1985. p. 228.
9.
Hennekens CH, Buring JE. Epidemiology in medicine. 1st ed. Boston: Little, Brown and Company; 1987.
10.
Schoebel FC, Gradaus F, Ivens K, et al. Restenosis after elective coronary balloon angioplasty in patients with end stage renal disease: a case-control study using quantitative coronary angiography. Heart. 1997 Oct;78(4):337–42. [PMC free article: PMC1892250] [PubMed: 9404246]
11.
Rothman K, Greenland S. Modern Epidemiology. 3rd ed. Philadelphia: Lippincott Williams & Wilkins; 1998. pp. 175–9.
12.
Oral contraceptive use and the risk of endometrial cancer. The Centers for Disease Control Cancer and Steroid Hormone Study. JAMA. 1983 Mar 25;249(12):1600–4. [PubMed: 6338265]
13.
Oral contraceptive use and the risk of ovarian cancer. The Centers for Disease Control Cancer and Steroid Hormone Study. JAMA. 1983 Mar 25;249(12):1596–9. [PubMed: 6338264]
14.
Long-term oral contraceptive use and the risk of breast cancer. The Centers for Disease Control Cancer and Steroid Hormone Study. JAMA. 1983 Mar 25;249(12):1591–5. [PubMed: 6338262]
15.
Speck CE, Kukull WA, Brenner DE, et al. History of depression as a risk factor for Alzheimer's disease. Epidemiology. 1995 Jul;6(4):366–9. [PubMed: 7548342]
16.
Wacholder S, McLaughlin JK, Silverman DT, et al. Selection of controls in case-control studies. I. Principles. Am J Epidemiol. 1992 May 1;135(9):1019–28. [PubMed: 1595688]
17.
Wacholder S, Silverman DT, McLaughlin JK, et al. Selection of controls in case-control studies. II. Types of controls. Am J Epidemiol. 1992 May 1;135(9):1029–41. [PubMed: 1595689]
18.
Wacholder S, Silverman DT, McLaughlin JK, et al. Selection of controls in case-control studies. III. Design options. Am J Epidemiol. 1992 May 1;135(9):1042–50. [PubMed: 1595690]
19.
Rothman K, Greenland S. Modern Epidemiology. 3rd ed. Philadelphia: Lippincott Williams & Wilkins; 1998. p. 108.
20.
Vigneswaran R, Aitchison SJ, McDonald HM, et al. Cerebral palsy and placental infection: a case-cohort study. BMC Pregnancy Childbirth. 2004 Jan 27;4(1):1. [PMC free article: PMC343280] [PubMed: 15005809]
21.
Ong AT, Daemen J, van Hout BA, et al. Cost-effectiveness of the unrestricted use of sirolimus-eluting stents vs. bare metal stents at 1 and 2-year follow-up: results from the RESEARCH Registry. Eur Heart J. 2006 Dec;27(24):2996–3003. [PubMed: 17114234]
22.
Hulley SB, Cumming SR. Designing clinical research. Baltimore: Williams & Wilkins; 1988.
23.
Hunter D. First, gather the data. N Engl J Med. 2006 Jan 26;354(4):329–31. [PubMed: 16436764]
24.
Mangano DT, Tudor IC, Dietzel C, et al. The risk associated with aprotinin in cardiac surgery. N Engl J Med. 2006 Jan 26;354(4):353–65. [PubMed: 16436767]
25.
National Cancer Institute. Surveillance Epidemiology and End Results. [August 27, 2012]. http://seer​.cancer.gov.
26.
Metropolitan Atlanta Congenital Defects Program (MACDP). National Center on Birth Defects and Developmental Disabilities. Centers for Disease Control and Prevention; [August 27, 2012]. http://www​.cdc.gov/ncbddd​/birthdefects/MACDP.html. [PubMed: 17109399]
27.
Chen E, Sapirstein W, Ahn C, et al. FDA perspective on clinical trial design for cardiovascular devices. Ann Thorac Surg. 2006 Sep;82(3):773–5. [PubMed: 16928481]
28.
U.S. Food and Drug Administration; Center for Devices and Radiological Health. The Least Burdensome Provisions of the FDA Modernization Act of 1997; Concept and Principles: Final Guidance for FDA and Industry. [August 14, 2012]. Document issued October 4, 2002. http://www​.fda.gov/MedicalDevices​/DeviceRegulationandGuidance​/GuidanceDocuments/ucm085994.htm#h3.
29.
Cochran WG. Sampling Techniques. 3rd ed. New York: John Wiley & Sons; 1977.
30.
Lohr SL. Sampling: Design and Analysis. Boston: Duxbury; 1999.
31.
Sudman S. Applied Sampling. New York: Academic Press; 1976.
32.
Henry GT. Practical Sampling. Newbury Park, CA: Sage; 1990.
33.
Rothman K, Greenland S. Modern Epidemiology. 3rd ed. Philadelphia: Lippincott Williams & Wilkins; 1998. p. 116.
34.
Raudenbush SW. Statistical analysis and optimal design for cluster randomized trials. Psychological Methods. 1997;2(2):173–85.
35.
Concato J, Shah N, Horwitz RI. Randomized, controlled trials, observational studies, and the hierarchy of research designs. N Engl J Med. 2000 Jun 22;342(25):1887–92. [PMC free article: PMC1557642] [PubMed: 10861325]
36.
Benson K, Hartz AJ. A comparison of observational studies and randomized, controlled trials. N Engl J Med. 2000 Jun 22;342(25):1878–86. [PubMed: 10861324]
37.
Black N. Why we need observational studies to evaluate the effectiveness of health care. BMJ. 1996 May 11;312(7040):1215–8. [PMC free article: PMC2350940] [PubMed: 8634569]
38.
Rothman K. Modern Epidemiology. Boston: Little Brown and Company; 1986. p. 83.
39.
Petri H, Urquhart J. Channeling bias in the interpretation of drug effects. Stat Med. 1991 Apr;10(4):577–81. [PubMed: 2057656]
40.
Johnston SC. Identifying confounding by indication through blinded prospective review. Am J Epidemiol. 2001 Aug 1;154(3):276–84. [PubMed: 11479193]
41.
Sturmer T, Joshi M, Glynn RJ, et al. A review of the application of propensity score methods yielded increasing use, advantages in specific settings, but not substantially different estimates compared with conventional multivariable methods. J Clin Epidemiol. 2006 May;59(5):437–47. [PMC free article: PMC1448214] [PubMed: 16632131]
42.
Glynn RJ, Schneeweiss S, Sturmer T. Indications for propensity scores and review of their use in pharmacoepidemiology. Basic Clin Pharmacol Toxicol. 2006 Mar;98(3):253–9. [PMC free article: PMC1790968] [PubMed: 16611199]
43.
Loughlin J, Seeger JD, Eng PM, et al. Risk of hyperkalemia in women taking ethinylestradiol/ drospirenone and other oral contraceptives. Contraception. 2008 Nov;78(5):377–83. [PubMed: 18929734]
44.
Brookhart M. Alan. Instrumental Variables for Comparative Effectiveness Research: A Review of Applications. Rockville, MD: Agency for Healthcare Research and Quality; Jan, 2009. [August 14, 2012]. Slide Presentation from the AHRQ 2008 Annual Conference (Text Version) http://www​.ahrq.gov/about​/annualmtg08/090908slides​/Brookhart.htm.
45.
Evans WN, Ringel JS. Can higher cigarette taxes improve birth outcomes? J Public Economics. 1999;72(1):135–54.
46.
Schneeweiss S, Seeger JD, Landon J, et al. Aprotinin during coronary-artery bypass grafting and risk of death. N Engl J Med. 2008 Feb 21;358(8):771–83. [PubMed: 18287600]
47.
Bosco JL, Silliman RA, Thwin SS, et al. A most stubborn bias: no adjustment method fully resolves confounding by indication in observational studies. J Clin Epidemiol. 2010 Jan;63(1):64–74. [PMC free article: PMC2789188] [PubMed: 19457638]
48.
Ray WA. Evaluating medication effects outside of clinical trials: new-user designs. Am J Epidemiol. 2003 Nov 1;158(9):915–20. [PubMed: 14585769]
49.
Kristman V, Manno M, Cote P. Loss to follow-up in cohort studies: how much is too much? Eur J Epidemiol. 2004;19(8):751–60. [PubMed: 15469032]
50.
Lash TL, Fox MP, Fink AK. Applying quantitative bias analysis to epidemiologic data. Springer; 2009.

Views

  • PubReader
  • Print View
  • Cite this Page

Related information

  • PMC
    PubMed Central citations
  • PubMed
    Links to PubMed

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...