Structured Abstract


International standards include recommendations that systematic reviews be comprehensive, but time and resources may render it impractical to search for and extract data from all possible sources of information. For example, many data sources exist for randomized controlled trials (RCTs), a number of which are not publicly available or are difficult or impossible to access. Searching nonpublic sources of RCTs may improve the impact of systematic reviews by identifying information that was recorded but not included in public sources, thus reducing the need for additional RCTs and improving research efficiency.


To determine whether multiple data sources about RCTs affected systematic reviews and meta-analyses of patient-centered outcomes (PCOs) research (eg, trial quality assessment, pooled effect estimates) and to produce open access guidance about using multiple data sources, for producers of systematic reviews of PCOs research.


We conducted a methods study related to the conduct of systematic reviews using 2 case studies: (1) gabapentin for neuropathic pain and (2) quetiapine for bipolar depression. Applying comprehensive searching methods as if we were conducting 2 systematic reviews, we attempted to identify all data sources for eligible RCTs. We extracted data and compared the information about trial design and trial quality (ie, risk of bias) using the multiple data sources. Using these multiple data sources, we extracted information about prespecified outcome domains, including the definition of each outcome (ie, the outcome domain, measure, metric, method of aggregation, and time point) and the associated results (ie, numerical estimates of treatment effectiveness). Then, we compared the results of meta-analyses using multiple data sources by conducting multiple meta-analyses in which we systematically added and replaced data from various sources. We compared the outcomes that matter most to patients with the outcomes that were examined in clinical trials.


Most clinical trials in our case studies were associated with multiple data sources, including public sources (eg, journal articles, conference abstracts, trial registrations, and Food and Drug Administration (FDA) reviews) and nonpublic sources (eg, clinical study reports and individual patient data). We found 21 gabapentin RCTs (74 reports, 6 individual participant data) and 7 quetiapine RCTs (50 reports, 1 individual participant data); we found nonpublic sources for 6 of 21 (29%) gabapentin and 4 of 7 (57%) quetiapine RCTs. We identified literally hundreds of PCOs reported in the sources we found that could be included in our meta-analyses. We surmised that there were many opportunities for selective outcome reporting by trialists and systematic reviewers. The process of identifying all sources of information—and extracting and analyzing data—required considerable time and skill. Nonpublic sources were especially difficult to identify. Clinical study reports were time consuming to extract, and individual participant data required time and skill to prepare the information for analysis.

Data sources differed in completeness. Most RCTs (18/21 [86%] and 6/7 [86%], respectively) were reported in journal articles, which often presented unclear information related to trial quality. When nonpublic sources were available for RCTs, clinical study reports contained mostly information about trial design and trial quality, and clinical study reports and individual participant data contained the most results. In these case studies, individual participant data were not accompanied by any metadata (eg, codebooks or description of trial methods).

For a single RCT reported in a single source, 1 outcome domain (eg, pain intensity) could be associated with multiple outcome definitions (eg, using multiple measures) and results. Thus, multiple results for an outcome could be presented selectively, even when an outcome domain was prespecified. Multiple data sources associated with a single RCT sometimes contained contradicting information about trial design.

In the series of meta-analyses using results for a single outcome domain, measure, and time point for each case study, the effect of gabapentin on pain intensity ranged from an SMD of −0.46 (95% CI −0.64 to −0.28) to an SMD of −0.31 (95% CI −0.47 to −0.15), and the effect of quetiapine on depression ranged from SMD = −0.42 (95% CI −0.62 to −0.22) to SMD = −0.44 (95% CI −0.63 to −0.26). The range of results might have been wider had we included other outcome measures and time points in our meta-analyses.

Clinical study reports and individual participant data contained hundreds of harms (adverse events) that were not included in public sources. Further analyses would be required to determine if conclusions about the relative benefits and harms of treatment would be affected by a systematic synthesis of harms included only in nonpublic sources.


There is tremendous variation in the information available across multiple data sources from individual trials. Reported estimates from individual trials were vulnerable to selective reporting at both the trial and the systematic review levels.

Although an “open science” environment is a positive move forward, it will not solve all the problems that we identified with how trial information is currently made available. For example, an open trial data set for the purpose of making research reproducible allows for the investigation of new research questions, systematic reviews, and individual participant data meta-analysis; however, the steps required to share a data set and the steps required to utilize an open data set are resource intensive and demand skills that are not taught routinely to trialists or systematic reviewers, such as creating, storing, accessing, synthesizing, and analyzing information from multiple data sources. Furthermore, patients and stakeholders are often unfamiliar with the wide variety of existing data sources and what can be gained from each of them. Thus, current proposals to expand data sharing require careful and community-wide decision making about implementation.

Although guidance for systematic reviewers suggests searching extensively for information sources and including all data sources found, this approach is neither practical nor possible for many systematic reviews. Our research provides practical insights into a complex area that eeds further research and discussion.


Objectives and Overview

Our objectives were to determine whether multiple data sources about randomized controlled trials (RCTs) affected systematic reviews and meta-analyses of patient-centered outcomes (PCOs) research and to produce open access guidance about using multiple data sources for producers of systematic reviews of PCOs research.

To meet these objectives, we conducted a methods study related to the conduct of systematic reviews using 2 case studies: (1) gabapentin for neuropathic pain and (2) quetiapine for bipolar depression. We did not conduct systematic reviews; instead, we examined whether and how the results of systematic reviews might be affected by how trial data are made available, and how meta-analyses might be affected by those who use data from multiple sources for a single trial. Although we investigated methods used in conducting systematic reviews and the effect of those methods on meta-analyses, we did not consider our work to comprise multiple systematic reviews because we did not evaluate the benefits and harms of the interventions per se. Our findings may be of greatest interest to trialists, systematic reviewers, journal editors, and guideline developers.

Patient-Centered Outcomes

We aimed to examine whether we found additional PCOs (meaningful to patients) by looking through all reporting sources for each trial, not just a few sources. Systematic reviewers may rely on easy-to-find sources (eg, public sources) for trial outcomes, and they may not list PCOs that could be available from other sources (eg, nonpublic sources). Searching nonpublic sources may improve the impact of systematic reviews by identifying PCOs that were recorded but not included in public sources. Examining nonpublic data sources for PCOs could reduce the need for additional RCTs and improve the efficiency of PCOs research. As far as we are aware, this possibility has never been addressed in research.

Reporting Biases

Reporting biases threaten the reliability of RCTs and systematic reviews. Failure to report trials at all and selective reporting of design, methods, or results represents an ethical breach with research participants who believed that by participating in a trial they were contributing to knowledge. Reporting bias means that information from a trial is not transmitted to the public and thus also constitutes wasted information.1 When failure to report trials (or failure to report trials completely) is not random, it biases the trial findings and what is generally believed to be true.2

Multiple Data Sources

To decrease the threat of reporting biases, current best practices for the conduct of systematic reviews include a comprehensive search for trial information.3,4 Standards for such a search have focused on public sources of information about trials and include going beyond journal articles and searching the “gray” literature such as conference abstracts, FDA reviews, and trial registries. This is mainly because studies of reporting bias have shown that not all information about trial design and quality can be found in journals, and publication of trial results in a journal is associated with positive trial results.5,6

In addition, studies have shown that additional information about a trial can be found in nonpublic sources.7-9 Although systematic reviewers may wish to search for and include all relevant and reliable data in systematic reviews, comprehensive searching adds to the time and resources required to complete systematic reviews. In addition, information from the various sources may not always agree,8 and systematic reviewers must decide which source is correct.10 The value of searching for nonpublic information is unclear, partly because the best methods for searching are unknown.

Assuming that comprehensive searching for trial information provides additional data, questions remain about which public and nonpublic sources should be searched for and used in systematic reviews. Empirically grounded guidance is needed to help systematic reviewers make choices about the use of data from multiple sources. Increasingly, the move to open science is making nonpublic sources available.11 Even if all that is currently hidden becomes public, we would still need guidance to address the gains and challenges from accessing multiple data sources for a single trial.


We published a protocol describing detailed methods for our study.12 What follows is a summary of our methods.

Selection of Case Studies

To evaluate the effect of using data from multiple sources in systematic reviews and meta-analyses, we examined 2 case studies regarding the effectiveness and safety of (1) gabapentin (Neurontin) for neuropathic pain in adults and (2) quetiapine (Seroquel) for the treatment of depression in adults with bipolar disorder following a published protocol.12 We used methods consistent with the National Academy of Medicine (formerly the Institute of Medicine) standards for conducting systematic reviews.4

We selected these cases for several reasons. First, gabapentin and quetiapine are used commonly for these respective conditions, so the included studies are clinically important. Second, pain and depression are associated with several patient-reported outcomes. Methodologically, these cases also allow us to consider different, typical situations that systematic reviewers might face with respect to data from multiple sources.

Gabapentin for Neuropathic Pain

Neuropathic pain occurs when there is damage to the peripheral nervous system, which is the array of nerves that transmit information from the central nervous system (brain and spinal cord) to other parts of the body. Between 3% and 10% of the population may be living with neuropathic pain, which affects 30% to 50% of people with diabetes in particular.13 Other consequences include loss of sleep, reduced quality of life, and high health care costs.

The FDA approved gabapentin for the treatment of epilepsy in 1993, and it approved gabapentin for the treatment of postherpetic neuralgia (ie, residual pain in people who have had shingles) in 2002. Gabapentin is used off label for a variety of symptoms, including for neuropathic and other types of pain.

Quetiapine for Bipolar Depression

Bipolar disorder is characterized by episodes of depression and at least 1 episode of mania or hypomania14 and has a lifetime prevalence of 1% to 4%.15,16 Work, family life, and social life are impaired significantly by depressive episodes.17,18 People with bipolar disorder are at increased risk of suicide compared with the general population and compared with people who have other mental health problems.19,20

Quetiapine is an antipsychotic medication originally used for the treatment of acute psychotic episodes. Quetiapine is currently recommended as a first-line choice for treatment of acute bipolar depression,21,22 although it is associated with outcomes that are undesirable to patients, including daytime sleepiness, cognitive impairment, loss of libido, and rapid weight gain. Quetiapine and other antipsychotics are also associated with serious adverse events, including cardiac and metabolic effects, and extrapyramidal symptoms.23,24

Methods for Identifying Patient-Centered Outcomes

A goal of this project was to examine what matters most to patients who take gabapentin for neuropathic pain and to patients who take quetiapine for bipolar depression. We used existing resources and consulted patient coinvestigators. We compared the list of outcomes that matter most to patients with the outcomes that were examined in clinical trials.

We identified PCOs by reviewing existing information sources, and we worked with our patient coinvestigators and clinical experts to identify those outcomes they consider patient centered. Specifically, we examined the following:

  1. Core outcome sets (COSs) using the Core Outcome Measures in Effectiveness Trials database25 (http://www.comet-initiative.org) and the James Lind Alliance (http://www.lindalliance.org) in April and May 2014, respectively. We asked coinvestigators and colleagues to identify COSs related to neuropathic pain and bipolar disorder.
  2. A social media website (http://www.patientslikeme.com/) in April 2014. PatientsLikeMe allows patients to enter information about their medical history and interventions they have used.
  3. DRUGDEX (http://micromedex.com/) in January 2015. DRUGDEX is a compendium health care providers use to make treatment decisions. The Centers for Medicare and Medicaid may use DRUGDEX ratings to make reimbursement decisions related to medically accepted treatments. DRUGDEX contains a list, by organ system, of each drug's adverse effects.
  4. FDA prescribing information (“package inserts”) accessed through the Drugs@FDA website in April 2014

After examining existing sources, we asked patient and stakeholder partners to identify outcomes they thought should be included in each case study. We discussed the outcomes identified through all sources, and we selected those outcomes that patient and stakeholder partners deemed most important. We then generated a list of outcomes reported in clinical trials for each case study, and we examined whether PCOs were included in sources of clinical trials.

Effect of Multiple Data Sources on Meta-analyses

Eligibility Criteria for a Trial's Inclusion in Our Study

Our published protocol includes detailed inclusion and exclusion criteria for each case study.12 Table 1 describes the eligibility criteria for participants, interventions, outcomes, comparisons, and time points.

Table 1. Eligibility Criteria for the Case Studies.

Table 1

Eligibility Criteria for the Case Studies.

Search Methods for Identification of Trials

We conducted electronic and additional searches to identify trials as described in our study protocol.12 We searched for the following types of sources, which are listed by their approximate level of expected detail (ie, less detail to more detail):

  1. Trial registrations without results posted on trial registries (eg, www.ClinicalTrials.gov)
  2. Short reports (eg, conference abstracts, posters)
  3. Trial registrations without with summary data posted on trial registries
  4. Peer reviewed journal articles
  5. Dissertations (eg, Master's or doctoral theses)
  6. Unpublished manuscripts and reports (eg, reports to funders, clinical study reports)
  7. Information from regulators (ie, FDA medical and statistical reviews)
  8. Individual patient data

Selection of Trials

Two investigators independently screened titles and abstracts to determine which were eligible for inclusion in the review, as described in our study protocol.12 We independently screened full text reports of all potentially relevant trials and resolved disagreements through consensus and by asking a third investigator as needed.

Data Collection

We extracted information about trial design as described in our study protocol.12 We recorded all outcomes within the prespecified outcome domains, including the 5 elements described in Table 2.26,27

Table 2. The 5 Elements of an Outcome.

Table 2

The 5 Elements of an Outcome.

For sources of aggregate data

For trials associated with multiple sources (eg, journal articles, clinical study reports), we extracted data from each source, including trial design, trial quality (ie, risk of bias), outcomes, and results. We developed data collection forms and entered data in the Systematic Review Data Repository (SRDR).28,29

From single sources about more than 1 trial, we extracted information about each individual trial; we did not extract results from multiple trials when the data had already been combined (ie, we were unable to separate data for an individual trial from the combined data).

We extracted results from figures and tables that included numerical data (eg, counts and means included in graphs), including statistical significance (eg, P value above or below .05). We did not meta-analyze information that was presented only graphically. We extracted text from sources describing results as “statistically significant” or “not significant.”

Two investigators extracted data independently and resolved discrepancies by consensus and through discussion with a third investigator if necessary. We used Stata 1431 to reconcile the records and to produce a final data set that we will share publicly through SRDR at the end of the study.

For individual patient data

We received individual participant data for gabapentin trials from Pfizer as Microsoft Access database files. When codebooks were not available, we compared the individual participant data with case report forms (which often showed how and when data were recorded) and statistical analysis plans (which often showed how data were coded and analyzed) to identify the variables contained in the individual participant data.

We did not receive any data sets about quetiapine from AstraZeneca. In 1 quetiapine clinical study report, we identified an appendix with individual participant data, which we extracted using ABBYY FineReader version 11.32


Assessment of Trial Quality

To compare the completeness of sources with respect to trial quality, we assessed each source using the Cochrane Collaboration Risk of Bias Tool3 as described in our study protocol.12 Two investigators independently rated each source for risk of bias (low, unclear, or high risk of bias) related to sequence generation, allocation concealment, masking of participants, masking of outcome assessors, masking of providers, and missing data. Investigators resolved discrepancies by consensus and through discussion with a third investigator if necessary.

Series of Meta-analyses

For each case study, we conducted a series of meta-analyses to explore the impact of multiple data sources on the overall results. We analyzed these results by adding or replacing data as follows:

  1. Including only results from publications in peer reviewed journals
  2. Adding results from trial registries to step 1
  3. Adding results from short reports to step 2
  4. Adding or replacing results (from step 3) using results obtained from regulators (eg, FDA reviews)
  5. Adding or replacing results (from step 4) using summary results obtained from the authors or manufacturers (eg, clinical study reports)
  6. Replacing the best available aggregate results for all trials (step 5) with individual patient data where available

For example, if there was no source for a trial in steps 1 to 4 (ie, no journal article, trial registration, conference abstract, or FDA review) but there was a clinical study report, we added the results from the clinical study report in step 5. If there was a journal article, a FDA review, and a clinical study report, we replaced the results from the journal article with the results from the FDA review in step 4, and we replaced the results from the FDA review with data from the clinical study report in step 5.

For each outcome in each source, we determined if results included enough information to include them in meta-analyses. We considered results meta-analyzable if they included a point estimate and measure of precision (eg, SE, CI). We meta-analyzed only data for which there were sufficient information in a source to calculate the treatment effect; for example, we did not impute precision if it was not reported.

For each case study, we combined results for the most common measure using the standardized mean difference (SMD) for the difference between groups. In meta-analyses that included both aggregate data and individual participant data, we analyzed individual participant data to obtain an aggregate result for each trial and then we combined the aggregate results across all trials.33-36

We conducted all meta-analyses using random effects, and we report all SMDs with 95% CIs.37-39 We prespecified that we would use random effects because we anticipated that eligible trials might include differences (eg, dosage, participant characteristics) that would violate the assumptions required for the fixed effect model. We compared information using the Dersimonian-Laird method, a very common method of synthesis, assuming the random effects model. We have no reason to think that differences in meta-analytic effect estimates using multiple data sources would be limited to whether we used 1 statistical model or another to perform the meta-analysis.

Additional analyses—including other outcome measures, metrics, and methods of aggregation—are included in manuscripts submitted for publication.


Manuscripts describing our analyses and findings are currently under review. What follows is a summary of our findings.

Patient-Centered Outcomes Identified Through Existing Sources and Stakeholder Coinvestigators

As our protocol12 describes, we selected outcomes that matter most to patients, using existing sources and in collaboration with patient and stakeholder partners.

Core Outcome Sets

We identified a core outcome set for pain: the Initiative on Methods, Measurement, and Pain Assessment in Clinical Trials (IMMPACT) consensus recommendations.40 IMMPACT recommends that the following 6 core domains be assessed in clinical trials of chronic pain: pain, physical functioning, emotional functioning, participant ratings of global improvement, symptoms and adverse events, and participant disposition.

We did not identify a core outcome set for bipolar depression. From the James Lind Alliance, we learned that a working group had been formed to identify priorities for bipolar disorder in conjunction with the Oxford Patients Active in Research project; no relevant outputs had been produced at the time of our study

Social Media

Information on social media could not be used to identify PCOs. PatientsLikeMe reported the percentage of patients who rated each intervention's subjective effectiveness and side effects. Although patients could list specific outcomes related to effectiveness (eg, quality of life) or side effects (eg, dizziness), PatientsLikeMe does not include a function that allows patients to rate those specific side effects. PatientsLikeMe reports information about the 6 most commonly rated side effects for each medication.


DRUGDEX entries for gabapentin and quetiapine included information about the use of the drugs for multiple conditions. We identified information about the outcome domains pain intensity and depression, respectively; however, we found little information about other potential benefits of treatment for either on-label or off-label uses of the drugs. DRUGDEX did not always state whether harms were associated with all conditions for which a drug might be used or with only specific conditions.

Food and Drug Administration Prescribing Information

Prescribing information included the outcome domains pain intensity for trials of gabapentin and depression for trials of quetiapine. Prescribing information contained little information about other potential benefits. For gabapentin, prescribing information did not include any information about potential benefits for off-label uses (ie, types of neuropathic pain for which gabapentin is not approved by FDA).

Prescribing information included many potential harms, and it did not always state whether information about harms was applicable to any condition or to only specific conditions.

Patient and Stakeholder Investigators

Patient and stakeholder partners identified outcomes they thought should be included in each review. We created a questionnaire with a list of potential outcomes, and patients rated how much the outcomes mattered to them. We discussed the outcomes identified through all sources and selected those outcomes that patient and stakeholder partners thought were most patient centered.

Effects of Adding and Replacing Data From Multiple Data Sources on the Results of Meta-analyses

Results of the Search

We identified 21 eligible trials of gabapentin (from 1997 to 2013) and 7 eligible trials of quetiapine (from 2003 to 2014). As Tables 3 and 4 show, most of these trials were associated with more than 1 source (71% and 100%, respectively). We identified 74 (gabapentin) and 50 (quetiapine) reports and extracted information from them. Journal articles and conference abstracts were the most common reports. We identified 1 gabapentin trial only through a conference abstract, and 1 gabapentin trial only on ClinicalTrials.gov (www.clinicaltrials.gov).

Table 3. Multiple Sources of Trials of Gabapentin for Neuropathic Pain.

Table 3

Multiple Sources of Trials of Gabapentin for Neuropathic Pain.

Table 4. Multiple Sources of Trials of Quetiapine for Bipolar Depression.

Table 4

Multiple Sources of Trials of Quetiapine for Bipolar Depression.

FDA reviews included information about 2 trials of gabapentin for postherpetic neuralgia. FDA reviews available online did not include trials of quetiapine for bipolar depression, which was added to the label in a supplemental new drug application.

We identified few registrations for trials of gabapentin on ClinicalTrials.gov and other registries, none of the trials conducted by the manufacturer (Pfizer). In contrast, all eligible trials of quetiapine were registered.

We received clinical study reports and individual participant data for 6 gabapentin trials as a result of the litigation we describe above7,41; we received no additional information from Pfizer as a result of our requests.

By searching online, we identified clinical study reports for 2 quetiapine trials that were released during litigation (http://psychrights.org/); 1 of the clinical study reports included individual participant data in an appendix. AstraZeneca declined our request for more information about quetiapine trials, and our correspondence with AstraZenenca has been published.42 Thus, in our study, all nonpublic sources were previously nonpublic, since they were available to us online from litigation sources (ie, they were disclosed during patient lawsuits), and we received no additional sources from the trials' sponsors.

In summary, we found multiple sources for most trials, the largest contributors to multiple reports being journal articles and conference abstracts.

Completeness of Information

Trial design (eg, participant, intervention characteristics) was best described in clinical study reports and journal articles. For example, conference abstracts were incomplete and sometimes contradicted other sources. Five trial registrations submitted to ClinicalTrials.gov by AstraZeneca described a single intervention, “quetiapine fumarate,” while other sources indicated that 2 groups received quetiapine at different doses (ie, 300 mg and 600 mg).

Trial quality (defined as our assessment of the risk of bias for each trial) was often unclear in the sources we identified (see Tables 5 and 6). When we located enough information to assess risk of bias (usually in a journal article or clinical study report), that information almost always led from an unclear to a low risk of bias rating. The only exception was information about missing data, where identification of additional information led from an unclear to a high risk of bias rating for trials with large amounts of missing data.

Table 5. Trial Quality (Risk of Bias) for Sources of Gabapentin for Neuropathic Pain Trials.

Table 5

Trial Quality (Risk of Bias) for Sources of Gabapentin for Neuropathic Pain Trials.

Table 6. Trial Quality (Risk of Bias) by Source of Quetiapine for Bipolar Depression Trials.

Table 6

Trial Quality (Risk of Bias) by Source of Quetiapine for Bipolar Depression Trials.

Statistical information about many outcomes described in journal articles and short reports did not include sufficient detail for results to be included in meta-analysis. In contrast, data in clinical study reports were usually complete and could almost always be used for analysis.

For these reasons, clinical study reports—when they were available—were the most complete sources of information about methods and results.

Locating Patient-Centered Outcomes in Multiple Data Sources

Across all sources for all gabapentin trials, we located 4 of 5 prespecified outcome domains as described in section “Patient-centered outcomes identified through existing sources and stakeholder coinvestigators” (Tables 7 and 8). Across all sources for all quetiapine trials, we identified 7 of 8 prespecified domains. Quetiapine sources included about 6 (median) outcome domains, and they were also highly variable (range, 2-6 domains per source). To our knowledge, none of the trials measured the prespecified outcome domains for pain interference (gabapentin) or psychiatric hospitalization (quetiapine).

Table 7. Number of Prespecified Outcome Domains in Each Source of Gabapentin for Neuropathic Pain Trials.

Table 7

Number of Prespecified Outcome Domains in Each Source of Gabapentin for Neuropathic Pain Trials.

Table 8. Number of Prespecified Outcome Domains in Each Source of Quetiapine for Bipolar Depression Pain Trials.

Table 8

Number of Prespecified Outcome Domains in Each Source of Quetiapine for Bipolar Depression Pain Trials.

Conference abstracts and other short reports included little useful information about PCOs. Journal articles contained more PCOs than did other public sources, although clinical study reports were consistently the best source of aggregate information about PCOs. No outcome measures appeared in public sources that did not also appear in corresponding clinical study reports; however, we identified methods of aggregation in public sources (ie, journal articles and an FDA review) for gabapentin trials that were not included in clinical study reports.

In addition to the prespecified outcomes related to treatment benefits, clinical study reports and individual participant data described hundreds of harms that were not included in public sources.167

Multiple Outcome Definitions

As noted earlier, it is also possible for a single prespecified outcome domain to be assessed in multiple ways (ie, by varying the measure, metric, and method of aggregation). For this reason, an outcome might appear to be reported as described in the protocol when in fact there were multiple opportunities for changing the associated results (Figure 1).

Figure 1. Number of Outcomes in a Trial Is a Function of the Number of Definitions of Each of the 5 Elements.

Figure 1

Number of Outcomes in a Trial Is a Function of the Number of Definitions of Each of the 5 Elements.

For each case study, we identified hundreds of outcomes for the prespecified domains (Figure 2). By varying 4 elements, gabapentin RCTs reported 214 outcomes and quetiapine RCTs reported 81 outcomes. For pain intensity and depression, the most common outcome domains for gabapentin and quetiapine RCTs, we found 119 and 44 outcomes, respectively. Multiple methods of analysis also contributed to multiple results for the same defined outcome. For these reasons, many meta-analyses were possible for each outcome domain using the same included trials.167

Figure 2. Number of Outcomes Observed Increased as a Consequence of Multiple Definitions of the Elements.

Figure 2

Number of Outcomes Observed Increased as a Consequence of Multiple Definitions of the Elements.

Impact of Adding or Replacing Multiple Data Sources on Pooled Effect Sizes and Precision

For each case study, we conducted a series of meta-analyses in which we added or replaced data from each of the multiple sources for a single outcome domain, measure, and time point to evaluate the effect of adding or replacing data. In the presence of reporting bias, whereby negative or null results are reported in nonpublic sources but not in public sources, we would expect the addition of nonpublic sources to change the average effects estimated in meta-analyses. This change would lead to a meta-analytic effect estimate closer to the null.

Not all trials could contribute data to a meta-analysis (Tables 9 and 10). Of the 21 gabapentin and 7 quetiapine trials, 9 (43%) and 4 (57%), respectively, could not be included in an analysis of pain intensity or depression using continuous data (eg, the mean) because no source for those trials included meta-analyzable data (see “Series of Meta-analyses”).

Table 9. Series of Meta-analyses and the Effect of Adding or Replacing Data Sources on the Pooled Estimate of Pain Intensity in Trials of Gabapentin for Neuropathic Pain (SMD [95% CI]).

Table 9

Series of Meta-analyses and the Effect of Adding or Replacing Data Sources on the Pooled Estimate of Pain Intensity in Trials of Gabapentin for Neuropathic Pain (SMD [95% CI]).

Table 10. Series of Meta-analyses and the Effect of Adding or Replacing Data Sources on the Pooled Estimate of Depression in Trials of Quetiapine Bipolar Depression (SMD [95% CI]).

Table 10

Series of Meta-analyses and the Effect of Adding or Replacing Data Sources on the Pooled Estimate of Depression in Trials of Quetiapine Bipolar Depression (SMD [95% CI]).

When restricted to the most common outcome defined as the same outcome domain, measure, and time point, results in both case studies were largely consistent across sources. For gabapentin, including data from previously nonpublic sources (ie, clinical study reports and individual participant data) reduced the magnitude of the effect estimate; adding additional sources did not affect the statistical significance of the treatment effect. In our series of meta-analyses, the effect of gabapentin on pain intensity ranged from an SMD of −0.46 (95% CI −0.64 to −0.28) to an SMD of −0.31 (95% CI −0.47 to −0.15). For quetiapine, there were no meaningful differences as a result of adding and replacing data from multiple sources. The effect of quetiapine on depression ranged from an SMD of −0.42 (95% CI −0.62 to −0.22) to an SMD of −0.44 (95% CI −0.63 to −0.26).

Resources Required to Use Multiple Data Sources in Systematic Reviews and Meta-analyses

Available guidance for performing systematic reviews (eg, from the Cochrane Collaboration and the National Academy of Medicine [formerly the Institute of Medicine]) suggests that reviewers should use all data sources for clinical trials.3,4 Identifying and extracting data from multiple data sources was resource intensive and required many skills, and our findings suggest that existing guidance could be improved to help systematic reviewers use multiple data sources efficiently (Table 11).

Table 11. Main Findings From MUDS and our Recommendations for Application of the Findings.

Table 11

Main Findings From MUDS and our Recommendations for Application of the Findings.

In general, we were able to extract information from conference abstracts quickly; journal articles took several hours to extract.

Clinical study reports, which included thousands of pages of appendices and hundreds of outcomes and analyses, required dozens of hours. Two extractors each spent at least 5 8-hour days extracting data from each clinical study report.

Because we did not receive codebooks for individual participant data, each data set was a unique puzzle we had to solve even before we could synthesize the data. Members of our team spent months developing codebooks and creating a common data format to combine data across trials. Even with this investment, there were some variables we could not identify with certainty.

Guidance for Using Multiple Data Sources in Systematic Reviews

As a final step in our investigation, we developed guidance for systematic reviewers (Table 12). For the open science movement to succeed, systematic reviewers need to make efficient use of evidence from clinical trials, including previously nonpublic information. Although it would have been desirable to identify a best source of information for clinical trials, we did not find a single source or combination of sources that provided complete information about trial design, trial quality, outcomes, and results. Our recommendations incorporate the main findings for each step in the review process.

Table 12. Recommendations for Systematic Reviews Using Multiple Data Sources.

Table 12

Recommendations for Systematic Reviews Using Multiple Data Sources.


Identifying Patient-Centered Outcomes

Our efforts to identify PCOs revealed little formal research about what patients with neuropathic pain and bipolar disorder consider important outcomes. Better evidence is needed to determine what should be measured by clinical trials in these areas and to determine how those outcomes should be assessed (eg, choice of specific measures and time points).

Effects of Adding and Replacing Data From Multiple Data Sources on the Results of Meta-analyses

In both case studies, most trials were associated with multiple sources of information about their methods and results (mostly reports about the trials and some individual participant data). The most common sources, short reports (eg, conference abstracts), were often the least informative about trial design, trial quality, outcomes, and results. Journal articles contained information that was consistent with nonpublic sources, but they did not include all of the information available from nonpublic sources. FDA reviews contained less information than hoped, largely because the drugs and indications we examined did not meet FDA's criteria for inclusion in its public information. Trial registries were also less useful than we had hoped because trials supporting current uses of the drugs in our case studies were performed before legislation mandated trial registration and results reporting. The nonpublic sources we obtained were made public through litigation, an approach that has been used in previous studies to compare data sources.168

Because many sources were incomplete and inconsistent regarding trial design (eg, number of groups), sometimes we found it difficult to understand whether sources were about the same trial or different trials.

Most sources for a single trial, viewed on their own, were unclear about whether there was risk of bias in a trial's design (also referred to as trial quality). An advantage of identifying multiple data sources for each trial was that often information was provided in 1 source but not another; thus, when all sources were taken together, we were able discern more about the trial quality.

The effects of multiple data sources on meta-analysis for gabapentin were consistent with evidence that using FDA reviews and nonpublic sources may change the results of meta-analyses because additional trials can be included using FDA reviews and nonpublic sources and because results in these sources sometimes differ from results in journal articles and conference abstracts.169-172 Findings for quetiapine were consistent with evidence from other case studies that indicate the results reported in multiple data sources sometimes agree.173-175 We are not aware of any other cases in which researchers have had access to this range and number of sources.

Although they sometimes identified trials not reported in other sources (eg, trials not published in a journal article), short reports such as conference abstracts did not contribute new information about methods and results when other data sources were available.

Most of the outcomes in each trial we examined were associated with a few domains (ie, they measured a few underlying concepts, such as pain intensity and depression), and these domains were assessed using a variety of measures, metrics, methods of aggregation, and time points both within individual trials and across all eligible trials.167 For this reason, we believe it is possible for investigators to prespecify an outcome domain, assess it in multiple ways, and choose the most favorable result for public presentation.

In the 2 case studies we examined, inferences about effectiveness derived from meta-analyses were only slightly different when we added or replaced data following the process described above. However, some results associated with outcome definitions in each outcome domain were statistically significant and some were not; such differences might lead trial authors to report outcomes selectively or lead systematic reviewers to include outcomes selectively in their meta-analyses. Reporting biases related to differences in outcome measures, metrics, and methods of aggregation could lead decision makers to recommend or select 1 treatment over another.

Potential harms were viewed by our patient coinvestigators as among the most important outcomes. There was limited information about harms in publicly available sources. Further investigation of harmful outcomes as reported in public and nonpublic sources is warranted; findings could have important implications for PCOs research.

Implications for Data Sharing

Discussions about open access have largely focused on developing systems for archiving and sharing data from new trials. The drugs in our study are widely prescribed even though the trials were completed before trial registration and reporting requirements came into effect.176 Open access to trial data would make what are now nonpublic sources (eg, clinical study reports, individual participant data) more widely available. Our findings highlight the importance of open access to legacy trial data (eg, trials conducted before requirements to share data) because many older drugs enjoy current widespread use. Before data similar to those we examined are lost forever, the research community should develop a system for sharing legacy trials with future generations.


We were not funded to analyze potential harms, so we do not know whether the relative benefits and harms of these interventions differ between public and nonpublic data sources. Further research is needed to explore the benefits of using clinical study reports and individual patient data meta-analysis for identifying the frequency of specific harms and of groups of harms.

While many studies have shown that selective reporting is a problem, until now only a few studies have explored the potential for systematic reviewers to select outcomes for meta-analysis,177-179 and none have used both public and nonpublic sources. Although our results were consistent with previous studies comparing various sources with trial registrations,180,181 conference abstracts,6,182 FDA reviews,174,183 journal articles,184 protocols,185-187 clinical study reports,8,188 and individual participant data,9,189,190 we do not know the extent of applicability of our results to other CER questions. These case studies included trials of relatively short duration, and there may be even greater benefits to using nonpublic sources for trials with multiple assessments over long periods of time.


For systematic reviewers and meta-analysts, using multiple data sources for the same trial is time consuming and comes with both rewards (eg, more complete data) and challenges (eg, the need to resolve differences among contradictory sources). We found no advantage in using conference abstracts except for identifying the existence of unregistered and otherwise unpublished trials. Trial registrations in these case studies were relatively old (eg, we only found information in ClinicalTrials.gov, not other registries), and they provided less information than users might find in more recent registrations. Systematic reviewers should anticipate the vast amount of information that could be extracted from nonpublic sources and plan accordingly. Further research is needed to determine if the results of these case studies are representative of all trials and to determine how multiple data sources might affect meta-analyses of potential harms.


