U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Cover of A Process for Robust and Transparent Rating of Study Quality: Phase 1

A Process for Robust and Transparent Rating of Study Quality: Phase 1

Methods Research Reports

Investigators: , MD, Project Lead, , MD, PhD, , PhD, MPH, and , MD.

Author Information and Affiliations
Rockville (MD): Agency for Healthcare Research and Quality (US); .
Report No.: 12-EHC004-EF

Structured Abstract

Background:

Critical appraisal of individual studies with a formal summary judgment for methodological quality and subsequent assessment of the strength of a body of evidence addressing a specific question are essential activities of conducting comparative effectiveness reviews (CERs). Uncertainty concerning the optimal approach of quality assessments has given rise to wide variations in practice. A well-defined and transparent methodology to evaluate the robustness of quality assessments is critical for the interpretation of systematic reviews as well as the larger CER process.

Purpose:

To complete the first phase of a project to develop such a methodology, we aimed to examine the extent and potential sources of inter- and intra-rater variations in quality assessments, as conducted in our Evidence-based Practice Center (EPC).

Methods:

We conducted three sequential exercises: (1) quality assessment of randomized controlled trials (RCTs) based on the default quality item checklist used in EPC reports without further instruction; (2) quality assessment of RCTs guided by explicit definitions of quality items; and (3) quality assessment of RCTs based on manuscripts stripped of identifying information, and performance of sensitivity analyses of quality items. The RCTs used in these exercises had been included in a previous CER on sleep apnea. Three experienced systematic reviewers participated in these exercises.

Data synthesis:

In exercise 1, an initial set of 11 RCTs was subjected to a quality assessment process without any guidance, conducted in parallel by three independent reviewers. We found that the overall study quality ratings were discordant among the reviewers 64 percent of the time. In exercise 2, quality assessments were performed in a second set of RCTs, guided by explicit quality item definitions. The overall study quality ratings were discordant in 55 percent of the cases. In exercise 3, the provenance (i.e., title, authors, journal, etc.) of the published papers used in exercise 2 were concealed and simultaneously “influential” factors like study dropout rate and blinding were variably modified in a sensitivity analysis scheme. Comparing inter-rater disagreements between exercises 2 and 3, we observed that reviewers were less often in disagreement regarding the overall study quality rating (54.5 percent in exercise 2 vs. 45.5 percent in exercise 3). Anonymization of the papers resulted in increased proportion of disagreements for several items (e.g., “definition of outcomes,” “appropriate statistics”). We also observed that for certain items that have a less subjective interpretation (e.g., blinding of outcome assessors or patients), there was a consistent extent of disagreement between exercises 2 and 3.

Limitations:

The results presented here are based on a small sample of RCTs, selected from a single CER and assessed by three reviewers from one EPC only. The definitions of the items in our checklist were not evaluated for adequacy and clarity, other than for their face validity assessed by the reviewers of this study. We acknowledge that this default checklist may not be in widespread use across evidence synthesis practices, and is not directly aligned with the current trend to transfer the focus from methodological (and reporting) quality to explicit assessment of the risk of bias of studies. Due to these reasons, the generalizability and the target audience of this research activity may be limited. Furthermore, we did not examine how our quality assessment tool compared with other available tools or how our assessments would differ if applied in a different clinical question. Thus, our findings are preliminary, and no definite conclusions could and should be drawn from this pilot study.

Conclusions:

We identified extensive variations in overall study ratings between three experienced reviewers. Discrepancies among reviewers in the assignment of individual items are common. While it may be desirable to have a single rating assessed by multiple reviewers using a process of reconciliation, in the absence of a gold standard method, it may be even more important to report the variations in assessments among different reviewers. A study with large variations in quality assessment may fundamentally be very different from one that has little variations, despite the fact that both of them are assigned the same consensus quality rating. Further investigations are needed to evaluate these hypotheses.

Prepared for: Agency for Healthcare Research and Quality, U.S. Department of Health and Human Services1, Contract No. 290-2007-100551, Prepared by: Tufts Medical Center Evidence-based Practice Center, Boston, Massachusetts

Suggested citation:

Ip S, Kitsios GD, Chung M, Lau J. A Process for Robust and Transparent Rating of Study Quality: Phase 1. Methods Research Report. (Prepared by the Tufts Medical Center Evidence-based Practice Center under Contract No. HHSA 290-2007-100551.) AHRQ Publication No. 12-EHC004-EF. Rockville, MD: Agency for Healthcare Research and Quality; November 2011. effectivehealthcare.ahrq.gov.

This report is based on research conducted by the Tufts Medical Center Evidence-based Practice Center under contract to the Agency for Healthcare Research and Quality (AHRQ), Rockville, MD (Contract No. HHSA 290-2007-100551). The findings and conclusions in this document are those of the author(s), who are responsible for its contents; the findings and conclusions do not necessarily represent the views of AHRQ. Therefore, no statement in this report should be construed as an official position of AHRQ or of the U.S. Department of Health and Human Services.

The information in this report is intended to help health care decisionmakers—patients and clinicians, health system leaders, and policymakers, among others—make well-informed decisions and thereby improve the quality of health care services. This report is not intended to be a substitute for the application of clinical judgment. Anyone who makes decisions concerning the provision of clinical care should consider this report in the same way as any medical reference and in conjunction with all other pertinent information, i.e., in the context of available resources and circumstances presented by individual patients.

This report may be used, in whole or in part, as the basis for development of clinical practice guidelines and other quality enhancement tools, or as a basis for reimbursement and coverage policies. AHRQ or U.S. Department of Health and Human Services endorsement of such derivative products may not be stated or implied.

No investigators have any affiliations or financial involvement (e.g., employment, consultancies, honoraria, stock options, expert that conflict with material presented in this report.

1

540 Gaither Road, Rockville, MD 20850; www​.ahrq.gov

Bookshelf ID: NBK82248PMID: 22191113

Views

Related information

Similar articles in PubMed

See reviews...See all...

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...