U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Seida JC, Schouten JR, Mousavi SS, et al. First- and Second-Generation Antipsychotics for Children and Young Adults [Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); 2012 Feb. (Comparative Effectiveness Reviews, No. 39.)

  • This publication is provided for historical reference only and the information may be out of date.

This publication is provided for historical reference only and the information may be out of date.

Cover of First- and Second-Generation Antipsychotics for Children and Young Adults

First- and Second-Generation Antipsychotics for Children and Young Adults [Internet].

Show details

Methods

In this chapter, we describe the topic refinement process and our a priori methods for reviewing, assessing, and synthesizing the evidence on first-generation antipsychotics (FGAs) and second-generation antipsychotics (SGAs) for the treatment of children and young adults.

Topic Refinement and Technical Expert Panel

The University of Alberta Evidence-based Practice Center (EPC) was commissioned to conduct a preliminary literature review to gauge the availability of evidence and to draft key research questions for a comparative effectiveness review. Investigators from our EPC developed the key questions in consultations with the Agency for Healthcare Research and Quality (AHRQ), the Scientific Resource Center, and a panel of key informants. AHRQ posted the key questions on their website for public comment for a period of 1 month. Our EPC revised the Key Questions based on the public feedback we received, and AHRQ approved the final Key Questions.

We assembled a technical expert panel to provide content and methodological expertise throughout the development of the comparative effectiveness review. The technical experts are identified in the front matter of this report.

Literature Search Strategy

Our research librarian systematically searched the following bibliographic databases for studies published from 1987 to May 2010: MEDLINE, Embase, CENTRAL, PsycINFO, International Pharmaceutical Abstracts (IPA), Cumulative Index to Nursing and Allied Health Literature (CINAHL), Scopus, and ProQuest Dissertations International. In February 2011, we performed an update search in MEDLINE, Embase, CENTRAL, and PsycINFO. We restricted the searches to studies published in 1987 and later to coincide with the Diagnostic and Statistical Manual of Mental Disorders (DSM) III–Revised. We searched MedEffect Canada and TOXLINE to identify additional data on adverse events. We restricted the search results to studies published in English and to children and young adults ≤24 years of age. We applied filters for randomized controlled trials (RCTs) and cohort studies (see Appendix A for the detailed search strategies).

We selected search terms by scanning search strategies of systematic reviews on similar topics and examining index terms of potentially relevant studies. We adapted a combination of patient headings and text words for each electronic resource. Text words related to the conditions of interest: child development disorders, Asperger syndrome, autism, Rett syndrome, childhood schizophrenia, aggression, psychomotor agitation, sleep disorders, mood disorders, personality disorders, affective dysregulation, mood lability, irritability, self-injurious behavior, attention deficit hyperactivity disorder, conduct disorder, oppositional defiant disorder, psychotic disorders, bipolar disorder, depressive disorder, obsessive-compulsive disorder, anorexia nervosa, Tourette syndrome, and post-traumatic stress disorder. We included terms for the Food and Drug Administration (FDA)-approved FGAs and SGAs (see Table 5 for a listing of drugs). A second research librarian independently peer reviewed the search strategy.

Table 5. Eligibility criteria for the review.

Table 5

Eligibility criteria for the review.

We screened the reference lists to identify additional studies and searched online trial registries (World Health Organization and ClinicalTrials.gov) to identify unpublished and ongoing trials. We hand searched conference proceedings of the following scientific meetings that were identified by our clinical experts: American Academy of Child and Adolescent Psychiatry (2007, 2008), International College of Neuropsychopharmacology (2007–2009), and International Society for Bipolar Disorders (2007–2009). These proceedings were selected because of their close match to the content of this report. We reviewed FDA documents related to the drugs of interest for additional data. In addition, the Scientific Resource Center contacted drug manufacturers to request published and unpublished study data.

We used a Reference Manager for Windows version 11.0 (2004–2005 Thomson ResearchSoft) bibliographic database to manage the results of our literature searches.

Criteria for Study Selection

The eligibility criteria were developed in consultation with the technical expert panel and are provided in Table 5. Our population of interest was children, adolescents, and young adults ≤24 years of age with psychiatric disorders or behavioral disturbances. Studies that enrolled adults were included only when at least 80 percent of patients were ≤24 years of age or when subgroup analyses or individual data for patients within the eligible age range were provided. Studies that enrolled patients with different conditions (e.g., pervasive developmental disorder and schizophrenia) were included only if they reported efficacy data separately by condition. However, we included studies that aggregated adverse event data across patients with various conditions.

Molindone was removed from the list of FDA-approved drugs due to its discontinuation in January 2010. Patients were not excluded for polypharmacy. However, studies in which cotreatments were systematically given to only one treatment group (e.g., olanzapine and citalopram vs. ziprasidone) were excluded.

We screened the eligibility of articles in two phases. In the first phase, two reviewers independently screened the titles, keywords, and abstracts (when available) to determine if an article met broad screening criteria. We rated each article as “include,” “exclude,” or “unclear.” We retrieved the full text article for any study that was classified as “include” or “unclear” by at least one reviewer. Once the article was retrieved, two reviewers independently assessed each study using a detailed form (Appendix B1). We resolved disagreements by consensus or third-party adjudication.

A single reviewer screened FDA reports for relevance. Based on an a priori decision, studies were considered for inclusion only if patients were 18 years of age or younger, due to the complexities of determining the age of study participants referenced in the FDA reports.

Assessment of Methodological Quality

Two reviewers independently assessed the methodological quality of the studies and resolved discrepancies through consensus. We pilot tested each quality assessment tool on a sample of studies and developed guidelines for assessing the remaining studies.

Quality Assessment of Trials

We assessed the internal validity of RCTs and nonrandomized controlled trials using the Cochrane Collaboration risk of bias tool (Appendix B2).33 This tool consists of six domains of potential bias (sequence generation, allocation concealment, blinding, incomplete outcome data, selective outcome reporting, and “other” sources of bias) and a categorization of the overall risk of bias. Each separate domain was rated as having “low,” “unclear,” or “high” risk of bias. We assessed blinding and incomplete outcome data separately for subjective outcomes (e.g., health-related quality of life) and objective clinical outcomes (e.g., laboratory measures). For “other” sources of bias, we assessed baseline imbalances between groups, early stopping for benefit, and funding source. Industry funding was considered as a source of potential bias because studies have shown that sponsorship influences published results.34

The overall assessment was based on the responses to individual domains. If one or more of the individual domains had a high risk of bias, we rated the overall score as high risk of bias. We rated the overall risk of bias as low only if all components were assessed as having a low risk of bias. The overall risk of bias was unclear for all other studies.

Quality Assessment of Cohort Studies

We used a modified version of the Newcastle-Ottawa Quality Assessment Scale (Appendix B2)35 to assess cohort studies. The scale comprises seven items that evaluate three domains of quality: sample selection, comparability of cohorts, and assessment of outcomes. Each item that is adequately addressed is awarded one star, except for the “comparability of cohorts” item, for which a maximum of two stars can be given. The overall score is calculated by tallying the stars. We considered a total score of 6 to 8 stars to indicate high quality, 4 or 5 stars to indicate moderate quality, and 3 or fewer stars to indicate poor quality. In addition, we extracted the source of funding for each study.36

Data Extraction

We extracted data using a structured, electronic form and imported the data into a Microsoft Excel 2007 spreadsheet (Microsoft Corp., Redmond, WA) (Appendix B3). One reviewer extracted data, and a second reviewer checked the data for accuracy and completeness. Reviewers resolved discrepancies by consensus or in consultation with a third party. We extracted the following data: study and participant characteristics (including inclusion and exclusion criteria, age, sex, ethnicity, and diagnosis), intervention and co-intervention characteristics (including dose, frequency, and duration), and outcomes.

We reported outcomes only if quantitative data were reported or could be derived from graphs. We did not include outcomes that were only described qualitatively (e.g., “there was no difference between the groups”) or reported only as a p-value in the data analysis. We classified studies that directly compared one antipsychotic with another antipsychotic as “head-to-head” studies and studies that compared an antipsychotic with placebo as “placebo” studies. Studies with three of more treatment groups could provide data both for “head-to-head” and “placebo” comparisons.

When more than one publication reported the results of a single study, we considered the earliest published report of the main outcome data to be the primary publication. We extracted data from the primary publication first and then added outcome data reported in the secondary publications. We reference the primary publication throughout the evidence report. A list of the references for the companion articles is provided at the end of this report.

We made decisions regarding which outcome measures to extract for Key Question 1 in consultation with clinical experts. A list of the outcome measures that we extracted for each condition is provided in Table 6. We extracted the total score for each outcome measure; when the total score was not provided, we extracted all of the reported subscores. For each study arm, we extracted the mean baseline and endpoint or change scores, standard deviations, and sample size. We did not extract either outcome data from studies that did not provide a followup change or endpoint mean or data that could be used to calculated followup scores.

Table 6. Outcome measures extracted in the comparative effectiveness review.

Table 6

Outcome measures extracted in the comparative effectiveness review.

Using the adverse event monitoring guidelines proposed by Correll et al.,37 the lead investigators decided on adverse event data that would be extracted for each of the categories specified in Key Question 2 (see Table 7). We reported adverse events as they were reported in by the authors of the study. For each adverse event, we recorded the number of patients in each treatment or placebo group and the number of patients with an adverse event. We counted each event as if it corresponded to a unique individual. Because an individual patient may have experienced more than one event during the course of the study, this assumption may have overestimated the number of patients that experienced an adverse event. We extracted only quantitative adverse event data describing the number of patients who experienced an event; that is, studies that reported only p-values or reported one arm to have fewer events than another were not included in the analysis. For continuous adverse event measures (e.g., weight or prolactin levels), we extracted the mean change or endpoint score, standard deviation, and sample size.

Table 7. Adverse event outcome data extracted in the comparative effectiveness review.

Table 7

Adverse event outcome data extracted in the comparative effectiveness review.

For other short- and long-term efficacy outcomes (Key Question 3), we reported treatment response and remission rates as defined by the study authors. For outcomes measured using scales (e.g., health-related quality of life or cognitive outcomes), we extracted the total score. When the total score was not provided, we extracted all the subscores. We did not calculate total scores based on the subscore data provided by studies.

To assess whether the efficacy of antipsychotics varied in different subpopulations (Key Question 4), we extracted information on the subpopulations (independent variables), the type of analysis (e.g., subgroup or regression analysis), the outcomes assessed (dependent variables), and the authors' conclusions. Age categories were defined as <6 years (preschool), 6 to 12 years (preadolescent), 13 to 18 years (adolescent), and ≥19 years (young adult). Age 12 was chosen as the cutoff between childhood and adolescence because this age is traditionally considered to be the onset of puberty.

Applicability

Applicability of evidence distinguishes between effectiveness studies conducted in primary care settings that use less stringent eligibility criteria, assess health outcomes, and have longer followup periods than most efficacy studies.38 The results of effectiveness studies are more applicable to the spectrum of patients in the community than efficacy studies, which usually involve highly selected populations. The applicability of the body of evidence was assessed following the PICOTS (population, intervention, comparator, outcomes, timing of outcome measurement, and setting) format used to assess study characteristics. Factors that may potentially weaken the applicability of studies were reported in the results.

Grading the Strength of a Body of Evidence

Two independent reviewers graded the strength of the evidence for major outcomes and comparisons using the EPC GRADE approach39 and resolved discrepancies by consensus. A list of the outcomes that were assessed is provided in Table 8.

Table 8. Key outcomes assessed for strength of evidence.

Table 8

Key outcomes assessed for strength of evidence.

For each outcome, we assessed four major domains: risk of bias (rated as low, moderate, or high), consistency (rated as consistent, inconsistent, or unknown), directness (rated as direct or indirect), and precision (rated as precise or imprecise). No additional domains were used.

Based on the individual domains, we assigned the following overall evidence grades for each outcome for each comparison of interest: high, moderate, or low confidence that the evidence reflects the true effect. When no studies were available or an outcome or the evidence did not permit estimation of an effect, we rated the strength of evidence as insufficient.

To determine the overall strength of evidence score, we first considered the risk of bias domain. RCTs with a low risk of bias were initially considered to have a “high” strength of evidence, whereas RCTs with high risk of bias and well-conducted cohort studies received an initial grade of “moderate” strength of evidence. Low-quality cohort studies received an initial grade of “low” strength of evidence. The strength of evidence was then upgraded or downgraded depending on the assessments of that body of evidence on the consistency, directness, and precision domains.

Some outcomes, such as autistic symptoms, were assessed using a variety of outcome measures (e.g., Aberrant Behavior Checklist, Childhood Autism Rating Scale) across the studies. We graded these outcomes taking into account the findings of each of these relevant measures and their meta-analyses.

Data Analysis

We made the following assumptions and performed the following imputations to transform reported data into the form required for analysis. We extracted data from graphs using the measurement tool of Adobe Acrobat 9 Pro (Adobe Systems Inc., California, U.S.) when data were not reported in text or tables. If necessary, we approximated means by medians and used 95 percent confidence intervals (CI) to calculate approximate standard deviations. We calculated p-values when they were not reported.

For all studies, we present qualitative data in the results section and in the evidence table in Appendix D. When appropriate, we performed meta-analyses to synthesize the available data on the efficacy of antipsychotic medications. We considered it appropriate to pool studies that were sufficiently similar in terms of their study design, population (i.e., condition and patient ages), interventions being compared, and outcomes.

We summarized the evidence for efficacy separately for each condition. Within each condition category, we present data both by individual drug comparison and across the drug class (e.g., all SGAs). We summarized adverse event data separately for each drug across all conditions.

We used Review Manager Version 5.0 (The Cochrane Collaboration, Copenhagen, Denmark) to perform meta-analyses. For continuous variables, we calculated mean differences (MD) for individual studies. For dichotomous outcomes, we computed relative risks (RR) to estimate between-group differences. If no event was reported in one treatment arm, a correction factor of 0.5 was added to each cell of the two by two table in order to obtain estimates of the RR. All results are reported with 95% CIs.

All meta-analyses used a random effects model. We quantified statistical heterogeneity using the I-squared (I2) statistic. A priori, we considered an I2 value of 80 percent or greater to represent considerable heterogeneity, thereby precluding the pooling of studies.40,41

Views

  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (8.3M)

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...