U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Balk E, Adam GP, Kimmel H, et al. Nonsurgical Treatments for Urinary Incontinence in Women: A Systematic Review Update [Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); 2018 Aug. (Comparative Effectiveness Review, No. 212.)

Cover of Nonsurgical Treatments for Urinary Incontinence in Women: A Systematic Review Update

Nonsurgical Treatments for Urinary Incontinence in Women: A Systematic Review Update [Internet].

Show details

Discussion

Summary of Findings

This review updated the Agency for Healthcare Research and Quality’s (AHRQ) 2012 systematic review with new literature searches from 2011 through December 4, 2017. It includes urinary incontinence (UI) outcomes (cure, improvement, satisfaction), quality of life, and adverse events. For UI outcomes, we conducted network meta-analyses since studies have compared a large number of specific interventions (53) and categories of interventions (16) and the majority of these interventions have not been directly compared with each other. The main findings of this systematic review update and the associated strength of evidence for each conclusion are summarized in Table 27.

The conclusions in Table 27 are general and do not cover all the analyses we explored. We estimated effects for 202 possible comparisons among intervention categories and 1514 possible comparisons among individual interventions for the UI outcomes, not counting information on quality of life and (limited comparative) information on adverse events. Providing conclusions and rating the “strength of the evidence” for each of these hundreds of comparisons is not productive. Users of our report who have specific interests should consult the pertinent results.

Briefly, in regards to patient-centered outcomes including cure, improvement, and satisfaction with UI symptoms, evidence of variable strength supports that almost all the examined active interventions are better than sham, placebo, or no treatment for at least one of these outcomes; the exceptions were hormones and periurethral bulking agents. Based on moderate to high strength of evidence, the first-line intervention behavioral therapy generally resulted in better UI outcomes (cure, improvement, satisfaction) than second-line interventions (medications). For women with stress UI requiring third-line interventions, intravesical pressure release may be more effective to achieve improvement than combination neuromodulation and behavioral therapy; and triple combination neuromodulation, hormones, and behavioral therapy may be more effective than either periurethral bulking or combination neuromodulation and behavioral therapy; all based on low strength of evidence. For women with urgency UI requiring third-line interventions, onabotulinum toxin A (BTX) may be more effective to achieve cure than neuromodulation, also based on low strength of evidence.

Regarding quality of life outcomes, there is low strength of evidence that behavioral therapy, anticholinergics, and neuromodulation are each more effective than no treatment. There is also low strength of evidence that supervised pelvic floor muscle training is more effective to improve quality of life than unsupervised training.

Serious adverse events were generally rare, with the notable exception of periurethral bulking agents which resulted in erosion or need for surgical removal of the agents in about 5 percent of women (but only 1.5% with the agent available in the U.S.; reported in in one study, low strength of evidence). The most commonly reported adverse event was dry mouth, which occurred in 24 percent of women on anticholinergics (36% of women on oxybutynin) and 13 percent of women using the alpha agonist duloxetine (high strength of evidence). Among women who received BTX, about one-third had urinary tract infections and between 10 and 20 percent had episodes of urinary retention or voiding dysfunction (moderate strength of evidence). Women taking the alpha agonist duloxetine reported common occurrences of constitutional adverse events (e.g., nausea 23%, insomnia 12%, fatigue 10%); moderate strength of evidence.

The evidence base did not provide adequate information to suggest which women would most benefit from which intervention (or interventions) based on the etiology or severity of her UI or based on her personal characteristics (such as age or involvement with athletic activities). The studies covered a large range of women, across adult ages, geographic regions, and types of UI (urgency, stress, mixed, or undefined) that as a whole are likely applicable to the general population of nonpregnant women with UI. However, extremely few studies reported subgroup analyses. Across studies, no clear differences in the comparative effectiveness of interventions were found based on patient age or comparing studies of women with urgency UI (alone) and studies of women with stress UI (alone). In regards to subpopulations of particular interest to stakeholders, studies did not specifically analyze or report on women engaging in athletic activities or women in the military. Studies also did not report subgroup analyses based on race or ethnicity, nor were there studies restricted to ethnic minorities to allow across-study comparisons.

The clinical importance of the effect sizes between interventions likely varies among women with UI based on their personal preferences or values and may further differ related to their severity of symptoms, the UI type, intervention, and other factors. For example, those with more severe UI may be more satisfied with partial improvement than those with milder UI; similarly, women using simpler, less invasive interventions may be more satisfied with partial improvement than women using invasive, intensive, or expensive interventions. For these reasons, we would again direct readers to read and evaluate the pertinent results in this report based on their specific interests in particular interventions and outcomes.

Clinical Implications

There is evidence to support the use of most of the interventions—nonpharmacological, pharmacological, and combination interventions—in contrast to no intervention (or, in clinical practice, watchful waiting), with the exceptions of hormones and periurethral bulking agents, for which there is low strength of evidence of no difference in relative rates of cure and improvement.

For women with stress UI or with urgency UI, the first-line intervention behavioral therapy is highly effective compared with no treatment. It is also generally more effective than second-line pharmacological therapies when used alone. Nevertheless, compared with no treatment, alpha agonists (used for stress UI) significantly improve UI, although with complaints of dry mouth (13%) and constitutional adverse events (including nausea in 23%). Similarly, for urgency UI, anticholinergics increase rates of cure, improvement, and satisfaction with degree of incontinence, but with associated complaints of dry mouth (24% overall). Sparse evidence specific to women with mixed UI is consistent with the rest of the evidence base regarding effectiveness of alpha agonists and anticholinergics.

For women moving on to third-line interventions, intravesical pressure release and neuromodulation are effective options for women with stress UI, with rare adverse events. Sparse evidence specific to women with mixed UI had similar findings for neuromodulation related to UI improvement. For women with urgency UI who are interested in trying BTX (and for whom it may be indicated; e.g., those with proven detrusor overactivity who have not responded to first- and second-line intervention7), the evidence suggests it is the most effective pharmacological intervention; however, it is associated with urinary tract infections and urinary dysfunction after treatment. But BTX may also be considered to have the advantage of being a one-time treatment with trial evidence of effectiveness for up to 6 months. Neuromodulation may also be effective for this population. Notably, periurethral bulking agents are less effective than most other interventions and are associated with risk of erosion and need for surgical removal of the bulking agents.

Although the evidence did not adequately evaluate heterogeneity of treatment effects (how treatment effectiveness may vary in different individuals or groups of women), the relatively high satisfaction rates for all evaluated intervention categories (at least 50%) suggests that each intervention is potentially appropriate for different women, depending on their symptoms, severity of disease, prior treatment history, and their own goals and preferences.

It is also interesting to note that the rates of satisfaction (51% to 76%) are mostly higher than rates of cure (15% to 45%) or improvement (30% to 79%). Thus, women who are not reporting categorical improvement in symptoms are still reporting satisfaction with treatment. As discussed in the evaluation of the contextual question, women’s treatment goals vary widely, but emphasize improvements in activities of daily living and resultant improvements in psychological, interpersonal, and related impacts. For many women, actual cure or a researcher-defined threshold of improvement is of lesser importance than ability to return to normal activities. Furthermore, women have described differing interest and tolerance for different types of interventions (e.g., daily drugs, invasive interventions, behavioral therapy), in part related to differences in concern about to the types of adverse events associated with each intervention.

There are many variations of how UI manifests in different women, of what aspects of UI women find most bothersome, and in the preferences and goals, including tolerance for potential adverse events, across both women and clinicians. Available interventions also vary substantially in how they function, their frequency and duration, their degree of invasiveness, and the amount of effort required by the women. These differences combined with the finding that all the interventions are effective to a lesser or greater degree suggest that each of the interventions may be most appropriate for different women. Thus, for example, while one might argue that third-line BTX is more effective (for cure and satisfaction) than second-line anticholinergics and thus should be preferentially recommended, it is possible that women may prefer one over the other intervention based on their own values, preferences, lifestyle, work schedule, and concerns about adverse events and receiving a more invasive intervention.

Furthermore, what effect size is clinically significant likely varies among women with UI and may further differ related to the severity of symptoms, UI type, intervention, failure of prior interventions, and other factors. For example, those with more severe UI may be more satisfied with partial improvement than those with milder UI; similarly, women using simpler, less invasive interventions may be more satisfied with partial improvement than women using invasive, intensive, or expensive interventions. Thus, overall, women and their clinicians will likely be choosing among a limited set of options based on the women’s severity of symptoms, prior treatment history, preferences for daily or one-time treatments, concerns about adverse events, etc. For example, some women may be considering only oral medications to add on to their current behavioral therapy, while other women may be considering BTX because of concerns about adverse events of daily medications. Given the large number of possible comparisons across categories of intervention (and the very large number of comparisons of specific interventions), we direct readers to read and evaluate the pertinent results in this report found in the “odds ratio tables” (e.g., Table 6 for cure; and equivalent tables for specific interventions, e.g., Appendix G, Table G-1) based on their specific interests regarding particular interventions and outcomes.

In clinical practice, the pragmatic approach of many clinicians is to start with behavioral therapy as a first-line intervention. For patients who do not respond or experience suboptimal improvement, it is common to then consider oral medications, depending on the type of UI, as second-line intervention; for example alpha agonists for stress UI or anticholinergics for urgency UI. Finally, neuromodulation or bladder BTX are commonly considered third-line interventions, depending on UI type. The comparative effectiveness of the various interventions (with each other) provided by the evidence, together with other considerations (such as ease of implementation, availability, and resource use), broadly supports this approach.

Although, not evident among the studies of outpatient women specifically with UI, concern has recently increased regarding for cognitive changes from the continued use of anticholinergic medications in frail or elderly patients.247251 Based on this concern, the American Urogynecologic Society issued the following consensus statement recommendations: 1) patients should be counselled about the risks associated with anticholinergic medications, such as cognitive impairment, dementia, and Alzheimer disease; 2) the lowest effective dose should be prescribed, and consideration should be given for alternative medications; 3) particular consideration should be taken with patients using other anticholinergic medications; and 4) bladder BTX or neuromodulation should be considered for patients at risk for adverse effects from anticholinergics.15 In addition, evidence suggests that the majority of patients (>70%) stop using anticholinergics within 5 months, mostly because of side effects. 1618

In reviewing the contextual question, we identified success as defined by physicians (informants) and patients based on published survey and focus group data. As might be expected, these goals similar with respect to domains of importance including physical symptoms and the associated impact on relationships, quality of life, activities of daily living, interpersonal relationships and psychological distress, economic implications, and sleep disturbance. Based on the literature review, we also identified that patients want to know about the balance between adverse events and symptom improvement. However, our informants did not comment on this. This finding also highlights the importance of the adverse event data described in this review. Clinicians should remember that patients are interested in possible adverse events and want to know this information to help them make informed decisions about treatment options.

Our findings are consistent with previously published systematic reviews of nonsurgical treatment UI in women but are more complete because we have evaluated additional classes of medications and additional interventions. Furthermore, we conducted network meta-analyses to combine direct evidence, from head-to-head comparisons, with indirect evidence. We thus estimate treatment effects for all possible comparisons between intervention categories (and individual interventions). Based on the network meta-analysis model, we are able to obtain the predicted mean outcome rates per intervention, in an effort to simplify the interpretation of the available evidence.

Strength of Evidence

The strength of evidence for each conclusion presented in Table 27 is based on a qualitative combination of primarily the summary risk of bias across all relevant studies, the consistency of the studies, the precision of the available estimates, and the directness of the evidence. The large majority (83%) of studies were deemed to be of low risk of bias; therefore, for each conclusion, the evidence base usually had low risk of bias. Exceptions included the effect of neuromodulation versus no treatment on quality of life and most of the conclusions regarding adverse events, which were generally poorly and inconsistently reported. For most analyses studies reported consistent results regarding the comparative effectiveness of interventions or the risk of adverse events. The primary exception related to quality of life, for which studies reported some inconsistent results both within and across studies. Given the extremely large number of possible comparisons among both intervention categories and specific interventions, we provide strength of evidence ratings only for those comparisons for which summary conclusions are possible. In most instances where comparative effectiveness estimates were imprecise, no conclusions are possible, and these comparisons are omitted. However, where feasible, conclusions were made for quality of life and adverse event outcomes despite some instances of imprecision mostly due to sparse data. For the UI outcomes, directness was summarized as variable. The directness metric covers various concepts including whether the conclusions are based on direct (head-to-head comparisons) and whether the reported outcomes are direct (true) measures of the outcome of interest. For all UI outcomes, the conclusions are based on both direct and indirect evidence, per the network meta-analysis. As noted, all network and direct comparisons were congruent and were consistent between the networks of all studies and of the subsets of stress or urgency UI, so the overall strength of evidence was not downgraded due to indirectness. Although there was some variability in the definitions of cure, improvement, and satisfaction, these were deemed to be sufficiently minor to not affect the overall directness. In contrast, some adverse event conclusions were downgraded for being indirect in that the outcomes (“any,” “moderate,” or “severe” adverse events) were generally not defined and likely varied across studies.

Table 27. Evidence profile for nonpharmacological and pharmacological interventions for urinary incontinence.

Table 27

Evidence profile for nonpharmacological and pharmacological interventions for urinary incontinence.

Limitations of the Evidence Base

With few exceptions and for most outcomes, individual studies were deemed to have, at most, moderate risk of confounding, selection, or measurement biases. The risk of bias of individual studies was not a major determinant for the conclusions in Table 27 Assessing impact of the risk of bias of individual studies on the conclusions of a network meta-analysis is not straightforward.252,253 The comparison effects estimated from a network meta-analysis are a combination of the estimated effects from head-to-head studies and from studies contributing through indirect comparisons. For example, assume that there is a highly biased study in a network meta-analysis: this study may raise concerns primarily regarding the comparison it directly informs on; however, it would cause little (even negligible) concern regarding comparisons that it informs indirectly. It will be of no concern for comparisons to which it contributes zero information.253

The major limitation identified by this review is the relative dearth of direct (head-to-head trial) evidence when one considers the richness of the clinical questions that can be posed. In general, comparisons across intervention categories are not as informative as comparisons between individual interventions. However, given the limitations of the evidence comparing specific interventions, we have provided analyses at the individual intervention level only in the Appendixes, and opted not to draw conclusions based on them. Most comparisons of individual interventions are based on indirect data and small numbers of studies. In addition, the generally small sample sizes of included studies lead to concerns about generalization.

Most studies included both women with stress UI and women with urgency UI or did not adequately describe their eligibility criteria. Very few studies explicitly evaluated only women with mixed UI (with symptoms of both stress and urgency UI). Relatively few studies based their eligibility criteria on whether women had already taken (and/or failed to improve with) prior treatments or described which treatments had already been used by study participants. Also, relatively few studies described or based eligibility criteria on symptom severity. Thus, it was difficult to evaluate subgroup analyses or to summarize across studies based on most of these descriptors.

We found no new information on the effectiveness of treatments among women who engage in athletic activity. It is known from previous research that incontinence is more common among women who engage in athletic activity.254 Urinary incontinence depends on the type of activity, with no leakage reported with golf, and up to 80 percent among trampolinists.254 Gymnasts and other athletes of high impact sports report more incontinence than age-related controls. It has been postulated that elite athletes need to have a stronger than normal pelvic floor to help mitigate the increased abdominal pressure that occurs with strenuous physical activity. The lack of additional data identified for this subset of women again highlights the need for additional studies specific to this group of women.

We did not identify any information regarding different treatment strategies between young and old patients. In 2015, the International Consultation on Incontinence ~ Research Society Think Tank met, discussed and published their opinion on the best treatment options for stress UI in the “very young” and “very old”.255 They defined very young as premenopausal patients less than 40 years old and very old as more than 70 or 75 years of age. They included discussions of surgical options and did not comment on urgency UI. They reported that minimal data exist to guide the treatment in those less than 40 or more than 70 years. For young women, they recommend that risks associated with pregnancy and childbirth need to be considered and special considerations should be given regarding comorbidities in elderly women.

We found a paucity of data regarding specific treatment efficacy for additional subgroups of interest including race/ethnicity, or active/veteran military personnel. Research is clearly needed to help guide treatment strategies for these women.

In addition to the sparseness, or complete lack of data for subpopulations of interest, we found the inconsistent reporting of adverse events to be a challenge in this report. The specific adverse events reported and their definitions varied greatly among studies and treatment modalities. It is important to recognize that the evidence basis for effectiveness (primarily cure, improvement, and satisfaction from randomized trials) differs markedly from that for adverse events (which generally could not be adequately compared across interventions). However, interestingly, for mirabegron, no studies have reported comparative effectiveness for categorical outcomes in women with UI but several studies have reported adverse events in this population; most mirabegron studies have been conducted either in men only or both men and women together. Further decision analysis modeling would be needed to yield a more explicit balance between comparative benefits and harms of different interventions. Further complicating this issues, is the particular importance of patient preferences and values regarding the many intervention choices and the differing concerns about specific harms.

Limitations of the Analytic Approach

In our analyses we used indirect data to inform comparisons between interventions. However, indirect comparisons rely on an assumption that there are no influential systematic differences in the distribution of effect modifiers in the synthesized studies.

Conceptually, the corpus of studies on UI in women includes heterogeneous samples of women based on UI type (stress, urgency, and mixed), UI severity (e.g., frequency and volume of incontinence), and prior treatment history (e.g., treatment-naïve, incomplete resolution with behavioral therapy, failed medication therapy). However, as noted, most studies failed to provide data to distinguish comparative effects of interventions based on UI type, UI severity, past treatment history, or other potential effect modifiers. Thus, implicitly, they were not considering the heterogeneity of treatment effects based on these factors among their included study participants.

The overall network meta-analysis, thus, makes the same general assumptions as the majority of studies, namely that the comparative effectiveness of interventions is consistent across different subgroups. This assumption does not imply that the actual effectiveness (e.g., incidence of cure) for a given intervention is similar among different groups of women, but instead that the comparative effectiveness compared to other treatments is similar. As noted, the network meta-analysis does compare interventions used for stress UI with interventions used for urgency UI. Third-line interventions (which in theory are used primarily in women who have failed to improve with second-line intervention) are also compared with first-line or second-line interventions (which in theory are used primarily in women who have not failed to improve with prior therapies). This approach is consistent with studies of women with UI that have, for example, evaluated neuromodulation (which is primarily used to treat urgency UI) in studies of women with only stress UI. Furthermore, studies have directly compared BTX (3rd line intervention) and anticholinergics (2nd line), neuromodulation (3rd line) and behavioral therapy (1st line), and, as mentioned, neuromodulation in women with stress UI. Such direct comparisons are consistent with the overall structure of the full network meta-analysis. We tested the appropriateness of the network meta-analysis model in a number of ways and found no evidence that the assumptions necessary for the indirect comparisons are violated. Split-node analyses, which compare direct (head-to-head) comparisons with indirect comparisons (through another intervention) for each comparison of two interventions, were consistent with a valid network model. Equivalently, network meta-analysis results were consistent with pairwise (direct) meta-analysis results in those comparisons for which there were head-to-head comparisons available. In addition, network meta-analyses that included the more homogeneous studies of women with only stress UI, urgency UI, or older women all yielded similar results as the overall network meta-analysis, providing additional evidence of the validity of the network. The network meta-analytic approach allowed us to learn across studies by aggregating the full corpus of evidence as opposed to parsing the evidence into specific subcategories of comparisons each of which have only sparse direct evidence.

Most of the comparisons between intervention categories, and between specific interventions, are indirect, through sham or no treatment. Comparisons between active interventions are sparse. Several active interventions (e.g., raloxifene, duloxetine, magnetic stimulation, and autologous fat implantation as a periurethral bulking agent) have not been directly compared with another active intervention. This observation is important because for interventions that are generally reserved as second- or third-line treatment, comparisons versus no treatment are not as informative as comparisons between active interventions.

Recommendations for Future Research

We identified gaps in the literature that merit consideration for future research. They are described briefly in the following paragraphs.

There is a need to adopt a set of core outcome measures, for effectiveness and for safety outcomes. As an example, among studies to date a wide range of quality of life instruments have been used, but inconsistently reported, in the included studies. The large number of instruments, and the even larger number of subscales, hinders drawing of conclusions across studies. In addition, currently studies inconsistently reported clearly defined UI outcomes (cure, improvement, satisfaction) and defined them variously. If all studies had consistently reported all outcomes, our summary findings would have been much more robust and precise. A core outcome set would be maximally useful if it included standardized definitions for patient-centered outcomes and if it has been demonstrated to capture the outcomes directed toward patient, rather than clinician or researcher, interests. Based on the survey and focus group studies that have been reported, future studies should be collecting data on those adverse events about which patients are concerned. More data, however, are needed to determine what those adverse events may be, and to what degree patients balance potential benefits and harms.

Information to further clarify whether specific subpopulations may benefit more from, or have differential adherence to, specific interventions is still lacking. Specifically, information regarding the differential effects of interventions in women from all of the identified subgroups of interest for this review are relatively sparse. Studies should either include only women with a specific type of UI (stress, urgency, mixed) or report subgroup results for all outcomes. Studies should also report UI severity (e.g., frequency or volume) and past treatment history for included participants and, where feasible, again provide subgroup results based on severity and/or past treatment history. Additional studies are needed regarding efficacy of the various interventions including patient-specific outcome measures for athletes, young and old, military and women of diverse racial/ethnic backgrounds. The possibilities for future research in these subsets of women is particularly rich and untapped.

Several specific intervention comparisons of interest have no or limited direct evidence. Future studies are needed to allow more robust comparisons. Notably lacking are trials of mirabegron specific to women with UI. Existing trials of mirabegron that included a sufficient number of women with UI should publish these results. Other available interventions that are not included in the evidence should be evaluated if they are promising treatments.

To allow better interpretation of the evidence, studies need to more clearly describe prior treatments used by study participants. Ideally, studies should either include only women with a particular treatment history (e.g., treatment naïve, failed to improve with a first-line therapy, failed to improve with a specific intervention) or complete subgroup data for each treatment category should be reported.

Conclusions

Based on combined direct and indirect comparisons and with respect to patient-centered outcomes including cure, improvement, satisfaction with treatment, and quality of life, most examined active intervention categories appear to be better than sham or no treatment, and for many or most comparisons, statistically significantly so (with the exception of hormones and periurethral bulking agents). Behavioral therapy, alone or in combination with other interventions, is generally more effective than other first- and second-line interventions alone for both stress and urgency UI.

The third-line interventions BTX, neuromodulation, and intravesical pressure release are generally more effective than other interventions, but with increased risk of urinary tract infections and urinary dysfunction with BTX. Second-line pharmacological interventions, particularly when used alone, are generally less effective and are associated with nonserious but bothersome adverse events, such as dry mouth, nausea, and fatigue. However, adverse events are generally nonserious, except for erosion and need for surgical removal in about 5% of those who received periurethral bulking agents (1.6% with the agent available in the U.S.).

Large gaps remain in the literature regarding the comparison of individual interventions, and very little or no information is available on women who engage in athletic activity or women in the military or who are veterans, or about differences between older and younger women or women of different ethnicities or races. Standardized quality of life and adverse event reporting would allow significant improvement for conclusions from future systematic reviews as between-study comparisons would be more robust and conclusive.

For clinicians, patients and payers to make informed decisions, specifically for patient subgroups with sparse evidence, new evidence from studies comparing interventions is needed.

Views

  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (7.8M)

Other titles in this collection

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...