U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Committee on Review Data Systems for Monitoring HIV Care; Institute of Medicine; Ford MA, Spicer CM, editors. Monitoring HIV Care in the United States: Indicators and Data Systems. Washington (DC): National Academies Press (US); 2012 Mar 15.

Cover of Monitoring HIV Care in the United States

Monitoring HIV Care in the United States: Indicators and Data Systems.

Show details

6Efficient Analysis of HIV Care Indicators and Dissemination of Data by Federal Agencies

In this chapter the committee describes how federal agencies can efficiently analyze indicators and disseminate data to improve the quality of HIV care (statement of task question 5). The chapter begins with an overview of the challenges to the analysis of indicators, including those related to combining data drawn from multiple sources, and how to address those challenges. The committee then describes how federal agencies can efficiently disseminate HIV care data to improve care quality. The chapter ends with the committee’s conclusions and recommendations.

EFFICIENT ANALYSIS OF HIV CARE INDICATORS BY FEDERAL AGENCIES

As discussed in Chapter 3, no single data system can be used to gauge the impact of the National HIV/AIDS Strategy (NHAS) and the Patient Protection and Affordable Care Act (ACA) on improvements in HIV care. Rather, estimates of the indicators of clinical HIV care and mental health, substance use, and supportive services recommended by the committee often will require the use of data elements from two or more data systems. Combining data from multiple systems may also be necessary to compensate for the weaknesses of any individual data system, such as a lack of representativeness of the population of people living with HIV/AIDS (PLWHA) or incompleteness of data (e.g., due to a low response rate).

The committee was asked to describe how federal agencies can efficiently analyze indicators. The data systems described in Chapter 3 that are maintained by federal entities represent a mix of surveillance (e.g., the National HIV Surveillance System [NHSS]), claims (e.g., Medicaid Statistical Information System), and programmatic (e.g., Ryan White HIV/AIDS Program) data sources, as well as epidemiologic studies of PLWHA (e.g, the Medical Monitoring Project). Efficient analysis of the indicators will require overcoming challenges to combining data across these disparate systems.

One analytic challenge to the efficient analysis of indicators relates to differences in the way that data systems operationalize data elements or define concepts to allow them to be measured. An area in which this may be relevant is in the calculation of indicators for subgroups of PLWHA, because data systems may vary in how they define certain demographic data such as income, geographic marker of residence, race or ethnicity, and sex or gender. Another challenge is differences across data systems in the periodicity for particular data elements. Although claims systems will have continuous data on dispensing of antiretroviral drugs, the Ryan White HIV/AIDS Program collects information on whether antiretroviral drugs were prescribed within a 12-month reporting period. This presents an obstacle to combining data from these systems for purposes of estimating the proportion of PLWHA who were or were not on antiretroviral therapy (ART) during a given period. Although technically difficult, there are approaches to deal with the analytic challenges of combining data, as discussed below.

Additional impediments to the efficient analysis of the indicators by federal agencies that relate to combining data from multiple systems include the current lack of an infrastructure to support the secure exchange of health information across health information technology systems (e.g., electronic health records) and organizations, and other barriers to data sharing. These issues are discussed in Chapters 4 and 5 of this report.

An Example of Challenges to the Efficient Analysis
of an Indicator for Clinical HIV Care

One of the core indicators for clinical HIV care recommended by the committee (see Recommendation 2-1 in Chapter 2) is the proportion of people with diagnosed HIV infection and a CD4+ cell count <500 cells/mm3 who are not on ART among all patients who receive such counts. To define this indicator more precisely, one must take timing into account. For example, one might ask: What proportion of individuals who received a CD4+ cell count measurement of <500 cells/mm3 in 2011 also received ART at any point in 2011. Although this definition is clear, it suffers from the problem that those who received such a count late in 2011 had less opportunity to receive ART in that year. Therefore, it may make more sense to rephrase the question: How many individuals who received a CD4+ cell count <500 cells/mm3 in 2011 received ART treatment within a fixed window of time (e.g., 6 months) of receipt of that measurement. In addition to estimating a population average, there is also interest in estimating the effects of demographic factors or insurance status on this indicator.

To estimate such an indicator requires information on date of measurement and level of CD4 count as well as date of ART prescriptions given or filled. Some data sources, such as health maintenance organizations (HMOs; e.g., Kaiser Permanente), the Department of Veterans Affairs (VA), and federal prisons provide all of the relevant information needed, permitting a relatively straightforward estimation for subsets of the population. However, analytic issues arise from the fact that patients may leave these systems at any point—possibly after a CD4+ cell count <500 cells/mm3 is measured but before the prescription is provided or 6 months have elapsed. Furthermore, delays in reporting (e.g., of HIV/AIDS cases, CD4 counts) must be taken into account, particularly if the goal is to investigate trends over time. In addition, patients may die within 6 months of receiving a CD4 count—a situation that makes it impossible to obtain the indicator. For patients who leave a system before their contribution to the indicator can be assessed, it is important to make use of the available partial follow-up information in an attempt to avoid, or at least reduce, bias. This is fairly straightforward using methods for failure-time data if the loss to follow-up is not informative (i.e., unassociated with greater or lower risk of starting treatment). If it is informative, appropriate methods must be used to minimize bias; however, unbiased estimation is possible only if all potentially confounding variables are available (a very unlikely situation). To investigate the effect of demographic and other factors on the risk of not receiving appropriate ART, regression methods can be used. Limitations arise from losses to follow-up, as described above, as well as from the fact that with the exception of the NHSS, which captures data on the vast majority of people identified with HIV/AIDS in the United States, none of the data sources is representative of either the American population as a whole or any particular demographic group.

The limitation of representativeness can be addressed by making use of other sources of data that have broader coverage. To do so, however, one must make use of data systems that provide only part of the necessary information by combining them in some way. For example, the NHSS provides dates of measurements and CD4 counts but not (reliably) the time of receiving ART. By contrast, Medicare and Medicaid databases provide information about dates of ART prescriptions filled but not CD4 counts. In the absence of unique identifiers, no direct linkage between databases can be made. However, combining across sources is still feasible through linkage by demographic factors. For example, suppose one knew that for one demographic group in a given state, 400 people had CD4+ cell counts <500 cells/mm3 at some point in 2011 among 600 people who had CD4+ cell counts drawn. Suppose one also knew for this group that 300 people received or filled prescriptions for ART. One then would know that at a minimum, there had to be 100 patients who should have been on ART but were not. In fact, however, the number might be considerably greater because some of the ART use may have been among patients who had CD4+ cell counts of 500 cells/mm3 or greater. However, if one could estimate this number from other sources (for example, from people within HMO-type systems who are similar in demographic category to those under study), the estimate could be refined further. Suppose that of the 200 people who never had a CD4+ cell count <500 cells/mm3 during the year, one estimates, from some other data source, that about 100 of them were on ART. Then one could estimate that about 200 of the 400 patients who should have been receiving ART were not.

The above discussion illustrates the underlying logic for making inference but does not address the question of uncertainty in estimation. Of course there would be errors associated with all of these estimates. How to calculate the variability in estimates obtained by combining data from different sources is an area of active research. Bayesian methods have been used in a variety of settings to characterize the uncertainty associated with such estimates, reflecting the limitations of the data and the need to combine across sources. Similarly, Bayesian methods can also be used to conduct regression analyses that would allow for estimation of the effect of demographic factors on risk of receiving inadequate treatment.

Issues in Combining Information

Many problems can bedevil analyses of data sets that are derived from clinical program or public health systems and from which treatment or intervention effects are being estimated; these include missing data, unknown population sizes and denominators, and sampling bias. Analysis of randomized studies generally also suffers from these challenges, since they are subject to some level of participant attrition, unplanned crossovers, and inadvertent unblinding. Combining sources of information can help to overcome shortcomings in each source but creates new challenges for the analyst, as described in the illustration above. These challenges arise from the fact that linkage between sources at the individual subject level may be uncertain or impossible, and even when linkages with high levels of certainty are possible, all of the relevant information may not be available on all subjects. Furthermore the level of precision of information may not be equal across studies and optimal estimation may have to take this factor into account as well. A large and growing body of work regarding strategies and methods for combining information is now available.

In 1992, the National Research Council issued an important report titled Combining Information: Statistical Issues and Opportunities for Research which described many of the principles and methods associated with combining information (NRC, 2002). Since that time, a number of developments in methods for combining data from different sources have occurred that could be applied to HIV research. For example, Bayesian two-stage hierarchical models have been employed in environmental health studies that relate air pollution to mortality. The first stage of such studies estimates the impact on mortality of air pollution for different cities of interest, after controlling for confounding factors. The second stage combines the estimates across cities using a Bayesian hierarchical model (Lindley and Smith, 1972; Morris and Normand, 1992) to obtain an overall estimate and to explore whether some of the geographic variation can be explained by site-specific explanatory variables (Dominici et al., 2000). Such techniques would also be useful if, for example, there was interest in relating community-level factors—such as prevalence or incidence of disease, access to health care, poverty or homelessness rates—to such health outcomes as HIV morbidity or mortality.

Many of the problems that arise in combining information can be viewed as related to the issue of missing data. For example, the indicator for a link between individuals may be seen as missing. Missing data are handled in a wide variety of ways from the ad hoc (analyze only complete cases) to sophisticated methods for accommodating incomplete observations.

One approach to dealing with missing data is imputation—replacement of the missing observation with the best estimate of what it would have been had it not been missing. Such methods, however, tend to underestimate the uncertainty that arises from the missing data. Multiple imputation addresses this concern using Bayesian methods (Little and Rubin, 1987). Likelihood-based methods are also useful; these involve the development of a likelihood for just the observed data. In either case, one must have a statistical model for the generation process of the data, including the probability of its being observed. Given the importance of such models, considerable effort has been made to expand their flexibility, by allowing not only fully parametric but also semiparametric models (Tsiatis, 2006).

In some cases it may be possible to make inferences about the sizes of populations of interest using capture-recapture methods; these are useful in settings when collection of complete data (i.e., a full enumeration of the populations) is not feasible or affordable. For example, as described below, these have been used to estimate the size of an injection drug-using population.

In addition to the problem of missing data, analyses of observational data intended to produce causal estimates of the impact of factors, such as demographics or insurance status, on outcomes must take into account confounding factors. There is an enormous literature on adjustment for confounding factors, as well as increased interest in causal modeling for this purpose. One approach—use of marginal structural models—has received increasing attention because of its ability to handle confounding factors that vary over time (Suarez et al., 2008). All of these techniques are relevant to the charge to the committee to explore the opportunities and limits of data sources for HIV program and outcome evaluation in the United States, but they by no means capture the breadth of methodology available to cope with data limitations that may bias or confound results and distort conclusions.

Multiple Imputation for Missing Data

Missing data can arise from settings in which people are asked about sensitive personal data, when resource constraints limit the completeness of data collection, or when certain items of information are not routinely collected. When a given variable is essential for a particular evaluation, analyses only of complete cases introduces many threats to inference: (1) bias can be introduced because persons with missing data may be systematically different from those with complete data; (2) statistical power can be reduced when many cases are deleted from analyses due to missing data; (3) resources can be wasted—for example, when 95 percent of data are collected on someone, but due to the 5 percent missing data, the entire data block is left unused; and (4) ethical obligations to research subjects can be compromised when they have inconvenienced themselves under the assumption that they were doing this for biomedical or behavioral research, but the investigator discards their data due to missing variables.

Data on sensitive topics such as sexual risk behaviors or drug use may be limited by nonresponse bias or biases stemming from socially sensitive responding. These biases present a special challenge to the collection of data for surveillance and for epidemiologic research studies where sexual behaviors or drug use may be relevant (Fenton et al., 2001). Multiple imputation helps to circumvent the need to eliminate subjects with partially observed data imputing (predicting) values for missing variables. Such imputation requires a statistical model for the complete data (including the unobserved portions) and for the process that led to the observed pattern of missing observations. This model is used to predict the missing observation based on the individuals for whom the data were observed. The posterior distributions of the unobserved values given the observed data can then be calculated. Since such calculations may be difficult, Little and Rubin (1987) propose a resampling-based approach for their calculation. Like any approach for handling missing data, its validity depends on the correct specification of a model for the process that generated the missing data.

Capture-Recapture Methods

When the population of interest has not been enumerated and a survey of the prevalence of a condition or size of a subgroup in a community is impractical or otherwise unfeasible, the capture-recapture method may be used. This technique derived from the field of population ecology (Stephen, 1996). For example, one can capture mosquitoes, dust them with harmless florescent material, and release them. The proportion of recaptured mosquitoes in a day or two (allowing sufficient time for remixing but not allowing time for significant mortality) can be used to estimate the total number of mosquitoes in the local population, assuming random mixing and equal probability of selecting labeled and unlabeled mosquitoes. Similarly, small mammals may be trapped, tagged, and released, and then a second trapping recaptures new and old (tagged) mammals, enabling a population estimate.

In human biology and epidemiology, the completeness of population ascertainment can be indirectly estimated using capture-recapture, as with estimations of persons who need HIV therapy, drug addiction services, or other social or medical services. Thus, persons must be “captured” and “marked,” to borrow the ecology model, such that they are available for recapture after release. Sometimes in epidemiology, this is literal, as with prisoners who are injection drug users (IDUs), who are arrested but released after a short time in jail. The proportion who return to jail may be used to estimate the proportion of drug users at risk of being arrested (presumably a large proportion of IDUs); combined with population HIV estimates, the absolute number of HIV-infected drug users can be estimated (Drucker and Vermund, 1989; Dunn and Ferri, 1999).

One may estimate the size of a population from just two samples or through multiple samples. Capture histories may be analyzed to estimate migration, life span, or size in the population of interest. A simple formula reflects the core principle of the basic capture-recapture approach. This simple model requires strong assumptions such as full mixing of persons who have been “captured” and “released” (as with hospitalized patients who go home) with the general population. The time-to-recapture estimation must be long enough to permit remixing and short enough for estimation to be relatively unaffected by deaths, out- and in-migrations, and failure to identify “marks.” The latter may occur, for example, when rehospitalized patients use different names when entering an institution. If the assumptions are met, the formula is expressed as: N = MC/R (where N = total population size estimated; M = total number of persons “captured” and “marked” [i.e., identified] on the first occasion; C = total number of persons “captured” on the second visit; and R = number of identified persons “marked” from the first occasion that were then reidentified on the second occasion) (Chao et al., 2001; Hook and Regal, 1995; International Working Group for Disease Monitoring and Forecasting, 1995a,b; Stephen, 1996).

Marginal Structural Methods and Models

Marginal structural models estimate treatment or intervention effects in observational studies by statistical strategies of controlling for selection bias and confounding variables (Robins, 1999). The fundamental concept behind marginal structural modeling can be explained as follows: Suppose one wishes to compare exposures A and B, which may vary over time, in some population and suppose that, at each time, one could create an identical copy of each study subject. If the actual subject had exposure A at a given time, we would give the copy exposure B and vice versa. One could then compare each subject to his or her copy. We refer to the outcomes for each of the imaginary copies as “counterfactuals” and treat them as missing data. Inverse-probability weighting (IPW) is an approach to handling missing data that reweights observations by the inverse of the probability that they are made. Marginal structural models use IPW to deal with the unobserved (“missing”) counterfactuals. Using IPW and marginal structural model procedures reweight data sets so that treatment and covariates are not confounded.

In “confounding by indication,” an “exposure” is linked to a true causal exposure (e.g., condom use and commercial sex work) but does not itself contribute to the outcome. For example, condom use may be statistically and positively linked to HIV risk, which is counterintuitive (Holmes et al., 2004), but this association may arise because of confounding by indication (e.g., disproportionate use of condoms by sex workers in the population studied). When this occurs in an observational study, the association of the putative risk factor cannot be accurately attributed to the outcome of interest unless one has measured all of the exposures and the relevant confounding factors.

Estimation of causality must take into account time-dependent confounding, and marginal structural models can address selection bias and/or confounding in such analyses. However, inclusion of such factors as time-varying covariates in longitudinal models does not correct for this bias. Such bias occurs most often when “(1) conditional on past treatment history, a time-dependent variable is a predictor of the subsequent outcome and [is] also a predictor of subsequent treatment; and (2) [when] past treatment history is an independent predictor of the time-dependent variable” (Suarez et al., 2008).

Marginal structural models can be used for causal inference even from nonexperimental designs, comparing treatments or interventions, as long as information is reasonably accurate, all confounders are measured, and censoring either is noninformative or can be modeled accurately as a function of known covariates. Better control of confounding than available from simple parametric regression models alone may bring some observational data closer to values that would be measured in a randomized controlled clinical trial. One recent example comes from a study showing that hormonal contraception is a risk factor for HIV acquisition in African women (Heffron et al., 2011). Marginal structural model analyses were used to assess the validity of the Cox proportional hazards regression from this large observational couples study.

Here the committee describes only a few of the challenges that arise in the use of observational data to make inferences about outcomes or service coverage (Teresi, 1994) and the approaches to dealing with them. Nonetheless, the committee seeks to illustrate a few modern statistical methods to make surveillance and programmatic data more useful for evaluation purposes and to illustrate the inherent challenges, both to data collection and to analyses. Correct application of these and other relevant techniques can improve the chances that inferences drawn from imperfect data are valid.

Analysis of Indicators Involving Small Subgroups of People Living with HIV/AIDS

Tracking reductions in HIV-related health disparities will require analysis of indicators by race and ethnicity, sexual orientation, and other demographic variables. The NHAS is aimed at improving access to care and health outcomes for PLWHA and reducing HIV-related health disparities at the national level. Yet, analysis of indicators may occur at a local level, such as to disseminate information to local health departments and HIV care providers on the status of the HIV epidemic in their jurisdictions. In some communities of the United States, the number of individuals who comprise a specific demographic group (e.g., racial and ethnic minority men who have sex with men) may be small. Because the statistical power of an indicator estimate is linked to the number of observations in a sample, small subgroups limit the precision of estimates of care indicators and the ability to compare them with other subpopulations of PLWHA. In epidemiologic studies, investigators may have little choice but to pool very small subpopulations with the larger study population because there is insufficient power to extract the effects of defining subgroup characteristics. With respect to the NHAS, however, this would defeat the purpose of using indicators to track improvements in HIV-related disparities.

Statistical methods for inference may be used for the analysis of indicators involving small subgroups of PLWHA. In general, Bayesian methods are useful for combining information about prevalence, incidence, or treatment effects across different population subgroups (Han and Chaloner, 2005). Group-specific Bayesian estimates are “shrunk” toward (moved closer to) the mean of the quantity of interest over the population included in the combined data set. Because the amount of shrinkage depends on the amount of available information, the smaller the size of the subgroup, the greater will be the reliance on the estimate of the mean. In addition, group-specific Bayesian estimates are sensitive to assumptions regarding the distributions of the random effects; the most common approach of assuming normal distributions leads to the greatest shrinkage. Using other types of random effects distributions, such as Student t or mixtures, can reduce the amount of shrinkage, since they have longer tails and, therefore, allow for a greater probability of outlying values. As an alternative, one can base inference on nonparametric approaches, which can achieve the same goal. Posterior distributions may also tend to be flatter—implying lower precision in estimates—because the strong normal assumption can convey a sense that there is more information on which to base inference than is truly the case if the distributions are nonnormal. While Bayesian methods provide posterior distributions for any subgroup, no matter its size, the inference for that group will rest most heavily on the mean and on underlying assumptions if the subgroup is small. More advanced statistical methods, such as those that do not require parametric assumptions for the distributions of the random effects, can provide more reliable and robust results in this setting.

Growing numbers of studies indicate that social status modifiers such as race and ethnicity, nativity (place of birth), sexual orientation, geographic location, and drug use status often have an impact on important measures of HIV care (e.g., Kempf et al., 2010; Lillie-Blanton et al., 2009; McGowan et al., 2011). For these subpopulations, among whom social status contributes to their risk environment (Farley, 2006; Rhodes, 2009) and treatment access, assumptions of normality of the distribution of random effects may be especially problematic, and approaches that allow for the existence of outliers are particularly needed. One potential consequence of overshrinkage is underestimation of the impact of indicators of social status, such as geographic location, economic status, or drug status, on care experiences. And so although parametric approaches can provide some care data on individuals in these subpopulation groups, they lie within a more restrictive set of assumptions that could temper the use of the results for policy changes.

Epidemiologic studies are an important source of data on care and supportive services received by PLWHA. Health research in general has historically been plagued by an inability to recruit and retain large numbers of racial and ethnic and socioeconomically diverse populations, particularly of sexual minorities or the homeless (Levkoff and Sanchez, 2003; Moreno-John et al., 2004; Sengupta et al., 2000), although studies of PLWHA may do better than studies of other populations in terms of representativeness. Although helpful, statistical methods cannot make up for a lack of sufficient data to estimate indicators for small populations of PLWHA. The development of precise indicator estimates would be facilitated to a greater degree by inclusion of those groups in greater numbers in HIV-related studies.

In a given community, there may be subpopulations of PLWHA that are small in number and have complex health care and supportive service needs for whom the ability to maintain health care regimens depends on access to supportive services. Improvements in linkages between data systems that collect information on clinical care and those that collect information on supportive services (e.g., housing and transportation services) would help to ensure the availability of the full range of data needed to estimate indicators for these subpopulations. Although data system linkages will not address the problem of low statistical power in analyses designed specifically to provide estimates for small subpopulations of PLWHA, nonparametric methods can be used to provide some insights into care needs.

Increased support for training of HIV/AIDS researchers in statistics and methodologies may facilitate the development of expertise in the analysis of data for subpopulations of PLWHA. Such investment could speed the provision of effective treatment to all communities and thereby improve control of HIV transmission.

DISSEMINATION OF DATA TO IMPROVE HIV CARE QUALITY

Analysis of the HIV care and related indicators identified by the committee will generate data of interest to a number of stakeholders, including federal and state agencies and policy makers, state and local health departments, health care systems (e.g., HMOs, VA, prisons), individual providers, consumers (patients), and academic researchers. Properly presented, the information provided to each audience has the potential to improve the quality of HIV care in the United States. Policy makers, agencies, and health departments may use the information to direct resources and policies toward areas that are most problematic (e.g., access to health care or mental health, substance use, or supportive services to improve linkage to or retention in care). Health care systems and individual providers may use the information to inform their provision of quality HIV care and to target patient education efforts. Individual patients, patient groups, and patient advocates could use the information to direct personal and group advocacy efforts for access to needed services. Academic researchers could use the information to support research proposals and projects that might generate additional information to further improve the quality of HIV care.

The committee’s review of the existing systems that capture data relevant to HIV care shows that many data on various aspects of HIV care currently exist. However, existing data often are not used to the fullest extent possible. Although government agencies and some state and local health departments make some de-identified data available publicly, in other cases the data reside with the agencies that require the reporting and are not made accessible to the public (HHS, 2010), including to the programs and providers who reported the data in the first place. Not only is broad dissemination of data on HIV care important for improving care by engaging as many stakeholders as possible; the return of information to reporting programs and providers increases the collaborative nature of the relationship, provides them with useful feedback, and may motivate them to further increase reporting compliance (CDC, 2011, p. 5-32).

Data Dissemination by Federal Agencies

Federal agencies, including the Centers for Disease Control and Prevention (CDC), the Health Resources and Services Administration (HRSA), and the National Institutes of Health (NIH), have been disseminating health-related information for decades. Until the advent of the Internet, which enabled agencies to disseminate large amounts of information, dissemination had primarily involved making paper copies of documents available to the public (OMB, 2011). In the context of increasing federal information dissemination, Congress passed the Information Quality Act (IQA), also referred to as the Data Quality Act, in December 2000. The IQA required the Office of Management and Budget (OMB) to issue guidance to federal agencies to ensure the “quality, objectivity, utility, and integrity” of information disseminated to the public. In response, the OMB issued Guidelines for Ensuring and Maximizing the Quality, Objectivity, Utility, and Integrity of Information Disseminated by Federal Agencies, effective October 2001. These guidelines require that information quality be treated as an integral step in the information development process. Federal agencies must adopt a basic standard of quality as a performance goal and take steps to incorporate information quality criteria into agency information dissemination. In addition, agencies are to develop a process for reviewing the quality of information before it is disseminated. OMB designed the guidelines to apply to a variety of government dissemination activities and to be generic enough to fit all media (HHS, 2006a).

The IQA also required that government agencies issue their own information quality guidelines and establish mechanisms to allow individuals to seek correction of information maintained and disseminated by federal agencies that does not comply with OMB guidance (OMB, 2011). Therefore, Guidelines for Ensuring the Quality of Information Disseminated to the Public have been issued for several agencies of the U.S. Department of Health and Human Services (HSS). These guidelines describe the types of information disseminated by the agency to the public; types of dissemination methods; agency standards for ensuring the quality of information disseminated; agency administrative complaint procedures; influential scientific, financial, and statistical information; and any special considerations for agency dissemination. As one HIV-specific example, the types of information disseminated by HRSA listed in its guidelines include HRSA HIV/AIDS Bureau State Profiles that describe spending and service information for Ryan White HIV/AIDS Programs, including provider characteristics (e.g., the number and types of organizations in the state that receive Ryan White HIV/AIDS Program funding), client demographic information, service utilization information (e.g., number of patient visits for core medical services), and characteristics of AIDS Drug Assistance Program clients (HHS, 2006b; HRSA, 2012). Under “dissemination methods” the guidelines say that the state profiles are available through the HRSA HIV/AIDS Bureau website and that further requests or feedback can be made by phone or email (HHS, 2006c). Within CDC, dissemination guidance applies to HIV/AIDS Surveillance Reports and reports for other infectious and noninfectious conditions (HHS, 2006b).

Considerations in Data Dissemination

Effective and efficient dissemination of data requires careful attention to several considerations, including audience, definition and presentation of the message, data quality and interpretation, and method of dissemination (CDC, 2009; Marriott et al., 2000; Sofaer and Hibbard, 2010a,b).

In HRSA guidelines for ensuring the quality of information disseminated to the public, CAREWare, a software package used by Ryan White HIV/AIDS Program providers to track clients and services, is listed as a means to ensure the quality of information disseminated to the public (HHS, 2006c). According to the guidelines, CAREWare helps to ensure the quality of data because it contains consistency and edit checks on input data. HIV/AIDS Bureau State Profiles, which present state-level data derived from these data, provide—for each data element—information on data limitations, rounding, and restrictions where appropriate (HHS, 2006c). CDC guidance notes that surveillance information is often obtained from third parties, such as states and grantees, which places limits on quality assurance. However, the accuracy, completeness, and timeliness of the information are subject to sample audits, site visits, and an “evaluation for completeness and consistency with trends and external controls” (HHS, 2006b).

Audience

Defining and understanding the target audience is one of the first steps in developing a plan for data dissemination (Marriott et al., 2000; Sofaer and Hibbard, 2010b). Potential audiences for data derived from the full set of HIV care indicators identified by the committee already have been identified (federal and state agencies and policy makers, state and local health departments, health care systems, individual providers, consumers [patients], and academic researchers). Selection of the appropriate audience involves consideration of what the data show, the purpose for which the data are being disseminated, and the message that is to be conveyed.

Federal, state, or local policy makers and agencies would be the primary target audience/s if the purpose is to increase or redirect the allocation of resources or to affect policy changes, including the development of new programs to address specific areas of need. Such programs could focus on points in the HIV care continuum that the data might indicate are particularly problematic (e.g., continuity of care) or mediators known to affect those areas (e.g., access to stable housing). Information about improvements on indicators would be useful as well, by showing which current policies and programs are working.

Public and private health care systems, as well as individual providers, might be interested in the data for the purpose of evaluation of, and possible changes in, the HIV care they provide. Such information could permit systems and providers to identify their areas of strength, as well as areas for improvement, in the provision of quality HIV care. Research indicates that dissemination of clinical practice guidelines alone has a minimal effect on provider knowledge and performance, while combination strategies, including those with an education component, are more effective (Marriott et al., 2000). Results of performance indicators also may be more effective in changing provider behavior (Marriott et al., 2000).

PLWHA, and advocacy groups for PLWHA, are other potential audiences for the information on indicators. The information could be used to educate individuals regarding areas in which increased attention and advocacy could improve HIV care.

Depending on what the data show, the dissemination process might target any of these general audiences or a more specific audience within a group, such as policy makers representing a particular region of the United States, HIV care providers who serve patients in a specific demographic group, or patients of a particular race or ethnicity. Ultimately, audience selection should depend on the applicability of the data for that audience and the purpose the data serve (Sofaer and Hibbard, 2010b). Once the audience is defined, the message and the remainder of the dissemination process should be geared to that audience (CDC, 2009).

Definition and Presentation of the Message

Another critical consideration in effective data dissemination is the message to be conveyed. Many audiences are not equipped to understand and process vast quantities of data (Sofaer and Hibbard, 2010a). Data provided without the expertise to interpret them can cause more harm than good. Even the language used to present the information may result in unanticipated misinterpretation (Hibbard and Sofaer, 2010). It is important therefore for an agency to have a clear understanding of the message it wants to transmit and then relay that message to the target audience clearly and concisely, along with the data to support it (Hibbard and Sofaer, 2010; Marriott et al., 2000; Sofaer and Hibbard, 2010a). The details of the message may vary depending on the target audience, as will the way in which the message is presented.

Presentation of the message in the most appropriate way for the target audience is critical to ensure that the message the agency wants to convey is the one that is received by the audience (CDC, 2009; Marriott et al., 2000). Considerations of health literacy and numeracy are important when preparing information for dissemination (Hibbard and Sofaer, 2010; IOM, 2004). Presentations of data from the HIV care and supportive services indicators and trends in the quality of HIV care over time will use different language depending on the audience (e.g., clinical care professionals, policy makers, program administrators, members of the public). Clinical indicators of HIV care that are fully comprehensible to HIV care providers may be incomprehensible to patients or to policy makers. It is important to make the information relevant to what the audience understands and the purpose for which it will use the data. Three papers on “best practices in public reporting” on health care performance data, prepared for the Agency for Healthcare Research and Quality, discuss a number of the pitfalls in and solutions to presenting performance data to health care consumers (Hibbard and Sofaer, 2010; Sofaer and Hibbard, 2010a,b). Although the papers focus on a specific type of data and target audience, the concepts presented may be generalized to other audiences and types of information.

Data Quality and Interpretation

As discussed, the IQA mandates that federal agencies develop quality assurance guidelines for information releases to the public, and a number of HHS agencies have issued Guidelines for Ensuring the Quality of Information Disseminated to the Public. Although it is important for agencies to present the message clearly, concisely, and in language that is understood by and resonates with the target audience, it is also important that they include information about the quality of the data that support the message and the methods used to interpret them (Sofaer and Hibbard, 2010a).

Factors affecting data quality include the source of the data, quality within the system, coverage of the data, confidence range, use of proxies, and analytic methodology applied. The challenge lies in providing sufficient information to permit independent assessment of the data, while not overwhelming the target audience with information that it cannot or will not use (Marriott et al., 2000; Sofaer and Hibbard, 2010a). One approach is to include with the disseminated information a summary, presented in language accessible to the target audience, of the data and the data analysis, including discussion of limitations or gaps in the data and any other relevant information that would enhance the audience’s understanding and evaluation of the data (HHS, 2006b; Marriott et al., 2000; Sofaer and Hibbard, 2010a). At the same time, the agency could make available to interested parties full information on the data set and the methodologies used to assess it (Sofaer and Hibbard, 2010a). CDC, for example, clearly documents and makes publicly available the statistical processes and methodologies used to derive published information, which allows independent statisticians to replicate the results (HHS, 2006b).

Evidence suggests that an audience’s acceptance of data is affected by its perception of the credibility of the data source and the source reporting the information (e.g., professional medical journal versus popular press), as well as proximity of the source to the target audience (Marriott et al., 2000; Sofaer and Hibbard, 2010a). Information intermediaries can help in this regard. Engagement with organizations knowledgeable about and trusted by the target audience may assist in the dissemination of information and help to support the credibility of the information and its source (Sofaer and Hibbard, 2010b).

Methods of Dissemination

A final consideration for effective and efficient data dissemination is selection of the most appropriate and cost-effective method of dissemination. As previously mentioned, federal agencies have a variety of dissemination methods at their disposal, including traditional print media (e.g., reports, peer-reviewed articles, fact sheets, newsletters), electronic media (e.g., websites, podcasts), and public forums (e.g., conferences, planned meetings) (CDC, 2009), and frequently more than one method may be employed.

The target audience and the message and data to be conveyed are factors in the choice of dissemination method. An agency might choose to prepare a report or paper for a peer-reviewed professional journal if the goal is transmit the information to health care systems or providers. Reports, newsletters, and fact sheets might be more effective in reaching policy makers or other agencies. Websites will reach a larger and broader audience, including members of the public. The type of information and style of presentation used for a professional journal will differ markedly from that prepared for dissemination on the agency website. The speed or urgency with which a message needs to be conveyed is another consideration in the selection of dissemination method.

CONCLUSIONS AND RECOMMENDATIONS

  • Estimation of the committee’s recommended indicators for HIV care and supportive services will often require combining data from multiple data systems. Making valid inferences about the indicators across different populations and over time using data from multiple data systems presents a range of analytic and logistical challenges. Such challenges will change over time and will have to be reevaluated periodically.
    Recommendation 6-1. At least once every 2 years, the Department of Health and Human Services should reevaluate mechanisms for combining data elements to estimate key indicators of HIV care and access to supportive services, analyze the combined data, and identify and address barriers to the efficient analysis of such data, including relevant statistical methodologies. To facilitate this process, HHS should engage a center of excellence representing broad areas of expertise that include information technology, statistical methodologies for combining data, and data system content.
    The center of excellence might also include experts in epidemiology and surveillance; laws and policies that affect access to HIV-related data; health services research, including insurance; medical informatics, including integration of public and private data sources to estimate population-level parameters; clinical HIV care and relevant social services; and community and patient perspectives. The center of excellence could address questions such as the extent to which proxy data elements can be used to estimate indicators; whether knowledge of an indicator for a subpopulation rather than the whole cohort of PLWHA might be acceptable for some indicators; and the level of accuracy to be demanded for any given indicator (e.g., whether estimates are needed within 1, 5, or 10 percentage points) given the potential costs of data collection and of obtaining very accurate indicator estimates.
  • Information on the indicators recommended by the committee will be of interest to a variety of stakeholders, including policy makers, health departments, HIV care providers, patients, and researchers. The disseminated information can be used in numerous ways— from informing policy decisions to supporting the development of research projects—that have the potential to improve the quality of HIV care.
    Recommendation 6-2. The Department of Health and Human Services should report to the public at least once every 2 years on indicators of HIV care and access to supportive services to foster improvements in the quality of HIV care and in monitoring progress toward meeting the goals of the National HIV/AIDS Strategy.
    The reporting interval of at least once every 2 years allows for regular reporting of the indicator data to monitor the NHAS while minimizing reporting burden and associated costs. To facilitate understanding and use of the indicator information by stakeholders, dissemination products and strategies may vary depending on the target audience and message to be conveyed. Information about the quality of the indicator data (e.g., confidence ranges for indicators estimates, use of proxy data elements) might be included in the dissemination product so that stakeholders are aware of the limitations of the data.

REFERENCES

  • CDC (Centers for Disease Control and Prevention). Basic Concepts for Disseminating and Communicating Surveillance Data. 2009. http://www​.cdc.gov/pednss​/how_to/disseminate_data​/basic_concepts.htm (accessed January 19, 2012)
  • CDC. Principles of Epidemiology in Public Health Practice. 3. Atlanta, GA: CDC; 2011. Self-Study Course SS-1000. http://www​.cdc.gov/training​/products/ss1000/ss1000-ol.pdf (accessed January 19, 2012)
  • Chao A., Tsay P. K., Lin S., Shau W., Chao D. The applications of capture-recapture models to epidemiologic data. Statistics in Medicine. 2001;20:3123–3157. [PubMed: 11590637]
  • Dominici F., Samet J. M., Zeger S. L. Combining evidence on air pollution and daily mortality from the 20 largest U.S. cities: A hierarchical modelling strategy. Journal of the Royal Statistical Society. 2000;163(3):263–302.
  • Drucker E., Vermund S. Estimating population prevalence of human immunodeficiency virus infection in urban areas with high rates of intravenous drug use: A model of the Bronx in 1988. American Journal of Epidemiology. 1989;130(1):133–142. [PubMed: 2787104]
  • Dunn J., Ferri C. P. Epidemiologic methods for research with drug misusers: Review of methods for studying prevalence and morbidity. Revista de Saúde Pública. 1999;33(2):206–215. [PubMed: 10413939]
  • Farley T. A. Sexually transmitted diseases in the southeastern United States: Location, race, and social context. Sexually Transmitted Diseases. 2006;33((7)):S58–S64. [PubMed: 16432486]
  • Fenton K. A., Johnson A. M., McManus S., Erens B. Measuring sexual behavior: Methodological challenges in survey research. Sexually Transmitted Infections. 2001;77:84–92. [PMC free article: PMC1744273] [PubMed: 11287683]
  • Han C., Chaloner K. Design of population studies of HIV dynamics. In: Tan W., Wu H., editors. Deterministic and Stochastic Models of HIV Epidemics and HIV Infections with Intervention. Hackensack, NJ: World Scientific Publishing Co.; 2005.
  • Heffron R., Donnell D., Rees H., Celum C., Mugo N., Were E., de Bruyn G., Nakku-Joloba E., Ngure K., Kiarie J., Coombs R.W., Baeten J. M. Lancet Infectious Diseases. 2011. Use of hormonal contraceptives and risk of HIV-1 transmission: A prospective cohort study. [PMC free article: PMC3266951] [PubMed: 21975269] [CrossRef]
  • HHS (U.S. Department of Health and Human Services). HHS Guidelines for Ensuring and Maximizing the Quality, Objectivity, Utility, and Integrity of Information Disseminated to the Public. Part 1: HHS overview. 2006a. http://aspe​.hhs.gov/infoquality​/guidelines/part1.shtml (accessed January 19, 2012)
  • HHS. Guidelines for Ensuring the Quality of Information Disseminated to the Public. Centers for Disease Control and Prevention and Agency for Toxic Substances and Disease Registry. 2006b. http://aspe​.hhs.gov/infoquality​/guidelines/cdcinfo2.shtml (accessed January 19, 2012)
  • HHS. Guidelines for Ensuring the Quality of Information Disseminated to the Public. Health Resources and Services Administration. 2006c. http://aspe​.hhs.gov/infoquality​/guidelines/HRSAinfo2.shtml (accessed January 19, 2012)
  • HHS. News Release. 2010. Putting data and innovation to work to help communities and consumers improve health. http://www​.hhs.gov/news​/press/2010pres/06/20100602a.html (accessed January 19, 2012)
  • Hibbard J., Sofaer S. Best Practices in Public Reporting No. 1: How to Effectively Present Health Care Performance Data io Consumers. Bethesda, MD: Agency for Healthcare Research and Quality; 2010. AHRQ Publication No.10-0082-EF.
  • Holmes K. K., Levine R., Weaver M. Effectiveness of condoms in preventing sexually transmitted infections. Bulletin of the World Health Organization. 2004;82(6):454–461. [PMC free article: PMC2622864] [PubMed: 15356939]
  • Hook E. B., Regal R. R. Capture-recapture methods in epidemiology: Methods and limitations. American Journal of Epidemiology. 1995;17(2):243–264. [PubMed: 8654510]
  • HRSA (Health Resources and Services Administration). Ryan White HIV/AIDS Program. 2009 State Profiles. 2012. http://hab​.hrsa.gov/stateprofiles/index​.htm (accessed March 6, 2012)
  • International Working Group for Disease Monitoring and Forecasting. Capture-recapture and multiple-record systems estimation I: History and theoretical development. American Journal of Epidemiology. 1995a;142(10):1047–1058. [PubMed: 7485050]
  • International Working Group for Disease Monitoring and Forecasting. Capture-recapture and multiple-record systems estimation II: Applications in human diseases. American Journal of Epidemiology. 1995b;142(10):1059–1068. [PubMed: 7485051]
  • IOM (Institute of Medicine). Health Literacy: A Prescription to End Confusion. Washington, DC: The National Academies Press; 2004. [PubMed: 25009856]
  • Kempf M., McLeod J., Boehme A. K., Walcott M. W., Wright L., Seal P., Norton W. E., Schumacher J. E., Mugavero M., Moneyham L. A qualitative study of the barriers and facilitators to retention-in-care among HIV-positive women in the rural southeastern United States: Implications for targeted interventions. AIDS Patient Care and STDs. 2010;24(8):515–520. [PubMed: 20672971]
  • Levkoff S., Sanchez H. Lessons learned about minority recruitment and retention from the Centers on Minority Aging and Health Promotion. Gerontologist. 2003;43:18–26. [PubMed: 12604742]
  • Lillie-Blanton M., Stone V. E., Jones A. S., Levi J., Golub E. T., Cohen M. H., Hessol N. A., Wilson T. E. Association of race, substance abuse, and health insurance coverage with use of highly active antiretroviral therapy among HIV-infected women, 2005. American Journal of Public Health. 2009;100:1493–1499. [PMC free article: PMC2901300] [PubMed: 19910347]
  • Lindley D. V., Smith A. F. M. Bayes estimates for the linear model. Journal of the Royal Statistical Society. 1972;34(1):1–41.
  • Little R. J. A., Rubin D. B. Statistical Analysis with Missing Data. 1. Hoboken, New Jersey: Wiley and Sons, Inc.; 1987.
  • Marriott C., Palmer C., Lelliott P. Disseminating healthcare information: Getting the message across. Quality in Health Care. 2000;9:58–62. [PMC free article: PMC1743500] [PubMed: 10848372]
  • McGowan C. C., Weinstein D. D., Samenow C. P., Stinnette S. E., Barkinic G., Rubiero P. F., Sterling T. R., Moore R. D., Hulgan T. Drug use and receipt of highly active antiretroviral therapy among HIV-infected persons in two U.S. clinic cohorts. PLoS One. 2011;6((4)) [PMC free article: PMC3081810] [PubMed: 21541016]
  • Moreno-John G., Gachie A., Fleming C. M., Napoles-Springer A., Mutran E. Ethnic minority older adults participating in clinical research: Developing trust. Journal of Aging and Health. 2004;16:93S–123S. [PubMed: 15448289]
  • Morris C. N., Normand S. L. Hierarchical models for combining information and for meta-analysis. In. In: Bernardo J. M., Berger J. O., Dawid A. P., Smith A. F. M., editors. Bayesian Statistics 4, Oxford, UK: Oxford University Press; 1992. pp. 321–344. (with discussion)
  • NRC (National Research Council). Combining Information: Statistical Issues and Opportunities for Research. Washington, DC: National Academy Press; 2002.
  • OMB (Office of Management and Budget). Guidelines for Ensuring and Maximizing the Quality, Objectivity, Utility, and Integrity of Information Disseminated by Federal Agencies. 2011. http://www​.whitehouse​.gov/omb/fedreg_final​_information_quality_guidelines (accessed January 19, 2012)
  • Rhodes T. Risk environments and drug harms: A social science for harm reduction approach. International Journal of Drug Policy. 2009;20(3):193–201. [PubMed: 19147339]
  • Robins J. M. Association, causation, and marginal structural models. Synthese. 1999;121:151e79.
  • Sengupta S., Strauss R. P., DeVellis R., Quinn S. C., DeVellis B., Ware W. B. Factors affecting African-American participation in AIDS research. Journal of Acquired Immunodeficiency Syndromes. 2000;24(3):275–284. [PubMed: 10969353]
  • Sofaer S., Hibbard J. Best Practices in Public Reporting No. 2: Maximizing Consumer Understanding of Public Comparative Quality Reports: Effective Use of Explanatory Information. Bethesda, MD: Agency for Healthcare Research and Quality; 2010a. AHRQ Publication No. 10-0082-1-EF.
  • Sofaer S., Hibbard J. Best Practices in Public Reporting No. 3: How to Maximize Public Awareness and Use of Comparative Quality Reports Through Effective Promotion and Dissemination Strategies. Bethesda, MD: Agency for Healthcare Research and Quality; 2010b. AHRQ Publication No. 10-0082-2-EF.
  • Stephen C. Capture-recapture methods in epidemiological studies. Infection Control and Hospital Epidemiology. 1996;17(4):262–266. [PubMed: 8935735]
  • Suarez D., Haro J. M., Novick D., Ochoa S. Marginal structural models might overcome confounding when analyzing multiple treatment effects in observational studies. Journal of Clinical Epidemiology. 2008;61:525–530. [PubMed: 18471655]
  • Teresi J. Overview of methodological issues in the study of chronic care populations. Alzheimer’s Disease and Associated Disorders. 1994;8((Suppl 1)):S247–S273. [PubMed: 8068268]
  • Tsiatis A. A. Semiparametric Theory and Missing Data. New York: Springer Science and Business Media, LLC; 2006.
Copyright 2012 by the National Academy of Sciences. All rights reserved.
Bookshelf ID: NBK201375

Views

  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (1.6M)

Related information

  • PMC
    PubMed Central citations
  • PubMed
    Links to PubMed

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...