U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

National Research Council (US) Committee on Population; Finch CE, Vaupel JW, Kinsella K, editors. Cells and Surveys: Should Biological Measures Be Included in Social Science Research? Washington (DC): National Academies Press (US); 2001.

Cover of Cells and Surveys

Cells and Surveys: Should Biological Measures Be Included in Social Science Research?

Show details

10Applying Genetic Study Designs to Social and Behavioral Population Surveys

.

It is clear that population surveys comprising social and behavioral issues and hypotheses can be extended to include and explore epidemiological, public health, genetic, and other biologically oriented scientific themes, in part through the collection of genetic and other bioindicators that will inform these questions (Wallace, 1997). Well-defined, geographically representative cohorts are an important opportunity for many types of population research because they are expensive and uncommon, particularly those that are national in scope. They often contain a wealth of personal and family information that is important to understanding the causes and management of health problems. With the rapid advance of genetic knowledge and measurement technology, behaviorally oriented surveys may become important “laboratories” to seek genetic and other biological explanations for personal and social behaviors as well as to answer a broad range of scientific questions.

It is also clear that such surveys will not lend themselves well to all possible scientific questions equally. As much forethought and planning as possible should be undertaken before such surveys go into the field, so that there will be a maximum ability to shape design and data-collection methods. Surveys that are already in place will present special design challenges for enhancement with genetic marker or other bioindicator-collection activities, for both scientific and logistical reasons. For these same reasons, one must not assume that routine collection of particular specimen types, such as blood or urine, will automatically address a wide range of yet-to-be-defined general scientific questions, since type of specimen and preliminary handling and processing may not be appropriate for many cogent hypotheses that may be identified later. Also, no matter what the primary purposes of an existing survey and investigator expertise, there should be wide interdisciplinary consultation in order to assure that genetic or other bioindicator-related hypotheses are precise and feasibly assessed, and that the effort has a reasonable chance of providing useful scientific outcomes.

The purpose of this chapter is to introduce design and other methodological considerations on how to optimally extend geographically referent health and social survey cohorts to enhance their application to additional biological questions, focusing on genetic issues and collection of genetic bioindicators. While genetic studies may address any human trait or characteristic, the emphasis will be on health problems. It begins with a consideration of identifying and defining the basic age-related diseases, conditions, and other traits that are potential objects of genetic study. Next is a discussion on how to assess family structures within household surveys and the value of determining the familiality of diseases, conditions, and other traits of interest. This is followed by a review of genetic study designs and methods that could be applied to representative household surveys in order to address the contribution of genes and inheritance to disease etiology or progression and to age-related physiological and functional change. Finally, there is an overview of logistical considerations in approaching community-dwelling survey participants for specimen collection. It should be noted that here are many ethical issues related to the acquisition, processing, and interpreting of genetic and other bioindicators; these are addressed elsewhere in this volume by Botkin and by Durfy.

It is axiomatic that most human illnesses are due to both genetic and environmental causes. Thus, while this chapter emphasizes genetic bioindicators and their relation to health outcomes among older persons, it should be emphasized that many community-based surveys offer the opportunity to make environmental observations and collect environmental specimens that may be potentially important to defining health outcomes. For example, even before an interviewer approaches and enters a household, it is possible to collect outdoor air and soil samples for pollutants and to make observations on the quality and maintenance of the residence as well as impediments to outdoor mobility and the general challenges of the geographic terrain. Once inside the household, with appropriate consent it may be possible to collect (a) air samples within or near specific households for general air pollutants or specific contaminants such as radon or carbon monoxide; (b) peeling paint samples for lead or other heavy metal content; (c) household dust samples for toxic agents or pollutants; (d) temperature measurements in times of extreme weather conditions; (e) industrial hygiene monitors worn by inhabitants to assess physio-chemical exposures over an extended period; or (f) biological specimens from household pets, which may reflect a wide variety of physiochemical exposures in common with the occupants, including diet. Environmental observations are critical in their own right, but are often important cofactors for genetically related diseases and other outcomes.

DEFINING THE PHENOTYPIC OUTCOMES IN POPULATION STUDIES

All rigorously observed living things undergo age-related changes in physiology, metabolism, structure, and behavior. The measures of these changes are the bioindicators of aging. Pursuing the biological mechanisms of age-related change is extremely important, and in less complex species (e.g., bacteria, yeast, roundworms), where a substantial amount of basic research is conducted, a variety of predictable age-related changes in physiology and metabolism have been observed. On the other hand, the concept of “disease” is an extremely complex notion, generally reserved for more complex organisms higher up on the evolutionary tree, since diseases often involve dimensions such as anatomic change, altered function, and suffering. Species in which diseases have received attention and study experience age-related increases in the rates of an important series of chronic conditions, many of which may lead to death directly or to a cascade of secondary events that are ultimately fatal. Of great interest, it appears that the rates of age-related changes and of diseases can vary widely within a species, and there is a large body of experimental evidence demonstrating that these changes are modifiable by both environmental and genetic manipulation.

However, one of several important dilemmas when studying aging mechanisms in simpler organisms is the problem of understanding how the biology of aging relates, if at all, to the occurrence of diseases and conditions in humans and other more complex organisms. An important question, as one explores the association of genetics and bioindicators with health outcomes, is whether the distinction between aging processes and formally conceived and designated diseases can be made. In genetic explorations, the question may be whether there are aging measures (in genetic parlance—phenotypes) that are conceptually, statistically, and pathogenetically independent of disease processes. Much useful study has been devoted to the processes of aging, but biological aging remains an extremely difficult commodity to define. The thesis here is that for purposes of studying empirical associations between genetic bioindicators and health outcomes, whether the genetic exposures and phenotypic outcomes are part of some underlying, unique biological process called aging or are abnormal clinical conditions such as atherosclerotic heart disease, and cancer is largely immaterial and possibly distracting.

There are several reasons for the contention that distinguishing between biological aging and disease processes may be problematic. There is little agreement on a precise definition of aging, although many have offered general characteristics; this is usefully discussed by Arking (1998). Most scientific papers on the study of aging, basic or applied, do not offer definitions of aging as an explicit biological process separate from disease and dysfunction. Survivorship and longevity, among the most widely studied attributes of aging across species, are insufficient outcomes for the study of complex animal processes, particularly in humans or other mammals; nearly all humans die of one or more discrete, identifiable medical conditions. Further, most if not all hypothesized biological mechanisms of aging encompass concepts that have also been applied to disease causation and progression. For example, age-related shortening of chromosomal telomeres has been related both to aging processes and to carcinogenesis (Shay, 1997), as have cumulative somatic mutations (Vijg, 2000; Hernandez-Boussard et al., 1999) and age-related, progressively inefficient DNA repair processes (de Boer and Hoeijmakers, 2000). Even an environmental factor that experimentally has been shown to dramatically prolong mammalian survivorship as well as decrease the occurrence of age-related physiological change and disease, caloric restriction, has been shown to alter the rate of change in age-related gene function (Lee et al., 1999).

In summary, until further understanding of the biology of aging emerges, it seems best to consider that organisms have a set of complex, interactive, biologically malleable cell and tissue/organ machinery that is responsible for both diseases and age-related change. This malleability suggests that all age-related phenomena are not obligate, whether physiological or pathogenetic, and that interventions targeted to these phenomena may become very important for the enhancement of successful aging. Genetically related bioindicators of age-related change should be equally suited for their associations with endpoints that include disease and dysfunction as well as other age-related changes. Because the number of cell mechanisms that might be related to genetic bioindicators is so large, some way to conceptualize them may be helpful. Holliday (1998) has suggested one approach. He divides the energy flow that allows cells to survive into three categories: normal cell functions, reproduction, and organism maintenance. Examples of each are shown in Table 10-1, which provides a working taxonomy for considering categories of bioindicators for future study.

TABLE 10-1. Energy Resource Allocation in Mammals.

TABLE 10-1

Energy Resource Allocation in Mammals.

As a caution when applying genetic bioindicators in the search for the causes of disease outcomes in populations, it should be noted that almost all phenotypes of interest are biologically complex and related to multiple body systems and processes, including human and animal behaviors. Broad survival traits in aging studies, such as longevity, active life expectancy, or rates of change in important age-related functional activities, are emblematic of this issue. Age-related disease outcomes, including most of the major chronic illnesses, are also extremely complex and difficult to define as homogeneous phenotypes. Ellsworth and Manolio (1999) explain why this is the case:

  • Complex diseases have a high level of genetic complexity, where multiple genes, each with a relatively small effect, act independently or interact in important ways. For example, essential hypertension, proximal to many cardiovascular conditions, starts with 4 to10 genes interacting with several environmental factors, leading to several intermediate phenotypes and finally to diseases in association with other genetically related factors such as obesity and alcohol use (Carretero and Oparil, 2000).
  • A single disease may have multiple manifestations with varying relationships to genetic influences.
  • Apparently homogeneous diseases may have multiple causes and pathogenetic mechanisms. For example, atherosclerosis may be due to combinations of lipid accumulation, endothelial injury, inflammation, and clotting abnormalities.
  • Individuals with preclinical illnesses may be indistinguishable from otherwise healthy persons because early detection methods may be inadequate. Most chronic illnesses in western societies have variable ages-at-onset of clinical symptoms.
  • Environmental factors may alter incidence rates and interact with important genes so that the severity of the condition may range from almost imperceptible to debilitating.
  • At times, the failure to find an association between a bioindicator and a condition with genetic causes may be due to the misclassification and aggregation of heterogeneous conditions that appear phenotypically similar.

ASSESSING THE FAMILIALITY AND HERITABILITY OF DISEASES, CONDITIONS, AND AGE-RELATED CHANGE

One of the most important preliminary ways to determine the likelihood that genetic factors play an important role in the genesis of disease, dysfunction, and age-related change is to assess the extent to which these phenotypes (diseases and other traits) occur more frequently within families than in the general population. Once this is determined, more detailed studies may be justified. While the clustering or unusual occurrence of age-related conditions or functional alterations within certain families does not per se distinguish between environmental and genetic explanations, their absence makes genetic factors less likely.

Other common approaches to preliminarily assessing whether a trait has a genetic component are twin studies and adoption studies. In twin studies, some degree of genetic inheritance is inferred if a trait has a higher rate of concordance among monozygotic (i.e., identical) than among dizygotic (i.e., fraternal) twins. Despite some discussion on the meaning and interpretation of twin studies, evidence from these studies has served to suggest a role for genetic factors in conditions important to elders, such as macular degeneration (Gorin et al., 1999) and cognitive disability (Plomin and DeFries, 1998), but they have not fully resolved the role of inheritance in Parkinson's disease (Langston, 1998). In general, most representative population surveys, even those of substantial size, do not contain sufficient numbers of twin pairs to conduct twin studies per se. Rather, surveys may identify some twins that can be reported to regional twin registries for later studies, with appropriate permission of the participants.

Similarly, adoption studies can shed light on genetic contributions to disease. Here, siblings raised in different households are contrasted for disease concordance and rates of familiality with those raised in the same household. This method, although used less than twin studies, has been fruitful in several disease domains, including mental illnesses such as schizophrenia (Heston, 1966). A recently suggested variation is to explore familiality among unrelated children of the same age raised in the same household (Anonymous, 2000).

These study models are not efficiently addressed using population methods, but knowing the population-referent character of study twins and adoptive families adds to their credibility. Other population applications, such as migration studies, have been used to explore the role of environmental exposures. Once there is reasonable evidence that genetic factors are important for a given health problem or other trait, examining family members of known relation to each other (i.e., a pedigree) using a variety of complex genetic study designs and models helps determine whether genetic markers are statistically linked to each other and are environmentally interactive as well.

Most pedigrees containing an unusual occurrence of a particular disease or trait are not discovered in population surveys, but rather in the clinical setting. This is mostly a matter of efficiency, because if population cohorts or registers were large enough and sufficiently well documented, informative pedigrees could be obtained from them. However, population surveys that emphasize demographic, family, and social hypotheses and collect family structure data and associated histories of diseases or other age-related phenomena can serve genetic study designs in other ways. They allow determination of the distribution of pedigree sizes available for more detailed genetic study. They provide detailed ethnic and other demographic characterization of the population, which can help avoid confounding in genetic studies (see below). They can also provide methods for locating participants and ascertaining vital status with improved accuracy. Finally, for certain studies, they can provide substantial samples of geographically referent sib pairs and other core family groups that form the basis for some genetic inquiries and models. Other uses of assembled population cohorts are described below.

If collection of pedigree information is undertaken, there are computer programs that can assist with this activity, available at regional genetics clinics or population genetics research programs. A typical pedigree contains the full names (including maiden names) of individuals, as well as their date and location of birth, vital status, residential location, gender, nature of relation to the index household individual or couple, and the presence of the particular disease, condition, or trait of interest. Useful pedigree data can be obtained from postal surveys (McKinley et al., 1996) as well as from personal interviews. It may be of value within population surveys to collect pedigree information in stages, taking more detailed information from subsampled families that meet certain eligibility criteria, such as having a member with a given condition or comprising a certain size or composition.

There are potential problems in collecting family structure and concomitant disease occurrence information, most of which are common to general survey data collection. Putative blood relatives may in fact not be, and issues of paternity may be present. Survey respondents or other informants may not have accurate information on the age or disease status of even close family members. This has been observed in the study of Alzheimer's disease and dementia, where responses on familial presence of dementia were found to be sensitive but nonspecific (Kukull and Larson, 1989). A similar problem was observed in the reporting of familial orofacial birth defects (Romitti et al., 1997). In addition, many potentially informative family members are deceased or living remote from the site of the informant interview, making contact or specimen collection logistically more difficult. Some family members may choose not to participate in genetic studies. Thus, there are many reasons to be cautious in interpreting pedigree information.

However, there are ways to confirm and expand pedigree health information. One important method is to link pedigree members, when ethically feasible, to other data sources such as medical records, church or administrative records, or regional disease registries. In the United States, linkage to Health Care Financing Administration (HCFA) records, at least for persons 65 years of age and older, may confirm or refute the presence of many medical conditions. Similarly, linkage to regional vital record sources, the National Death Index, or the mortality file of the Social Security Administration (Hussey and Elo, 1997) may confirm deaths of relatives as well as help establish family relationships. Some family information may be contained within extended genealogy record systems that are available and suitable for analytic use (Thomas et al., 1999) or other more specialized registers, such as geographic disease or twin registers.

STUDY DESIGNS AND GENETIC BIOINDICATOR APPLICATIONS IN POPULATION STUDIES

There are a substantial number of genetic study designs used to find associations between gene markers or gene function and health outcomes. If one includes all bioindicators that might be collected using simple techniques, such as from venipuncture or urine collection, there would be nearly limitless opportunities for associational studies between these indicators and health, functional, or other aging outcomes. These could span a large number of social science, health, biological, ecological, and environmental disciplines and could be quite fruitful with the appropriate collaboration. The discussion below begins with some methodological cautions and continues with the population intersection with genetic studies.

General Methodological Issues

Sample Size Considerations

Approaching genetic applications in population surveys often begins with sample size considerations. In general, specific genes, alleles, or gene markers must be common enough to make a study informative, even if there are thousands of participants. This issue becomes more problematic when there is substantially heterogeneity (i.e., allelic variation) at a given genetic locus or when multiple genes may be involved. Even robust, geographically representative (generally household) survey populations may or may not be suitable for many hypotheses; this needs to be evaluated on an individual basis. This issue arises also when exploring gene-environment interactions, where occurrence rates for both the environmental exposures and the genetic markers of interest may be uncommon. Thus, some exposures cannot be adequately addressed, even in large survey cohorts.

The same issues apply to various health and functional outcomes. They must be common enough to avoid Type I errors in models of exposure/outcome associations. For example, among older western cohorts of substantial size, some conditions such as clinical coronary artery disease; breast, colon and prostate cancer; and Alzheimer's disease, are likely to be quite common if the observation period is long enough. However, other conditions such as ovarian cancer, Parkinson's disease, or even hip fracture, may not be suitably assessed in prospective population studies unless the sample size and the follow-up interval are large. One way to enhance the number of outcomes for genetic studies of diseases is to consider intermediate outcomes. For example, very low bone density may serve as a surrogate for hip fracture, and adenomatous colon polyps may serve the same role for colon cancer. The use of intermediate outcomes requires consensus on their biological appropriateness. If the outcomes are age-related physiological or functional characteristics or traits that can be assessed in all or most survey participants over time, such as decreased renal or cognitive function, then smaller sample sizes may be feasible. Methods for addressing these sample size considerations are available and improving, and are very dependent on the particular genetic analytical models employed.

Incident Versus Prevalent Outcomes in Populations

When exploring genetic markers for associations with diseases or other aging outcomes, the distinction between prevalent and incident outcomes needs to be considered, or biases and spurious findings may occur as in epidemiological and social studies that explore environment-disease associations (Goldman et al., 1983). For example, cross-sectional analyses can be difficult to interpret because a gene or allele may be associated with surviving a disease rather than in its etiology. Further, the influence of a particular gene on a disease may cause clinical emergence at an earlier age, making the age structure and competing causes of death important determinants of the associations found. Even in prospective cohort studies, temporal associations between specific alleles present for a lifetime and disease occurrence can be complex to interpret because many clinical conditions develop in a preclinical state over many years, and many participants may be misclassified as normal when in fact nascent illnesses will emerge given enough time.

The Intersection of Defined Survey Populations with Genetic Studies

The following are some approaches to utilizing population-based cohorts for answering scientific questions related to genetics and health outcomes among older persons. Social science surveys are taking place in many countries, industrialized and developing, and the opportunities may be different according to the geographical locale and the prevalent environmental exposures and clinical conditions at hand. These approaches are not exhaustive, but highlight existing and emerging applications.

The Distribution of Gene Markers in Demographically Defined Populations—General Research Applications

Assessing the fundamental distributions of various gene markers in a population according to basic demographic characteristics can be extremely important for genetic research applications. Population surveys may supplement ongoing population registers in constructing genealogies by reconstructing biological kinship relationships among inhabitants (Gaimard et al., 1998). The study of populations with an increased occurrence of a known genetic disease may lead to identification of the ancestors and related demographic and geographic factors that were responsible for the increased population disease rates, such as has been done for cystic fibrosis (deBraekeleer et al., 1996). Repeated genetic sampling of spatially diverse populations may also provide substantial information on patterns of geographic migration (Epperson, 1998) and the population age of rare, nonrecurrent mutants (Rannala, 1997).

As noted above, gene-marker distributions characterized according to age, gender, ethnicity, or geographic locale can provide important information about population genetic heterogeneity and support sample-size calculations for more detailed genetic inquiries. Determining accurate population allele frequencies is important in the application of various genetic analytical models and for meeting their assumptions. In some of these models, selecting population controls for persons with a particular trait or condition for gene marker-disease association studies can lead to a kind of confounding bias called “population stratification,” where gene frequencies and penetrances at a locus vary between subpopulations and are not adequately controlled for by usual methods (Caporaso et al., 1999). Demographically well-characterized populations may help avoid some of this confounding by providing information on factors associated with these various gene frequencies.

One ancillary research application of gene frequencies derived from population surveys is to assess the quality and completeness of disease registries for conditions that are known to be caused by major genes. Such population gene frequencies can supplement population-based disease registers by providing corroborating evidence on the changing occurrence of these conditions, such as the eye tumor retinoblastoma (Moll et al., 1997).

Founder Populations

Some populations are chosen for special genetic study, particularly for complex genetic traits, because they are more genetically homogeneous than other populations, often because of social or geographic isolation over many generations. At least in theory, this may allow easier detection of gene-phenotype associations because there is less genetic “noise.” Such approaches have several applications, such as in determining mutant mitochondrial DNA associated with dementia in a French-Canadian founder population (Chagnon et al., 1999) or in searching for genetic linkages to susceptibility genes for asthma (Ober et al., 1998). The founder population within Iceland has also received much attention, and the combined genetic and clinical information available is being used for many purposes (Enserink, 2000), despite ethical and political concerns. Founder populations that are well characterized according to demographic, social, and environmental factors can be extremely helpful in addressing genetic issues.

Gene Frequencies and Disease Occurrence—Association Studies

It is possible to utilize defined population cohorts to identify specific genes that may cause chronic conditions and disability among older persons. Standard population methods for exploring associations between gene markers and phenotypes, particularly disease outcomes, can be applied here as in other situations, including standard case-control and cohort analytic approaches that have been used for studies of environment-disease associations. Case-control studies may be less subject to bias when cases (i.e., persons with particular diseases or traits) and controls are obtained from geographically referent populations. Nesting case-control studies from within well-defined cohorts is an important way to better understand a study population.

Cohort studies addressing genetically defined “exposures” have an advantage over case-control studies in that many hypothesized outcomes can be evaluated simultaneously. Participants' genetic markers may be defined in different ways. They may be particular “candidate” (i.e., hypothesized) markers followed to determine whether they predict a particular outcome, or markers known to cause a condition followed to determine the general population impact. For example, a particular homozygous mutation has been detected in 85-90 percent of northern Europeans with the iron storage disease hemochromotosis. However, when this mutation was determined in a general Australian cohort (Olynyk et al., 1999), only half of the homozygous persons had clinical or serological features of the disease over a four-year period. This emphasizes one potential contribution of population-based studies relative to other designs.

In the past most population genetic studies were designed, for reasons of efficiency, to evaluate candidate genes or gene markers selected for study based on prior linkage or associational methods, molecular or other studies, or on a priori reasoning. With 50,000-100,000 genes and wide allelic variation in the human genome and a much larger number of gene markers, including the potential to exploit the exact nucleotide sequence of the entire genome, it would seem difficult under the best of circumstances to determine and test all potential markers. Moreover, most chronic conditions of older persons are thought to be due to multiple, interactive genes as well as important environmental exposures (Collins, 1999), making the study of one or a few candidate genes increasingly likely to be unproductive. Due to rapidly advancing gene measurement technology, alternative approaches are emerging. One suggested approach, in essence, is to screen the entire genome using closely spaced genetic markers for associations to disease outcomes. This is still a resource-intensive activity that has not yet borne fruit, but it holds future promise.

Other approaches to defining genetic roles in disease and identifying the chromosomal location of these genes are based more on pedigrees and families. Linkage studies are performed in large pedigrees, where genetic heterogeneity is limited and the condition occurs with a high frequency. There are variations of pedigree studies performed to determine if particular gene markers are related to phenotypes when large pedigrees are not available, using related individuals in nuclear families, such as sib-pairs (Goring and Terwilliger, 2000), trios (i.e., two parents and an offspring), and co-twin, half-sib or other models that employ various combinations of family members (Nance, 1993). Participants for many of these study models can be recruited from population surveys if appropriate disease or phenotype information is available. It is not the intent of this paper to review these methods; each must be performed with care and has incumbent problems in conduct, analysis, and interpretation.

Thus, there are many potential applications for population studies in the discovery and characterization of genes related to diseases and age-related change. In the past, population studies were more effectively employed to verify if gene-disease associations found in pedigrees through the study of nuclear families were relevant to representative population groups, and to examine the clinical and public health implications of the associations. Important recent examples include studies where particular genes have been shown to have direct relevance to the risk of common disabling conditions of older persons in the community, such as the relation of apolipoprotein E alleles to Alzheimer's dementia (Saunders et al., 1993) and HPC alleles to the occurrence of prostate cancer (Xu et al., 1998). However, technological and analytical advances have given new importance to gene discovery studies in defined populations. It has been the history of gene-disease searches that many valid associations found within individual, high-occurrence families often are not relevant to many other families or to the large number of persons with that condition in the community. This does not diminish the importance of the findings, particularly for addressing the clinical and genetic concerns of certain highrisk pedigrees or for understanding the pathogenesis of the disease, but the further application to population studies may be limited.

Gene-Environment Interaction

Since many population surveys have collected a variety of environmental data on individuals or environments, there are opportunities to use this information in association with genetic bioindicators. Some examples include the search for genetic susceptibility to conditions such as lung cancer associated with cigarette smoking (Shields, 1999), the demonstration of genetically based interindividual differences in the metabolism of environmental toxicants (Guengerich et al., 1999), and the demonstration of the genetic role in the regulation and clinical presentation of environmentally acquired infections (Garcia et al., 1999). Case-control methodology has been used to address the genetic susceptibility to conditions where environmental exposures play a major role (Brennan, 1999). Community-based behavioral and psychiatric studies have profitably evaluated genetic and environmental effects on various mental illnesses (McGuffin and Martin, 1999).

Public Health and Clinical Applications of Genetically Characterized Populations

Determining the genetic characteristics of defined populations can have immediate and important clinical or public health implications. Such information is critical for planning and locating genetic screening and counseling programs, such as for neonatal metabolic disorders or conditions common among older persons. Characterization of gene frequencies by age, gender, ethnicity, or other factors will allow more efficiently targeted screening. As genes for disease or dysfunction susceptibility within individuals are identified, clinical prevention programs (Coughlin, 1999) can be invoked with greater intensity, or various environmental exposures can be more rigorously avoided. Knowing the distribution of relevant gene markers in populations can improve disease diagnosis and prognosis. In the future it may be possible to use the information obtained for targeting various local populations or individual families for disease prevention or treatment through gene therapy itself (Collins, 1999).

THE LOGISTICS OF SPECIMEN COLLECTION IN POPULATION SURVEYS

There are many issues involved in specimen collection in population surveys, from respondents' willingness and ability to participate to the handling and biological determinations of materials collected. All of these have to be considered when approaching populations; each potential specimen and bioindicator will have its own set of logistical challenges and methodological difficulties. The following are some of the major considerations.

The Impact of Specimen Collection on Survey Participation

There is very little formally published information or synthesis on how specimen collection affects participation rates. In general, there is some decrement in specimen acquisition rates among respondents to interview surveys, even when specimens are collected at a separate session. Some interviewees may be less willing to participate, even if prior detailed study explanations and informed consent procedures suggest minimum risk and high potential scientific payoff. This reticence may be based on unlikely fears (e.g., revealing fatal illnesses, surreptitious testing for the Human Immunodeficiency Virus) or on prior untoward experiences in survey or clinical settings (e.g., fainting after venipuncture). Such situations should be anticipated as much as possible and avoided by appropriate staff training and pretesting of presentation techniques to potential respondents. The overriding issue is whether specimen collection imperils long-term participation rates in panel (cohort) studies, where further waves of data collection are planned and long-term participation is paramount. More documentation and field experiments need to be performed in order to acquire and document more experience with this issue, particularly among older persons.

Sites of Specimen Collection

An important early logistical consideration is to determine where biological specimens should be collected. Experience dictates that blood and urine specimens can reliably be obtained in the home, although in the latter case explicit participant instruction and monitoring is necessary. Hair clippings, skin scrapings, cheek swabs, and washes/rinses for genetic analysis can usually be obtained without difficulty in the home. Successful self-collection of DNA from oral epithelial cells has been reported (Harty et al., 2000). However, there are other types of specimens that require more equipment or a medical setting to acquire, such as semen specimens, skin or adipose tissue biopsies, multiple blood specimens over many hours, or specimens collected in association with complex physiological testing. Here, transporting participants to facilities structured to deal with complex specimen issues may be necessary. In certain circumstances, an additional option for venipuncture is to send a voucher, instructions, and a mailing container to participants and request that blood be drawn by their local physician, clinic, or laboratory.

Irrespective of where the specimens are collected, close attention to rigorous protocols are needed to maximize the scientific yield. Table 10-2 highlights the problems that may occur at almost any stage of the acquisition and determination process. Specimen acquisition from older populations may pose special problems. Older persons may be living in institutional or other long-term care settings or in guarded residential environments, and may not be easily available for study. Even among community-dwelling elders, cognitive impairment or the absence of assistance and supervision may lead to failures in specimen collection protocols at home. Older persons may not tolerate extracting the amounts of blood that would be tolerated in younger participants, or venipuncture may be more complex due to poor vein anatomy or access, perhaps partly due to prior intensive medical care. Frequent incontinence may make urine collection procedures more difficult. The extent of these and related problems may be anticipated by appropriate pretesting procedures.

TABLE 10-2. Sources of Problems in the Acquisition of Bioindicator Specimens in Field Studies.

TABLE 10-2

Sources of Problems in the Acquisition of Bioindicator Specimens in Field Studies.

Specimen Transport and Storage

As more sophisticated bioindicators have been applied to population surveys, the logistics of specimen acquisition has similarly become more challenging. Various bioindicators have different levels of chemical stability after removal from the body and require different processing techniques; this must be known in advance. Some specimens require immediate icing or freezing for transport to the laboratory (a type of “cold chain”) for long-term storage or processing; others are relatively stable and can be handled with less rigor. Cells that will be kept alive in culture will require immediate special-handling techniques.

These same issues pertain to long-term storage of specimens as well. Some molecules, such as serum immunoglobulins (e.g., antibodies to infectious agents), steroid hormones (e.g., estrogen or testosterone), and DNA, are relatively stable and can be profitably processed and stored (frozen) with relatively little loss or degradation. The same is true for DNA adducts, DNA to which environmental chemicals are bound, and for other environmental, elemental chemicals, such as heavy metals. Other molecules, however, such as certain enzymes, lipids, or fatty acids or small peptides (proteins) may require precise and rapid processing; long-term storage may not be feasible. If the bioindicators are to include complex cell functions, such as for the study of gene function, living cells such as lymphocytes must be rapidly isolated and placed in cell cultures. Cells obtained for cytogenetic studies will also require rapid processing. It is also possible to “immortalize” and store, for future study, living cells that are necessary for assessment of gene function, but fastidious procedures and protocol adherence are necessary. Specimen processing and storage issues are paramount and must be planned well in advance of fieldwork.

Investigators should also be alert to the challenges of long-term specimen storage. One issue is that the long-term preservation of many chemicals is not always known and must be determined by trial and error. This is particularly a problem with analyses that are not anticipated at the time of specimen collection. Additional problems may include lack of storage space or its requisite long-term funding to maintain the specimens, enhanced degradation of specimens from repeated freezing and thawing (usually avoidable by aliquoting a given specimen into multiple, small containers), and contamination of specimens due to chemicals in the storage containers. Ethical issues also exist with respect to long-term specimen ownership and competing scientific themes for specimen disposition, particularly after the main activities of the study have been completed.

Alternative Sources for Genetic Bioindicators

In some circumstances, such as when a potentially informative individual is deceased or otherwise not available for genetic study, it may be possible to acquire genetic or other bioindicator specimens from alternative sources. As noted by Martin and Hu in this volume, stored tissue specimens on which genetic or other determinations can be performed may be archived in hospitals or pathology laboratories. Such sources may include surgical and autopsy specimens, cytology specimens (e.g., Pap smears), and other blood specimens obtained for hematological or chemical determinations. Unfortunately, these specimens are not being retained as long as might be desired for investigational purposes. In some cases, there may be additional genetic specimen sources other than from routine clinical care, if the scientific needs are compelling. For example, blood may have been archived in certain occupational settings, such as where employees may be exposed to infectious agents, and it is routinely collected from those serving in military careers. It is also possible to extract analyzable DNA from some serum specimens (Goessl, 2000), even if no cellular material is available. Some research institutions maintain long-term tissue archives for specific organs or conditions, often related to brain disease or cancer. An individual may have stored ova or semen that could be retrieved and analyzed. However, one must be alert to selection bias in the acquisition of specimens from tissue banks (Winn and Gunter, 1993), possibly leading to spurious analytical results. In summary, there may be special opportunities to enhance specimen acquisition if appropriate investigational and ethical hurdles can be overcome.

What Is the Optimal Age for Genetic Specimen Acquisition?

This is not always clear, but there is a reasonable argument for collecting genetic information on pedigree members before the senium. Often in human aging studies the phenotype is longevity, and the shorter-lived members (i.e., the control group) are by definition not available when the oldest-lived persons are studied. This is one justification for routine collection of DNA at the start of panel studies at an age when most participants are still available for study.

CONCLUSION

Substantial opportunities exist for important scientific contributions when specimen collection for genetic and environmental bioindicators is applied to existing or planned representative population surveys originally intended for behavioral, social, or economic purposes. In some instances, questions of import can be answered only by this approach. Attention to study design and specimen collection, as well as the ethical dimensions of human participation, are critical areas. Partnering with geneticists, environmental scientists, epidemiologists, and molecular biologists should be highly productive.

REFERENCES

  • Anonymous. Offbeat twins. Science. 2000;288:1735.
  • Arking R. Biology of Aging. Second Edition. Sinauer Associates, Inc.; 1998.
  • Brennan P. Design and analysis issues in case-control studies addressing genetic susceptibility. IARC Scientific Publications (Lyon). 1999;148:123–132. [PubMed: 10493254]
  • Caporaso N, Rothman N, Wacholder S. Case-control studies of common alleles and environmental factors. Monographs of the National Cancer Institute. 1999;26:25–30. [PubMed: 10854482]
  • Carretero OA, Oparil S. Essential hypertension. Part I: Definition and etiology. Circulation. 2000;101:329–335. [PubMed: 10645931]
  • Chagnon P, Gee M, Filion M, Robitaille Y, Belouchi M, Gauvreau D. Phylogenetic analysis of the mitochondrial genome indicates significant differences between patients with Alzheimer's disease and controls in a French-Canadian founder population. American Journal of Human Genetics. 1999;85:20–30. [PubMed: 10377009]
  • Collins FS. Shattuck Lecture: Medical and societal consequences of the Human Genome Project. New England Journal of Medicine. 1999;341:28–37. [PubMed: 10387940]
  • Coughlin SS. The intersection of genetics, public health, and preventive medicine. American Journal of Preventive Medicine. 1999;16(2):89–90. [PubMed: 10343883]
  • de Boer J, Hoeijmakers JH. Nucleotide excision repair and human syndromes. Carcinogenesis. 2000;21(3):453–460. [PubMed: 10688865]
  • deBraekeleer M, Daigneault J, Allard C, Simard F, Aubin G. Genealogy and geographical distribution of CFTR mutations in Saguenay Lac-Saint-Jean (Quebec, Canada). Annals of Human Biology. 1996;23:345–352. [PubMed: 8886242]
  • Ellsworth DL, Manolio T. The emerging importance of genetics in epidemiological research II. Issues in study design and gene mapping. Annals of Epidemiology. 1999;9:75–90. [PubMed: 10037550]
  • Enserink M. Start-up claims piece of Iceland's gene pie. Science. 2000;287:951. [PubMed: 10691565]
  • Epperson BK. Gene genealogies in geographically structured populations. Genetics. 1998;86:156–161.
  • Gaimard M, Dilumbu I, Louame P, Bellis G, Assouan A, Chaventre A. Registers and follow-up methods of populations in a public health survey: The example of the village Glanle in Ivory Coast. Collegium Anthropologicum. 1998;22:63–75. [PubMed: 10097421]
  • Garcia A, Abel L, Cot M, Richard P, Ranque S, Feingold J, Demenais F, Boussinesq M, Chippaux JP. Genetic epidemiology of host predisposition to microfilaraemia in human loiasis. Tropical Medicine and International Health. 1999;4:565–574. [PubMed: 10499080]
  • Goessl C. Laser-fluorescence microsatellite analysis and new results in microsatellite analysis of plasma/serum DNA of cancer patients. Annals of the New York Academy of Sciences. 2000;906:63–66. [PubMed: 10818598]
  • Goldman L, Mudge GH Jr., Cook EF. The changing “natural history” of symptomatic coronary artery disease: Basis versus bias. American Journal of Cardiology. 1983;51(3):449–454. [PubMed: 6401909]
  • Gorin MB, Breitner JC, De Jong PT, Hageman GS, Klaver CC, Kuehn MH, Seddon JM. The genetics of age-related macular degeneration. Molecular Vision. 1999;5:29. [PubMed: 10562653]
  • Goring HH, Terwilliger JD. Linkage analysis in the presence of errors IV: Joint pseudomarker analysis of linkage and/or linkage disequilibrium on a mixture of pedigrees and singletons when the mode of inheritance cannot be accurately specified. American Journal of Human Genetics. 2000;66:1310–1327. [PMC free article: PMC1288197] [PubMed: 10731466]
  • Guengerich FP, Parikh A, Turesky RJ, Josephy PD. Inter-individual differences in the metabolism of environmental toxicants: Cytochrome P450 1A2 as a prototype. Mutation Research. 1999;428:115–124. [PubMed: 10517985]
  • Harty LC, Shields PG, Winn DM, Caporaso NE, Hayes RB. Self-collection of oral epithelial cells under instruction from epidemiological interviewers. American Journal of Epidemiology. 2000;151:199–205. [PubMed: 10645823]
  • Hernandez-Boussard T, Montesano R, Hainaut P. Analysis of somatic mutations of the p53 gene in human cancers: A tool to generate hypotheses about the natural history of cancer. IARC Scientific Publications (Lyon). 1999;146:43–53. [PubMed: 10353383]
  • Heston LL. Psychiatric disorders in foster home reared children of schizophrenic mothers. British Journal of Psychiatry. 1966;112:819–825. [PubMed: 5966555]
  • Holliday R. Causes of aging. Annals of the New York Academy of Sciences. 1998;854:61–71. [PubMed: 9928420]
  • Hussey JM, Elo IT. Cause-specific mortality among older African-Americans: Correlates and consequences of age misreporting. Social Biology. 1997;44(3-4):227–246. [PubMed: 9446963]
  • Kukull WA, Larson EB. Distinguishing Alzheimer's disease from other dementias. Questionnaire responses of close relatives and autopsy results. Journal of the American Geriatrics Society. 1989;37:521–527. [PubMed: 2654258]
  • Langston JW. Epidemiology versus genetics in Parkinson's disease: Progress is solving the age-old debate. Annals of Neurology. 1998;44(Suppl 1):S45–52. [PubMed: 9749572]
  • Lee CK, Klopp RG, Weindruch R, Prolla TA. Gene expression profile of aging and its retardation by caloric restriction. Science. 1999;285(5432):1390–1393. [PubMed: 10464095]
  • McGuffin P, Martin N. Science, medicine and the future: Behavior and genes. British Medical Journal. 1999;319:37–40. [PMC free article: PMC1116141] [PubMed: 10390460]
  • McKinley AG, Russell SE, Spence RA, Odling-Smee W, Nevin NC. Hereditary breast cancer in Northern Ireland. Ulster Medical Journal. 1996;65(2):113–117. [PMC free article: PMC2448579] [PubMed: 8979776]
  • Moll AC, Kuik DJ, Bouter LM, Den Otter W, Bezemer PD, Koten JW, Imhof SM, Kuyt BP, Tan KE. Incidence and survival of retinoblastoma in The Netherlands: A register-based study 1962-1995. British Journal of Ophthalmology. 1997;81:559–562. [PMC free article: PMC1722238] [PubMed: 9290369]
  • Nance WE. 1992 American Society of Human Genetics presidential address: Back to the future. American Journal of Human Genetics. 1993;53:6–15. [PMC free article: PMC1682233] [PubMed: 8317499]
  • Ober C, Cox NJ, Abney M, Di Rienzo A, Lander ES, Changyaleket B, Gidley H, Kurtz B, Lee J, Nance M, Pettersson A, Prescott J, Richardson A, Schlenker E, Summerhill E, Willadsen S. Genome-wide search for asthma susceptability loci in a founder population. Human Molecular Genetics. 1998;7:1393–1398. [PubMed: 9700192]
  • Olynyk JK, Cullen DJ, Aquilia S, Rossi E, Summerville L, Powell LW. A population-based study of the clinical expression of the hemochromatosis gene. New England Journal of Medicine. 1999;341:718–724. [PubMed: 10471457]
  • Plomin R, DeFries JC. The genetics of cognitive abilities and disabilities. Scientific American. 1998;278:62–69. [PubMed: 9569675]
  • Rannala B. On the genealogy of a rare allele. Theoretical Population Biology. 1997;52:216–223. [PubMed: 9466962]
  • Romitti PA, Burns TL, Murray JC. Maternal interview reports of family history of birth defects: Evaluation from a population-based case-control study of orofacial clefts. Americal Journal of Medical Genetics. 1997;72:422–429. [PubMed: 9375725]
  • Saunders AM, Strittmatter WH, Schmechel D, et al. Association of the apolipoprotein E allele E4 with late-onset familial and sporadic Alzheimer's disease. Neurology. 1993;43:1467–1472. [PubMed: 8350998]
  • Shay JW. Telomerase in human development and cancer. Journal of Cellular Physiology. 1997;173(2):266–270. [PubMed: 9365534]
  • Shields PG. Molecular epidemiology of lung cancer. Annals of Oncology. 1999;10(Suppl 5):S7–S11. [PubMed: 10582132]
  • Thomas A, Cannon-Albright L, Bansal A, Skolnick MH. Familial associations between cancer sites. Computers & Biomedical Research. 1999;32(6):517–529. [PubMed: 10587469]
  • Vijg J. Somatic mutations and aging: A re-evaluation. Mutation Research. 2000;447(1):117–135. [PubMed: 10686308]
  • Wallace RB. The potential of population surveys for genetic studies. In: Wachter KW, Finch CE, editors. Between Zeus and the Salmon: The Biodemography of Longevity. Washington, DC: National Academy Press; 1997. pp. 234–244. [PubMed: 22973581]
  • Winn DM, Gunter EW. Biological specimen banks: A resource for molecular epidemiologic studies. In: Schulte PA, Perera FP, editors. Molecular Epidemiology: Principles and Practices. New York: Academic Press; 1993. pp. 217–234.
  • Xu J, Meyers D, Freije D, et al. Evidence for a prostate cancer susceptibility locus on the X chromosome. Nature Genetics. 1998;20(2):175–179. [PubMed: 9771711]
Copyright © 2001, National Academy of Sciences.
Bookshelf ID: NBK110052

Views

  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (1.8M)

Related information

  • PMC
    PubMed Central citations
  • PubMed
    Links to PubMed

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...