Tracing Lung Cancer Risk Factors Through Mutational Signatures in Never-Smokers

Am J Epidemiol. 2021 Jun 1;190(6):962-976. doi: 10.1093/aje/kwaa234.

Abstract

Epidemiologic studies often rely on questionnaire data, exposure measurement tools, and/or biomarkers to identify risk factors and the underlying carcinogenic processes. An emerging and promising complementary approach to investigate cancer etiology is the study of somatic "mutational signatures" that endogenous and exogenous processes imprint on the cellular genome. These signatures can be identified from a complex web of somatic mutations thanks to advances in DNA sequencing technology and analytical algorithms. This approach is at the core of the Sherlock-Lung study (2018-ongoing), a retrospective case-only study of over 2,000 lung cancers in never-smokers (LCINS), using different patterns of mutations observed within LCINS tumors to trace back possible exposures or endogenous processes. Whole genome and transcriptome sequencing, genome-wide methylation, microbiome, and other analyses are integrated with data from histological and radiological imaging, lifestyle, demographic characteristics, environmental and occupational exposures, and medical records to classify LCINS into subtypes that could reveal distinct risk factors. To date, we have received samples and data from 1,370 LCINS cases from 17 study sites worldwide and whole-genome sequencing has been completed on 1,257 samples. Here, we present the Sherlock-Lung study design and analytical strategy, also illustrating some empirical challenges and the potential for this approach in future epidemiologic studies.

Keywords: genomic analyses; histology; lung cancer; mutational signatures; never-smokers; radiological imaging.

Publication types

  • Research Support, N.I.H., Intramural

MeSH terms

  • Causality
  • DNA Mutational Analysis / methods*
  • Genetic Predisposition to Disease / epidemiology*
  • Humans
  • Lung Neoplasms / genetics*
  • Retrospective Studies
  • Risk Assessment / methods*
  • Risk Factors
  • Whole Genome Sequencing / methods*