An integrative bioinformatics approach reveals coding and non-coding gene variants associated with gene expression profiles and outcome in breast cancer molecular subtypes

Br J Cancer. 2018 Apr;118(8):1107-1114. doi: 10.1038/s41416-018-0030-0. Epub 2018 Mar 21.

Abstract

Background: Sequence variations in coding and non-coding regions of the genome can affect gene expression and signalling pathways, which in turn may influence disease outcome.

Methods: In this study, we integrated somatic mutations, gene expression and clinical data from 930 breast cancer patients included in the TCGA database. Genes associated with single mutations in molecular breast cancer subtypes were identified by the Mann-Whitney U-test and their prognostic value was evaluated by Kaplan-Meier and Cox regression analyses. Results were confirmed using gene expression profiles from the Metabric data set (n = 1988) and whole-genome sequencing data from the TCGA cohort (n = 117).

Results: The overall mutation rate in coding and non-coding regions were significantly higher in ER-negative/HER2-negative tumours (P = 2.8E-03 and P = 2.4E-07, respectively). Recurrent sequence variations were identified in non-coding regulatory regions of several cancer-associated genes, including NBPF1, PIK3CA and TP53. After multivariate regression analysis, gene signatures associated with three coding mutations (CDH1, MAP3K1 and TP53) and two non-coding variants (CRTC3 and STAG2) in cancer-related genes predicted prognosis in ER-positive/HER2-negative tumours.

Conclusions: These findings demonstrate that sequence alterations influence gene expression and oncogenic pathways, possibly affecting the outcome of breast cancer patients. Our data provide potential opportunities to identify non-coding variations with functional and clinical relevance in breast cancer.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biomarkers, Tumor / genetics*
  • Breast Neoplasms / classification
  • Breast Neoplasms / diagnosis*
  • Breast Neoplasms / genetics*
  • Breast Neoplasms / mortality
  • Computational Biology / methods
  • Diagnosis, Differential
  • Female
  • Follow-Up Studies
  • Gene Expression Regulation, Neoplastic
  • Humans
  • Open Reading Frames / genetics*
  • Prognosis
  • RNA, Untranslated / genetics*
  • Survival Analysis
  • Systems Integration
  • Transcriptome*
  • Treatment Outcome

Substances

  • Biomarkers, Tumor
  • RNA, Untranslated