Bioinformatics analysis of the genes involved in the extension of prostate cancer to adjacent lymph nodes by supervised and unsupervised machine learning methods: The role of SPAG1 and PLEKHF2

Genomics. 2020 Nov;112(6):3871-3882. doi: 10.1016/j.ygeno.2020.06.035. Epub 2020 Jun 30.

Abstract

The present study aimed to identify the genes associated with the involvement of adjunct lymph nodes of patients with prostate cancer (PCa) and to provide valuable information for the identification of potential diagnostic biomarkers and pathological genes in PCa metastasis. The most important candidate genes were identified through several machine learning approaches including K-means clustering, neural network, Naïve Bayesian classifications and PCA with or without downsampling. In total, 21 genes associated with lymph nodes involvement were identified. Among them, nine genes have been identified in metastatic prostate cancer, six have been found in the other metastatic cancers and four in other local cancers. The amplification of the candidate genes was evaluated in the other PCa datasets. Besides, we identified a validated set of genes involved in the PCa metastasis. The amplification of SPAG1 and PLEKHF2 genes were associated with decreased survival in patients with PCa.

Keywords: Gene expression analysis; Machine learning; Metastasis; Prostate cancer.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Antigens, Surface / genetics*
  • Cluster Analysis
  • Computational Biology / methods*
  • Datasets as Topic
  • GTP-Binding Proteins / genetics*
  • Humans
  • Lymphatic Metastasis / genetics*
  • Male
  • Prostatic Neoplasms / genetics
  • Prostatic Neoplasms / pathology*
  • Supervised Machine Learning*
  • Unsupervised Machine Learning*
  • Vesicular Transport Proteins / genetics*

Substances

  • Antigens, Surface
  • PLEKHF2 protein, human
  • Vesicular Transport Proteins
  • GTP-Binding Proteins
  • SPAG1 protein, human