Gene selection for brain cancer classification

Conf Proc IEEE Eng Med Biol Soc. 2006:2006:5846-9. doi: 10.1109/IEMBS.2006.260197.

Abstract

With the introduction of microarray, cancer classification, diagnosis and prediction are made more accurate and effective. However, the final outcome of the data analyses very much depend on the huge number of genes with relatively small number of samples present in each experiment. It is thus crucial to select relevant genes to be used for future specific cancer markers. Many feature selection methods have been proposed but none is able to classify all kinds of microarray data accurately, especially on those multi-class datasets. We propose a one-versus-one comparison method for selecting discriminatory features instead of performing the statistical test in a one-versus-all manner. Brain cancer is chosen as an example. Here, 3 types of statistics are used: signal-to-noise ratio (SNR), t-statistics and Pearson correlation coefficient. Results are verified by performing hierarchical and k-means clustering. Using our one-versus-one comparisons, best performance accuracies of 90.48% and 97.62% can be obtained by hierarchical and k-means clustering respectively. However best performance accuracies of 88.10% and 80.95% can be obtained respectively when using one-versus-all comparison. This shows that one-versus-one comparison is superior.

MeSH terms

  • Algorithms
  • Biomarkers, Tumor / metabolism
  • Brain Neoplasms / diagnosis*
  • Brain Neoplasms / genetics*
  • Cluster Analysis
  • Computational Biology / methods*
  • Diagnosis, Computer-Assisted
  • Gene Expression Profiling
  • Gene Expression Regulation, Neoplastic*
  • Humans
  • Models, Statistical
  • Models, Theoretical
  • Neoplasm Proteins / metabolism
  • Oligonucleotide Array Sequence Analysis
  • Pattern Recognition, Automated

Substances

  • Biomarkers, Tumor
  • Neoplasm Proteins