Sparse boosting for high-dimensional survival data with varying coefficients

Stat Med. 2018 Feb 28;37(5):789-800. doi: 10.1002/sim.7544. Epub 2017 Nov 19.

Abstract

Motivated by high-throughput profiling studies in biomedical research, variable selection methods have been a focus for biostatisticians. In this paper, we consider semiparametric varying-coefficient accelerated failure time models for right censored survival data with high-dimensional covariates. Instead of adopting the traditional regularization approaches, we offer a novel sparse boosting (SparseL2 Boosting) algorithm to conduct model-based prediction and variable selection. One main advantage of this new method is that we do not need to perform the time-consuming selection of tuning parameters. Extensive simulations are conducted to examine the performance of our sparse boosting feature selection techniques. We further illustrate our methods using a lung cancer data analysis.

Keywords: accelerated failure time model; boosting; high-dimensional data; minimum description length; varying-coefficient model.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Computer Simulation
  • Humans
  • Risk Assessment / methods
  • Statistics, Nonparametric*
  • Survival Analysis*