Alignment and classification of time series gene expression in clinical studies

Bioinformatics. 2008 Jul 1;24(13):i147-55. doi: 10.1093/bioinformatics/btn152.

Abstract

Motivation: Classification of tissues using static gene-expression data has received considerable attention. Recently, a growing number of expression datasets are measured as a time series. Methods that are specifically designed for this temporal data can both utilize its unique features (temporal evolution of profiles) and address its unique challenges (different response rates of patients in the same class).

Results: We present a method that utilizes hidden Markov models (HMMs) for the classification task. We use HMMs with less states than time points leading to an alignment of the different patient response rates. To focus on the differences between the two classes we develop a discriminative HMM classifier. Unlike the traditional generative HMM, discriminative HMM can use examples from both classes when learning the model for a specific class. We have tested our method on both simulated and real time series expression data. As we show, our method improves upon prior methods and can suggest markers for specific disease and response stages that are not found when using traditional classifiers.

Availability: Matlab implementation is available from http://www.cs.cmu.edu/~thlin/tram/.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Artificial Intelligence*
  • Biomedical Research / methods*
  • Gene Expression Profiling / methods*
  • Pattern Recognition, Automated / methods*
  • Sequence Alignment / methods*
  • Time Factors
  • Tissue Array Analysis / methods*