Comparison of reversible-jump Markov-chain-Monte-Carlo learning approach with other methods for missing enzyme identification

J Biomed Inform. 2008 Apr;41(2):272-81. doi: 10.1016/j.jbi.2007.09.002. Epub 2007 Sep 15.

Abstract

Computational identification of missing enzymes plays a significant role in accurate and complete reconstruction of metabolic network for both newly sequenced and well-studied organisms. For a metabolic reaction, given a set of candidate enzymes identified according to certain biological evidences, a powerful mathematical model is required to predict the actual enzyme(s) catalyzing the reactions. In this study, several plausible predictive methods are considered for the classification problem in missing enzyme identification, and comparisons are performed with an aim to identify a method with better performance than the Bayesian model used in previous work. In particular, a regression model consisting of a linear term and a nonlinear term is proposed to apply to the problem, in which the reversible jump Markov-chain-Monte-Carlo (MCMC) learning technique (developed in [Andrieu C, Freitas Nando de, Doucet A. Robust full Bayesian learning for radial basis networks 2001;13:2359-407.]) is adopted to estimate the model order and the parameters. We evaluated the models using known reactions in Escherichia coli, Mycobacterium tuberculosis, Vibrio cholerae and Caulobacter cresentus bacteria, as well as one eukaryotic organism, Saccharomyces Cerevisiae. Although support vector regression also exhibits comparable performance in this application, it was demonstrated that the proposed model achieves favorable prediction performance, particularly sensitivity, compared with the Bayesian method.

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Artificial Intelligence*
  • Computer Simulation
  • Gene Expression Profiling / methods*
  • Models, Biological*
  • Models, Statistical
  • Monte Carlo Method
  • Multienzyme Complexes / metabolism*
  • Pattern Recognition, Automated / methods*
  • Signal Transduction / physiology*

Substances

  • Multienzyme Complexes