Predicting protein complexes from PPI data: a core-attachment approach

J Comput Biol. 2009 Feb;16(2):133-44. doi: 10.1089/cmb.2008.01TT.

Abstract

Protein complexes play a critical role in many biological processes. Identifying the component proteins in a protein complex is an important step in understanding the complex as well as the related biological activities. This paper addresses the problem of predicting protein complexes from the protein-protein interaction (PPI) network of one species using a computational approach. Most of the previous methods rely on the assumption that proteins within the same complex would have relatively more interactions. This translates into dense subgraphs in the PPI network. However, the existing software tools have limited success. Recently, Gavin et al. (2006) provided a detailed study on the organization of protein complexes and suggested that a complex consists of two parts: a core and an attachment. Based on this core-attachment concept, we developed a novel approach to identify complexes from the PPI network by identifying their cores and attachments separately. We evaluated the effectiveness of our proposed approach using three different datasets and compared the quality of our predicted complexes with three existing tools. The evaluation results show that we can predict many more complexes and with higher accuracy than these tools with an improvement of over 30%. To verify the cores we identified in each complex, we compared our cores with the mediators produced by Andreopoulos et al. (2007), which were claimed to be the cores, based on the benchmark result produced by Gavin et al. (2006). We found that the cores we produced are of much higher quality ranging from 10- to 30-fold more correctly predicted cores and with better accuracy.

Availability: (http://alse.cs.hku.hk/complexes/).

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Markov Chains
  • Mathematics
  • Models, Theoretical*
  • Multiprotein Complexes* / chemistry
  • Multiprotein Complexes* / metabolism
  • Protein Interaction Mapping*
  • Proteins / chemistry
  • Proteins / metabolism
  • Software*

Substances

  • Multiprotein Complexes
  • Proteins