hoDCA: higher order direct-coupling analysis

BMC Bioinformatics. 2018 Dec 29;19(1):546. doi: 10.1186/s12859-018-2583-6.

Abstract

Background: Direct-coupling analysis (DCA) is a method for protein contact prediction from sequence information alone. Its underlying principle is parameter estimation for a Hamiltonian interaction function stemming from a maximum entropy model with one- and two-point interactions. Vastly growing sequence databases enable the construction of large multiple sequence alignments (MSA). Thus, enough data exists to include higher order terms, such as three-body correlations.

Results: We present an implementation of hoDCA, which is an extension of DCA by including three-body interactions into the inverse Ising problem posed by parameter estimation. In a previous study, these three-body-interactions improved contact prediction accuracy for the PSICOV benchmark dataset. Our implementation can be executed in parallel, which results in fast runtimes and makes it suitable for large-scale application.

Conclusion: Our hoDCA software allows improved contact prediction using the Julia language, leveraging power of multi-core machines in an automated fashion.

Keywords: Contact prediction; DCA; Proteins.

MeSH terms

  • Humans
  • Proteins / metabolism*
  • Sequence Analysis, Protein / methods*

Substances

  • Proteins