Identification of biophysical interaction patterns in direct coupling analysis

Phys Rev E. 2021 Apr;103(4-1):042418. doi: 10.1103/PhysRevE.103.042418.

Abstract

Direct-coupling analysis is a statistical learning method for protein contact prediction based on sequence information alone. The maximum entropy principle leads to an effective inverse Potts model. Predictions on contacts are based on fitted local fields and couplings from an empirical multiple sequence alignment. Typically, the l_{2} norm of the resulting two-body couplings is used for contact prediction. However, this procedure discards important information. In this paper we show that the usage of the full fields and coupling information improves prediction accuracy.