TAPO: A combined method for the identification of tandem repeats in protein structures

FEBS Lett. 2015 Sep 14;589(19 Pt A):2611-9. doi: 10.1016/j.febslet.2015.08.025. Epub 2015 Aug 29.

Abstract

In recent years, there has been an emergence of new 3D structures of proteins containing tandem repeats (TRs), as a result of improved expression and crystallization strategies. Databases focused on structure classifications (PDB, SCOP, CATH) do not provide an easy solution for selection of these structures from PDB. Several approaches have been developed, but no best approach exists to identify the whole range of 3D TRs. Here we describe the TAndem PrOtein detector (TAPO) that uses periodicities of atomic coordinates and other types of structural representation, including strings generated by conformational alphabets, residue contact maps, and arrangements of vectors of secondary structure elements. The benchmarking shows the superior performance of TAPO over the existing programs. In accordance with our analysis of PDB using TAPO, 19% of proteins contain 3D TRs. This analysis allowed us to identify new families of 3D TRs, suggesting that TAPO can be used to regularly update the collection and classification of existing repetitive structures.

Keywords: 3D protein structure; Non-globular protein; Prediction of repetitive unit; Prediction pipeline; Proteome; Tandem repeat; Webserver.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Databases, Protein
  • Models, Molecular
  • Protein Conformation
  • Protein Structure, Secondary
  • Proteins / chemistry*
  • Repetitive Sequences, Amino Acid*
  • Reproducibility of Results
  • Tandem Repeat Sequences*

Substances

  • Proteins