Characteristic motifs for families of allergenic proteins

Mol Immunol. 2009 Feb;46(4):559-68. doi: 10.1016/j.molimm.2008.07.034. Epub 2008 Oct 31.

Abstract

The identification of potential allergenic proteins is usually done by scanning a database of allergenic proteins and locating known allergens with a high sequence similarity. However, there is no universally accepted cut-off value for sequence similarity to indicate potential IgE cross-reactivity. Further, overall sequence similarity may be less important than discrete areas of similarity in proteins with homologous structure. To identify such areas, we first classified all allergens and their subdomains in the Structural Database of Allergenic Proteins (SDAP, http://fermi.utmb.edu/SDAP/) to their closest protein families as defined in Pfam, and identified conserved physicochemical property motifs characteristic of each group of sequences. Allergens populate only a small subset of all known Pfam families, as all allergenic proteins in SDAP could be grouped to only 130 (of 9318 total) Pfams, and 31 families contain more than four allergens. Conserved physicochemical property motifs for the aligned sequences of the most populated Pfam families were identified with the PCPMer program suite and catalogued in the webserver MotifMate (http://born.utmb.edu/motifmate/summary.php). We also determined specific motifs for allergenic members of a family that could distinguish them from non-allergenic ones. These allergen specific motifs should be most useful in database searches for potential allergens. We found that sequence motifs unique to the allergens in three families (seed storage proteins, Bet v 1, and tropomyosin) overlap with known IgE epitopes, thus providing evidence that our motif based approach can be used to assess the potential allergenicity of novel proteins.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Allergens / chemistry*
  • Allergens / classification*
  • Allergens / immunology
  • Amino Acid Motifs / immunology
  • Computational Biology
  • Cross Reactions / immunology
  • Databases, Protein
  • Epitopes / chemistry
  • Epitopes / immunology
  • Humans
  • Immunoglobulin E / chemistry
  • Immunoglobulin E / immunology
  • Information Storage and Retrieval
  • Protein Structure, Tertiary
  • Sequence Homology, Amino Acid
  • Software
  • Structure-Activity Relationship

Substances

  • Allergens
  • Epitopes
  • Immunoglobulin E