NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|171916097|ref|NP_001116437|]
View 

repetin [Homo sapiens]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
S-100 cd00213
S-100: S-100 domain, which represents the largest family within the superfamily of proteins ...
3-89 9.40e-34

S-100: S-100 domain, which represents the largest family within the superfamily of proteins carrying the Ca-binding EF-hand motif. Note that this S-100 hierarchy contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins are expressed exclusively in vertebrates, and are implicated in intracellular and extracellular regulatory activities. Intracellularly, S100 proteins act as Ca-signaling or Ca-buffering proteins. The most unusual characteristic of certain S100 proteins is their occurrence in extracellular space, where they act in a cytokine-like manner through RAGE, the receptor for advanced glycation products. Structural data suggest that many S100 members exist within cells as homo- or heterodimers and even oligomers; oligomerization contributes to their functional diversification. Upon binding calcium, most S100 proteins change conformation to a more open structure exposing a hydrophobic cleft. This hydrophobic surface represents the interaction site of S100 proteins with their target proteins. There is experimental evidence showing that many S100 proteins have multiple binding partners with diverse mode of interaction with different targets. In addition to S100 proteins (such as S100A1,-3,-4,-6,-7,-10,-11,and -13), this group includes the ''fused'' gene family, a group of calcium binding S100-related proteins. The ''fused'' gene family includes multifunctional epidermal differentiation proteins - profilaggrin, trichohyalin, repetin, hornerin, and cornulin; functionally these proteins are associated with keratin intermediate filaments and partially crosslinked to the cell envelope. These ''fused'' gene proteins contain N-terminal sequence with two Ca-binding EF-hands motif, which may be associated with calcium signaling in epidermal cells and autoprocessing in a calcium-dependent manner. In contrast to S100 proteins, "fused" gene family proteins contain an extraordinary high number of almost perfect peptide repeats with regular array of polar and charged residues similar to many known cell envelope proteins.


:

Pssm-ID: 238131 [Multi-domain]  Cd Length: 88  Bit Score: 124.14  E-value: 9.40e-34
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097   3 QLLNSILSVIDVFHKYAKGNGDCALLCKEELKQLLLAEFGDILQRPNDPETVETILNLLDQDRDGHIDFHEYLLLVFQLV 82
Cdd:cd00213    2 ELEKAIETIIDVFHKYSGKEGDKDTLSKKELKELLETELPNFLKNQKDPEAVDKIMKDLDVNKDGKVDFQEFLVLIGKLA 81

                 ....*..
gi 171916097  83 QACYHKL 89
Cdd:cd00213   82 VACHEFF 88
ser_rich_anae_1 super family cl41472
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
284-581 1.79e-14

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


The actual alignment was detected with superfamily member NF033849:

Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 77.74  E-value: 1.79e-14
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  284 GQTDRQGQSSHYGQTDRQDQSYHYGQTDRQGQSSHYSQTDRQGQSSHysqpdrQGQSShygqmdrkgqcyhydqTNRQGQ 363
Cdd:NF033849  236 GQSAGTGYGESVGHSTSQGQSHSVGTSESHSVGTSQSQSHTTGHGST------RGWSH----------------TQSTSE 293
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  364 GSHYSQPNRQGQSSHYGQPDTQDQSShyGQTDRQDQSSHYGQTERQGQSSHYSQMDRQGQGSHYGQTDRQGQSSHYGQpd 443
Cdd:NF033849  294 SESTGQSSSVGTSESQSHGTTEGTST--TDSSSHSQSSSYNVSSGTGVSSSHSDGTSQSTSISHSESSSESTGTSVGH-- 369
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  444 rqGQNSHYGQTDRQGQSSHYGQTDRQG----------QSSHYSQPDKQGQSSHYGkIDRQDQSYH--------YGQPDGQ 505
Cdd:NF033849  370 --STSSSVSSSESSSRSSSSGVSGGFSggiagggvtsEGLGASQGGSEGWGSGDS-VQSVSQSYGsssstgtsSGHSDSS 446
                         250       260       270       280       290       300       310
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 171916097  506 GQSSHYGQTDRQGQSFHYGQPDRQGQSSHYSQMdrQGQSSHYGQTDRQGQSSHYGQTDRQGQSYHYGQTDRQGQSS 581
Cdd:NF033849  447 SHSTSSGQADSVSQGTSWSEGTGTSQGQSVGTS--ESWSTSQSETDSVGDSTGTSESVSQGDGRSTGRSESQGTSL 520
Glutenin_hmw super family cl26620
High molecular weight glutenin subunit; Members of this family include high molecular weight ...
109-714 8.26e-12

High molecular weight glutenin subunit; Members of this family include high molecular weight subunits of glutenin. This group of gluten proteins is thought to be largely responsible for the elastic properties of gluten, and hence, doughs. Indeed, glutenin high molecular weight subunits are classified as elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterized by the following three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm.


The actual alignment was detected with superfamily member pfam03157:

Pssm-ID: 367362 [Multi-domain]  Cd Length: 786  Bit Score: 68.82  E-value: 8.26e-12
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  109 QDCKFPGNTGRQHRQRHEEERQNSHHSQPERQDGDSHHGQPERQDRDSHHGQSEKQDRDSHHSQPERQDRDSHHNQSERQ 188
Cdd:pfam03157 141 QQWYYPTSPQQPGQWQQPGQGQQGYYPTSPQQSGQRQQPGQGQQLRQGQQGQQSGQGQPGYYPTSSQQPGQLQQTGQGQQ 220
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  189 DKDFSFDQSERQSQDSSSGKKVSHKSTSGQAKWQGHIFALNRCEKPIQDSHYGQSERhtqqsetlGQASHFNQTNQQKSG 268
Cdd:pfam03157 221 GQQPERGQQGQQPGQGQQPGQGQQGQQPGQPQQLGQGQQGYYPISPQQPRQWQQSGQ--------GQQGYYPTSLQQPGQ 292
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  269 SYCGQSERLGQELGCGQTDRQGQSSHYGQTDRQDQSYHYGQTDRQGQSSHYSQTDRQGQSSHYSQPDRQGQSSHYGQMDR 348
Cdd:pfam03157 293 GQSGYYPTSQQQAGQLQQEQQLGQEQQDQQPGQGRQGQQPGQGQQGQQPAQGQQPGQGQPGYYPTSPQQPGQGQPGYYPT 372
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  349 KGQCYHYDQTNRQGQGSHYSQPNRQGQSSHYGQPDTQDQSSHYGQTDRQDQSSHYGQTERQGQSSHYSQMDRQGQGSHYG 428
Cdd:pfam03157 373 SQQQPQQGQQPEQGQQGQQQGQGQQGQQPGQGQQPGQGQPGYYPTSPQQSGQGQPGYYPTSPQQSGQGQQPGQGQQPGQE 452
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  429 QTDrQGQSSHYGQPDRQGQNSHYGQTDRQGQSSHYG---QTDRQGQSSHYSQPDKQGQSSHYGKIDRQDqsyhygqpdGQ 505
Cdd:pfam03157 453 QPG-QGQQPGQGQQGQQPGQPEQGQQPGQGQPGYYPtspQQSGQGQQLGQWQQQGQGQPGYYPTSPLQP---------GQ 522
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  506 GQSSHYGQTDRQGQSfhyGQPDRQGQSSHYSQMDRQGQSSHYGQTDRQGQSSHYGQTDRQGQSYHYGQTDRQGQSSHYIQ 585
Cdd:pfam03157 523 GQPGYYPTSPQQPGQ---GQQLGQLQQPTQGQQGQQSGQGQQGQQPGQGQQGQQPGQGQQGQQPGQGQQPGQGQPGYYPT 599
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  586 SQTGEIQGQNKYFQGTEGTRKASYVEQSGRSGRLSQQTPGQEGYQNQGQGFQSRDSQQN--GHQVWEPEEDSQHHQHKLL 663
Cdd:pfam03157 600 SPQQSGQGQQPGQWQQPGQGQPGYYPTSSLQLGQGQQGYYPTSPQQPGQGQQPGQWQQSgqGQQGYYPTSPQQSGQAQQP 679
                         570       580       590       600       610
                  ....*....|....*....|....*....|....*....|....*....|.
gi 171916097  664 AQIQQERPLCHKGRDWQSCSSEQGHRQAQtrqshGEGLSHWAEEEQGHQTW 714
Cdd:pfam03157 680 GQGQQPGQWLQPGQGQQGYYPTSPQQPGQ-----GQQLGQGQQSGQGQQGY 725
PRK12678 super family cl36163
transcription termination factor Rho; Provisional
689-784 2.14e-03

transcription termination factor Rho; Provisional


The actual alignment was detected with superfamily member PRK12678:

Pssm-ID: 237171 [Multi-domain]  Cd Length: 672  Bit Score: 41.43  E-value: 2.14e-03
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097 689 RQAQTRQSHGEGLSHWAEEEQGHQTwDRHSHESQEGPCGTQDRRTHKDEQNHQRRDRQTHEHEQSHQRRDRQTHEDKQNR 768
Cdd:PRK12678 137 ARRGAARKAGEGGEQPATEARADAA-ERTEEEERDERRRRGDREDRQAEAERGERGRREERGRDGDDRDRRDRREQGDRR 215
                         90
                 ....*....|....*.
gi 171916097 769 QRRDRQTHEDEQNHQR 784
Cdd:PRK12678 216 EERGRRDGGDRRGRRR 231
 
Name Accession Description Interval E-value
S-100 cd00213
S-100: S-100 domain, which represents the largest family within the superfamily of proteins ...
3-89 9.40e-34

S-100: S-100 domain, which represents the largest family within the superfamily of proteins carrying the Ca-binding EF-hand motif. Note that this S-100 hierarchy contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins are expressed exclusively in vertebrates, and are implicated in intracellular and extracellular regulatory activities. Intracellularly, S100 proteins act as Ca-signaling or Ca-buffering proteins. The most unusual characteristic of certain S100 proteins is their occurrence in extracellular space, where they act in a cytokine-like manner through RAGE, the receptor for advanced glycation products. Structural data suggest that many S100 members exist within cells as homo- or heterodimers and even oligomers; oligomerization contributes to their functional diversification. Upon binding calcium, most S100 proteins change conformation to a more open structure exposing a hydrophobic cleft. This hydrophobic surface represents the interaction site of S100 proteins with their target proteins. There is experimental evidence showing that many S100 proteins have multiple binding partners with diverse mode of interaction with different targets. In addition to S100 proteins (such as S100A1,-3,-4,-6,-7,-10,-11,and -13), this group includes the ''fused'' gene family, a group of calcium binding S100-related proteins. The ''fused'' gene family includes multifunctional epidermal differentiation proteins - profilaggrin, trichohyalin, repetin, hornerin, and cornulin; functionally these proteins are associated with keratin intermediate filaments and partially crosslinked to the cell envelope. These ''fused'' gene proteins contain N-terminal sequence with two Ca-binding EF-hands motif, which may be associated with calcium signaling in epidermal cells and autoprocessing in a calcium-dependent manner. In contrast to S100 proteins, "fused" gene family proteins contain an extraordinary high number of almost perfect peptide repeats with regular array of polar and charged residues similar to many known cell envelope proteins.


Pssm-ID: 238131 [Multi-domain]  Cd Length: 88  Bit Score: 124.14  E-value: 9.40e-34
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097   3 QLLNSILSVIDVFHKYAKGNGDCALLCKEELKQLLLAEFGDILQRPNDPETVETILNLLDQDRDGHIDFHEYLLLVFQLV 82
Cdd:cd00213    2 ELEKAIETIIDVFHKYSGKEGDKDTLSKKELKELLETELPNFLKNQKDPEAVDKIMKDLDVNKDGKVDFQEFLVLIGKLA 81

                 ....*..
gi 171916097  83 QACYHKL 89
Cdd:cd00213   82 VACHEFF 88
S_100 pfam01023
S-100/ICaBP type calcium binding domain; The S-100 domain is a subfamily of the EF-hand ...
4-48 1.44e-14

S-100/ICaBP type calcium binding domain; The S-100 domain is a subfamily of the EF-hand calcium binding proteins.


Pssm-ID: 460028  Cd Length: 45  Bit Score: 68.23  E-value: 1.44e-14
                          10        20        30        40
                  ....*....|....*....|....*....|....*....|....*
gi 171916097    4 LLNSILSVIDVFHKYAKGNGDCALLCKEELKQLLLAEFGDILQRP 48
Cdd:pfam01023   1 LERAIETIIDVFHKYAGKEGDKDTLSKKELKELLEKELPNFLKNQ 45
ser_rich_anae_1 NF033849
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
284-581 1.79e-14

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 77.74  E-value: 1.79e-14
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  284 GQTDRQGQSSHYGQTDRQDQSYHYGQTDRQGQSSHYSQTDRQGQSSHysqpdrQGQSShygqmdrkgqcyhydqTNRQGQ 363
Cdd:NF033849  236 GQSAGTGYGESVGHSTSQGQSHSVGTSESHSVGTSQSQSHTTGHGST------RGWSH----------------TQSTSE 293
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  364 GSHYSQPNRQGQSSHYGQPDTQDQSShyGQTDRQDQSSHYGQTERQGQSSHYSQMDRQGQGSHYGQTDRQGQSSHYGQpd 443
Cdd:NF033849  294 SESTGQSSSVGTSESQSHGTTEGTST--TDSSSHSQSSSYNVSSGTGVSSSHSDGTSQSTSISHSESSSESTGTSVGH-- 369
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  444 rqGQNSHYGQTDRQGQSSHYGQTDRQG----------QSSHYSQPDKQGQSSHYGkIDRQDQSYH--------YGQPDGQ 505
Cdd:NF033849  370 --STSSSVSSSESSSRSSSSGVSGGFSggiagggvtsEGLGASQGGSEGWGSGDS-VQSVSQSYGsssstgtsSGHSDSS 446
                         250       260       270       280       290       300       310
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 171916097  506 GQSSHYGQTDRQGQSFHYGQPDRQGQSSHYSQMdrQGQSSHYGQTDRQGQSSHYGQTDRQGQSYHYGQTDRQGQSS 581
Cdd:NF033849  447 SHSTSSGQADSVSQGTSWSEGTGTSQGQSVGTS--ESWSTSQSETDSVGDSTGTSESVSQGDGRSTGRSESQGTSL 520
ser_rich_anae_1 NF033849
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
272-581 2.76e-13

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 73.89  E-value: 2.76e-13
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  272 GQSERLGQELGCGQTDRQGQSSHYGQTDRQ--DQSYHYGQTDRQGQSSHYSQTDRQGQSSHYSQPDRQGQSSHYGQMDRK 349
Cdd:NF033849  236 GQSAGTGYGESVGHSTSQGQSHSVGTSESHsvGTSQSQSHTTGHGSTRGWSHTQSTSESESTGQSSSVGTSESQSHGTTE 315
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  350 G----------QCYHYDQTNRQGQGSHYSQPNRQGQSSHYGQPDTQDQSSHYGQTdrqdQSSHYGQTERQGQSSHYSQMD 419
Cdd:NF033849  316 GtsttdssshsQSSSYNVSSGTGVSSSHSDGTSQSTSISHSESSSESTGTSVGHS----TSSSVSSSESSSRSSSSGVSG 391
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  420 RQGQGSHYGQTDRQGQSSHygqpdrQGQNSHYGQTD-RQGQSSHYGQTDRQGQSShySQPDKQGQSSHYGkidrqdqsyh 498
Cdd:NF033849  392 GFSGGIAGGGVTSEGLGAS------QGGSEGWGSGDsVQSVSQSYGSSSSTGTSS--GHSDSSSHSTSSG---------- 453
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  499 YGQPDGQGQSSHYGQTDRQGQSFhygqpdRQGQSSHYSQMDRQGQSSHYGQTDRQGQSSHYGQTDRQGQSYHYGQTDRQG 578
Cdd:NF033849  454 QADSVSQGTSWSEGTGTSQGQSV------GTSESWSTSQSETDSVGDSTGTSESVSQGDGRSTGRSESQGTSLGTSGGRT 527

                  ...
gi 171916097  579 QSS 581
Cdd:NF033849  528 SGA 530
Glutenin_hmw pfam03157
High molecular weight glutenin subunit; Members of this family include high molecular weight ...
109-714 8.26e-12

High molecular weight glutenin subunit; Members of this family include high molecular weight subunits of glutenin. This group of gluten proteins is thought to be largely responsible for the elastic properties of gluten, and hence, doughs. Indeed, glutenin high molecular weight subunits are classified as elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterized by the following three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm.


Pssm-ID: 367362 [Multi-domain]  Cd Length: 786  Bit Score: 68.82  E-value: 8.26e-12
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  109 QDCKFPGNTGRQHRQRHEEERQNSHHSQPERQDGDSHHGQPERQDRDSHHGQSEKQDRDSHHSQPERQDRDSHHNQSERQ 188
Cdd:pfam03157 141 QQWYYPTSPQQPGQWQQPGQGQQGYYPTSPQQSGQRQQPGQGQQLRQGQQGQQSGQGQPGYYPTSSQQPGQLQQTGQGQQ 220
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  189 DKDFSFDQSERQSQDSSSGKKVSHKSTSGQAKWQGHIFALNRCEKPIQDSHYGQSERhtqqsetlGQASHFNQTNQQKSG 268
Cdd:pfam03157 221 GQQPERGQQGQQPGQGQQPGQGQQGQQPGQPQQLGQGQQGYYPISPQQPRQWQQSGQ--------GQQGYYPTSLQQPGQ 292
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  269 SYCGQSERLGQELGCGQTDRQGQSSHYGQTDRQDQSYHYGQTDRQGQSSHYSQTDRQGQSSHYSQPDRQGQSSHYGQMDR 348
Cdd:pfam03157 293 GQSGYYPTSQQQAGQLQQEQQLGQEQQDQQPGQGRQGQQPGQGQQGQQPAQGQQPGQGQPGYYPTSPQQPGQGQPGYYPT 372
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  349 KGQCYHYDQTNRQGQGSHYSQPNRQGQSSHYGQPDTQDQSSHYGQTDRQDQSSHYGQTERQGQSSHYSQMDRQGQGSHYG 428
Cdd:pfam03157 373 SQQQPQQGQQPEQGQQGQQQGQGQQGQQPGQGQQPGQGQPGYYPTSPQQSGQGQPGYYPTSPQQSGQGQQPGQGQQPGQE 452
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  429 QTDrQGQSSHYGQPDRQGQNSHYGQTDRQGQSSHYG---QTDRQGQSSHYSQPDKQGQSSHYGKIDRQDqsyhygqpdGQ 505
Cdd:pfam03157 453 QPG-QGQQPGQGQQGQQPGQPEQGQQPGQGQPGYYPtspQQSGQGQQLGQWQQQGQGQPGYYPTSPLQP---------GQ 522
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  506 GQSSHYGQTDRQGQSfhyGQPDRQGQSSHYSQMDRQGQSSHYGQTDRQGQSSHYGQTDRQGQSYHYGQTDRQGQSSHYIQ 585
Cdd:pfam03157 523 GQPGYYPTSPQQPGQ---GQQLGQLQQPTQGQQGQQSGQGQQGQQPGQGQQGQQPGQGQQGQQPGQGQQPGQGQPGYYPT 599
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  586 SQTGEIQGQNKYFQGTEGTRKASYVEQSGRSGRLSQQTPGQEGYQNQGQGFQSRDSQQN--GHQVWEPEEDSQHHQHKLL 663
Cdd:pfam03157 600 SPQQSGQGQQPGQWQQPGQGQPGYYPTSSLQLGQGQQGYYPTSPQQPGQGQQPGQWQQSgqGQQGYYPTSPQQSGQAQQP 679
                         570       580       590       600       610
                  ....*....|....*....|....*....|....*....|....*....|.
gi 171916097  664 AQIQQERPLCHKGRDWQSCSSEQGHRQAQtrqshGEGLSHWAEEEQGHQTW 714
Cdd:pfam03157 680 GQGQQPGQWLQPGQGQQGYYPTSPQQPGQ-----GQQLGQGQQSGQGQQGY 725
Glutenin_hmw pfam03157
High molecular weight glutenin subunit; Members of this family include high molecular weight ...
97-629 9.03e-10

High molecular weight glutenin subunit; Members of this family include high molecular weight subunits of glutenin. This group of gluten proteins is thought to be largely responsible for the elastic properties of gluten, and hence, doughs. Indeed, glutenin high molecular weight subunits are classified as elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterized by the following three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm.


Pssm-ID: 367362 [Multi-domain]  Cd Length: 786  Bit Score: 62.27  E-value: 9.03e-10
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097   97 RTSQQERGQEGAQDCKFPGNTGRQHRQRHEEERQNSHHSQPERQDGDSHHGQPERQDRDSHHGQSEKQDRDSHHSQPERQ 176
Cdd:pfam03157 186 RQGQQGQQSGQGQPGYYPTSSQQPGQLQQTGQGQQGQQPERGQQGQQPGQGQQPGQGQQGQQPGQPQQLGQGQQGYYPIS 265
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  177 DRDSHHNQSERQDKDFSFDQSERQSQDSSSGKKVSHKSTSGQAKWQGHIFALNRCEKPIQDSHYGQSERHTQ-------Q 249
Cdd:pfam03157 266 PQQPRQWQQSGQGQQGYYPTSLQQPGQGQSGYYPTSQQQAGQLQQEQQLGQEQQDQQPGQGRQGQQPGQGQQgqqpaqgQ 345
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  250 SETLGQASHFNQTNQQKSGSYCGQSERLGQELGCGQTDRQGQSSHYGQTDRQDQSYHYGQTDRQGQSSHYS---QTDRQG 326
Cdd:pfam03157 346 QPGQGQPGYYPTSPQQPGQGQPGYYPTSQQQPQQGQQPEQGQQGQQQGQGQQGQQPGQGQQPGQGQPGYYPtspQQSGQG 425
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  327 QSSHY-SQPDRQGQSSHYGQMDRKGQcyhydQTNRQGQGSHYSQPNRQGQSSHYGQPDTQDQSSHYG---QTDRQDQSSH 402
Cdd:pfam03157 426 QPGYYpTSPQQSGQGQQPGQGQQPGQ-----EQPGQGQQPGQGQQGQQPGQPEQGQQPGQGQPGYYPtspQQSGQGQQLG 500
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  403 YGQTERQGQSSHYSQMDRQGQGSHYGQTDRQGQSSHYGQPDRQGQNSHYGQTDRQGQSSHYGQTDRQGQSSHYSQPDKQG 482
Cdd:pfam03157 501 QWQQQGQGQPGYYPTSPLQPGQGQPGYYPTSPQQPGQGQQLGQLQQPTQGQQGQQSGQGQQGQQPGQGQQGQQPGQGQQG 580
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  483 QSSHYGKIDRQDQSYHYG---QPDGQGQSSHYGQTDRQGQSFHYGQPDRQ---GQSSHYS---QMDRQGQSSHYGQTDRQ 553
Cdd:pfam03157 581 QQPGQGQQPGQGQPGYYPtspQQSGQGQQPGQWQQPGQGQPGYYPTSSLQlgqGQQGYYPtspQQPGQGQQPGQWQQSGQ 660
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  554 GQSSHY---------GQTDRQGQSYHYGQTDRQGQSSHYIQSQTGEIQGQNKYFQGTEGTRKASYVEQSGRSGRLSQQtp 624
Cdd:pfam03157 661 GQQGYYptspqqsgqAQQPGQGQQPGQWLQPGQGQQGYYPTSPQQPGQGQQLGQGQQSGQGQQGYYPTSPGQGQQSGQ-- 738

                  ....*
gi 171916097  625 GQEGY 629
Cdd:pfam03157 739 GQQGY 743
ser_rich_anae_1 NF033849
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
380-635 9.68e-10

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 62.33  E-value: 9.68e-10
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  380 GQPDTQDQSSHYGQTDRQDQSSHYGQTERQGQSSHYSQMDRQGQGSHYGQTDRQGQS--SHYGQPDRQGQNSHYGQTDRQ 457
Cdd:NF033849  236 GQSAGTGYGESVGHSTSQGQSHSVGTSESHSVGTSQSQSHTTGHGSTRGWSHTQSTSesESTGQSSSVGTSESQSHGTTE 315
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  458 G----------QSSHYGQTDRQGQSSHYSQPDKQGQSSHYGkidrQDQSYHYGQPDGQGQSSHYGQTDRQGQSFHYGQPD 527
Cdd:NF033849  316 GtsttdssshsQSSSYNVSSGTGVSSSHSDGTSQSTSISHS----ESSSESTGTSVGHSTSSSVSSSESSSRSSSSGVSG 391
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  528 R------------------QGQSSHYSQMD-RQGQSSHYGQTdrQGQSSHYGQTDRQGQSYHYGQTDRQGQSSHYIQSqT 588
Cdd:NF033849  392 GfsggiagggvtseglgasQGGSEGWGSGDsVQSVSQSYGSS--SSTGTSSGHSDSSSHSTSSGQADSVSQGTSWSEG-T 468
                         250       260       270       280       290
                  ....*....|....*....|....*....|....*....|....*....|.
gi 171916097  589 GEIQGQ---NKYFQGTEGTRKASYVEQSGRSGRLSQQTPGQEGY-QNQGQG 635
Cdd:NF033849  469 GTSQGQsvgTSESWSTSQSETDSVGDSTGTSESVSQGDGRSTGRsESQGTS 519
ser_rich_anae_1 NF033849
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
134-461 1.92e-07

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 55.01  E-value: 1.92e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  134 HSQPERQDGDSHHGQPERQDRDSHHGQSEKQ-DRDSH---HSQPERQDRDSHHNQSErqdkdfSFDQSERQSQDSSSGKK 209
Cdd:NF033849  235 LGQSAGTGYGESVGHSTSQGQSHSVGTSESHsVGTSQsqsHTTGHGSTRGWSHTQST------SESESTGQSSSVGTSES 308
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  210 VSHKSTSGQAKWQGhifalnrcekpiqdSHYGQSERHTQQSETLGQASHFNQTNQQKSGsycGQSERLGQelGCGQTDRQ 289
Cdd:NF033849  309 QSHGTTEGTSTTDS--------------SSHSQSSSYNVSSGTGVSSSHSDGTSQSTSI---SHSESSSE--STGTSVGH 369
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  290 GQSSHYGQTDRQDQSYHYGQTDRQGQSSHYSQTDRQGQSSHysqpdrQGQSSHYGQMDrKGQCYHYDQTNRQGQGSHYSQ 369
Cdd:NF033849  370 STSSSVSSSESSSRSSSSGVSGGFSGGIAGGGVTSEGLGAS------QGGSEGWGSGD-SVQSVSQSYGSSSSTGTSSGH 442
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  370 PNRQGQSSHYGQPDTQDQsshyGQTDRQDQSSHYGQTERQGQSSHYSQMDRQGQGSHYGQTDRQGQSSHYGQPDRQGQNS 449
Cdd:NF033849  443 SDSSSHSTSSGQADSVSQ----GTSWSEGTGTSQGQSVGTSESWSTSQSETDSVGDSTGTSESVSQGDGRSTGRSESQGT 518
                         330
                  ....*....|..
gi 171916097  450 HYGQTDRQGQSS 461
Cdd:NF033849  519 SLGTSGGRTSGA 530
PRK12678 PRK12678
transcription termination factor Rho; Provisional
92-203 2.36e-05

transcription termination factor Rho; Provisional


Pssm-ID: 237171 [Multi-domain]  Cd Length: 672  Bit Score: 47.98  E-value: 2.36e-05
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  92 KSHGGRTSQQERGQEGAQDCKFPGNTGRQHRQRHEEERQNSHHSQPERQDGDSHHGQPERQDRDSHHGQSEKQDRDSHHS 171
Cdd:PRK12678 143 RKAGEGGEQPATEARADAAERTEEEERDERRRRGDREDRQAEAERGERGRREERGRDGDDRDRRDRREQGDRREERGRRD 222
                         90       100       110
                 ....*....|....*....|....*....|..
gi 171916097 172 QPERQDRDSHHNQSERQDKDFSFDQSERQSQD 203
Cdd:PRK12678 223 GGDRRGRRRRRDRRDARGDDNREDRGDRDGDD 254
PRK12678 PRK12678
transcription termination factor Rho; Provisional
689-784 2.14e-03

transcription termination factor Rho; Provisional


Pssm-ID: 237171 [Multi-domain]  Cd Length: 672  Bit Score: 41.43  E-value: 2.14e-03
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097 689 RQAQTRQSHGEGLSHWAEEEQGHQTwDRHSHESQEGPCGTQDRRTHKDEQNHQRRDRQTHEHEQSHQRRDRQTHEDKQNR 768
Cdd:PRK12678 137 ARRGAARKAGEGGEQPATEARADAA-ERTEEEERDERRRRGDREDRQAEAERGERGRREERGRDGDDRDRRDRREQGDRR 215
                         90
                 ....*....|....*.
gi 171916097 769 QRRDRQTHEDEQNHQR 784
Cdd:PRK12678 216 EERGRRDGGDRRGRRR 231
 
Name Accession Description Interval E-value
S-100 cd00213
S-100: S-100 domain, which represents the largest family within the superfamily of proteins ...
3-89 9.40e-34

S-100: S-100 domain, which represents the largest family within the superfamily of proteins carrying the Ca-binding EF-hand motif. Note that this S-100 hierarchy contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins are expressed exclusively in vertebrates, and are implicated in intracellular and extracellular regulatory activities. Intracellularly, S100 proteins act as Ca-signaling or Ca-buffering proteins. The most unusual characteristic of certain S100 proteins is their occurrence in extracellular space, where they act in a cytokine-like manner through RAGE, the receptor for advanced glycation products. Structural data suggest that many S100 members exist within cells as homo- or heterodimers and even oligomers; oligomerization contributes to their functional diversification. Upon binding calcium, most S100 proteins change conformation to a more open structure exposing a hydrophobic cleft. This hydrophobic surface represents the interaction site of S100 proteins with their target proteins. There is experimental evidence showing that many S100 proteins have multiple binding partners with diverse mode of interaction with different targets. In addition to S100 proteins (such as S100A1,-3,-4,-6,-7,-10,-11,and -13), this group includes the ''fused'' gene family, a group of calcium binding S100-related proteins. The ''fused'' gene family includes multifunctional epidermal differentiation proteins - profilaggrin, trichohyalin, repetin, hornerin, and cornulin; functionally these proteins are associated with keratin intermediate filaments and partially crosslinked to the cell envelope. These ''fused'' gene proteins contain N-terminal sequence with two Ca-binding EF-hands motif, which may be associated with calcium signaling in epidermal cells and autoprocessing in a calcium-dependent manner. In contrast to S100 proteins, "fused" gene family proteins contain an extraordinary high number of almost perfect peptide repeats with regular array of polar and charged residues similar to many known cell envelope proteins.


Pssm-ID: 238131 [Multi-domain]  Cd Length: 88  Bit Score: 124.14  E-value: 9.40e-34
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097   3 QLLNSILSVIDVFHKYAKGNGDCALLCKEELKQLLLAEFGDILQRPNDPETVETILNLLDQDRDGHIDFHEYLLLVFQLV 82
Cdd:cd00213    2 ELEKAIETIIDVFHKYSGKEGDKDTLSKKELKELLETELPNFLKNQKDPEAVDKIMKDLDVNKDGKVDFQEFLVLIGKLA 81

                 ....*..
gi 171916097  83 QACYHKL 89
Cdd:cd00213   82 VACHEFF 88
S-100A10_like cd05031
S-100A10_like: S-100A10 domain found in proteins similar to S100A10. S100A10 is a member of ...
2-85 2.71e-18

S-100A10_like: S-100A10 domain found in proteins similar to S100A10. S100A10 is a member of the S100 family of EF-hand superfamily of calcium-binding proteins. Note that the S-100 hierarchy, to which this S-100A1_like group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins are expressed exclusively in vertebrates, and are implicated in intracellular and extracellular regulatory activities. A unique feature of S100A10 is that it contains mutation in both of the calcium binding sites, making it calcium insensitive. S100A10 has been detected in brain, heart, gastrointestinal tract, kidney, liver, lung, spleen, testes, epidermis, aorta, and thymus. Structural data supports the homo- and hetero-dimeric as well as hetero-tetrameric nature of the protein. S100A10 has multiple binding partners in its calcium free state and is therefore involved in many diverse biological functions.


Pssm-ID: 240157 [Multi-domain]  Cd Length: 94  Bit Score: 80.55  E-value: 2.71e-18
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097   2 AQLLNSILSVIDVFHKYAKGNGDCALLCKEELKQLLLAEFGDILQRPNDPETVETILNLLDQDRDGHIDFHEYLLLVFQL 81
Cdd:cd05031    1 SELEHAMESLILTFHRYAGKDGDKNTLSRKELKKLMEKELSEFLKNQKDPMAVDKIMKDLDQNRDGKVNFEEFVSLVAGL 80

                 ....
gi 171916097  82 VQAC 85
Cdd:cd05031   81 SIAC 84
S-100A1 cd05025
S-100A1: S-100A1 domain found in proteins similar to S100A1. S100A1 is a calcium-binding ...
2-85 9.73e-17

S-100A1: S-100A1 domain found in proteins similar to S100A1. S100A1 is a calcium-binding protein belonging to a large S100 vertebrate-specific protein family within the EF-hand superfamily of calcium-binding proteins. Note that the S-100 hierarchy, to which this S-100A1 group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately. As is the case with many other members of S100 protein family, S100A1 is implicated in intracellular and extracellular regulatory activities, including interaction with myosin-associated twitchin kinase, actin-capping protein CapZ, sinapsin I, and tubulin. Structural data suggests that S100A1 proteins exist within cells as antiparallel homodimers, while heterodimers with S100A4 and S100B also has been reported. Upon binding calcium S100A1 changes conformation to expose a hydrophobic cleft which is the interaction site of S100A1 with its more that 20 known target proteins.


Pssm-ID: 240152 [Multi-domain]  Cd Length: 92  Bit Score: 76.08  E-value: 9.73e-17
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097   2 AQLLNSILSVIDVFHKYAKGNGDCALLCKEELKQLLLAEFGDILQRPNDPETVETILNLLDQDRDGHIDFHEYLLLVFQL 81
Cdd:cd05025    2 SELETAMETLINVFHAHSGKEGDKYKLSKKELKDLLQTELSDFLDAQKDADAVDKIMKELDENGDGEVDFQEFVVLVAAL 81

                 ....
gi 171916097  82 VQAC 85
Cdd:cd05025   82 TVAC 85
S_100 pfam01023
S-100/ICaBP type calcium binding domain; The S-100 domain is a subfamily of the EF-hand ...
4-48 1.44e-14

S-100/ICaBP type calcium binding domain; The S-100 domain is a subfamily of the EF-hand calcium binding proteins.


Pssm-ID: 460028  Cd Length: 45  Bit Score: 68.23  E-value: 1.44e-14
                          10        20        30        40
                  ....*....|....*....|....*....|....*....|....*
gi 171916097    4 LLNSILSVIDVFHKYAKGNGDCALLCKEELKQLLLAEFGDILQRP 48
Cdd:pfam01023   1 LERAIETIIDVFHKYAGKEGDKDTLSKKELKELLEKELPNFLKNQ 45
S-100Z cd05026
S-100Z: S-100Z domain found in proteins similar to S100Z. S100Z is a member of the S100 domain ...
3-85 1.55e-14

S-100Z: S-100Z domain found in proteins similar to S100Z. S100Z is a member of the S100 domain family within the EF-hand Ca2+-binding proteins superfamily. Note that the S-100 hierarchy, to which this S-100Z group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately.S100 proteins exhibit unique patterns of tissue- and cell type-specific expression and have been implicated in the Ca2+-dependent regulation of diverse physiological processes, including cell cycle regulation, differentiation, growth, and metabolic control. S100Z is normally expressed in various tissues, with its highest level of expression being in spleen and leukocytes. The function of S100Z remains unclear. Preliminary structural data suggests that S100Z is homodimer, however a heterodimer with S100P has been reported. S100Z is capable of binding calcium ions. When calcium binds to S110Z, the protein experiences a conformational change, which exposes hydrophobic surfaces on the protein. In comparison with their normal tissue counterparts, S100Z gene expression appears to be deregulated in some tumor tissues.


Pssm-ID: 240153 [Multi-domain]  Cd Length: 93  Bit Score: 69.52  E-value: 1.55e-14
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097   3 QLLNSILSVIDVFHKYAKGNGDCALLCKEELKQLLLAEFGDILQRPNDPETVETILNLLDQDRDGHIDFHEYLLLVFQLV 82
Cdd:cd05026    4 QLEGAMDTLIRIFHNYSGKEGDRYKLSKGELKELLQRELTDFLSSQKDPMLVDKIMNDLDSNKDNEVDFNEFVVLVAALT 83

                 ...
gi 171916097  83 QAC 85
Cdd:cd05026   84 VAC 86
ser_rich_anae_1 NF033849
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
284-581 1.79e-14

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 77.74  E-value: 1.79e-14
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  284 GQTDRQGQSSHYGQTDRQDQSYHYGQTDRQGQSSHYSQTDRQGQSSHysqpdrQGQSShygqmdrkgqcyhydqTNRQGQ 363
Cdd:NF033849  236 GQSAGTGYGESVGHSTSQGQSHSVGTSESHSVGTSQSQSHTTGHGST------RGWSH----------------TQSTSE 293
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  364 GSHYSQPNRQGQSSHYGQPDTQDQSShyGQTDRQDQSSHYGQTERQGQSSHYSQMDRQGQGSHYGQTDRQGQSSHYGQpd 443
Cdd:NF033849  294 SESTGQSSSVGTSESQSHGTTEGTST--TDSSSHSQSSSYNVSSGTGVSSSHSDGTSQSTSISHSESSSESTGTSVGH-- 369
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  444 rqGQNSHYGQTDRQGQSSHYGQTDRQG----------QSSHYSQPDKQGQSSHYGkIDRQDQSYH--------YGQPDGQ 505
Cdd:NF033849  370 --STSSSVSSSESSSRSSSSGVSGGFSggiagggvtsEGLGASQGGSEGWGSGDS-VQSVSQSYGsssstgtsSGHSDSS 446
                         250       260       270       280       290       300       310
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 171916097  506 GQSSHYGQTDRQGQSFHYGQPDRQGQSSHYSQMdrQGQSSHYGQTDRQGQSSHYGQTDRQGQSYHYGQTDRQGQSS 581
Cdd:NF033849  447 SHSTSSGQADSVSQGTSWSEGTGTSQGQSVGTS--ESWSTSQSETDSVGDSTGTSESVSQGDGRSTGRSESQGTSL 520
ser_rich_anae_1 NF033849
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
272-581 2.76e-13

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 73.89  E-value: 2.76e-13
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  272 GQSERLGQELGCGQTDRQGQSSHYGQTDRQ--DQSYHYGQTDRQGQSSHYSQTDRQGQSSHYSQPDRQGQSSHYGQMDRK 349
Cdd:NF033849  236 GQSAGTGYGESVGHSTSQGQSHSVGTSESHsvGTSQSQSHTTGHGSTRGWSHTQSTSESESTGQSSSVGTSESQSHGTTE 315
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  350 G----------QCYHYDQTNRQGQGSHYSQPNRQGQSSHYGQPDTQDQSSHYGQTdrqdQSSHYGQTERQGQSSHYSQMD 419
Cdd:NF033849  316 GtsttdssshsQSSSYNVSSGTGVSSSHSDGTSQSTSISHSESSSESTGTSVGHS----TSSSVSSSESSSRSSSSGVSG 391
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  420 RQGQGSHYGQTDRQGQSSHygqpdrQGQNSHYGQTD-RQGQSSHYGQTDRQGQSShySQPDKQGQSSHYGkidrqdqsyh 498
Cdd:NF033849  392 GFSGGIAGGGVTSEGLGAS------QGGSEGWGSGDsVQSVSQSYGSSSSTGTSS--GHSDSSSHSTSSG---------- 453
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  499 YGQPDGQGQSSHYGQTDRQGQSFhygqpdRQGQSSHYSQMDRQGQSSHYGQTDRQGQSSHYGQTDRQGQSYHYGQTDRQG 578
Cdd:NF033849  454 QADSVSQGTSWSEGTGTSQGQSV------GTSESWSTSQSETDSVGDSTGTSESVSQGDGRSTGRSESQGTSLGTSGGRT 527

                  ...
gi 171916097  579 QSS 581
Cdd:NF033849  528 SGA 530
S-100B cd05027
S-100B: S-100B domain found in proteins similar to S100B. S100B is a calcium-binding protein ...
2-86 7.44e-13

S-100B: S-100B domain found in proteins similar to S100B. S100B is a calcium-binding protein belonging to a large S100 vertebrate-specific protein family within the EF-hand superfamily of calcium-binding proteins. Note that the S-100 hierarchy, to which this S-100B group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100B is most abundant in glial cells of the central nervous system, predominately in astrocytes. S100B is involved in signal transduction via the inhibition of protein phoshorylation, regulation of enzyme activity and by affecting the calcium homeostasis. Upon calcium binding the S100B homodimer changes conformation to expose a hydrophobic cleft, which represents the interaction site of S100B with its more than 20 known target proteins. These target proteins include several cellular architecture proteins such as tubulin and GFAP; S100B can inhibit polymerization of these oligomeric molecules. Furthermore, S100B inhibits the phosphorylation of multiple kinase substrates including the Alzheimer protein tau and neuromodulin (GAP-43) through a calcium-sensitive interaction with the protein substrates.


Pssm-ID: 240154 [Multi-domain]  Cd Length: 88  Bit Score: 64.88  E-value: 7.44e-13
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097   2 AQLLNSILSVIDVFHKYAKGNGDCALLCKEELKQLLLAEFGDILQRPNDPETVETILNLLDQDRDGHIDFHEYLLLVFQL 81
Cdd:cd05027    1 SELEKAMVALIDVFHQYSGREGDKHKLKKSELKELINNELSHFLEEIKEQEVVDKVMETLDSDGDGECDFQEFMAFVAMV 80

                 ....*
gi 171916097  82 VQACY 86
Cdd:cd05027   81 TTACH 85
Glutenin_hmw pfam03157
High molecular weight glutenin subunit; Members of this family include high molecular weight ...
109-714 8.26e-12

High molecular weight glutenin subunit; Members of this family include high molecular weight subunits of glutenin. This group of gluten proteins is thought to be largely responsible for the elastic properties of gluten, and hence, doughs. Indeed, glutenin high molecular weight subunits are classified as elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterized by the following three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm.


Pssm-ID: 367362 [Multi-domain]  Cd Length: 786  Bit Score: 68.82  E-value: 8.26e-12
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  109 QDCKFPGNTGRQHRQRHEEERQNSHHSQPERQDGDSHHGQPERQDRDSHHGQSEKQDRDSHHSQPERQDRDSHHNQSERQ 188
Cdd:pfam03157 141 QQWYYPTSPQQPGQWQQPGQGQQGYYPTSPQQSGQRQQPGQGQQLRQGQQGQQSGQGQPGYYPTSSQQPGQLQQTGQGQQ 220
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  189 DKDFSFDQSERQSQDSSSGKKVSHKSTSGQAKWQGHIFALNRCEKPIQDSHYGQSERhtqqsetlGQASHFNQTNQQKSG 268
Cdd:pfam03157 221 GQQPERGQQGQQPGQGQQPGQGQQGQQPGQPQQLGQGQQGYYPISPQQPRQWQQSGQ--------GQQGYYPTSLQQPGQ 292
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  269 SYCGQSERLGQELGCGQTDRQGQSSHYGQTDRQDQSYHYGQTDRQGQSSHYSQTDRQGQSSHYSQPDRQGQSSHYGQMDR 348
Cdd:pfam03157 293 GQSGYYPTSQQQAGQLQQEQQLGQEQQDQQPGQGRQGQQPGQGQQGQQPAQGQQPGQGQPGYYPTSPQQPGQGQPGYYPT 372
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  349 KGQCYHYDQTNRQGQGSHYSQPNRQGQSSHYGQPDTQDQSSHYGQTDRQDQSSHYGQTERQGQSSHYSQMDRQGQGSHYG 428
Cdd:pfam03157 373 SQQQPQQGQQPEQGQQGQQQGQGQQGQQPGQGQQPGQGQPGYYPTSPQQSGQGQPGYYPTSPQQSGQGQQPGQGQQPGQE 452
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  429 QTDrQGQSSHYGQPDRQGQNSHYGQTDRQGQSSHYG---QTDRQGQSSHYSQPDKQGQSSHYGKIDRQDqsyhygqpdGQ 505
Cdd:pfam03157 453 QPG-QGQQPGQGQQGQQPGQPEQGQQPGQGQPGYYPtspQQSGQGQQLGQWQQQGQGQPGYYPTSPLQP---------GQ 522
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  506 GQSSHYGQTDRQGQSfhyGQPDRQGQSSHYSQMDRQGQSSHYGQTDRQGQSSHYGQTDRQGQSYHYGQTDRQGQSSHYIQ 585
Cdd:pfam03157 523 GQPGYYPTSPQQPGQ---GQQLGQLQQPTQGQQGQQSGQGQQGQQPGQGQQGQQPGQGQQGQQPGQGQQPGQGQPGYYPT 599
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  586 SQTGEIQGQNKYFQGTEGTRKASYVEQSGRSGRLSQQTPGQEGYQNQGQGFQSRDSQQN--GHQVWEPEEDSQHHQHKLL 663
Cdd:pfam03157 600 SPQQSGQGQQPGQWQQPGQGQPGYYPTSSLQLGQGQQGYYPTSPQQPGQGQQPGQWQQSgqGQQGYYPTSPQQSGQAQQP 679
                         570       580       590       600       610
                  ....*....|....*....|....*....|....*....|....*....|.
gi 171916097  664 AQIQQERPLCHKGRDWQSCSSEQGHRQAQtrqshGEGLSHWAEEEQGHQTW 714
Cdd:pfam03157 680 GQGQQPGQWLQPGQGQQGYYPTSPQQPGQ-----GQQLGQGQQSGQGQQGY 725
calgranulins cd05030
Calgranulins: S-100 domain found in proteins belonging to the Calgranulin subgroup of the S100 ...
3-89 9.42e-12

Calgranulins: S-100 domain found in proteins belonging to the Calgranulin subgroup of the S100 family of EF-hand calcium-modulated proteins, including S100A8, S100A9, and S100A12 . Note that the S-100 hierarchy, to which this Calgranulin group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately. These proteins are expressed mainly in granulocytes, and are involved in inflammation, allergy, and neuritogenesis, as well as in host-parasite response. Calgranulins are modulated not only by calcium, but also by other metals such as zinc and copper. Structural data suggested that calgranulins may exist in multiple structural forms, homodimers, as well as hetero-oligomers. For example, the S100A8/S100A9 complex called calprotectin plays important roles in the regulation of inflammatory processes, wound repair, and regulating zinc-dependent enzymes as well as microbial growth.


Pssm-ID: 240156 [Multi-domain]  Cd Length: 88  Bit Score: 61.59  E-value: 9.42e-12
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097   3 QLLNSILSVIDVFHKYAKGNGDCALLCKEELKQLLLAEFGDILQRPNDPETVETILNLLDQDRDGHIDFHEYLLLVFQLV 82
Cdd:cd05030    2 ELEKAIETIINVFHQYSVRKGHPDTLYKKEFKQLVEKELPNFLKKEKNQKAIDKIFEDLDTNQDGQLSFEEFLVLVIKVG 81

                 ....*..
gi 171916097  83 QACYHKL 89
Cdd:cd05030   82 VAAHEKS 88
S-100A10 cd05024
S-100A10: A subgroup of the S-100A10 domain found in proteins similar to S100A10. S100A10 is a ...
2-85 2.47e-10

S-100A10: A subgroup of the S-100A10 domain found in proteins similar to S100A10. S100A10 is a member of the S100 family of EF-hand superfamily of calcium-binding proteins. Note that the S-100 hierarchy, to which this S-100A10 group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins are expressed exclusively in vertebrates, and are implicated in intracellular and extracellular regulatory activities. A unique feature of S100A10 is that it contains mutation in both of the calcium binding sites, making it calcium insensitive. S100A10 has been detected in brain, heart, gastrointestinal tract, kidney, liver, lung, spleen, testes, epidermis, aorta, and thymus. Structural data supports the homo- and hetero-dimeric as well as hetero-tetrameric nature of the protein. S100A10 has multiple binding partners in its calcium free state and is therefore involved in many diverse biological functions.


Pssm-ID: 240151 [Multi-domain]  Cd Length: 91  Bit Score: 57.54  E-value: 2.47e-10
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097   2 AQLLNSILSVIDVFHKYAkgnGDCALLCKEELKQLLLAEFGDILQRPNDPETVETILNLLDQDRDGHIDFHEYLLLVFQL 81
Cdd:cd05024    1 SELEHSMEKMMLTFHKFA---GEKNYLNRDDLQKLMEKEFSEFLKNQNDPMAVDKIMKDLDDCRDGKVGFQSFFSLIAGL 77

                 ....
gi 171916097  82 VQAC 85
Cdd:cd05024   78 LIAC 81
S-100A11 cd05023
S-100A11: S-100A11 domain found in proteins similar to S100A11. S100A11 is a member of the ...
7-89 2.85e-10

S-100A11: S-100A11 domain found in proteins similar to S100A11. S100A11 is a member of the S-100 domain family within EF-hand Ca2+-binding proteins superfamily. Note that the S-100 hierarchy, to which this S-100A11 group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins exhibit unique patterns of tissue- and cell type-specific expression and have been implicated in the Ca2+-dependent regulation of diverse physiological processes, including cell cycle regulation, differentiation, growth, and metabolic control . S100 proteins have also been associated with a variety of pathological events, including neoplastic transformation and neurodegenerative diseases such as Alzheimer's, usually via over expression of the protein. S100A11 is expressed in smooth muscle and other tissues and involves in calcium-dependent membrane aggregation, which is important for cell vesiculation . As is the case for many other S100 proteins, S100A11 is homodimer, which is able to form a heterodimer with S100B through subunit exchange. Ca2+ binding to S100A11 results in a conformational change in the protein, exposing a hydrophobic surface that interacts with target proteins. In addition to binding to annexin A1 and A6 S100A11 also interacts with actin and transglutaminase.


Pssm-ID: 240150 [Multi-domain]  Cd Length: 89  Bit Score: 57.47  E-value: 2.85e-10
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097   7 SILSVIDVFHKYAKGNGDCALLCKEELKQLLLAEFGDILQRPNDPETVETILNLLDQDRDGHIDFHEYLLLVFQLVQACY 86
Cdd:cd05023    7 CIESLIAVFQKYAGKDGDSYQLSKTEFLSFMNTELASFTKNQKDPGVLDRMMKKLDLNSDGQLDFQEFLNLIGGLAVACH 86

                 ...
gi 171916097  87 HKL 89
Cdd:cd05023   87 ESF 89
Glutenin_hmw pfam03157
High molecular weight glutenin subunit; Members of this family include high molecular weight ...
97-629 9.03e-10

High molecular weight glutenin subunit; Members of this family include high molecular weight subunits of glutenin. This group of gluten proteins is thought to be largely responsible for the elastic properties of gluten, and hence, doughs. Indeed, glutenin high molecular weight subunits are classified as elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterized by the following three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm.


Pssm-ID: 367362 [Multi-domain]  Cd Length: 786  Bit Score: 62.27  E-value: 9.03e-10
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097   97 RTSQQERGQEGAQDCKFPGNTGRQHRQRHEEERQNSHHSQPERQDGDSHHGQPERQDRDSHHGQSEKQDRDSHHSQPERQ 176
Cdd:pfam03157 186 RQGQQGQQSGQGQPGYYPTSSQQPGQLQQTGQGQQGQQPERGQQGQQPGQGQQPGQGQQGQQPGQPQQLGQGQQGYYPIS 265
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  177 DRDSHHNQSERQDKDFSFDQSERQSQDSSSGKKVSHKSTSGQAKWQGHIFALNRCEKPIQDSHYGQSERHTQ-------Q 249
Cdd:pfam03157 266 PQQPRQWQQSGQGQQGYYPTSLQQPGQGQSGYYPTSQQQAGQLQQEQQLGQEQQDQQPGQGRQGQQPGQGQQgqqpaqgQ 345
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  250 SETLGQASHFNQTNQQKSGSYCGQSERLGQELGCGQTDRQGQSSHYGQTDRQDQSYHYGQTDRQGQSSHYS---QTDRQG 326
Cdd:pfam03157 346 QPGQGQPGYYPTSPQQPGQGQPGYYPTSQQQPQQGQQPEQGQQGQQQGQGQQGQQPGQGQQPGQGQPGYYPtspQQSGQG 425
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  327 QSSHY-SQPDRQGQSSHYGQMDRKGQcyhydQTNRQGQGSHYSQPNRQGQSSHYGQPDTQDQSSHYG---QTDRQDQSSH 402
Cdd:pfam03157 426 QPGYYpTSPQQSGQGQQPGQGQQPGQ-----EQPGQGQQPGQGQQGQQPGQPEQGQQPGQGQPGYYPtspQQSGQGQQLG 500
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  403 YGQTERQGQSSHYSQMDRQGQGSHYGQTDRQGQSSHYGQPDRQGQNSHYGQTDRQGQSSHYGQTDRQGQSSHYSQPDKQG 482
Cdd:pfam03157 501 QWQQQGQGQPGYYPTSPLQPGQGQPGYYPTSPQQPGQGQQLGQLQQPTQGQQGQQSGQGQQGQQPGQGQQGQQPGQGQQG 580
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  483 QSSHYGKIDRQDQSYHYG---QPDGQGQSSHYGQTDRQGQSFHYGQPDRQ---GQSSHYS---QMDRQGQSSHYGQTDRQ 553
Cdd:pfam03157 581 QQPGQGQQPGQGQPGYYPtspQQSGQGQQPGQWQQPGQGQPGYYPTSSLQlgqGQQGYYPtspQQPGQGQQPGQWQQSGQ 660
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  554 GQSSHY---------GQTDRQGQSYHYGQTDRQGQSSHYIQSQTGEIQGQNKYFQGTEGTRKASYVEQSGRSGRLSQQtp 624
Cdd:pfam03157 661 GQQGYYptspqqsgqAQQPGQGQQPGQWLQPGQGQQGYYPTSPQQPGQGQQLGQGQQSGQGQQGYYPTSPGQGQQSGQ-- 738

                  ....*
gi 171916097  625 GQEGY 629
Cdd:pfam03157 739 GQQGY 743
ser_rich_anae_1 NF033849
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
380-635 9.68e-10

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 62.33  E-value: 9.68e-10
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  380 GQPDTQDQSSHYGQTDRQDQSSHYGQTERQGQSSHYSQMDRQGQGSHYGQTDRQGQS--SHYGQPDRQGQNSHYGQTDRQ 457
Cdd:NF033849  236 GQSAGTGYGESVGHSTSQGQSHSVGTSESHSVGTSQSQSHTTGHGSTRGWSHTQSTSesESTGQSSSVGTSESQSHGTTE 315
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  458 G----------QSSHYGQTDRQGQSSHYSQPDKQGQSSHYGkidrQDQSYHYGQPDGQGQSSHYGQTDRQGQSFHYGQPD 527
Cdd:NF033849  316 GtsttdssshsQSSSYNVSSGTGVSSSHSDGTSQSTSISHS----ESSSESTGTSVGHSTSSSVSSSESSSRSSSSGVSG 391
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  528 R------------------QGQSSHYSQMD-RQGQSSHYGQTdrQGQSSHYGQTDRQGQSYHYGQTDRQGQSSHYIQSqT 588
Cdd:NF033849  392 GfsggiagggvtseglgasQGGSEGWGSGDsVQSVSQSYGSS--SSTGTSSGHSDSSSHSTSSGQADSVSQGTSWSEG-T 468
                         250       260       270       280       290
                  ....*....|....*....|....*....|....*....|....*....|.
gi 171916097  589 GEIQGQ---NKYFQGTEGTRKASYVEQSGRSGRLSQQTPGQEGY-QNQGQG 635
Cdd:NF033849  469 GTSQGQsvgTSESWSTSQSETDSVGDSTGTSESVSQGDGRSTGRsESQGTS 519
ser_rich_anae_1 NF033849
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
134-461 1.92e-07

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 55.01  E-value: 1.92e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  134 HSQPERQDGDSHHGQPERQDRDSHHGQSEKQ-DRDSH---HSQPERQDRDSHHNQSErqdkdfSFDQSERQSQDSSSGKK 209
Cdd:NF033849  235 LGQSAGTGYGESVGHSTSQGQSHSVGTSESHsVGTSQsqsHTTGHGSTRGWSHTQST------SESESTGQSSSVGTSES 308
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  210 VSHKSTSGQAKWQGhifalnrcekpiqdSHYGQSERHTQQSETLGQASHFNQTNQQKSGsycGQSERLGQelGCGQTDRQ 289
Cdd:NF033849  309 QSHGTTEGTSTTDS--------------SSHSQSSSYNVSSGTGVSSSHSDGTSQSTSI---SHSESSSE--STGTSVGH 369
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  290 GQSSHYGQTDRQDQSYHYGQTDRQGQSSHYSQTDRQGQSSHysqpdrQGQSSHYGQMDrKGQCYHYDQTNRQGQGSHYSQ 369
Cdd:NF033849  370 STSSSVSSSESSSRSSSSGVSGGFSGGIAGGGVTSEGLGAS------QGGSEGWGSGD-SVQSVSQSYGSSSSTGTSSGH 442
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  370 PNRQGQSSHYGQPDTQDQsshyGQTDRQDQSSHYGQTERQGQSSHYSQMDRQGQGSHYGQTDRQGQSSHYGQPDRQGQNS 449
Cdd:NF033849  443 SDSSSHSTSSGQADSVSQ----GTSWSEGTGTSQGQSVGTSESWSTSQSETDSVGDSTGTSESVSQGDGRSTGRSESQGT 518
                         330
                  ....*....|..
gi 171916097  450 HYGQTDRQGQSS 461
Cdd:NF033849  519 SLGTSGGRTSGA 530
S-100A6 cd05029
S-100A6: S-100A6 domain found in proteins similar to S100A6. S100A6 is a member of the S100 ...
1-75 3.73e-07

S-100A6: S-100A6 domain found in proteins similar to S100A6. S100A6 is a member of the S100 domain family within EF-hand Ca2+-binding proteins superfamily. Note that the S-100 hierarchy, to which this S-100A6 group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins exhibit unique patterns of tissue- and cell type-specific expression and have been implicated in the Ca2+-dependent regulation of diverse physiological processes, including cell cycle regulation, differentiation, growth, and metabolic control . S100A6 is normally expressed in the G1 phase of the cell cycle in neuronal cells. The function of S100A6 remains unclear, but evidence suggests that it is involved in cell cycle regulation and exocytosis. S100A6 may also be involved in tumorigenesis; the protein is overexpressed in several tumors. Ca2+ binding to S100A6 leads to a conformational change in the protein, which exposes a hydrophobic surface for interaction with target proteins. Several such proteins have been identified: glyceraldehyde-3-phosphate dehydrogenase , annexins 2, 6 and 11 and Calcyclin-Binding Protein (CacyBP).


Pssm-ID: 240155 [Multi-domain]  Cd Length: 88  Bit Score: 48.68  E-value: 3.73e-07
                         10        20        30        40        50        60        70
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 171916097   1 MAQLLN-SILSVIDVFHKYAKGNGDCALLCKEELKQLLLAEFGdILQRPNDPETVEtILNLLDQDRDGHIDFHEYL 75
Cdd:cd05029    1 MASPLDqAIGLLVAIFHKYSGREGDKNTLSKKELKELIQKELT-IGSKLQDAEIAK-LMEDLDRNKDQEVNFQEYV 74
PRK12678 PRK12678
transcription termination factor Rho; Provisional
92-203 2.36e-05

transcription termination factor Rho; Provisional


Pssm-ID: 237171 [Multi-domain]  Cd Length: 672  Bit Score: 47.98  E-value: 2.36e-05
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  92 KSHGGRTSQQERGQEGAQDCKFPGNTGRQHRQRHEEERQNSHHSQPERQDGDSHHGQPERQDRDSHHGQSEKQDRDSHHS 171
Cdd:PRK12678 143 RKAGEGGEQPATEARADAAERTEEEERDERRRRGDREDRQAEAERGERGRREERGRDGDDRDRRDRREQGDRREERGRRD 222
                         90       100       110
                 ....*....|....*....|....*....|..
gi 171916097 172 QPERQDRDSHHNQSERQDKDFSFDQSERQSQD 203
Cdd:PRK12678 223 GGDRRGRRRRRDRRDARGDDNREDRGDRDGDD 254
PRK12678 PRK12678
transcription termination factor Rho; Provisional
90-199 1.55e-04

transcription termination factor Rho; Provisional


Pssm-ID: 237171 [Multi-domain]  Cd Length: 672  Bit Score: 45.28  E-value: 1.55e-04
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  90 DNKSHGGRTSQQERGQEGAQDckfpgnTGRQHRQRHEEERQNSHHSQPERQDGDSHHGQPERQDRDSHHGQSEKQDRDSH 169
Cdd:PRK12678 159 DAAERTEEEERDERRRRGDRE------DRQAEAERGERGRREERGRDGDDRDRRDRREQGDRREERGRRDGGDRRGRRRR 232
                         90       100       110
                 ....*....|....*....|....*....|
gi 171916097 170 HSQPERQDRDSHHNQSERQDKDFSFDQSER 199
Cdd:PRK12678 233 RDRRDARGDDNREDRGDRDGDDGEGRGGRR 262
EFh cd00051
EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal ...
13-75 4.93e-04

EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers.


Pssm-ID: 238008 [Multi-domain]  Cd Length: 63  Bit Score: 39.07  E-value: 4.93e-04
                         10        20        30        40        50        60
                 ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 171916097  13 DVFHKYAK-GNGdcaLLCKEELKQLLlAEFGDILQRpndpETVETILNLLDQDRDGHIDFHEYL 75
Cdd:cd00051    4 EAFRLFDKdGDG---TISADELKAAL-KSLGEGLSE----EEIDEMIREVDKDGDGKIDFEEFL 59
PRK12678 PRK12678
transcription termination factor Rho; Provisional
95-207 5.01e-04

transcription termination factor Rho; Provisional


Pssm-ID: 237171 [Multi-domain]  Cd Length: 672  Bit Score: 43.74  E-value: 5.01e-04
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  95 GGRTSQQERGQEGAQDcKFPGNTGRQHR----QRHEEERQNSHHSQPERQDGdshhgqPERQDRDSHHGQSEKQDRDSHH 170
Cdd:PRK12678 137 ARRGAARKAGEGGEQP-ATEARADAAERteeeERDERRRRGDREDRQAEAER------GERGRREERGRDGDDRDRRDRR 209
                         90       100       110
                 ....*....|....*....|....*....|....*..
gi 171916097 171 SQPERQDRDSHHNQSERQDKDFSFDQSERQSQDSSSG 207
Cdd:PRK12678 210 EQGDRREERGRRDGGDRRGRRRRRDRRDARGDDNRED 246
EF-hand_7 pfam13499
EF-hand domain pair;
13-75 7.09e-04

EF-hand domain pair;


Pssm-ID: 463900 [Multi-domain]  Cd Length: 67  Bit Score: 38.77  E-value: 7.09e-04
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 171916097   13 DVFHKY-AKGNGdcaLLCKEELKQLLLAEFGDIlqrPNDPETVETILNLLDQDRDGHIDFHEYL 75
Cdd:pfam13499   6 EAFKLLdSDGDG---YLDVEELKKLLRKLEEGE---PLSDEEVEELFKEFDLDKDGRISFEEFL 63
PRK12678 PRK12678
transcription termination factor Rho; Provisional
689-784 2.14e-03

transcription termination factor Rho; Provisional


Pssm-ID: 237171 [Multi-domain]  Cd Length: 672  Bit Score: 41.43  E-value: 2.14e-03
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097 689 RQAQTRQSHGEGLSHWAEEEQGHQTwDRHSHESQEGPCGTQDRRTHKDEQNHQRRDRQTHEHEQSHQRRDRQTHEDKQNR 768
Cdd:PRK12678 137 ARRGAARKAGEGGEQPATEARADAA-ERTEEEERDERRRRGDREDRQAEAERGERGRREERGRDGDDRDRRDRREQGDRR 215
                         90
                 ....*....|....*.
gi 171916097 769 QRRDRQTHEDEQNHQR 784
Cdd:PRK12678 216 EERGRRDGGDRRGRRR 231
PRK12678 PRK12678
transcription termination factor Rho; Provisional
95-207 4.78e-03

transcription termination factor Rho; Provisional


Pssm-ID: 237171 [Multi-domain]  Cd Length: 672  Bit Score: 40.27  E-value: 4.78e-03
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 171916097  95 GGRTSQQERGQEGAQdckfpGNTGRQHRQRhEEERQNSHHSQPERQDGDSHHGQPERQDRDSHHGQSEKQDRDSHHSQPE 174
Cdd:PRK12678 176 GDREDRQAEAERGER-----GRREERGRDG-DDRDRRDRREQGDRREERGRRDGGDRRGRRRRRDRRDARGDDNREDRGD 249
                         90       100       110
                 ....*....|....*....|....*....|...
gi 171916097 175 RQDRDSHHNQSERQDKdfsFDQSERQSQDSSSG 207
Cdd:PRK12678 250 RDGDDGEGRGGRRGRR---FRDRDRRGRRGGDG 279
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH