NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|112382224|ref|NP_001036146|]
View 

arginine-glutamic acid dipeptide repeats protein isoform a [Homo sapiens]

Protein Classification

arginine-glutamic acid dipeptide repeats protein( domain architecture ID 11562211)

arginine-glutamic acid dipeptide repeats protein (RERE) plays a role as a transcriptional repressor during development

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
568-1565 0e+00

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


:

Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 1182.64  E-value: 0e+00
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   568 GKHSMRTRRSRGSMSTLRSGRKKQPASPDGRTSPINEDIRSSGRNSPSAASTSSNDSKAETVKKSAKKVKEEASSPLKSN 647
Cdd:pfam03154    1 GKHSMRTRRSRGSMSTLRSGRKKQTASPDGRASPTNEDLRSSGRNSPSAASTSSNDSKAESMKKSSKKIKEEAPSPLKSA 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   648 KRQREKVASDTEEADRTSSKKTKTQEISRPNSPSEGEGESSDSRSVNDEGSSDPKDIDQDNRSTSPSIPSPQDNESDSDS 727
Cdd:pfam03154   81 KRQREKGASDTEEPERATAKKSKTQEISRPNSPSEGEGESSDGRSVNDEGSSDPKDIDQDNRSTSPSIPSPQDNESDSDS 160
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   728 SAQQQMLQAQPPALQAPTGVTPAPSSAPPGTPQLPTPGPTPSATAVPPQGSPTASQAPNQPQAPTAPvpHTHIQQAPALH 807
Cdd:pfam03154  161 SAQQQILQTQPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAP--HTLIQQTPTLH 238
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   808 PQRPPSPHPPPHPSPHPPLQPltgsagQPSAPSHAQPPLHGQGPPGPHSLQAGP-LLQHPGPPQPFGLPPQASQGQAPLG 886
Cdd:pfam03154  239 PQRLPSPHPPLQPMTQPPPPS------QVSPQPLPQPSLHGQMPPMPHSLQTGPsHMQHPVPPQPFPLTPQSSQSQVPPG 312
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   887 TSPAAAYP-HTSLQLPASQSALQSQQPPREQPLPPAPLAMPHIKPPPTTPIPQLPAPQAHKHPPHLSGPSPFSMNANLPP 965
Cdd:pfam03154  313 PSPAAPGQsQQRIHTPPSQSQLQSQQPPREQPLPPAPLSMPHIKPPPTTPIPQLPNPQSHKHPPHLSGPSPFQMNSNLPP 392
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   966 PPALKPLSSLSTHHPPSAHPPPLQLMPQSQPLPSSPAQPPGLTQSQNLPPPPASHPPT-GLHQVAPQPPFAQHPFVPGGP 1044
Cdd:pfam03154  393 PPALKPLSSLSTHHPPSAHPPPLQLMPQSQQLPPPPAQPPVLTQSQSLPPPAASHPPTsGLHQVPSQSPFPQHPFVPGGP 472
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  1045 PPITPPTCPSTSTPPAGPGTsaQPPCSGAAASGGSIAGGSSCPLPTVQIKEEALDDAEEPESPPPPPRSPSPEPTVVDTP 1124
Cdd:pfam03154  473 PPITPPSGPPTSTSSAMPGI--QPPSSASVSSSGPVPAAVSCPLPPVQIKEEALDEAEEPESPPPPPRSPSPEPTVVNTP 550
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  1125 SHASQSARFYKHLDRGYNSCARTDLYFMPLAGSKLAKKREEAIEKAKREAEQKAREEREREKEKEKEREREREREREAER 1204
Cdd:pfam03154  551 SHASQSARFYKHLDRGYNSCARTDLYFMPLAGSKLAKKREEALEKAKREAEQKAREEKEREKEKEKEREREREREREAER 630
                          650       660       670       680       690       700       710       720
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  1205 AAKASSSAHEGRLSDPQLSGPGHMRPSFEPPPTTIAAVPPYIGPDTPALRTLSEYARPHVMSPTNRNHPFYMPLNPTDPL 1284
Cdd:pfam03154  631 AAKASSSSHEGRMGDPQLAGPAHMRPSFEPPPTTIAAVPPYIGPDTPALRTLSEYARPHVMSPTNRNHPFFVPLNPTDPL 710
                          730       740       750       760       770       780       790       800
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  1285 LAYHMPGLYNVDPTIRERELREREIREREIRERELRERMKPGFEVKPPELDPLHPAANPMEHFARHSALTIPPTAGPHPF 1364
Cdd:pfam03154  711 LAYHMPGLYNVDPAIRERELREREIREREIRERELRERMKPGFEVKPPELDPLHPATNPMEHFARHGALTLPPMAGPHPF 790
                          810       820       830       840       850       860       870       880
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  1365 ASFHPGLNPLERERLALAGPQLRPEMSYPDRLAAERIHAERMASLTSDPLARLQMFNVTPHHHQHSHIHSHLHLHQQDPL 1444
Cdd:pfam03154  791 ASFHPGLNPLERERLALAGPQLRPEMSYPDRLAAERLHAERMASLTNDPLARLQMFNVTPHHHQHSHIHSHLHLHQQDPL 870
                          890       900       910       920       930       940       950       960
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  1445 HQGSAGPVHPLVDPLTAGPHLARFPYPPGTLPNPLLGQPPHEHEMLRHPVFGTPYPRDLPGAIPPPMSAAHQLQAMHAQS 1524
Cdd:pfam03154  871 HQGSGGPVHPLVDPLAAGPHLARFPYPPGAIPNPLLGQPPHEHEMLRHPVFGTPYPRDLPGGLPPPMSAAHQLQAMHAQS 950
                          970       980       990      1000
                   ....*....|....*....|....*....|....*....|.
gi 112382224  1525 AELQRLAMEQQWLHGHPHMHGGHLPSQEDYYSRLKKEGDKQ 1565
Cdd:pfam03154  951 AELQRLAMEQQWLHGHPHMHGGHLPGQEDYYSRLKKESDKQ 991
BAH_MTA cd04709
BAH, or Bromo Adjacent Homology domain, as present in MTA1 and similar proteins. The ...
102-307 4.76e-81

BAH, or Bromo Adjacent Homology domain, as present in MTA1 and similar proteins. The Metastasis-associated protein MTA1 is part of the NURD (nucleosome remodeling and deacetylating) complex and plays a role in cellular transformation and metastasis. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions.


:

Pssm-ID: 240060  Cd Length: 164  Bit Score: 263.48  E-value: 4.76e-81
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  102 DVVYRPGDCVYIESRrPNTPYFICSIQDFKLvhnsqaccrsptpalcdppacslpvasqppqhlseagrgpvgSKRDHLL 181
Cdd:cd04709     1 ANMYRVGDYVYFESS-PNNPYLIRRIEELNK------------------------------------------TARGHVE 37
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  182 MNVKWYYRQSEVPDSVYQHLVQDRHNEND-SGRELVITDPVIKNRELFISDYVDTYHAAALRGKCNISHFSDIFAAREFK 260
Cdd:cd04709    38 AKVVCYYRRRDIPDSLYQLADQHRRELEEkSDDLTPKQRHQLRHRELFLSRQVETLPATHIRGKCSVTLLNDTESARSYL 117
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|....*..
gi 112382224  261 ARVDSFFYILGYNPETRRLNSTQGEIRVGPSHQAKLPDLQPFPSPDG 307
Cdd:cd04709   118 AREDTFFYSLVYDPEQKTLLADQGEIRVGPSYQAKLPDLQPFPSPDG 164
SANT_MTA3_like cd11661
Myb-Like Dna-Binding Domain of MTA3 and related proteins; Members in this SANT/myb family ...
395-440 4.33e-23

Myb-Like Dna-Binding Domain of MTA3 and related proteins; Members in this SANT/myb family include domains found in mouse metastasis-associated protein 3 (MTA3) proteins and arginine-glutamic dipeptide (RERE) repeats proteins. SANT (SWI3, ADA2, N-CoR and TFIIIB) DNA-binding domains are a diverse set of proteins that share a common 3 alpha-helix bundle. MTA3 has been shown to interact with nucleosome remodeling and deacetylase (NuRD) proteins CHD4 and HDAC1, and the core cohesin complex protein RAD21 in the ovary, and regulate G2/M progression in proliferating granulosa cells. RERE belongs to the atrophin family and has been identified as a nuclear receptor corepressor; altered expression levels of RERE are associated with cancer in humans while mutations of Rere in mice cause failure in closing the anterior neural tube and fusion of the telencephalic and optic vesicles during embryogenesis.


:

Pssm-ID: 212559 [Multi-domain]  Cd Length: 46  Bit Score: 93.45  E-value: 4.33e-23
                          10        20        30        40
                  ....*....|....*....|....*....|....*....|....*.
gi 112382224  395 CWTEDEVKRFVKGLRQYGKNFFRIRKELLPNKETGELITFYYYWKK 440
Cdd:cd11661     1 EWSESEAKLFEEGLRKYGKDFHDIRQDFLPWKSVGELVEFYYMWKK 46
ZnF_GATA smart00401
zinc finger binding to DNA consensus sequence [AT]GATA[AG];
503-552 3.08e-15

zinc finger binding to DNA consensus sequence [AT]GATA[AG];


:

Pssm-ID: 214648 [Multi-domain]  Cd Length: 52  Bit Score: 71.30  E-value: 3.08e-15
                            10        20        30        40        50
                    ....*....|....*....|....*....|....*....|....*....|.
gi 112382224    503 KGYACRHCFTTTSKDWHHGGRENILLCTDCRIHFKKYGEL-PPIEKPVDPP 552
Cdd:smart00401    2 SGRSCSNCGTTETPLWRRGPSGNKTLCNACGLYYKKHGGLkRPLSLKKDGI 52
ELM2 pfam01448
ELM2 domain; The ELM2 (Egl-27 and MTA1 homology 2) domain is a small domain of unknown ...
286-336 2.45e-12

ELM2 domain; The ELM2 (Egl-27 and MTA1 homology 2) domain is a small domain of unknown function. It is found in the MTA1 protein that is part of the NuRD complex. The domain is usually found to the N terminus of a myb-like DNA binding domain pfam00249. ELM2 is also found associated with an ARID DNA binding domain pfam01388 in Swiss:O82364. This suggests that ELM2 may also be involved in DNA binding, or perhaps is a protein-protein interaction domain.


:

Pssm-ID: 460214  Cd Length: 53  Bit Score: 63.02  E-value: 2.45e-12
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|...
gi 112382224   286 IRVGPSHQAKLPDLQPFPSPDGDTVTQHEELVWMP--GVNDCDLLMYLRAARS 336
Cdd:pfam01448    1 IRVGPRYQAEIPELLPPSEEEDRYEEEDELLVWDPnhNLPDRKLDEYLVVARS 53
 
Name Accession Description Interval E-value
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
568-1565 0e+00

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 1182.64  E-value: 0e+00
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   568 GKHSMRTRRSRGSMSTLRSGRKKQPASPDGRTSPINEDIRSSGRNSPSAASTSSNDSKAETVKKSAKKVKEEASSPLKSN 647
Cdd:pfam03154    1 GKHSMRTRRSRGSMSTLRSGRKKQTASPDGRASPTNEDLRSSGRNSPSAASTSSNDSKAESMKKSSKKIKEEAPSPLKSA 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   648 KRQREKVASDTEEADRTSSKKTKTQEISRPNSPSEGEGESSDSRSVNDEGSSDPKDIDQDNRSTSPSIPSPQDNESDSDS 727
Cdd:pfam03154   81 KRQREKGASDTEEPERATAKKSKTQEISRPNSPSEGEGESSDGRSVNDEGSSDPKDIDQDNRSTSPSIPSPQDNESDSDS 160
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   728 SAQQQMLQAQPPALQAPTGVTPAPSSAPPGTPQLPTPGPTPSATAVPPQGSPTASQAPNQPQAPTAPvpHTHIQQAPALH 807
Cdd:pfam03154  161 SAQQQILQTQPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAP--HTLIQQTPTLH 238
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   808 PQRPPSPHPPPHPSPHPPLQPltgsagQPSAPSHAQPPLHGQGPPGPHSLQAGP-LLQHPGPPQPFGLPPQASQGQAPLG 886
Cdd:pfam03154  239 PQRLPSPHPPLQPMTQPPPPS------QVSPQPLPQPSLHGQMPPMPHSLQTGPsHMQHPVPPQPFPLTPQSSQSQVPPG 312
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   887 TSPAAAYP-HTSLQLPASQSALQSQQPPREQPLPPAPLAMPHIKPPPTTPIPQLPAPQAHKHPPHLSGPSPFSMNANLPP 965
Cdd:pfam03154  313 PSPAAPGQsQQRIHTPPSQSQLQSQQPPREQPLPPAPLSMPHIKPPPTTPIPQLPNPQSHKHPPHLSGPSPFQMNSNLPP 392
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   966 PPALKPLSSLSTHHPPSAHPPPLQLMPQSQPLPSSPAQPPGLTQSQNLPPPPASHPPT-GLHQVAPQPPFAQHPFVPGGP 1044
Cdd:pfam03154  393 PPALKPLSSLSTHHPPSAHPPPLQLMPQSQQLPPPPAQPPVLTQSQSLPPPAASHPPTsGLHQVPSQSPFPQHPFVPGGP 472
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  1045 PPITPPTCPSTSTPPAGPGTsaQPPCSGAAASGGSIAGGSSCPLPTVQIKEEALDDAEEPESPPPPPRSPSPEPTVVDTP 1124
Cdd:pfam03154  473 PPITPPSGPPTSTSSAMPGI--QPPSSASVSSSGPVPAAVSCPLPPVQIKEEALDEAEEPESPPPPPRSPSPEPTVVNTP 550
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  1125 SHASQSARFYKHLDRGYNSCARTDLYFMPLAGSKLAKKREEAIEKAKREAEQKAREEREREKEKEKEREREREREREAER 1204
Cdd:pfam03154  551 SHASQSARFYKHLDRGYNSCARTDLYFMPLAGSKLAKKREEALEKAKREAEQKAREEKEREKEKEKEREREREREREAER 630
                          650       660       670       680       690       700       710       720
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  1205 AAKASSSAHEGRLSDPQLSGPGHMRPSFEPPPTTIAAVPPYIGPDTPALRTLSEYARPHVMSPTNRNHPFYMPLNPTDPL 1284
Cdd:pfam03154  631 AAKASSSSHEGRMGDPQLAGPAHMRPSFEPPPTTIAAVPPYIGPDTPALRTLSEYARPHVMSPTNRNHPFFVPLNPTDPL 710
                          730       740       750       760       770       780       790       800
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  1285 LAYHMPGLYNVDPTIRERELREREIREREIRERELRERMKPGFEVKPPELDPLHPAANPMEHFARHSALTIPPTAGPHPF 1364
Cdd:pfam03154  711 LAYHMPGLYNVDPAIRERELREREIREREIRERELRERMKPGFEVKPPELDPLHPATNPMEHFARHGALTLPPMAGPHPF 790
                          810       820       830       840       850       860       870       880
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  1365 ASFHPGLNPLERERLALAGPQLRPEMSYPDRLAAERIHAERMASLTSDPLARLQMFNVTPHHHQHSHIHSHLHLHQQDPL 1444
Cdd:pfam03154  791 ASFHPGLNPLERERLALAGPQLRPEMSYPDRLAAERLHAERMASLTNDPLARLQMFNVTPHHHQHSHIHSHLHLHQQDPL 870
                          890       900       910       920       930       940       950       960
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  1445 HQGSAGPVHPLVDPLTAGPHLARFPYPPGTLPNPLLGQPPHEHEMLRHPVFGTPYPRDLPGAIPPPMSAAHQLQAMHAQS 1524
Cdd:pfam03154  871 HQGSGGPVHPLVDPLAAGPHLARFPYPPGAIPNPLLGQPPHEHEMLRHPVFGTPYPRDLPGGLPPPMSAAHQLQAMHAQS 950
                          970       980       990      1000
                   ....*....|....*....|....*....|....*....|.
gi 112382224  1525 AELQRLAMEQQWLHGHPHMHGGHLPSQEDYYSRLKKEGDKQ 1565
Cdd:pfam03154  951 AELQRLAMEQQWLHGHPHMHGGHLPGQEDYYSRLKKESDKQ 991
BAH_MTA cd04709
BAH, or Bromo Adjacent Homology domain, as present in MTA1 and similar proteins. The ...
102-307 4.76e-81

BAH, or Bromo Adjacent Homology domain, as present in MTA1 and similar proteins. The Metastasis-associated protein MTA1 is part of the NURD (nucleosome remodeling and deacetylating) complex and plays a role in cellular transformation and metastasis. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions.


Pssm-ID: 240060  Cd Length: 164  Bit Score: 263.48  E-value: 4.76e-81
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  102 DVVYRPGDCVYIESRrPNTPYFICSIQDFKLvhnsqaccrsptpalcdppacslpvasqppqhlseagrgpvgSKRDHLL 181
Cdd:cd04709     1 ANMYRVGDYVYFESS-PNNPYLIRRIEELNK------------------------------------------TARGHVE 37
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  182 MNVKWYYRQSEVPDSVYQHLVQDRHNEND-SGRELVITDPVIKNRELFISDYVDTYHAAALRGKCNISHFSDIFAAREFK 260
Cdd:cd04709    38 AKVVCYYRRRDIPDSLYQLADQHRRELEEkSDDLTPKQRHQLRHRELFLSRQVETLPATHIRGKCSVTLLNDTESARSYL 117
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|....*..
gi 112382224  261 ARVDSFFYILGYNPETRRLNSTQGEIRVGPSHQAKLPDLQPFPSPDG 307
Cdd:cd04709   118 AREDTFFYSLVYDPEQKTLLADQGEIRVGPSYQAKLPDLQPFPSPDG 164
SANT_MTA3_like cd11661
Myb-Like Dna-Binding Domain of MTA3 and related proteins; Members in this SANT/myb family ...
395-440 4.33e-23

Myb-Like Dna-Binding Domain of MTA3 and related proteins; Members in this SANT/myb family include domains found in mouse metastasis-associated protein 3 (MTA3) proteins and arginine-glutamic dipeptide (RERE) repeats proteins. SANT (SWI3, ADA2, N-CoR and TFIIIB) DNA-binding domains are a diverse set of proteins that share a common 3 alpha-helix bundle. MTA3 has been shown to interact with nucleosome remodeling and deacetylase (NuRD) proteins CHD4 and HDAC1, and the core cohesin complex protein RAD21 in the ovary, and regulate G2/M progression in proliferating granulosa cells. RERE belongs to the atrophin family and has been identified as a nuclear receptor corepressor; altered expression levels of RERE are associated with cancer in humans while mutations of Rere in mice cause failure in closing the anterior neural tube and fusion of the telencephalic and optic vesicles during embryogenesis.


Pssm-ID: 212559 [Multi-domain]  Cd Length: 46  Bit Score: 93.45  E-value: 4.33e-23
                          10        20        30        40
                  ....*....|....*....|....*....|....*....|....*.
gi 112382224  395 CWTEDEVKRFVKGLRQYGKNFFRIRKELLPNKETGELITFYYYWKK 440
Cdd:cd11661     1 EWSESEAKLFEEGLRKYGKDFHDIRQDFLPWKSVGELVEFYYMWKK 46
ZnF_GATA smart00401
zinc finger binding to DNA consensus sequence [AT]GATA[AG];
503-552 3.08e-15

zinc finger binding to DNA consensus sequence [AT]GATA[AG];


Pssm-ID: 214648 [Multi-domain]  Cd Length: 52  Bit Score: 71.30  E-value: 3.08e-15
                            10        20        30        40        50
                    ....*....|....*....|....*....|....*....|....*....|.
gi 112382224    503 KGYACRHCFTTTSKDWHHGGRENILLCTDCRIHFKKYGEL-PPIEKPVDPP 552
Cdd:smart00401    2 SGRSCSNCGTTETPLWRRGPSGNKTLCNACGLYYKKHGGLkRPLSLKKDGI 52
ZnF_GATA cd00202
Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] ...
506-560 3.28e-15

Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C


Pssm-ID: 238123 [Multi-domain]  Cd Length: 54  Bit Score: 71.25  E-value: 3.28e-15
                          10        20        30        40        50
                  ....*....|....*....|....*....|....*....|....*....|....*
gi 112382224  506 ACRHCFTTTSKDWHHGGRENILLCTDCRIHFKKYGELPPIEKPvDPPPFMFKPVK 560
Cdd:cd00202     1 ACSNCGTTTTPLWRRGPSGGSTLCNACGLYWKKHGVMRPLSKR-KKDQIKRRNRK 54
ELM2 pfam01448
ELM2 domain; The ELM2 (Egl-27 and MTA1 homology 2) domain is a small domain of unknown ...
286-336 2.45e-12

ELM2 domain; The ELM2 (Egl-27 and MTA1 homology 2) domain is a small domain of unknown function. It is found in the MTA1 protein that is part of the NuRD complex. The domain is usually found to the N terminus of a myb-like DNA binding domain pfam00249. ELM2 is also found associated with an ARID DNA binding domain pfam01388 in Swiss:O82364. This suggests that ELM2 may also be involved in DNA binding, or perhaps is a protein-protein interaction domain.


Pssm-ID: 460214  Cd Length: 53  Bit Score: 63.02  E-value: 2.45e-12
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|...
gi 112382224   286 IRVGPSHQAKLPDLQPFPSPDGDTVTQHEELVWMP--GVNDCDLLMYLRAARS 336
Cdd:pfam01448    1 IRVGPRYQAEIPELLPPSEEEDRYEEEDELLVWDPnhNLPDRKLDEYLVVARS 53
GATA pfam00320
GATA zinc finger; This domain uses four cysteine residues to coordinate a zinc ion. This ...
507-542 1.25e-11

GATA zinc finger; This domain uses four cysteine residues to coordinate a zinc ion. This domain binds to DNA. Two GATA zinc fingers are found in the GATA transcription factors. However there are several proteins which only contain a single copy of the domain.


Pssm-ID: 425605 [Multi-domain]  Cd Length: 36  Bit Score: 60.41  E-value: 1.25e-11
                           10        20        30
                   ....*....|....*....|....*....|....*.
gi 112382224   507 CRHCFTTTSKDWHHGGRENILLCTDCRIHFKKYGEL 542
Cdd:pfam00320    1 CSNCGTTKTPLWRRGPNGNRTLCNACGLYYKKKGLK 36
PHA03247 PHA03247
large tegument protein UL36; Provisional
540-956 1.49e-11

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 69.97  E-value: 1.49e-11
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  540 GELPPIEKPVDPPPFMFKPVKEeddglsgkhSMRTRRSRGSMSTLRSGRKKQPASPDGRTSPINE--DIRSSGRNSPSAA 617
Cdd:PHA03247 2549 GDPPPPLPPAAPPAAPDRSVPP---------PRPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDrgDPRGPAPPSPLPP 2619
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  618 STSSNDSKAETVKKSAKKVKEEASSPLKSNKRQREKVASDTEEADRTSSKKTKTqeiSRPNSPSEGEGESSDSRSVNDEG 697
Cdd:PHA03247 2620 DTHAPDPPPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRA---AQASSPPQRPRRRAARPTVGSLT 2696
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  698 SS-----DPKDIDQDNRSTSPSIPSPQDNESDSDSSAQQQMLQAQPPALQAP-TGVTPAPSSAPPGTPQLPTPGPTPSAT 771
Cdd:PHA03247 2697 SLadpppPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPaTPGGPARPARPPTTAGPPAPAPPAAPA 2776
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  772 AVPPQGSPTASQAPNQPQAPTAPVPhthiqQAPALHPQRPPSPHPPPHPSPHPPLQPLTGSAGQPSAPSHAQPPLHGQGP 851
Cdd:PHA03247 2777 AGPPRRLTRPAVASLSESRESLPSP-----WDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLP 2851
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  852 PGPHSLQAGPLLQHPGPPQPFGLPPQASQGQAPLGTSPAAAYPHTSLQLPASQSALQSQQPPREQPLPPAPLAMPHIKPP 931
Cdd:PHA03247 2852 LGGSVAPGGDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQP 2931
                         410       420
                  ....*....|....*....|....*.
gi 112382224  932 PTTPIPQLPAPQA-HKHPPHLSGPSP 956
Cdd:PHA03247 2932 PPPPPPRPQPPLApTTDPAGAGEPSG 2957
BAH pfam01426
BAH domain; This domain has been called BAH (Bromo adjacent homology) domain and has also been ...
103-281 6.30e-10

BAH domain; This domain has been called BAH (Bromo adjacent homology) domain and has also been called ELM1 and BAM (Bromo adjacent motif) domain. The function of this domain is unknown but may be involved in protein-protein interaction.


Pssm-ID: 460207  Cd Length: 120  Bit Score: 58.47  E-value: 6.30e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   103 VVYRPGDCVYIESRRPNTPYFICSIQDFKlvhnsqaccrsptpalCDPPACSLPVasqppqhlseagrgpvgskrdhllm 182
Cdd:pfam01426    1 ETYSVGDFVLVEPDDADEPYYVARIEELF----------------EDTKNGKKMV------------------------- 39
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   183 NVKWYYRQSEVPdsvyqHLVQDRHNEndsgrelvitdpviknRELFISDYVDTYHAAALRGKCNISHFSDIFAAREFK-A 261
Cdd:pfam01426   40 RVQWFYRPEETV-----HRAGKAFNK----------------DELFLSDEEDDVPLSAIIGKCSVLHKSDLESLDPYKiK 98
                          170       180
                   ....*....|....*....|
gi 112382224   262 RVDSFFYILGYNPETRRLNS 281
Cdd:pfam01426   99 EPDDFFCELLYDPKTKSFKK 118
BAH smart00439
Bromo adjacent homology domain;
105-281 1.55e-09

Bromo adjacent homology domain;


Pssm-ID: 214664 [Multi-domain]  Cd Length: 121  Bit Score: 57.30  E-value: 1.55e-09
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224    105 YRPGDCVYIESRRPNTPYFICSIQDfklvhnsqaccrsptpaLCDPPACSLPVASQppqhlseagrgpvgskrdhllmnV 184
Cdd:smart00439    2 ISVGDFVLVEPDDADEPYYIGRIEE-----------------IFETKKNSESKMVR-----------------------V 41
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224    185 KWYYRQSEVPdsvyqHLVQDRHNENdsgrelvitdpviknrELFISDYVDTYHAAALRGKCNISHFSDIF--AAREFKAR 262
Cdd:smart00439   42 RWFYRPEETV-----LEKAALFDKN----------------EVFLSDEYDTVPLSDIIGKCNVLYKSDYPglRPEGSIGE 100
                           170
                    ....*....|....*....
gi 112382224    263 VDSFFYILGYNPETRRLNS 281
Cdd:smart00439  101 PDVFFCESAYDPEKGSFKK 119
MSCRAMM_ClfA NF033609
MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial ...
545-729 9.44e-09

MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial Surface Components Recognizing Adhesive Matrix Molecules). It is heavily studied in Staphylococcus aureus both for its biological role in adhesion and for its potential for vaccination. Features of the sequence, but also of other MSCRAMM adhesins, include a long run of Ser-Asp dipeptide repeats and a C-terminal cell wall anchoring LPXTG motif.


Pssm-ID: 468110 [Multi-domain]  Cd Length: 934  Bit Score: 60.31  E-value: 9.44e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  545 IEKPVDPP----PFMFKPVKEEDDGLSGKHSMRTRRSR------GSMSTLRSGRKKQPASPDGRTSPINEDIRSSGRNSP 614
Cdd:NF033609  539 IDKPVVPEqpdePGEIEPIPEDSDSDPGSDSGSDSSNSdsgsdsGSDSTSDSGSDSASDSDSASDSDSASDSDSASDSDS 618
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  615 SAASTSSNDSKAETVKKSAKKVKEEASSPLKSNKRQREKVASDTEeADRTSSKKTKTQEISRPNSPSEGEGES-SDSRSV 693
Cdd:NF033609  619 ASDSDSASDSDSASDSDSASDSDSDSDSDSDSDSDSDSDSDSDSD-SDSDSDSDSDSDSDSDSDSDSDSDSDSdSDSDSD 697
                         170       180       190
                  ....*....|....*....|....*....|....*.
gi 112382224  694 NDEGSSDPKDIDQDNRSTSPSiPSPQDNESDSDSSA 729
Cdd:NF033609  698 SDSDSDSDSDSDSDSDSDSDS-DSDSDSDSDSDSDS 732
PspC_subgroup_1 NF033838
pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, ...
557-804 4.95e-06

pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A. The other form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site.


Pssm-ID: 468201 [Multi-domain]  Cd Length: 684  Bit Score: 51.17  E-value: 4.95e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  557 KPVKEEDDGLSGKHSMRTRRSRGSMStlrsgrkkQPASPDGRTSPINEDIRSSGRNSPSAASTSSNDSKAEtvkksAKKV 636
Cdd:NF033838  246 KEAVEKNVATSEQDKPKRRAKRGVLG--------EPATPDKKENDAKSSDSSVGEETLPSPSLKPEKKVAE-----AEKK 312
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  637 KEEAssplksnkrqrEKVASDTEEADR----TSSKKTKTQEISRPNSP-SEGE-----GESSDSRsvNDEGSSDPKDIDQ 706
Cdd:NF033838  313 VEEA-----------KKKAKDQKEEDRrnypTNTYKTLELEIAESDVKvKEAElelvkEEAKEPR--NEEKIKQAKAKVE 379
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  707 DNRSTSPSIPSPQDNESDSDSSAQQQMlqaqppALQAPTGVTPAPSSAPPGTPQLPTPGPTPSATAVPPQGSPTASQAPN 786
Cdd:NF033838  380 SKKAEATRLEKIKTDRKKAEEEAKRKA------AEEDKVKEKPAEQPQPAPAPQPEKPAPKPEKPAEQPKAEKPADQQAE 453
                         250
                  ....*....|....*....
gi 112382224  787 QPQAPTAP-VPHTHIQQAP 804
Cdd:NF033838  454 EDYARRSEeEYNRLTQQQP 472
SANT smart00717
SANT SWI3, ADA2, N-CoR and TFIIIB'' DNA-binding domains;
396-441 3.89e-05

SANT SWI3, ADA2, N-CoR and TFIIIB'' DNA-binding domains;


Pssm-ID: 197842 [Multi-domain]  Cd Length: 49  Bit Score: 42.60  E-value: 3.89e-05
                            10        20        30        40
                    ....*....|....*....|....*....|....*....|....*..
gi 112382224    396 WTEDEVKRFVKGLRQYG-KNFFRIRKElLPNKETGELITFYYYWKKT 441
Cdd:smart00717    4 WTEEEDELLIELVKKYGkNNWEKIAKE-LPGRTAEQCRERWRNLLKP 49
MSCRAMM_ClfA NF033609
MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial ...
580-729 7.10e-05

MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial Surface Components Recognizing Adhesive Matrix Molecules). It is heavily studied in Staphylococcus aureus both for its biological role in adhesion and for its potential for vaccination. Features of the sequence, but also of other MSCRAMM adhesins, include a long run of Ser-Asp dipeptide repeats and a C-terminal cell wall anchoring LPXTG motif.


Pssm-ID: 468110 [Multi-domain]  Cd Length: 934  Bit Score: 47.60  E-value: 7.10e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  580 SMSTLRSGRKKQPASPDGRTSPINEDIRSSGRNSPSAASTSSNDSKAETVKKSAKKVKEEASSPLKSNKRQREKVASDTE 659
Cdd:NF033609  630 SASDSDSASDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 709
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 112382224  660 -EADRTSSKKTKTQEISRPNSPSEGEGES-SDSRSVNDEGSSDPKDIDQDNRSTSPSiPSPQDNESDSDSSA 729
Cdd:NF033609  710 sDSDSDSDSDSDSDSDSDSDSDSDSDSDSdSDSDSDSDSDSDSDSDSDSDSDSDSDS-DSDSDSDSDSDSDS 780
MSCRAMM_ClfA NF033609
MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial ...
600-798 7.93e-05

MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial Surface Components Recognizing Adhesive Matrix Molecules). It is heavily studied in Staphylococcus aureus both for its biological role in adhesion and for its potential for vaccination. Features of the sequence, but also of other MSCRAMM adhesins, include a long run of Ser-Asp dipeptide repeats and a C-terminal cell wall anchoring LPXTG motif.


Pssm-ID: 468110 [Multi-domain]  Cd Length: 934  Bit Score: 47.60  E-value: 7.93e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  600 SPINEDIRSSGRNSPSAASTSSNDSKAETVKKSAKKVKEEASSPLKSNKRQREKVASDTE-EADRTSSKKTKTQEISRPN 678
Cdd:NF033609  704 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDsDSDSDSDSDSDSDSDSDSD 783
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  679 SPSEGEGES-SDSRSVNDEGSSDPKDIDQDNRSTSPSiPSPQDNESDSDS-----SAQQQMLQAQPPALQAPTGVTPAPS 752
Cdd:NF033609  784 SDSDSDSDSdSDSDSDSDSDSDSDSDSDSDSDSDSDS-DSDSDSDSDSDSdsdsdSDSDSDSDSDSDSDSDSDSESDSNS 862
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|....*...
gi 112382224  753 SAPPGTpqlptpgptpSATAVPPQGSPTASQAPNQPQAPTA--PVPHT 798
Cdd:NF033609  863 DSESGS----------NNNVVPPNSPKNGTNASNKNEAKDSkePLPDT 900
MSCRAMM_ClfA NF033609
MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial ...
580-729 1.01e-04

MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial Surface Components Recognizing Adhesive Matrix Molecules). It is heavily studied in Staphylococcus aureus both for its biological role in adhesion and for its potential for vaccination. Features of the sequence, but also of other MSCRAMM adhesins, include a long run of Ser-Asp dipeptide repeats and a C-terminal cell wall anchoring LPXTG motif.


Pssm-ID: 468110 [Multi-domain]  Cd Length: 934  Bit Score: 47.21  E-value: 1.01e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  580 SMSTLRSGRKKQPASPDGRTSPINEDIRSSGRNSPSAASTSSNDSKAETVKKSAKKVKEEASSPLKSNKRQREKVASDTE 659
Cdd:NF033609  650 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 729
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 112382224  660 -EADRTSSKKTKTQEISRPNSPSEGEGES---SDSRSVNDEGSSDPKDIDQDNRSTSPSiPSPQDNESDSDSSA 729
Cdd:NF033609  730 sDSDSDSDSDSDSDSDSDSDSDSDSDSDSdsdSDSDSDSDSDSDSDSDSDSDSDSDSDS-DSDSDSDSDSDSDS 802
Myb_DNA-binding pfam00249
Myb-like DNA-binding domain; This family contains the DNA binding domains from Myb proteins, ...
396-439 1.04e-04

Myb-like DNA-binding domain; This family contains the DNA binding domains from Myb proteins, as well as the SANT domain family.


Pssm-ID: 459731 [Multi-domain]  Cd Length: 46  Bit Score: 40.95  E-value: 1.04e-04
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|....
gi 112382224   396 WTEDEVKRFVKGLRQYGKNFFRIrKELLPNKETGELITFYYYWK 439
Cdd:pfam00249    4 WTPEEDELLLEAVEKLGNRWKKI-AKLLPGRTDNQCKNRWQNYL 46
PABP-1234 TIGR01628
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ...
756-916 2.16e-03

polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.


Pssm-ID: 130689 [Multi-domain]  Cd Length: 562  Bit Score: 42.49  E-value: 2.16e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   756 PGTPQLPTPGPTPSATAVPPQGSPtasqapnQPQAPTAPVPHTHIQQAPalhpqrppsphppphpsphpplqplTGSAGQ 835
Cdd:TIGR01628  380 PRMRQLPMGSPMGGAMGQPPYYGQ-------GPQQQFNGQPLGWPRMSM-------------------------MPTPMG 427
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   836 PSAPSHAQ--PPLHGQGPPGPHSLQAgpllQHPGPPQPFGLPPQASQGQAPLGTSPAAAYPHTSLQLPASQSALQSqQPP 913
Cdd:TIGR01628  428 PGGPLRPNglAPMNAVRAPSRNAQNA----AQKPPMQPVMYPPNYQSLPLSQDLPQPQSTASQGGQNKKLAQVLAS-ATP 502

                   ...
gi 112382224   914 REQ 916
Cdd:TIGR01628  503 QMQ 505
COG5373 COG5373
Uncharacterized membrane protein [Function unknown];
728-805 2.17e-03

Uncharacterized membrane protein [Function unknown];


Pssm-ID: 444140 [Multi-domain]  Cd Length: 854  Bit Score: 42.68  E-value: 2.17e-03
                          10        20        30        40        50        60        70
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 112382224  728 SAQQQMLQAQPPAlqAPTGVTPAPSSAPPGTPQLPTPGPTPSATAVPPQGSPTASQAPNQPQAPTAPVPhthiQQAPA 805
Cdd:COG5373    31 EELEAELAEAAEA--ASAPAEPEPEAAAAATAAAPEAAPAPVPEAPAAPPAAAEAPAPAAAAPPAEAEP----AAAPA 102
 
Name Accession Description Interval E-value
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
568-1565 0e+00

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 1182.64  E-value: 0e+00
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   568 GKHSMRTRRSRGSMSTLRSGRKKQPASPDGRTSPINEDIRSSGRNSPSAASTSSNDSKAETVKKSAKKVKEEASSPLKSN 647
Cdd:pfam03154    1 GKHSMRTRRSRGSMSTLRSGRKKQTASPDGRASPTNEDLRSSGRNSPSAASTSSNDSKAESMKKSSKKIKEEAPSPLKSA 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   648 KRQREKVASDTEEADRTSSKKTKTQEISRPNSPSEGEGESSDSRSVNDEGSSDPKDIDQDNRSTSPSIPSPQDNESDSDS 727
Cdd:pfam03154   81 KRQREKGASDTEEPERATAKKSKTQEISRPNSPSEGEGESSDGRSVNDEGSSDPKDIDQDNRSTSPSIPSPQDNESDSDS 160
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   728 SAQQQMLQAQPPALQAPTGVTPAPSSAPPGTPQLPTPGPTPSATAVPPQGSPTASQAPNQPQAPTAPvpHTHIQQAPALH 807
Cdd:pfam03154  161 SAQQQILQTQPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAP--HTLIQQTPTLH 238
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   808 PQRPPSPHPPPHPSPHPPLQPltgsagQPSAPSHAQPPLHGQGPPGPHSLQAGP-LLQHPGPPQPFGLPPQASQGQAPLG 886
Cdd:pfam03154  239 PQRLPSPHPPLQPMTQPPPPS------QVSPQPLPQPSLHGQMPPMPHSLQTGPsHMQHPVPPQPFPLTPQSSQSQVPPG 312
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   887 TSPAAAYP-HTSLQLPASQSALQSQQPPREQPLPPAPLAMPHIKPPPTTPIPQLPAPQAHKHPPHLSGPSPFSMNANLPP 965
Cdd:pfam03154  313 PSPAAPGQsQQRIHTPPSQSQLQSQQPPREQPLPPAPLSMPHIKPPPTTPIPQLPNPQSHKHPPHLSGPSPFQMNSNLPP 392
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   966 PPALKPLSSLSTHHPPSAHPPPLQLMPQSQPLPSSPAQPPGLTQSQNLPPPPASHPPT-GLHQVAPQPPFAQHPFVPGGP 1044
Cdd:pfam03154  393 PPALKPLSSLSTHHPPSAHPPPLQLMPQSQQLPPPPAQPPVLTQSQSLPPPAASHPPTsGLHQVPSQSPFPQHPFVPGGP 472
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  1045 PPITPPTCPSTSTPPAGPGTsaQPPCSGAAASGGSIAGGSSCPLPTVQIKEEALDDAEEPESPPPPPRSPSPEPTVVDTP 1124
Cdd:pfam03154  473 PPITPPSGPPTSTSSAMPGI--QPPSSASVSSSGPVPAAVSCPLPPVQIKEEALDEAEEPESPPPPPRSPSPEPTVVNTP 550
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  1125 SHASQSARFYKHLDRGYNSCARTDLYFMPLAGSKLAKKREEAIEKAKREAEQKAREEREREKEKEKEREREREREREAER 1204
Cdd:pfam03154  551 SHASQSARFYKHLDRGYNSCARTDLYFMPLAGSKLAKKREEALEKAKREAEQKAREEKEREKEKEKEREREREREREAER 630
                          650       660       670       680       690       700       710       720
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  1205 AAKASSSAHEGRLSDPQLSGPGHMRPSFEPPPTTIAAVPPYIGPDTPALRTLSEYARPHVMSPTNRNHPFYMPLNPTDPL 1284
Cdd:pfam03154  631 AAKASSSSHEGRMGDPQLAGPAHMRPSFEPPPTTIAAVPPYIGPDTPALRTLSEYARPHVMSPTNRNHPFFVPLNPTDPL 710
                          730       740       750       760       770       780       790       800
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  1285 LAYHMPGLYNVDPTIRERELREREIREREIRERELRERMKPGFEVKPPELDPLHPAANPMEHFARHSALTIPPTAGPHPF 1364
Cdd:pfam03154  711 LAYHMPGLYNVDPAIRERELREREIREREIRERELRERMKPGFEVKPPELDPLHPATNPMEHFARHGALTLPPMAGPHPF 790
                          810       820       830       840       850       860       870       880
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  1365 ASFHPGLNPLERERLALAGPQLRPEMSYPDRLAAERIHAERMASLTSDPLARLQMFNVTPHHHQHSHIHSHLHLHQQDPL 1444
Cdd:pfam03154  791 ASFHPGLNPLERERLALAGPQLRPEMSYPDRLAAERLHAERMASLTNDPLARLQMFNVTPHHHQHSHIHSHLHLHQQDPL 870
                          890       900       910       920       930       940       950       960
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  1445 HQGSAGPVHPLVDPLTAGPHLARFPYPPGTLPNPLLGQPPHEHEMLRHPVFGTPYPRDLPGAIPPPMSAAHQLQAMHAQS 1524
Cdd:pfam03154  871 HQGSGGPVHPLVDPLAAGPHLARFPYPPGAIPNPLLGQPPHEHEMLRHPVFGTPYPRDLPGGLPPPMSAAHQLQAMHAQS 950
                          970       980       990      1000
                   ....*....|....*....|....*....|....*....|.
gi 112382224  1525 AELQRLAMEQQWLHGHPHMHGGHLPSQEDYYSRLKKEGDKQ 1565
Cdd:pfam03154  951 AELQRLAMEQQWLHGHPHMHGGHLPGQEDYYSRLKKESDKQ 991
BAH_MTA cd04709
BAH, or Bromo Adjacent Homology domain, as present in MTA1 and similar proteins. The ...
102-307 4.76e-81

BAH, or Bromo Adjacent Homology domain, as present in MTA1 and similar proteins. The Metastasis-associated protein MTA1 is part of the NURD (nucleosome remodeling and deacetylating) complex and plays a role in cellular transformation and metastasis. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions.


Pssm-ID: 240060  Cd Length: 164  Bit Score: 263.48  E-value: 4.76e-81
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  102 DVVYRPGDCVYIESRrPNTPYFICSIQDFKLvhnsqaccrsptpalcdppacslpvasqppqhlseagrgpvgSKRDHLL 181
Cdd:cd04709     1 ANMYRVGDYVYFESS-PNNPYLIRRIEELNK------------------------------------------TARGHVE 37
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  182 MNVKWYYRQSEVPDSVYQHLVQDRHNEND-SGRELVITDPVIKNRELFISDYVDTYHAAALRGKCNISHFSDIFAAREFK 260
Cdd:cd04709    38 AKVVCYYRRRDIPDSLYQLADQHRRELEEkSDDLTPKQRHQLRHRELFLSRQVETLPATHIRGKCSVTLLNDTESARSYL 117
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|....*..
gi 112382224  261 ARVDSFFYILGYNPETRRLNSTQGEIRVGPSHQAKLPDLQPFPSPDG 307
Cdd:cd04709   118 AREDTFFYSLVYDPEQKTLLADQGEIRVGPSYQAKLPDLQPFPSPDG 164
SANT_MTA3_like cd11661
Myb-Like Dna-Binding Domain of MTA3 and related proteins; Members in this SANT/myb family ...
395-440 4.33e-23

Myb-Like Dna-Binding Domain of MTA3 and related proteins; Members in this SANT/myb family include domains found in mouse metastasis-associated protein 3 (MTA3) proteins and arginine-glutamic dipeptide (RERE) repeats proteins. SANT (SWI3, ADA2, N-CoR and TFIIIB) DNA-binding domains are a diverse set of proteins that share a common 3 alpha-helix bundle. MTA3 has been shown to interact with nucleosome remodeling and deacetylase (NuRD) proteins CHD4 and HDAC1, and the core cohesin complex protein RAD21 in the ovary, and regulate G2/M progression in proliferating granulosa cells. RERE belongs to the atrophin family and has been identified as a nuclear receptor corepressor; altered expression levels of RERE are associated with cancer in humans while mutations of Rere in mice cause failure in closing the anterior neural tube and fusion of the telencephalic and optic vesicles during embryogenesis.


Pssm-ID: 212559 [Multi-domain]  Cd Length: 46  Bit Score: 93.45  E-value: 4.33e-23
                          10        20        30        40
                  ....*....|....*....|....*....|....*....|....*.
gi 112382224  395 CWTEDEVKRFVKGLRQYGKNFFRIRKELLPNKETGELITFYYYWKK 440
Cdd:cd11661     1 EWSESEAKLFEEGLRKYGKDFHDIRQDFLPWKSVGELVEFYYMWKK 46
ZnF_GATA smart00401
zinc finger binding to DNA consensus sequence [AT]GATA[AG];
503-552 3.08e-15

zinc finger binding to DNA consensus sequence [AT]GATA[AG];


Pssm-ID: 214648 [Multi-domain]  Cd Length: 52  Bit Score: 71.30  E-value: 3.08e-15
                            10        20        30        40        50
                    ....*....|....*....|....*....|....*....|....*....|.
gi 112382224    503 KGYACRHCFTTTSKDWHHGGRENILLCTDCRIHFKKYGEL-PPIEKPVDPP 552
Cdd:smart00401    2 SGRSCSNCGTTETPLWRRGPSGNKTLCNACGLYYKKHGGLkRPLSLKKDGI 52
ZnF_GATA cd00202
Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] ...
506-560 3.28e-15

Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C


Pssm-ID: 238123 [Multi-domain]  Cd Length: 54  Bit Score: 71.25  E-value: 3.28e-15
                          10        20        30        40        50
                  ....*....|....*....|....*....|....*....|....*....|....*
gi 112382224  506 ACRHCFTTTSKDWHHGGRENILLCTDCRIHFKKYGELPPIEKPvDPPPFMFKPVK 560
Cdd:cd00202     1 ACSNCGTTTTPLWRRGPSGGSTLCNACGLYWKKHGVMRPLSKR-KKDQIKRRNRK 54
ELM2 pfam01448
ELM2 domain; The ELM2 (Egl-27 and MTA1 homology 2) domain is a small domain of unknown ...
286-336 2.45e-12

ELM2 domain; The ELM2 (Egl-27 and MTA1 homology 2) domain is a small domain of unknown function. It is found in the MTA1 protein that is part of the NuRD complex. The domain is usually found to the N terminus of a myb-like DNA binding domain pfam00249. ELM2 is also found associated with an ARID DNA binding domain pfam01388 in Swiss:O82364. This suggests that ELM2 may also be involved in DNA binding, or perhaps is a protein-protein interaction domain.


Pssm-ID: 460214  Cd Length: 53  Bit Score: 63.02  E-value: 2.45e-12
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|...
gi 112382224   286 IRVGPSHQAKLPDLQPFPSPDGDTVTQHEELVWMP--GVNDCDLLMYLRAARS 336
Cdd:pfam01448    1 IRVGPRYQAEIPELLPPSEEEDRYEEEDELLVWDPnhNLPDRKLDEYLVVARS 53
GATA pfam00320
GATA zinc finger; This domain uses four cysteine residues to coordinate a zinc ion. This ...
507-542 1.25e-11

GATA zinc finger; This domain uses four cysteine residues to coordinate a zinc ion. This domain binds to DNA. Two GATA zinc fingers are found in the GATA transcription factors. However there are several proteins which only contain a single copy of the domain.


Pssm-ID: 425605 [Multi-domain]  Cd Length: 36  Bit Score: 60.41  E-value: 1.25e-11
                           10        20        30
                   ....*....|....*....|....*....|....*.
gi 112382224   507 CRHCFTTTSKDWHHGGRENILLCTDCRIHFKKYGEL 542
Cdd:pfam00320    1 CSNCGTTKTPLWRRGPNGNRTLCNACGLYYKKKGLK 36
PHA03247 PHA03247
large tegument protein UL36; Provisional
540-956 1.49e-11

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 69.97  E-value: 1.49e-11
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  540 GELPPIEKPVDPPPFMFKPVKEeddglsgkhSMRTRRSRGSMSTLRSGRKKQPASPDGRTSPINE--DIRSSGRNSPSAA 617
Cdd:PHA03247 2549 GDPPPPLPPAAPPAAPDRSVPP---------PRPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDrgDPRGPAPPSPLPP 2619
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  618 STSSNDSKAETVKKSAKKVKEEASSPLKSNKRQREKVASDTEEADRTSSKKTKTqeiSRPNSPSEGEGESSDSRSVNDEG 697
Cdd:PHA03247 2620 DTHAPDPPPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRA---AQASSPPQRPRRRAARPTVGSLT 2696
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  698 SS-----DPKDIDQDNRSTSPSIPSPQDNESDSDSSAQQQMLQAQPPALQAP-TGVTPAPSSAPPGTPQLPTPGPTPSAT 771
Cdd:PHA03247 2697 SLadpppPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPaTPGGPARPARPPTTAGPPAPAPPAAPA 2776
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  772 AVPPQGSPTASQAPNQPQAPTAPVPhthiqQAPALHPQRPPSPHPPPHPSPHPPLQPLTGSAGQPSAPSHAQPPLHGQGP 851
Cdd:PHA03247 2777 AGPPRRLTRPAVASLSESRESLPSP-----WDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLP 2851
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  852 PGPHSLQAGPLLQHPGPPQPFGLPPQASQGQAPLGTSPAAAYPHTSLQLPASQSALQSQQPPREQPLPPAPLAMPHIKPP 931
Cdd:PHA03247 2852 LGGSVAPGGDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQP 2931
                         410       420
                  ....*....|....*....|....*.
gi 112382224  932 PTTPIPQLPAPQA-HKHPPHLSGPSP 956
Cdd:PHA03247 2932 PPPPPPRPQPPLApTTDPAGAGEPSG 2957
BAH pfam01426
BAH domain; This domain has been called BAH (Bromo adjacent homology) domain and has also been ...
103-281 6.30e-10

BAH domain; This domain has been called BAH (Bromo adjacent homology) domain and has also been called ELM1 and BAM (Bromo adjacent motif) domain. The function of this domain is unknown but may be involved in protein-protein interaction.


Pssm-ID: 460207  Cd Length: 120  Bit Score: 58.47  E-value: 6.30e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   103 VVYRPGDCVYIESRRPNTPYFICSIQDFKlvhnsqaccrsptpalCDPPACSLPVasqppqhlseagrgpvgskrdhllm 182
Cdd:pfam01426    1 ETYSVGDFVLVEPDDADEPYYVARIEELF----------------EDTKNGKKMV------------------------- 39
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   183 NVKWYYRQSEVPdsvyqHLVQDRHNEndsgrelvitdpviknRELFISDYVDTYHAAALRGKCNISHFSDIFAAREFK-A 261
Cdd:pfam01426   40 RVQWFYRPEETV-----HRAGKAFNK----------------DELFLSDEEDDVPLSAIIGKCSVLHKSDLESLDPYKiK 98
                          170       180
                   ....*....|....*....|
gi 112382224   262 RVDSFFYILGYNPETRRLNS 281
Cdd:pfam01426   99 EPDDFFCELLYDPKTKSFKK 118
BAH smart00439
Bromo adjacent homology domain;
105-281 1.55e-09

Bromo adjacent homology domain;


Pssm-ID: 214664 [Multi-domain]  Cd Length: 121  Bit Score: 57.30  E-value: 1.55e-09
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224    105 YRPGDCVYIESRRPNTPYFICSIQDfklvhnsqaccrsptpaLCDPPACSLPVASQppqhlseagrgpvgskrdhllmnV 184
Cdd:smart00439    2 ISVGDFVLVEPDDADEPYYIGRIEE-----------------IFETKKNSESKMVR-----------------------V 41
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224    185 KWYYRQSEVPdsvyqHLVQDRHNENdsgrelvitdpviknrELFISDYVDTYHAAALRGKCNISHFSDIF--AAREFKAR 262
Cdd:smart00439   42 RWFYRPEETV-----LEKAALFDKN----------------EVFLSDEYDTVPLSDIIGKCNVLYKSDYPglRPEGSIGE 100
                           170
                    ....*....|....*....
gi 112382224    263 VDSFFYILGYNPETRRLNS 281
Cdd:smart00439  101 PDVFFCESAYDPEKGSFKK 119
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
738-916 7.89e-09

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 60.38  E-value: 7.89e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  738 PPALQAPTGVTPAPSSAPPGTPQLPTPGPTPSATAVPPQGSPTASQAPNQPQAPTAPVPHTHIQQAPALHPQRPPSPHPP 817
Cdd:PRK07764  591 APGAAGGEGPPAPASSGPPEEAARPAAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDGWP 670
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  818 PHPSPHPPLQPLTGSAGQPSAPSHAQPPLHGQ--GPPGPHSLQA-GPLLQHPGPPQPFGLPPQASQGQAPLGTSPAA--A 892
Cdd:PRK07764  671 AKAGGAAPAAPPPAPAPAAPAAPAGAAPAQPApaPAATPPAGQAdDPAAQPPQAAQGASAPSPAADDPVPLPPEPDDppD 750
                         170       180
                  ....*....|....*....|....
gi 112382224  893 YPHTSLQLPASQSALQSQQPPREQ 916
Cdd:PRK07764  751 PAGAPAQPPPPPAPAPAAAPAAAP 774
MSCRAMM_ClfA NF033609
MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial ...
545-729 9.44e-09

MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial Surface Components Recognizing Adhesive Matrix Molecules). It is heavily studied in Staphylococcus aureus both for its biological role in adhesion and for its potential for vaccination. Features of the sequence, but also of other MSCRAMM adhesins, include a long run of Ser-Asp dipeptide repeats and a C-terminal cell wall anchoring LPXTG motif.


Pssm-ID: 468110 [Multi-domain]  Cd Length: 934  Bit Score: 60.31  E-value: 9.44e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  545 IEKPVDPP----PFMFKPVKEEDDGLSGKHSMRTRRSR------GSMSTLRSGRKKQPASPDGRTSPINEDIRSSGRNSP 614
Cdd:NF033609  539 IDKPVVPEqpdePGEIEPIPEDSDSDPGSDSGSDSSNSdsgsdsGSDSTSDSGSDSASDSDSASDSDSASDSDSASDSDS 618
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  615 SAASTSSNDSKAETVKKSAKKVKEEASSPLKSNKRQREKVASDTEeADRTSSKKTKTQEISRPNSPSEGEGES-SDSRSV 693
Cdd:NF033609  619 ASDSDSASDSDSASDSDSASDSDSDSDSDSDSDSDSDSDSDSDSD-SDSDSDSDSDSDSDSDSDSDSDSDSDSdSDSDSD 697
                         170       180       190
                  ....*....|....*....|....*....|....*.
gi 112382224  694 NDEGSSDPKDIDQDNRSTSPSiPSPQDNESDSDSSA 729
Cdd:NF033609  698 SDSDSDSDSDSDSDSDSDSDS-DSDSDSDSDSDSDS 732
BAH cd04370
BAH, or Bromo Adjacent Homology domain (also called ELM1 and BAM for Bromo Adjacent Motif). ...
105-277 8.70e-08

BAH, or Bromo Adjacent Homology domain (also called ELM1 and BAM for Bromo Adjacent Motif). BAH domains have first been described as domains found in the polybromo protein and Yeast Rsc1/Rsc2 (Remodeling of the Structure of Chromatin). They also occur in mammalian DNA methyltransferases and the MTA1 subunits of histone deacetylase complexes. A BAH domain is also found in Yeast Sir3p and in the origin receptor complex protein 1 (Orc1p), where it was found to interact with the N-terminal lobe of the silence information regulator 1 protein (Sir1p), confirming the initial hypothesis that BAH plays a role in protein-protein interactions.


Pssm-ID: 239835 [Multi-domain]  Cd Length: 123  Bit Score: 52.39  E-value: 8.70e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  105 YRPGDCVYIE--SRRPNTPYFICSIQDFklvhnsqaccrsptpalcdppacslpvasqppqhlseagrgpVGSKRDHLLM 182
Cdd:cd04370     4 YEVGDSVYVEpdDSIKSDPPYIARIEEL------------------------------------------WEDTNGSKQV 41
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  183 NVKWYYRQSEVPDSVYQHlvqdrHNEndsgrelvitdpviknRELFISDYVDTYHAAALRGKCNISHFSDIF--AAREFK 260
Cdd:cd04370    42 KVRWFYRPEETPKGLSPF-----ALR----------------RELFLSDHLDEIPVESIIGKCKVLFVSEFEglKQRPNK 100
                         170
                  ....*....|....*..
gi 112382224  261 ARVDSFFYILGYNPETR 277
Cdd:cd04370   101 IDTDDFFCRLAYDPTTK 117
BAH_fungalPHD cd04710
BAH, or Bromo Adjacent Homology domain, as present in fungal proteins containing PHD domains. ...
101-278 1.92e-07

BAH, or Bromo Adjacent Homology domain, as present in fungal proteins containing PHD domains. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions.


Pssm-ID: 240061  Cd Length: 135  Bit Score: 51.60  E-value: 1.92e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  101 DDVVYRPGDCVYIESRRPNTPYFICSIQDFklvhnsqaccrsptpalcdppacsLPVASQPPQHLSEAGRGPVGSKRdhl 180
Cdd:cd04710     8 NGELLKVNDHIYMSSEPPGEPYYIGRIMEF------------------------VPKHEFPSGIHARVFPASYFQVR--- 60
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  181 lMNvkWYYRQSEVpdsvyqhlvqDRHNENDSgrelvitdpviknRELFISDYVDTYHAAALRGKCNISHFSDIFAAREFK 260
Cdd:cd04710    61 -LN--WYYRPRDI----------SRRVVADS-------------RLLYASMHSDICPIGSVRGKCTVRHRDQIPDLEEYK 114
                         170
                  ....*....|....*...
gi 112382224  261 ARVDSFFYILGYNPETRR 278
Cdd:cd04710   115 KRPNHFYFDQLFDRYILR 132
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
738-890 6.57e-07

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 54.27  E-value: 6.57e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   738 PPALQAPTGVTPAPSSAPPGTPQLP----------------TPGPTPSATAVPPQgSPTASQAPNQPQAPTAPVPHTHIQ 801
Cdd:pfam09770  166 APKKAAAPAPAPQPAAQPASLPAPSrkmmsleeveaamraqAKKPAQQPAPAPAQ-PPAAPPAQQAQQQQQFPPQIQQQQ 244
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   802 QAPALHPQRPPSPHPPPHPSPHPPLQPLTGSAGQPSAPSHAQPPLHGQGPPGPHSLQA-----------------GPLLQ 864
Cdd:pfam09770  245 QPQQQPQQPQQHPGQGHPVTILQRPQSPQPDPAQPSIQPQAQQFHQQPPPVPVQPTQIlqnpnrlsaarvgypqnPQPGV 324
                          170       180
                   ....*....|....*....|....*.
gi 112382224   865 HPGPPQPFGLPPQASQGQAPLGTSPA 890
Cdd:pfam09770  325 QPAPAHQAHRQQGSFGRQAPIITHPQ 350
PHA03247 PHA03247
large tegument protein UL36; Provisional
542-961 8.20e-07

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 54.17  E-value: 8.20e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  542 LPPIEKPVDPPPFMFKPVKEEDDGLSGKHSMRTRRSRGSMSTLR---SGRKKQPASPDGRTSPINEDIRSSGRNS--PSA 616
Cdd:PHA03247 2617 LPPDTHAPDPPPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRvsrPRRARRLGRAAQASSPPQRPRRRAARPTvgSLT 2696
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  617 ASTSSNDSKAETVKKSAKKVKEEASSPLKSNKRQREKVASDTEEADRTSSKKTKTQEISRPNSPSEGEGESSDSRSVNDE 696
Cdd:PHA03247 2697 SLADPPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPA 2776
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  697 GSSDPKDIDQDNRSTSPSIPSPQDNESDSDSSAQQQMLQAQPPALQAPTGVTPAPSSA---PPGTPQLPTPGPTPSATAV 773
Cdd:PHA03247 2777 AGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAqptAPPPPPGPPPPSLPLGGSV 2856
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  774 PPQGS----PTASQAPNQPQAPTAPVPHTHIQQAPALHPQRPPSPHPPPHPSPHPPLQPLTGSAGQPSAPSHAQPPLHGQ 849
Cdd:PHA03247 2857 APGGDvrrrPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPP 2936
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  850 GPPGPhslQAGPLLQHPGPPQPFGLPPQASQGQAPLGTSPA----AAYPHTSLQLPASQ--------------------- 904
Cdd:PHA03247 2937 PRPQP---PLAPTTDPAGAGEPSGAVPQPWLGALVPGRVAVprfrVPQPAPSREAPASStppltghslsrvsswasslal 3013
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  905 ---------SALQSQQPPREQPLPPAPLAMPHIKPPPTTPIPQLPAPQAHKHPPHLS-----------------GPSPFS 958
Cdd:PHA03247 3014 heetdpppvSLKQTLWPPDDTEDSDADSLFDSDSERSDLEALDPLPPEPHDPFAHEPdpatpeagarespssqfGPPPLS 3093

                  ...
gi 112382224  959 MNA 961
Cdd:PHA03247 3094 ANA 3096
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
660-845 1.71e-06

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 53.07  E-value: 1.71e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  660 EADRTSSKKTKTQEISRPNSPSEGEGESSDSRSVNDEGSSDPKDIDQDNRSTSPSIPSPQDNESDSDSSAQQQMLQAQPP 739
Cdd:PRK07764  598 EGPPAPASSGPPEEAARPAAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDGWPAKAGGAA 677
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  740 ALQAPTGVTPAPSSAPPGTPQlPTPGPTPSATAVPPQGSPTASQAPNQPQAPTAPVPHTHIQQAPALHPQRPPSPHPPPH 819
Cdd:PRK07764  678 PAAPPPAPAPAAPAAPAGAAP-AQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPAADDPVPLPPEPDDPPDPAGAPA 756
                         170       180
                  ....*....|....*....|....*.
gi 112382224  820 PSPHPPLQPLTGSAGQPSAPSHAQPP 845
Cdd:PRK07764  757 QPPPPPAPAPAAAPAAAPPPSPPSEE 782
PHA03247 PHA03247
large tegument protein UL36; Provisional
708-1145 1.72e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 53.02  E-value: 1.72e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  708 NRSTSPSIPSPQDNESDSDSSAQQQMLQAQPPALQAPTG------VTPAPSSAPPGTPQLPTPGPTPSATAV-PPQGSPT 780
Cdd:PHA03247 2565 DRSVPPPRPAPRPSEPAVTSRARRPDAPPQSARPRAPVDdrgdprGPAPPSPLPPDTHAPDPPPPSPSPAANePDPHPPP 2644
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  781 ASQAPNQPQAPTAP--VPHTHIQQAPALHPQRPPSPHPPPHPSPHPPLQPLTGSAGQPSAPSHAQPPLHGQGPPGPHSLQ 858
Cdd:PHA03247 2645 TVPPPERPRDDPAPgrVSRPRRARRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPPPPTPEPAPHALVSATPLPPG 2724
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  859 AGPLLQ-HPGPPQPFGLPPQASQGQAPLGTSPAAAYPHTSLQLPASQSALQSQQPPREQPLPPaplamphIKPPPTTPIP 937
Cdd:PHA03247 2725 PAAARQaSPALPAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRPA-------VASLSESRES 2797
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  938 QLPAPQAHKHPPHLSGPSPFSMNANLPPPPALKPLSSLSTHHPPSAHPPPLQL--------------MPQSQPLPSSPAQ 1003
Cdd:PHA03247 2798 LPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLplggsvapggdvrrRPPSRSPAAKPAA 2877
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224 1004 PPGlTQSQNLPPPPASHPPTGLHQVAPQPPFAQHPFVPGGPPPITPPTCPSTSTPPAGPGTSAQPPCSGAAASGGSIAGG 1083
Cdd:PHA03247 2878 PAR-PPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTTDPAGAGEPS 2956
                         410       420       430       440       450       460
                  ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 112382224 1084 SSCPLPTvqikeeaLDDAEEPESPPPPPRSPSPEPTVvdtPSHASQSARFYKHLDRGYNSCA 1145
Cdd:PHA03247 2957 GAVPQPW-------LGALVPGRVAVPRFRVPQPAPSR---EAPASSTPPLTGHSLSRVSSWA 3008
PHA03378 PHA03378
EBNA-3B; Provisional
739-882 2.88e-06

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 52.38  E-value: 2.88e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  739 PALQAPTGV-TPAPSSAPPGTPQLPTPGPTPsatAVPPQGSPTASQAP---NQPQAPTAPVPHTHIQQAPALHPQRPPSP 814
Cdd:PHA03378  673 PYQPSPTGAnTMLPIQWAPGTMQPPPRAPTP---MRPPAAPPGRAQRPaaaTGRARPPAAAPGRARPPAAAPGRARPPAA 749
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 112382224  815 HPPPHPSPHPPLQPLTGSAGQPSAPSHAQPPlhgQGPPGPHSL-QAGPLLQHP--GPPQPFGLPPQASQGQ 882
Cdd:PHA03378  750 APGRARPPAAAPGRARPPAAAPGAPTPQPPP---QAPPAPQQRpRGAPTPQPPpqAGPTSMQLMPRAAPGQ 817
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
680-910 4.93e-06

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 51.42  E-value: 4.93e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  680 PSEGEGESSDSRSVNDEgSSDPKDIDQDNRSTSPSIPSPQDNESDSDSSAQQQMLQAQPPALQ--APTGVTPAPSSAPPG 757
Cdd:PRK12323  365 PGQSGGGAGPATAAAAP-VAQPAPAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRspAPEALAAARQASARG 443
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  758 TPQLPTPGPTPSATAVPPQGSPTASQAPNQPQAPTAPVPHTHIQQAPALHPQRPPSPHPPPHPSPHPPLQPLTGSAGQPS 837
Cdd:PRK12323  444 PGGAPAPAPAPAAAPAAAARPAAAGPRPVAAAAAAAPARAAPAAAPAPADDDPPPWEELPPEFASPAPAQPDAAPAGWVA 523
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  838 A----PSHAQPPlhGQGPPGPHSLQAGPLLQHPGPPQPFGLPPQASQGQAPLGTSPAAAYPHTSLQLP----ASQSALQS 909
Cdd:PRK12323  524 EsipdPATADPD--DAFETLAPAPAAAPAPRAAAATEPVVAPRPPRASASGLPDMFDGDWPALAARLPvrglAQQLARQS 601

                  .
gi 112382224  910 Q 910
Cdd:PRK12323  602 E 602
PspC_subgroup_1 NF033838
pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, ...
557-804 4.95e-06

pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A. The other form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site.


Pssm-ID: 468201 [Multi-domain]  Cd Length: 684  Bit Score: 51.17  E-value: 4.95e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  557 KPVKEEDDGLSGKHSMRTRRSRGSMStlrsgrkkQPASPDGRTSPINEDIRSSGRNSPSAASTSSNDSKAEtvkksAKKV 636
Cdd:NF033838  246 KEAVEKNVATSEQDKPKRRAKRGVLG--------EPATPDKKENDAKSSDSSVGEETLPSPSLKPEKKVAE-----AEKK 312
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  637 KEEAssplksnkrqrEKVASDTEEADR----TSSKKTKTQEISRPNSP-SEGE-----GESSDSRsvNDEGSSDPKDIDQ 706
Cdd:NF033838  313 VEEA-----------KKKAKDQKEEDRrnypTNTYKTLELEIAESDVKvKEAElelvkEEAKEPR--NEEKIKQAKAKVE 379
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  707 DNRSTSPSIPSPQDNESDSDSSAQQQMlqaqppALQAPTGVTPAPSSAPPGTPQLPTPGPTPSATAVPPQGSPTASQAPN 786
Cdd:NF033838  380 SKKAEATRLEKIKTDRKKAEEEAKRKA------AEEDKVKEKPAEQPQPAPAPQPEKPAPKPEKPAEQPKAEKPADQQAE 453
                         250
                  ....*....|....*....
gi 112382224  787 QPQAPTAP-VPHTHIQQAP 804
Cdd:NF033838  454 EDYARRSEeEYNRLTQQQP 472
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
677-894 5.62e-06

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 51.14  E-value: 5.62e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  677 PNSPSEGEGESSDSRSVNDEGSSDPKDIDQDNRSTSPSiPSPQDNESDSDSSAQQQMLQAqPPALQAPTGVTPAPSSAPP 756
Cdd:PRK07764  591 APGAAGGEGPPAPASSGPPEEAARPAAPAAPAAPAAPA-PAGAAAAPAEASAAPAPGVAA-PEHHPKHVAVPDASDGGDG 668
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  757 GTPQLPTPGPTPSATAVPPQGSPT-ASQAPNQPQAPTAPVPHTHIQQAPAlhpqrppsphppPHPSPHPPLQPLTGSAGQ 835
Cdd:PRK07764  669 WPAKAGGAAPAAPPPAPAPAAPAApAGAAPAQPAPAPAATPPAGQADDPA------------AQPPQAAQGASAPSPAAD 736
                         170       180       190       200       210
                  ....*....|....*....|....*....|....*....|....*....|....*....
gi 112382224  836 PSAPSHAQPPLHGQGPPGPHSLQAGPllqHPGPPQPFGLPPQASQGQAPLGTSPAAAYP 894
Cdd:PRK07764  737 DPVPLPPEPDDPPDPAGAPAQPPPPP---APAPAAAPAAAPPPSPPSEEEEMAEDDAPS 792
PRK14951 PRK14951
DNA polymerase III subunits gamma and tau; Provisional
733-910 7.52e-06

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237865 [Multi-domain]  Cd Length: 618  Bit Score: 50.48  E-value: 7.52e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  733 MLQAQPPAlqAPTGVTPAPSSAP-PGTPQLPTPGPTPSATAVPPQGSPTASQAPNQPQAPTAPVPHTHIQQAPALHPQRP 811
Cdd:PRK14951  361 LLAFKPAA--AAEAAAPAEKKTPaRPEAAAPAAAPVAQAAAAPAPAAAPAAAASAPAAPPAAAPPAPVAAPAAAAPAAAP 438
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  812 PSPHPPPHPSPhpplqpltgSAGQPSAPSHAQPPLHGQgpPGPHSLQAGPllqHPGPPQPFGLPPQASQGQAPLGTSP-- 889
Cdd:PRK14951  439 AAAPAAVALAP---------APPAQAAPETVAIPVRVA--PEPAVASAAP---APAAAPAAARLTPTEEGDVWHATVQql 504
                         170       180
                  ....*....|....*....|.
gi 112382224  890 AAAYPHTSLqlpASQSALQSQ 910
Cdd:PRK14951  505 AAAEAITAL---ARELALQSE 522
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
723-871 8.30e-06

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 50.75  E-value: 8.30e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  723 SDSDSSAQQQMLQAQPPALQAPTGVTPAPSSAPPGTPQLPTPGPTPSATAVPPQGSPTAsqapnQPQAPTAPVPHTHIQQ 802
Cdd:PRK07764  367 ASDDERGLLARLERLERRLGVAGGAGAPAAAAPSAAAAAPAAAPAPAAAAPAAAAAPAP-----AAAPQPAPAPAPAPAP 441
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 112382224  803 APALHPQRPPSPHPPPHPSPHPPLQPLTGSAGQPSAPSHAQPPLHGQGPPGPhslQAGPllQHPGPPQP 871
Cdd:PRK07764  442 PSPAGNAPAGGAPSPPPAAAPSAQPAPAPAAAPEPTAAPAPAPPAAPAPAAA---PAAP--AAPAAPAG 505
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
747-954 1.22e-05

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 49.98  E-value: 1.22e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  747 VTPAPSSAPPGTPQLPTPGPTPSATAVP-PQGSPTASQAPNQPQAPTAPVPHTHIQQAPALHPQRPPSPHPPPHPSPHPP 825
Cdd:PRK07764  588 VGPAPGAAGGEGPPAPASSGPPEEAARPaAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGD 667
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  826 LQPLTGSAGQPSAPSHAQPPLHGQGPPGphslQAGPLLQHPGPPQPFGLPPQASQGQAPLGTSPAAAyphtslqlpaSQS 905
Cdd:PRK07764  668 GWPAKAGGAAPAAPPPAPAPAAPAAPAG----AAPAQPAPAPAATPPAGQADDPAAQPPQAAQGASA----------PSP 733
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|....*....
gi 112382224  906 ALQSQQPPREQPLPPAPLAMPHIKPPPTTPIPQLPAPQAHKHPPHLSGP 954
Cdd:PRK07764  734 AADDPVPLPPEPDDPPDPAGAPAQPPPPPAPAPAAAPAAAPPPSPPSEE 782
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
592-791 1.71e-05

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 49.78  E-value: 1.71e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  592 PASPDGRTSPINEDIRSSGRNSPSAASTSSNDSKAEtvKKSAKKVKEEASSPLKSnkRQREKVASDTEEADR----TSSK 667
Cdd:PHA03307  190 PAEPPPSTPPAAASPRPPRRSSPISASASSPAPAPG--RSAADDAGASSSDSSSS--ESSGCGWGPENECPLprpaPITL 265
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  668 KTKTQEISRPNSPSEGEGESSDSRSVNDEgSSDPKDIDQDNRSTSPSiPSPQDNESDSDSSAQQQMLQAQPPALQAPTGV 747
Cdd:PHA03307  266 PTRIWEASGWNGPSSRPGPASSSSSPRER-SPSPSPSSPGSGPAPSS-PRASSSSSSSRESSSSSTSSSSESSRGAAVSP 343
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|....
gi 112382224  748 TPAPSSAPPgtpqlPTPGPTPSATAVPPQGSPTASQAPNQPQAP 791
Cdd:PHA03307  344 GPSPSRSPS-----PSRPPPPADPSSPRKRPRPSRAPSSPAASA 382
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
728-1013 1.75e-05

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 49.65  E-value: 1.75e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   728 SAQQQMLQAQPPALQAPTGVTP--APSSAPPGTPQLPT---------PGPTPSATA------VPPQgspTASQAPNQPQA 790
Cdd:pfam09770  103 NRQQPAARAAQSSAQPPASSLPqyQYASQQSQQPSKPVrtgyekykePEPIPDLQVdaslwgVAPK---KAAAPAPAPQP 179
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   791 PTAPVPHTHIQ---------------QAPAlhpqrppsphppphpsphpplqpltgsAGQPSAPSHAQPPLHGQGPPGPH 855
Cdd:pfam09770  180 AAQPASLPAPSrkmmsleeveaamraQAKK---------------------------PAQQPAPAPAQPPAAPPAQQAQQ 232
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   856 SLQAGPLLQHPGPPQPFGLPPQASQGQAPlgtspaaayPHTSLQLPASQSALQSQQPPREQplppaplamphikpppttp 935
Cdd:pfam09770  233 QQQFPPQIQQQQQPQQQPQQPQQHPGQGH---------PVTILQRPQSPQPDPAQPSIQPQ------------------- 284
                          250       260       270       280       290       300       310
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 112382224   936 ipqlpaPQAHKHPPHLSGPSPFSMNANLPPPPALKPLssLSTHHPPSAHPPPLQLMPQSQplPSSPAQPPGLTQSQNL 1013
Cdd:pfam09770  285 ------AQQFHQQPPPVPVQPTQILQNPNRLSAARVG--YPQNPQPGVQPAPAHQAHRQQ--GSFGRQAPIITHPQQL 352
PRK13042 PRK13042
superantigen-like protein SSL4; Reviewed;
710-796 3.17e-05

superantigen-like protein SSL4; Reviewed;


Pssm-ID: 183854 [Multi-domain]  Cd Length: 291  Bit Score: 47.71  E-value: 3.17e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  710 STSPSIPSPQDNESDSDSSAQQQMLQAQPPALQaPTGVTPAPSSAPPGTPQLPTPGPTPSATaVPPQGSPTASQAPNQPQ 789
Cdd:PRK13042   17 TTGVITTTTQAANATTPSSTKVEAPQSTPPSTK-VEAPQSKPNATTPPSTKVEAPQQTPNAT-TPSSTKVETPQSPTTKQ 94

                  ....*..
gi 112382224  790 APTAPVP 796
Cdd:PRK13042   95 VPTEINP 101
SANT smart00717
SANT SWI3, ADA2, N-CoR and TFIIIB'' DNA-binding domains;
396-441 3.89e-05

SANT SWI3, ADA2, N-CoR and TFIIIB'' DNA-binding domains;


Pssm-ID: 197842 [Multi-domain]  Cd Length: 49  Bit Score: 42.60  E-value: 3.89e-05
                            10        20        30        40
                    ....*....|....*....|....*....|....*....|....*..
gi 112382224    396 WTEDEVKRFVKGLRQYG-KNFFRIRKElLPNKETGELITFYYYWKKT 441
Cdd:smart00717    4 WTEEEDELLIELVKKYGkNNWEKIAKE-LPGRTAEQCRERWRNLLKP 49
SANT cd00167
'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric ...
396-439 3.95e-05

'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA.


Pssm-ID: 238096 [Multi-domain]  Cd Length: 45  Bit Score: 42.18  E-value: 3.95e-05
                          10        20        30        40
                  ....*....|....*....|....*....|....*....|....*
gi 112382224  396 WTEDEVKRFVKGLRQYG-KNFFRIRKElLPNKETGELITFYYYWK 439
Cdd:cd00167     2 WTEEEDELLLEAVKKYGkNNWEKIAKE-LPGRTPKQCRERWRNLL 45
PLN02967 PLN02967
kinase
558-687 6.54e-05

kinase


Pssm-ID: 215521 [Multi-domain]  Cd Length: 581  Bit Score: 47.35  E-value: 6.54e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  558 PVKEEDDGLSGKHSMRTRRSRgsmstlRSGRKKQPASPDGRTSPINEDIRssgrNSPSAASTSSNDSKAETVKKSA---K 634
Cdd:PLN02967   57 AVDEEPDENGAVSKKKPTRSV------KRATKKTVVEISEPLEEGSELVV----NEDAALDKESKKTPRRTRRKAAaasS 126
                          90       100       110       120       130
                  ....*....|....*....|....*....|....*....|....*....|...
gi 112382224  635 KVKEEASSPLKSNKRQREKVASDTEEADRTSSKKTKTQEISRPNSPSEGEGES 687
Cdd:PLN02967  127 DVEEEKTEKKVRKRRKVKKMDEDVEDQGSESEVSDVEESEFVTSLENESEEEL 179
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
735-959 6.99e-05

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 47.56  E-value: 6.99e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  735 QAQPPALQAPTGVTPAPSSAPPgTPQLPTPGPTPSATAVPPQGSPTASQAPNQPQAPT-APVPHTHIQQAPALHPQRPPS 813
Cdd:PRK12323  371 GAGPATAAAAPVAQPAPAAAAP-AAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSpAPEALAAARQASARGPGGAPA 449
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  814 PHPPPhpsphpplqpltgsagqPSAPSHAQPPlHGQGPPGPHSLQAGPllqhPGPPQPFGLPPQASQGQAP---LGTSPA 890
Cdd:PRK12323  450 PAPAP-----------------AAAPAAAARP-AAAGPRPVAAAAAAA----PARAAPAAAPAPADDDPPPweeLPPEFA 507
                         170       180       190       200       210       220
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 112382224  891 AAYPHTSLQLPASQSALQSQQPPREQPLPPAPLAMPHIKPPPTTPIPQLPAPQAHKHPPHLSGPSPFSM 959
Cdd:PRK12323  508 SPAPAQPDAAPAGWVAESIPDPATADPDDAFETLAPAPAAAPAPRAAAATEPVVAPRPPRASASGLPDM 576
MSCRAMM_ClfA NF033609
MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial ...
580-729 7.10e-05

MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial Surface Components Recognizing Adhesive Matrix Molecules). It is heavily studied in Staphylococcus aureus both for its biological role in adhesion and for its potential for vaccination. Features of the sequence, but also of other MSCRAMM adhesins, include a long run of Ser-Asp dipeptide repeats and a C-terminal cell wall anchoring LPXTG motif.


Pssm-ID: 468110 [Multi-domain]  Cd Length: 934  Bit Score: 47.60  E-value: 7.10e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  580 SMSTLRSGRKKQPASPDGRTSPINEDIRSSGRNSPSAASTSSNDSKAETVKKSAKKVKEEASSPLKSNKRQREKVASDTE 659
Cdd:NF033609  630 SASDSDSASDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 709
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 112382224  660 -EADRTSSKKTKTQEISRPNSPSEGEGES-SDSRSVNDEGSSDPKDIDQDNRSTSPSiPSPQDNESDSDSSA 729
Cdd:NF033609  710 sDSDSDSDSDSDSDSDSDSDSDSDSDSDSdSDSDSDSDSDSDSDSDSDSDSDSDSDS-DSDSDSDSDSDSDS 780
MSCRAMM_ClfA NF033609
MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial ...
600-798 7.93e-05

MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial Surface Components Recognizing Adhesive Matrix Molecules). It is heavily studied in Staphylococcus aureus both for its biological role in adhesion and for its potential for vaccination. Features of the sequence, but also of other MSCRAMM adhesins, include a long run of Ser-Asp dipeptide repeats and a C-terminal cell wall anchoring LPXTG motif.


Pssm-ID: 468110 [Multi-domain]  Cd Length: 934  Bit Score: 47.60  E-value: 7.93e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  600 SPINEDIRSSGRNSPSAASTSSNDSKAETVKKSAKKVKEEASSPLKSNKRQREKVASDTE-EADRTSSKKTKTQEISRPN 678
Cdd:NF033609  704 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDsDSDSDSDSDSDSDSDSDSD 783
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  679 SPSEGEGES-SDSRSVNDEGSSDPKDIDQDNRSTSPSiPSPQDNESDSDS-----SAQQQMLQAQPPALQAPTGVTPAPS 752
Cdd:NF033609  784 SDSDSDSDSdSDSDSDSDSDSDSDSDSDSDSDSDSDS-DSDSDSDSDSDSdsdsdSDSDSDSDSDSDSDSDSDSESDSNS 862
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|....*...
gi 112382224  753 SAPPGTpqlptpgptpSATAVPPQGSPTASQAPNQPQAPTA--PVPHT 798
Cdd:NF033609  863 DSESGS----------NNNVVPPNSPKNGTNASNKNEAKDSkePLPDT 900
MSCRAMM_ClfA NF033609
MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial ...
580-729 1.01e-04

MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial Surface Components Recognizing Adhesive Matrix Molecules). It is heavily studied in Staphylococcus aureus both for its biological role in adhesion and for its potential for vaccination. Features of the sequence, but also of other MSCRAMM adhesins, include a long run of Ser-Asp dipeptide repeats and a C-terminal cell wall anchoring LPXTG motif.


Pssm-ID: 468110 [Multi-domain]  Cd Length: 934  Bit Score: 47.21  E-value: 1.01e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  580 SMSTLRSGRKKQPASPDGRTSPINEDIRSSGRNSPSAASTSSNDSKAETVKKSAKKVKEEASSPLKSNKRQREKVASDTE 659
Cdd:NF033609  650 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 729
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 112382224  660 -EADRTSSKKTKTQEISRPNSPSEGEGES---SDSRSVNDEGSSDPKDIDQDNRSTSPSiPSPQDNESDSDSSA 729
Cdd:NF033609  730 sDSDSDSDSDSDSDSDSDSDSDSDSDSDSdsdSDSDSDSDSDSDSDSDSDSDSDSDSDS-DSDSDSDSDSDSDS 802
PRK10856 PRK10856
cytoskeleton protein RodZ;
686-790 1.02e-04

cytoskeleton protein RodZ;


Pssm-ID: 236776 [Multi-domain]  Cd Length: 331  Bit Score: 46.17  E-value: 1.02e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  686 ESSDSRSVNDEGSSDPKDIDQDNRSTSPSIPSPQDNESDSDSSAQQQMLQAQPPALQAPTGVTPAPSSAPPGTPQLPTPG 765
Cdd:PRK10856  149 QSSAELSQNSGQSVPLDTSTTTDPATTPAPAAPVDTTPTNSQTPAVATAPAPAVDPQQNAVVAPSQANVDTAATPAPAAP 228
                          90       100
                  ....*....|....*....|....*
gi 112382224  766 PTPSATAVPPQGSPTASQAPNQPQA 790
Cdd:PRK10856  229 ATPDGAAPLPTDQAGVSTPAADPNA 253
Myb_DNA-binding pfam00249
Myb-like DNA-binding domain; This family contains the DNA binding domains from Myb proteins, ...
396-439 1.04e-04

Myb-like DNA-binding domain; This family contains the DNA binding domains from Myb proteins, as well as the SANT domain family.


Pssm-ID: 459731 [Multi-domain]  Cd Length: 46  Bit Score: 40.95  E-value: 1.04e-04
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|....
gi 112382224   396 WTEDEVKRFVKGLRQYGKNFFRIrKELLPNKETGELITFYYYWK 439
Cdd:pfam00249    4 WTPEEDELLLEAVEKLGNRWKKI-AKLLPGRTDNQCKNRWQNYL 46
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
709-906 2.34e-04

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 45.80  E-value: 2.34e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   709 RSTSPSIPSPQDNESDSDSSAQQQMLQAQPPALQAPTGVTPAPSSAPPGTPQLPTPGPTPSATAVPPQGSPTASQAPNQP 788
Cdd:pfam09770  204 RAQAKKPAQQPAPAPAQPPAAPPAQQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGHPVTILQRPQSPQPDPAQPSIQP 283
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   789 -------QAPTAPVPHTHIQQapalhpqrppsphppphpsphpplqpltgsagQPSAPSHAQPPLHGQGPPGPHslqagP 861
Cdd:pfam09770  284 qaqqfhqQPPPVPVQPTQILQ--------------------------------NPNRLSAARVGYPQNPQPGVQ-----P 326
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*
gi 112382224   862 LLQHPGPPQPFGLPPQAsqgqaplgtsPAAAYPHTSLQLPASQSA 906
Cdd:pfam09770  327 APAHQAHRQQGSFGRQA----------PIITHPQQLAQLSEEEKA 361
PTZ00108 PTZ00108
DNA topoisomerase 2-like protein; Provisional
536-708 3.22e-04

DNA topoisomerase 2-like protein; Provisional


Pssm-ID: 240271 [Multi-domain]  Cd Length: 1388  Bit Score: 45.42  E-value: 3.22e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  536 FKKYGELPPIEKPVDPPPFMFKPVKEEDDglSGKHSMRTRRSRGSMSTLRSGRKKQPASPDGRTSPINedirSSGRNSPS 615
Cdd:PTZ00108 1223 SDQEDDEEQKTKPKKSSVKRLKSKKNNSS--KSSEDNDEFSSDDLSKEGKPKNAPKRVSAVQYSPPPP----SKRPDGES 1296
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  616 AASTSSNDSKAETVKKSAKKVKEEASSPLKSNKRQREKVASDTEEADRTSSKKTKTQEISRPNSPSEGEgESSDSRSVND 695
Cdd:PTZ00108 1297 NGGSKPSSPTKKKVKKRLEGSLAALKKKKKSEKKTARKKKSKTRVKQASASQSSRLLRRPRKKKSDSSS-EDDDDSEVDD 1375
                         170
                  ....*....|...
gi 112382224  696 EGSSDPKDIDQDN 708
Cdd:PTZ00108 1376 SEDEDDEDDEDDD 1388
kgd PRK12270
multifunctional oxoglutarate decarboxylase/oxoglutarate dehydrogenase thiamine ...
691-805 4.61e-04

multifunctional oxoglutarate decarboxylase/oxoglutarate dehydrogenase thiamine pyrophosphate-binding subunit/dihydrolipoyllysine-residue succinyltransferase subunit;


Pssm-ID: 237030 [Multi-domain]  Cd Length: 1228  Bit Score: 44.88  E-value: 4.61e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  691 RSVNDEGSSDPK--DIDQDNRSTSPSIPSPQDNESDSDSSAQqqmlqaqPPALQAPTGVTPAPSSAPPGTPQlPTPGPTP 768
Cdd:PRK12270   17 QYLADPNSVDPSwrEFFADYGPGSTAAPTAAAAAAAAAASAP-------AAAPAAKAPAAPAPAPPAAAAPA-APPKPAA 88
                          90       100       110
                  ....*....|....*....|....*....|....*..
gi 112382224  769 SATAVPPQGSPTASQAPNQPQAPTAPVPHTHIQQAPA 805
Cdd:PRK12270   89 AAAAAAAPAAPPAAAAAAAPAAAAVEDEVTPLRGAAA 125
PHA03264 PHA03264
envelope glycoprotein D; Provisional
698-800 5.99e-04

envelope glycoprotein D; Provisional


Pssm-ID: 223029 [Multi-domain]  Cd Length: 416  Bit Score: 44.23  E-value: 5.99e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  698 SSDPKDIDQDNRSTSPSIPSPQDNESDSDSSAQQQmlqAQPPALQAPTGVTPAPSSAPPGTPQLPTPGPT-PSATAVPPQ 776
Cdd:PHA03264  260 ESKGYEPPPAPSGGSPAPPGDDRPEAKPEPGPVED---GAPGRETGGEGEGPEPAGRDGAAGGEPKPGPPrPAPDADRPE 336
                          90       100
                  ....*....|....*....|....
gi 112382224  777 GSPTASQAPNQPQAPTAPVPHTHI 800
Cdd:PHA03264  337 GWPSLEAITFPPPTPATPAVPRAR 360
PRK14971 PRK14971
DNA polymerase III subunit gamma/tau;
691-788 6.81e-04

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237874 [Multi-domain]  Cd Length: 614  Bit Score: 44.38  E-value: 6.81e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  691 RSVNDEGSSDPKDIDQDNRSTSPSIPSPQDNESDSDSSAQQQMLQAQPPAlqAPTGVTPAPSSAPPGTPQLPTPGPTPSA 770
Cdd:PRK14971  363 TQKGDDASGGRGPKQHIKPVFTQPAAAPQPSAAAAASPSPSQSSAAAQPS--APQSATQPAGTPPTVSVDPPAAVPVNPP 440
                          90
                  ....*....|....*...
gi 112382224  771 TAVPPQGSPTASQAPNQP 788
Cdd:PRK14971  441 STAPQAVRPAQFKEEKKI 458
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
725-1038 6.81e-04

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 44.23  E-value: 6.81e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   725 SDSSAQQQMLQAQPPAlQAPTGVTPAPSSAPPGTPQLPTPGPTPSA----TAVPPQGSPTASQAPNQPQAPTAPVPHTHI 800
Cdd:pfam09606  138 GFPSQMSRVGRMQPGG-QAGGMMQPSSGQPGSGTPNQMGPNGGPGQgqagGMNGGQQGPMGGQMPPQMGVPGMPGPADAG 216
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   801 QQAPALHPQRPPSPHPPphpsphpplqpltGSAGQPSAPSHAQPPLHgQGPPGPHSLQAGPLLQHPGPPQPFGLPPQASQ 880
Cdd:pfam09606  217 AQMGQQAQANGGMNPQQ-------------MGGAPNQVAMQQQQPQQ-QGQQSQLGMGINQMQQMPQGVGGGAGQGGPGQ 282
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   881 GQAPLGTSPAAAYPHTSLQLPASQSALQSQQPPREQPLPPaplamphikpppttpipqlpaPQAHKHPPHLSGPSPFSMN 960
Cdd:pfam09606  283 PMGPPGQQPGAMPNVMSIGDQNNYQQQQTRQQQQQQGGNH---------------------PAAHQQQMNQSVGQGGQVV 341
                          250       260       270       280       290       300       310
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 112382224   961 AnlppppaLKPLSSLSTHHPPSAHPPPLQLMPQSQPLPSSPAQPPGLTQSQNLPPPPASHPPTGLHQVAPQPPFAQHP 1038
Cdd:pfam09606  342 A-------LGGLNHLETWNPGNFGGLGANPMQRGQPGMMSSPSPVPGQQVRQVTPNQFMRQSPQPSVPSPQGPGSQPP 412
PRK10856 PRK10856
cytoskeleton protein RodZ;
729-838 8.43e-04

cytoskeleton protein RodZ;


Pssm-ID: 236776 [Multi-domain]  Cd Length: 331  Bit Score: 43.48  E-value: 8.43e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  729 AQQQMLQA---QPPALQAPTGVTPAPSSAPPGTPQLPTPGPTPSATAVPPQGSPTASQAPNQPQAP---TAPVPHTHIQQ 802
Cdd:PRK10856  138 AQQEEITTmadQSSAELSQNSGQSVPLDTSTTTDPATTPAPAAPVDTTPTNSQTPAVATAPAPAVDpqqNAVVAPSQANV 217
                          90       100       110
                  ....*....|....*....|....*....|....*.
gi 112382224  803 APALHPQRPPSPHPPPHPSPHPPLQPLTGSAGQPSA 838
Cdd:PRK10856  218 DTAATPAPAAPATPDGAAPLPTDQAGVSTPAADPNA 253
PHA03169 PHA03169
hypothetical protein; Provisional
571-798 8.48e-04

hypothetical protein; Provisional


Pssm-ID: 223003 [Multi-domain]  Cd Length: 413  Bit Score: 43.81  E-value: 8.48e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  571 SMRTRRSRGSMSTLRSGRKKQPASPDGRTspinEDIRSSGRNSPSAASTSSNDSKAETVKKSAKKVKEEASSPLKSNKRQ 650
Cdd:PHA03169    2 SRQRRKAKRSRHTLRSSCRGHCKRHGGTR----EQAGRRRGTAARAAKPAPPAPTTSGPQVRAVAEQGHRQTESDTETAE 77
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  651 REKVASDTEEADRTSSKKTKTQEISRPNSPSE-GEGESSDSRSVNDEGSSDPKDIDQDNRSTSPSIP---SPQDNESDSD 726
Cdd:PHA03169   78 ESRHGEKEERGQGGPSGSGSESVGSPTPSPSGsAEELASGLSPENTSGSSPESPASHSPPPSPPSHPgphEPAPPESHNP 157
                         170       180       190       200       210       220       230
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 112382224  727 SSAQQQMLQAQPPALQAPTGVTPAPSSAPPGTPQlptPGPTPSATAVPPQGSPTASQAPNQ-PQAPTAPVPHT 798
Cdd:PHA03169  158 SPNQQPSSFLQPSHEDSPEEPEPPTSEPEPDSPG---PPQSETPTSSPPPQSPPDEPGEPQsPTPQQAPSPNT 227
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
543-892 1.17e-03

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 43.62  E-value: 1.17e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  543 PPIEKPVDPPPfmfKPVKEEDDGLSGKHSMRTRRSRGSMSTLRSGRKKQPASPDGRTSPINEDIRSSGRNSPSAASTSSN 622
Cdd:PHA03307   63 DRFEPPTGPPP---GPGTEAPANESRSTPTWSLSTLAPASPAREGSPTPPGPSSPDPPPPTPPPASPPPSPAPDLSEMLR 139
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  623 DSKAETVKKSAKKVKEEASSPLK----SNKRQREKVASDTEEADRTSSKKTKTQEISRPN---SPSEGEGESSDSRSVND 695
Cdd:PHA03307  140 PVGSPGPPPAASPPAAGASPAAVasdaASSRQAALPLSSPEETARAPSSPPAEPPPSTPPaaaSPRPPRRSSPISASASS 219
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  696 EGSSDPKDIDQDNRSTSPSIPSPQDNESDSDSsaqqqmLQAQPPALQAPTGVTPAPSSAPPGTPQLPTPGPTPSATAVPP 775
Cdd:PHA03307  220 PAPAPGRSAADDAGASSSDSSSSESSGCGWGP------ENECPLPRPAPITLPTRIWEASGWNGPSSRPGPASSSSSPRE 293
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  776 QGSPTASQAPNQPQAPTAPvphthiqqapalhpqrppsphppphpspHPPLQPLTGSAGQPSAPSHAQPPLHGQGPPGPH 855
Cdd:PHA03307  294 RSPSPSPSSPGSGPAPSSP----------------------------RASSSSSSSRESSSSSTSSSSESSRGAAVSPGP 345
                         330       340       350
                  ....*....|....*....|....*....|....*..
gi 112382224  856 SLQAGPLLQHPGPPQPFGLPPQASQGQAPLGTSPAAA 892
Cdd:PHA03307  346 SPSRSPSPSRPPPPADPSSPRKRPRPSRAPSSPAASA 382
PRK07994 PRK07994
DNA polymerase III subunits gamma and tau; Validated
739-913 1.49e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236138 [Multi-domain]  Cd Length: 647  Bit Score: 43.32  E-value: 1.49e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  739 PALQAPTGVTPAPSSAPPGTPQLPtpgPTPSATAVPPQGSPTASQAPNQPQAPTAPVPHTHIQQAPALHPQRPpsphppp 818
Cdd:PRK07994  361 PAAPLPEPEVPPQSAAPAASAQAT---AAPTAAVAPPQAPAVPPPPASAPQQAPAVPLPETTSQLLAARQQLQ------- 430
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  819 hpsphpplqpltgSAGQPSAPSHAQPPLHGQGPPGPHSLQagPLLQHPGPPQPFGLPPQASQGQAPLGTSPAAAYPHTSL 898
Cdd:PRK07994  431 -------------RAQGATKAKKSEPAAASRARPVNSALE--RLASVRPAPSALEKAPAKKEAYRWKATNPVEVKKEPVA 495
                         170
                  ....*....|....*
gi 112382224  899 QLPASQSALQSQQPP 913
Cdd:PRK07994  496 TPKALKKALEHEKTP 510
PRK07994 PRK07994
DNA polymerase III subunits gamma and tau; Validated
713-796 1.80e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236138 [Multi-domain]  Cd Length: 647  Bit Score: 42.93  E-value: 1.80e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  713 PSIPSPQDNESD----SDSSAQQQMLQAQPPALQAPTGVTPAPSSAPPGTPQLPTPGPTPSATAVPPQGSPTASQAPNQP 788
Cdd:PRK07994  361 PAAPLPEPEVPPqsaaPAASAQATAAPTAAVAPPQAPAVPPPPASAPQQAPAVPLPETTSQLLAARQQLQRAQGATKAKK 440

                  ....*...
gi 112382224  789 QAPTAPVP 796
Cdd:PRK07994  441 SEPAAASR 448
PABP-1234 TIGR01628
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ...
756-916 2.16e-03

polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.


Pssm-ID: 130689 [Multi-domain]  Cd Length: 562  Bit Score: 42.49  E-value: 2.16e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   756 PGTPQLPTPGPTPSATAVPPQGSPtasqapnQPQAPTAPVPHTHIQQAPalhpqrppsphppphpsphpplqplTGSAGQ 835
Cdd:TIGR01628  380 PRMRQLPMGSPMGGAMGQPPYYGQ-------GPQQQFNGQPLGWPRMSM-------------------------MPTPMG 427
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   836 PSAPSHAQ--PPLHGQGPPGPHSLQAgpllQHPGPPQPFGLPPQASQGQAPLGTSPAAAYPHTSLQLPASQSALQSqQPP 913
Cdd:TIGR01628  428 PGGPLRPNglAPMNAVRAPSRNAQNA----AQKPPMQPVMYPPNYQSLPLSQDLPQPQSTASQGGQNKKLAQVLAS-ATP 502

                   ...
gi 112382224   914 REQ 916
Cdd:TIGR01628  503 QMQ 505
COG5373 COG5373
Uncharacterized membrane protein [Function unknown];
728-805 2.17e-03

Uncharacterized membrane protein [Function unknown];


Pssm-ID: 444140 [Multi-domain]  Cd Length: 854  Bit Score: 42.68  E-value: 2.17e-03
                          10        20        30        40        50        60        70
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 112382224  728 SAQQQMLQAQPPAlqAPTGVTPAPSSAPPGTPQLPTPGPTPSATAVPPQGSPTASQAPNQPQAPTAPVPhthiQQAPA 805
Cdd:COG5373    31 EELEAELAEAAEA--ASAPAEPEPEAAAAATAAAPEAAPAPVPEAPAAPPAAAEAPAPAAAAPPAEAEP----AAAPA 102
PRK08581 PRK08581
amidase domain-containing protein;
603-763 2.53e-03

amidase domain-containing protein;


Pssm-ID: 236304 [Multi-domain]  Cd Length: 619  Bit Score: 42.47  E-value: 2.53e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  603 NEDIRSSGRNSPSAASTSSNDSKAETVKKSAKKvkeeassplksNKRQREKVASDTEEADRTSSKKTKTQEISRPNSPSe 682
Cdd:PRK08581  136 YEQPRNSEKSTNDSNKNSDSSIKNDTDTQSSKQ-----------DKADNQKAPSSNNTKPSTSNKQPNSPKPTQPNQSN- 203
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  683 gegesSDSRSVNDEGSsdpKDIDQDNRSTSPS-IPSPQDNESDsDSSAQQQMLQAQppalqaptGVTPAPSSAPPGTPQL 761
Cdd:PRK08581  204 -----SQPASDDTANQ---KSSSKDNQSMSDSaLDSILDQYSE-DAKKTQKDYASQ--------SKKDKTETSNTKNPQL 266

                  ..
gi 112382224  762 PT 763
Cdd:PRK08581  267 PT 268
SEEEED pfam14797
Serine-rich region of AP3B1, clathrin-adaptor complex; This short low-complexity, highly ...
617-728 2.72e-03

Serine-rich region of AP3B1, clathrin-adaptor complex; This short low-complexity, highly serine-rich region lies on clathrin-adaptor complex 3 beta-1 subunit proteins, between family Adaptin_N, pfam01602 and a C-terminal domain, AP3B1_C,pfam14796.


Pssm-ID: 434218 [Multi-domain]  Cd Length: 111  Bit Score: 39.14  E-value: 2.72e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224   617 ASTSSNDSKAETVKKSAKKVKEEASSplksnkrqrekvasdtEEADRTSSKKTKTQeisrpnSPSEGEGESSDSRSVNDE 696
Cdd:pfam14797   15 SSDSSSDSESESGSESEEEGKEGSSS----------------EDSSEDSSSEQESE------SGSESEKKRTAKRNSKAK 72
                           90       100       110
                   ....*....|....*....|....*....|..
gi 112382224   697 GSSDPKDIDQDNRSTSPSIPSPQDNESDSDSS 728
Cdd:pfam14797   73 GKSDSEDGEKKNEKSKTSDSSDTESSSSEESS 104
PRK07003 PRK07003
DNA polymerase III subunit gamma/tau;
697-805 2.88e-03

DNA polymerase III subunit gamma/tau;


Pssm-ID: 235906 [Multi-domain]  Cd Length: 830  Bit Score: 42.53  E-value: 2.88e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  697 GSSDPKDIDQDNRSTSPSIPSPQDNESDSDSSAQQQMLQAQPPALQAPTgVTPAPSSAPPGT--PQLPTPGPTPSAT-AV 773
Cdd:PRK07003  429 APAPPATADRGDDAADGDAPVPAKANARASADSRCDERDAQPPADSGSA-SAPASDAPPDAAfePAPRAAAPSAATPaAV 507
                          90       100       110
                  ....*....|....*....|....*....|..
gi 112382224  774 PPQGSPTASQAPNQPQAPTAPVPHTHiQQAPA 805
Cdd:PRK07003  508 PDARAPAAASREDAPAAAAPPAPEAR-PPTPA 538
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
716-805 2.88e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 42.28  E-value: 2.88e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  716 PSPQDNESDSDSSAQQQMLQAQPPALQAPTGVTPAPSSAPPGTPQLPTPGPTPSATAVPPQGSPTASQAPNQPQAPTAPV 795
Cdd:PRK07764  410 PAPAAAAPAAAAAPAPAAAPQPAPAPAPAPAPPSPAGNAPAGGAPSPPPAAAPSAQPAPAPAAAPEPTAAPAPAPPAAPA 489
                          90
                  ....*....|
gi 112382224  796 PHTHiQQAPA 805
Cdd:PRK07764  490 PAAA-PAAPA 498
PRK10263 PRK10263
DNA translocase FtsK; Provisional
739-914 3.32e-03

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 42.38  E-value: 3.32e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  739 PALQAPTGVTPAPSSAPPGTPQLPTPgPTPSATAVP--PQGSPTASQAPNQPQAPTAPVPHTHiQQAPALHPQRPPSPHP 816
Cdd:PRK10263  319 PVAVAAAATTATQSWAAPVEPVTQTP-PVASVDVPPaqPTVAWQPVPGPQTGEPVIAPAPEGY-PQQSQYAQPAVQYNEP 396
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  817 PPHPSPHPPLQPLTGSAGQPSAPSHAQPPLHGQGPPGPhslqaGPLLQHPGPPQPFGLPPQASQGQAPLGTSPAAAYPHT 896
Cdd:PRK10263  397 LQQPVQPQQPYYAPAAEQPAQQPYYAPAPEQPAQQPYY-----APAPEQPVAGNAWQAEEQQSTFAPQSTYQTEQTYQQP 471
                         170
                  ....*....|....*...
gi 112382224  897 SLQLPASQSALQSQQPPR 914
Cdd:PRK10263  472 AAQEPLYQQPQPVEQQPV 489
PRK10856 PRK10856
cytoskeleton protein RodZ;
717-805 3.53e-03

cytoskeleton protein RodZ;


Pssm-ID: 236776 [Multi-domain]  Cd Length: 331  Bit Score: 41.55  E-value: 3.53e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  717 SPQDNES---DSDSSAQQQMLQAQPPALQAptgvTPAPSSAPPGTPQlPTPGPTPSATAVPPQGSptASQAPNQPQAPTA 793
Cdd:PRK10856  155 SQNSGQSvplDTSTTTDPATTPAPAAPVDT----TPTNSQTPAVATA-PAPAVDPQQNAVVAPSQ--ANVDTAATPAPAA 227
                          90
                  ....*....|..
gi 112382224  794 PVPHTHIQQAPA 805
Cdd:PRK10856  228 PATPDGAAPLPT 239
PRK07994 PRK07994
DNA polymerase III subunits gamma and tau; Validated
735-869 6.50e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236138 [Multi-domain]  Cd Length: 647  Bit Score: 41.00  E-value: 6.50e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  735 QAQPPALQAPTGVTPAPSSAPPGTPQLPTPGPTPSATAVPPQGSPT--ASQAPNQPQAPTAPVPHTHIQQAPALHPQRPP 812
Cdd:PRK07994  374 SAAPAASAQATAAPTAAVAPPQAPAVPPPPASAPQQAPAVPLPETTsqLLAARQQLQRAQGATKAKKSEPAAASRARPVN 453
                          90       100       110       120       130
                  ....*....|....*....|....*....|....*....|....*....|....*..
gi 112382224  813 SPHPPPHPSPHPPLQPLTGSAGQPSAPSHAQPPLHGQGPPGPHSLQAGPLLQHPGPP 869
Cdd:PRK07994  454 SALERLASVRPAPSALEKAPAKKEAYRWKATNPVEVKKEPVATPKALKKALEHEKTP 510
PHA03247 PHA03247
large tegument protein UL36; Provisional
737-1292 7.88e-03

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 41.08  E-value: 7.88e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  737 QPPALQAPTGVTPAP---SSAPPGTPQLPTPG-PTPSATAVPPQGSPTASQ-----------APNQPQAPTAPVPhthiq 801
Cdd:PHA03247 2482 RPAEARFPFAAGAAPdpgGGGPPDPDAPPAPSrLAPAILPDEPVGEPVHPRmltwirgleelASDDAGDPPPPLP----- 2556
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  802 qaPALHPQRPPSPHPPPHPSPHPPLQPLTGSAGQPSAPSHAQPPLHGQGPPGPHslqagpllqhPGPPQPFGLPPQAsqg 881
Cdd:PHA03247 2557 --PAAPPAAPDRSVPPPRPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDRGDP----------RGPAPPSPLPPDT--- 2621
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  882 qAPLGTSPAAAYPHTSLqlPASQSALQSQQPPREQPLPPAPLAMPHIKPPPTTPIPQLPAPQAHKHPPHLSGP-SPFSMN 960
Cdd:PHA03247 2622 -HAPDPPPPSPSPAANE--PDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSPPQRPRRRAARPTvGSLTSL 2698
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  961 ANLPPppalkplsslsthhppsahppplqlmPQSQPLPSSPAQPPGLTQSQNLPPPPASHPPTGLHQVAPQPPFAQ-HPF 1039
Cdd:PHA03247 2699 ADPPP--------------------------PPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPaTPG 2752
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224 1040 VPGGPPPITPPTCPSTSTPPAGPGTSAqPPCSGAAASGGSIAGGSSCPLPTVQIKEEALDDAEEPESPPPPPRSPSPEPT 1119
Cdd:PHA03247 2753 GPARPARPPTTAGPPAPAPPAAPAAGP-PRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPP 2831
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224 1120 VVDTP-SHASQSARFYKHLDRGYNSCARTDLYFMPLAGSKLAKKREEAIEKAKREAEQKAREEREREKEKEKERERERER 1198
Cdd:PHA03247 2832 TSAQPtAPPPPPGPPPPSLPLGGSVAPGGDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQPERPPQP 2911
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224 1199 EREAERAAKASSSAHEgrLSDPQLSGPGhmRPSFEPPPTTIAAVPPYIGPDTPALR----TLSEYARPHVMSPTNRNhPF 1274
Cdd:PHA03247 2912 QAPPPPQPQPQPPPPP--QPQPPPPPPP--RPQPPLAPTTDPAGAGEPSGAVPQPWlgalVPGRVAVPRFRVPQPAP-SR 2986
                         570
                  ....*....|....*...
gi 112382224 1275 YMPLNPTDPLLAYHMPGL 1292
Cdd:PHA03247 2987 EAPASSTPPLTGHSLSRV 3004
PRK13729 PRK13729
conjugal transfer pilus assembly protein TraB; Provisional
722-804 8.06e-03

conjugal transfer pilus assembly protein TraB; Provisional


Pssm-ID: 184281 [Multi-domain]  Cd Length: 475  Bit Score: 40.58  E-value: 8.06e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  722 ESDSDSSAQQQMLQAQPPALQAPTGvTPAPSSAPPGTPQLPTPGPTPSATAVPPQGSPT-ASQAPNQPQAPTAPVPHTHI 800
Cdd:PRK13729  108 KLGQDNAALAEQVKALGANPVTATG-EPVPQMPASPPGPEGEPQPGNTPVSFPPQGSVAvPPPTAFYPGNGVTPPPQVTY 186

                  ....
gi 112382224  801 QQAP 804
Cdd:PRK13729  187 QSVP 190
PRK14950 PRK14950
DNA polymerase III subunits gamma and tau; Provisional
711-796 8.48e-03

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237864 [Multi-domain]  Cd Length: 585  Bit Score: 40.56  E-value: 8.48e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  711 TSPSIPSPQDNESDSDSSAQQQMLQAQPPALQAPTGVTPAPSSAPPGTPqlPTPGPTPSATAVP---PQGSPTASQAPNQ 787
Cdd:PRK14950  361 VPVPAPQPAKPTAAAPSPVRPTPAPSTRPKAAAAANIPPKEPVRETATP--PPVPPRPVAPPVPhtpESAPKLTRAAIPV 438

                  ....*....
gi 112382224  788 PQAPTAPVP 796
Cdd:PRK14950  439 DEKPKYTPP 447
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
658-1006 9.42e-03

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 40.92  E-value: 9.42e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  658 TEEADRTSSKKTKTQEISRPNSPSEGEGESSDSRSVNDEGSSDPKDIDQDNRSTSPSIPSPQDNESDSDSSAQQQMLQAQ 737
Cdd:PHA03307   77 TEAPANESRSTPTWSLSTLAPASPAREGSPTPPGPSSPDPPPPTPPPASPPPSPAPDLSEMLRPVGSPGPPPAASPPAAG 156
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  738 PPALQAPTGVTPAPSSAPPgTPQLPTPGPTPSATAVPPQGSPTASQAPNQPQAPTAPVPHTHIQQAPALHPQRPPSPHPP 817
Cdd:PHA03307  157 ASPAAVASDAASSRQAALP-LSSPEETARAPSSPPAEPPPSTPPAAASPRPPRRSSPISASASSPAPAPGRSAADDAGAS 235
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  818 PHPSPHPPLQP----LTGSAGQPSAPSHAQPPLHGQGPPGPHSLQAGPLLQHPGPPQPFGLPPQASQGQAPLGTSPAAAY 893
Cdd:PHA03307  236 SSDSSSSESSGcgwgPENECPLPRPAPITLPTRIWEASGWNGPSSRPGPASSSSSPRERSPSPSPSSPGSGPAPSSPRAS 315
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 112382224  894 PHTSLQLPASQSALQSQQPPREQPLPPAPLAMPHIKPPPTTPIPQLPAPQAHKHPPHLSGPSP----------------- 956
Cdd:PHA03307  316 SSSSSSRESSSSSTSSSSESSRGAAVSPGPSPSRSPSPSRPPPPADPSSPRKRPRPSRAPSSPaasagrptrrraraava 395
                         330       340       350       360       370
                  ....*....|....*....|....*....|....*....|....*....|....
gi 112382224  957 ---FSMNANLPPPPALKPLSSLSTHHPPSAHPPPLQL-MPQSQPLPSSPAQPPG 1006
Cdd:PHA03307  396 graRRRDATGRFPAGRPRPSPLDAGAASGAFYARYPLlTPSGEPWPGSPPPPPG 449
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH