NCBI Conserved Domain Search

Conserved domains on [gi|1034630806|ref|XP_016861063|]

View

cAMP-regulated phosphoprotein 21 isoform X1 [Homo sapiens]

Protein Classification

R3H_encore_like and SUZ domain-containing protein( domain architecture ID 12927641)

protein containing domains R3H_encore_like, SUZ, and PAT1

Graphical summary

Zoom to residue level

show extra options »

Show site features Horizontal zoom: ×

List of domain hits

Name

Accession

Description

Interval

E-value

R3H_encore_like

cd02642

R3H domain of encore-like and DIP1-like proteins. Drosophila encore is involved in the ...

163-224

8.40e-26

R3H domain of encore-like and DIP1-like proteins. Drosophila encore is involved in the germline exit after four mitotic divisions, by facilitating SCF-ubiquitin-proteasome-dependent proteolysis. Maize DBF1-interactor protein 1 (DIP1) containing an R3H domain is a potential regulator of DBF1 activity in stress responses. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner.

Pssm-ID: 100071 Cd Length: 63 Bit Score: 100.75 E-value: 8.40e-26

                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1034630806 163 DRMILLKMEQEIIDFIADNNNHYKKFPQMSSYQRMLVHRVAAYFGLDHNVDQTG-KSVIINKT 224
Cdd:cd02642     1 DRLFVLKLEKDLLAFIKDSTRQSLELPPMNSYYRLLAHRVAQYYGLDHNVDNSGgKCVIVNKT 63

SUZ

pfam12752

SUZ domain; The SUZ domain is a conserved RNA-binding domain found in eukaryotes and enriched ...

245-300

2.59e-13

SUZ domain; The SUZ domain is a conserved RNA-binding domain found in eukaryotes and enriched in positively charged amino acids. It was first characterized in the C.elegans protein Szy-20 where it has been shown to bind RNA and allow their localization to the centrosome. Warning- the domain has a compositionally biased character.

Pssm-ID: 463689 [Multi-domain] Cd Length: 56 Bit Score: 65.04 E-value: 2.59e-13

                          10        20        30        40        50
                  ....*....|....*....|....*....|....*....|....*....|....*.
gi 1034630806 245 ESQKRFILKRDNSSIDKEDNQQNRMHPFRDDRRSKSIEEREEEYQRVRERIFAHDS 300
Cdd:pfam12752   1 PPPKMKILRRPSSGSSSSSSAGSSGASSSSGSDSKTLEEREAEYAEARARIFGSSE 56

PAT1 super family

cl37801

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...

608-795

6.37e-04

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.

The actual alignment was detected with superfamily member pfam09770:

Pssm-ID: 401645 [Multi-domain] Cd Length: 846 Bit Score: 43.49 E-value: 6.37e-04

                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806 608 PPSGPVYPSSLmPQPAQQPSYVIASTG-----QQLPTGGFSGSGPPISQQVLQPPPSPqgfvQQPPPAQMPVYYYPSGQY 682
Cdd:pfam09770 166 APKKAAAPAPA-PQPAAQPASLPAPSRkmmslEEVEAAMRAQAKKPAQQPAPAPAQPP----AAPPAQQAQQQQQFPPQI 240

                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806 683 PTSTTQQYRPMAPVQYNAQrsqqmpqaaqqaGYQPVLSGQQGFQGLIGVQQPPQSQNVINNQQGTPVQS---------VM 753
Cdd:pfam09770 241 QQQQQPQQQPQQPQQHPGQ------------GHPVTILQRPQSPQPDPAQPSIQPQAQQFHQQPPPVPVqptqilqnpNR 308

                         170       180       190       200
                  ....*....|....*....|....*....|....*....|..
gi 1034630806 754 VSYPTMSSYQVPMTQGSQGLPQQSYQQPIMLPNQAGQGSLPA 795
Cdd:pfam09770 309 LSAARVGYPQNPQPGVQPAPAHQAHRQQGSFGRQAPIITHPQ 350

PHA03247 super family

cl33720

large tegument protein UL36; Provisional

428-695

1.49e-03

large tegument protein UL36; Provisional

The actual alignment was detected with superfamily member PHA03247:

Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 42.62 E-value: 1.49e-03

                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806  428 PPLQSTPLVSGVAAGSPGCVPYPENGIGGQVAPSSTSYILLPLEAATGIPPGSillnphtgqpfVNPDGTPAIYNPPTSQ 507
Cdd:PHA03247  2701 PPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGG-----------PARPARPPTTAGPPAP 2769

                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806  508 QPLRSAMVGQSQQQPPQQQPSPQPQQQVQPPQPQMAGPlvTQSVQGLQASSQSVQYPAVSFPPqhllPVSPTQhfpmrdd 587
Cdd:PHA03247  2770 APPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADP--PAAVLAPAAALPPAASPAGPLPP----PTSAQP------- 2836

                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806  588 vatqfgqmtlsrqssgETPEPPSGPVyPSSLMPQPAQQPSYVIASTG--QQLPTGGFSGSGPPISQqVLQPPPSPQGFVQ 665
Cdd:PHA03247  2837 ----------------TAPPPPPGPP-PPSLPLGGSVAPGGDVRRRPpsRSPAAKPAAPARPPVRR-LARPAVSRSTESF 2898

                          250       260       270
                   ....*....|....*....|....*....|.
gi 1034630806  666 -QPPPAQMPVYYYPSGQYPTSTTQQYRPMAP 695
Cdd:PHA03247  2899 aLPPDQPERPPQPQAPPPPQPQPQPPPPPQP 2929

Name

Accession

Description

Interval

E-value

R3H_encore_like

cd02642

R3H domain of encore-like and DIP1-like proteins. Drosophila encore is involved in the ...

163-224

8.40e-26

Pssm-ID: 100071 Cd Length: 63 Bit Score: 100.75 E-value: 8.40e-26

                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1034630806 163 DRMILLKMEQEIIDFIADNNNHYKKFPQMSSYQRMLVHRVAAYFGLDHNVDQTG-KSVIINKT 224
Cdd:cd02642     1 DRLFVLKLEKDLLAFIKDSTRQSLELPPMNSYYRLLAHRVAQYYGLDHNVDNSGgKCVIVNKT 63

R3H

smart00393

Putative single-stranded nucleic acids-binding domain;

147-224

4.43e-14

Putative single-stranded nucleic acids-binding domain;

Pssm-ID: 214647 Cd Length: 79 Bit Score: 68.10 E-value: 4.43e-14

                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806  147 IDLHEFLINTLKNNSRDRMILLKMEQEIIDFIAdNNNHYKKFPQMSSYQRMLVHRVAAYFGLDHNVDQTG--KSVIINKT 224
Cdd:smart00393   1 ADFLPVTLDALSYRPRRREELIELELEIARFVK-STKESVELPPMNSYERKIVHELAEKYGLESESFGEGpkRRVVISKK 79

SUZ

pfam12752

SUZ domain; The SUZ domain is a conserved RNA-binding domain found in eukaryotes and enriched ...

245-300

2.59e-13

Pssm-ID: 463689 [Multi-domain] Cd Length: 56 Bit Score: 65.04 E-value: 2.59e-13

                          10        20        30        40        50
                  ....*....|....*....|....*....|....*....|....*....|....*.
gi 1034630806 245 ESQKRFILKRDNSSIDKEDNQQNRMHPFRDDRRSKSIEEREEEYQRVRERIFAHDS 300
Cdd:pfam12752   1 PPPKMKILRRPSSGSSSSSSAGSSGASSSSGSDSKTLEEREAEYAEARARIFGSSE 56

R3H

pfam01424

R3H domain; The name of the R3H domain comes from the characteriztic spacing of the most ...

165-223

3.80e-12

R3H domain; The name of the R3H domain comes from the characteriztic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to be binding ssDNA.

Pssm-ID: 460206 Cd Length: 60 Bit Score: 61.74 E-value: 3.80e-12

                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1034630806 165 MILLKMEQEIIDFIADNNNHYKkFPQMSSYQRMLVHRVAAYFGLDHNV--DQTGKSVIINK 223
Cdd:pfam01424   1 EFLEQLAEKLAEFVKDTGKSLE-LPPMSSYERRIIHELAQKYGLESESegEEPNRRVVVYK 60

PAT1

pfam09770

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...

608-795

6.37e-04

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.

Pssm-ID: 401645 [Multi-domain] Cd Length: 846 Bit Score: 43.49 E-value: 6.37e-04

                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806 608 PPSGPVYPSSLmPQPAQQPSYVIASTG-----QQLPTGGFSGSGPPISQQVLQPPPSPqgfvQQPPPAQMPVYYYPSGQY 682
Cdd:pfam09770 166 APKKAAAPAPA-PQPAAQPASLPAPSRkmmslEEVEAAMRAQAKKPAQQPAPAPAQPP----AAPPAQQAQQQQQFPPQI 240

                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806 683 PTSTTQQYRPMAPVQYNAQrsqqmpqaaqqaGYQPVLSGQQGFQGLIGVQQPPQSQNVINNQQGTPVQS---------VM 753
Cdd:pfam09770 241 QQQQQPQQQPQQPQQHPGQ------------GHPVTILQRPQSPQPDPAQPSIQPQAQQFHQQPPPVPVqptqilqnpNR 308

                         170       180       190       200
                  ....*....|....*....|....*....|....*....|..
gi 1034630806 754 VSYPTMSSYQVPMTQGSQGLPQQSYQQPIMLPNQAGQGSLPA 795
Cdd:pfam09770 309 LSAARVGYPQNPQPGVQPAPAHQAHRQQGSFGRQAPIITHPQ 350

PHA03247

large tegument protein UL36; Provisional

428-695

1.49e-03

large tegument protein UL36; Provisional

Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 42.62 E-value: 1.49e-03

                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806  428 PPLQSTPLVSGVAAGSPGCVPYPENGIGGQVAPSSTSYILLPLEAATGIPPGSillnphtgqpfVNPDGTPAIYNPPTSQ 507
Cdd:PHA03247  2701 PPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGG-----------PARPARPPTTAGPPAP 2769

                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806  508 QPLRSAMVGQSQQQPPQQQPSPQPQQQVQPPQPQMAGPlvTQSVQGLQASSQSVQYPAVSFPPqhllPVSPTQhfpmrdd 587
Cdd:PHA03247  2770 APPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADP--PAAVLAPAAALPPAASPAGPLPP----PTSAQP------- 2836

                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806  588 vatqfgqmtlsrqssgETPEPPSGPVyPSSLMPQPAQQPSYVIASTG--QQLPTGGFSGSGPPISQqVLQPPPSPQGFVQ 665
Cdd:PHA03247  2837 ----------------TAPPPPPGPP-PPSLPLGGSVAPGGDVRRRPpsRSPAAKPAAPARPPVRR-LARPAVSRSTESF 2898

                          250       260       270
                   ....*....|....*....|....*....|.
gi 1034630806  666 -QPPPAQMPVYYYPSGQYPTSTTQQYRPMAP 695
Cdd:PHA03247  2899 aLPPDQPERPPQPQAPPPPQPQPQPPPPPQP 2929

PRK10263

DNA translocase FtsK; Provisional

609-801

3.49e-03

DNA translocase FtsK; Provisional

Pssm-ID: 236669 [Multi-domain] Cd Length: 1355 Bit Score: 41.22 E-value: 3.49e-03

                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806  609 PSGPVYPSSLMPQPAQQPSYVIASTgQQLPtggfsgsGPPISQQVLQPppSPQGFVQQPPPAQmpvyyyPSGQYPTSTTQ 688
Cdd:PRK10263   336 PVEPVTQTPPVASVDVPPAQPTVAW-QPVP-------GPQTGEPVIAP--APEGYPQQSQYAQ------PAVQYNEPLQQ 399

                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806  689 QYRPMAPVQYNAQRSQQMPQAAQQAGYQPVLSGQQGFQgligVQQPPQSQNVINNQQGTPVQSvMVSYPTMSSYQVPMTQ 768
Cdd:PRK10263   400 PVQPQQPYYAPAAEQPAQQPYYAPAPEQPAQQPYYAPA----PEQPVAGNAWQAEEQQSTFAP-QSTYQTEQTYQQPAAQ 474

                          170       180       190
                   ....*....|....*....|....*....|...
gi 1034630806  769 GSQGLPQQSYQQPIMLPNQAGQGSLPATGMPVY 801
Cdd:PRK10263   475 EPLYQQPQPVEQQPVVEPEPVVEETKPARPPLY 507

SP1-4_arthropods_N

cd22553

N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ...

578-821

4.56e-03

N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.

Pssm-ID: 411778 [Multi-domain] Cd Length: 384 Bit Score: 40.39 E-value: 4.56e-03

                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806 578 PTQHFPMRDDVATQFGQMTLSRQSSGETPEPPSGPVYPSSLMPQ--PAQQPsyVIASTGQ------QLPTGGFSGSGPPI 649
Cdd:cd22553   113 ANQQTLIRPNTVQGQANASNVLQNIAQIASGGNAVQLPLNNMTQtiPVQVP--VSTANGQtvyqtiQVPIQAIQSGNAGG 190

                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806 650 SQQVLQPPPSPQgfVQQPPPAQMPVYYYPSGQ-----YPTSTTQQYRPMAPVQYNaQRSQQMPQAAQQAGYQPVLSGQQG 724
Cdd:cd22553   191 GNQALQAQVIPQ--LAQAAQLQPQQLAQVSSQgyiqqIPANASQQQPQMVQQGPN-QSGQIIGQVASASSIQAAAIPLTV 267

                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806 725 FQGLIGvqqppqSQNVINNQQGTPVQSV--------MVSYPTMSSYQVPMTQGSQGLPQQSYQQPIMLPNQAGQGSLPAT 796
Cdd:cd22553   268 YTGALA------GQNGSNQQQVGQIVTSpiqgmtqgLTAPASSSIPTVVQQQAIQGNPLPPGTQIIAAGQQLQQDPNDPT 341

                         250       260
                  ....*....|....*....|....*
gi 1034630806 797 GMPVYCNVTPPTPQnNLRLIGPHCP 821
Cdd:cd22553   342 KWQVVADGTPGSKK-RLRRVACTCP 365

Name

Accession

Description

Interval

E-value

R3H_encore_like

cd02642

R3H domain of encore-like and DIP1-like proteins. Drosophila encore is involved in the ...

163-224

8.40e-26

Pssm-ID: 100071 Cd Length: 63 Bit Score: 100.75 E-value: 8.40e-26

                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1034630806 163 DRMILLKMEQEIIDFIADNNNHYKKFPQMSSYQRMLVHRVAAYFGLDHNVDQTG-KSVIINKT 224
Cdd:cd02642     1 DRLFVLKLEKDLLAFIKDSTRQSLELPPMNSYYRLLAHRVAQYYGLDHNVDNSGgKCVIVNKT 63

R3H

smart00393

Putative single-stranded nucleic acids-binding domain;

147-224

4.43e-14

Putative single-stranded nucleic acids-binding domain;

Pssm-ID: 214647 Cd Length: 79 Bit Score: 68.10 E-value: 4.43e-14

                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806  147 IDLHEFLINTLKNNSRDRMILLKMEQEIIDFIAdNNNHYKKFPQMSSYQRMLVHRVAAYFGLDHNVDQTG--KSVIINKT 224
Cdd:smart00393   1 ADFLPVTLDALSYRPRRREELIELELEIARFVK-STKESVELPPMNSYERKIVHELAEKYGLESESFGEGpkRRVVISKK 79

SUZ

pfam12752

SUZ domain; The SUZ domain is a conserved RNA-binding domain found in eukaryotes and enriched ...

245-300

2.59e-13

Pssm-ID: 463689 [Multi-domain] Cd Length: 56 Bit Score: 65.04 E-value: 2.59e-13

                          10        20        30        40        50
                  ....*....|....*....|....*....|....*....|....*....|....*.
gi 1034630806 245 ESQKRFILKRDNSSIDKEDNQQNRMHPFRDDRRSKSIEEREEEYQRVRERIFAHDS 300
Cdd:pfam12752   1 PPPKMKILRRPSSGSSSSSSAGSSGASSSSGSDSKTLEEREAEYAEARARIFGSSE 56

R3H

cd02325

R3H domain. The name of the R3H domain comes from the characteristic spacing of the most ...

167-223

1.89e-12

R3H domain. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. R3H domains are found in proteins together with ATPase domains, SF1 helicase domains, SF2 DEAH helicase domains, Cys-rich repeats, ring-type zinc fingers, and KH domains. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner.

Pssm-ID: 100064 Cd Length: 59 Bit Score: 62.63 E-value: 1.89e-12

                          10        20        30        40        50
                  ....*....|....*....|....*....|....*....|....*....|....*....
gi 1034630806 167 LLKMEQEIIDFIADNNNHYKKFPQMSSYQRMLVHRVAAYFGLDHNVDQTG--KSVIINK 223
Cdd:cd02325     1 REEREEELEAFAKDAAGKSLELPPMNSYERKLIHDLAEYYGLKSESEGEGpnRRVVITK 59

R3H

pfam01424

R3H domain; The name of the R3H domain comes from the characteriztic spacing of the most ...

165-223

3.80e-12

R3H domain; The name of the R3H domain comes from the characteriztic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to be binding ssDNA.

Pssm-ID: 460206 Cd Length: 60 Bit Score: 61.74 E-value: 3.80e-12

                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1034630806 165 MILLKMEQEIIDFIADNNNHYKkFPQMSSYQRMLVHRVAAYFGLDHNV--DQTGKSVIINK 223
Cdd:pfam01424   1 EFLEQLAEKLAEFVKDTGKSLE-LPPMSSYERRIIHELAQKYGLESESegEEPNRRVVVYK 60

PAT1

pfam09770

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...

608-795

6.37e-04

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.

Pssm-ID: 401645 [Multi-domain] Cd Length: 846 Bit Score: 43.49 E-value: 6.37e-04

                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806 608 PPSGPVYPSSLmPQPAQQPSYVIASTG-----QQLPTGGFSGSGPPISQQVLQPPPSPqgfvQQPPPAQMPVYYYPSGQY 682
Cdd:pfam09770 166 APKKAAAPAPA-PQPAAQPASLPAPSRkmmslEEVEAAMRAQAKKPAQQPAPAPAQPP----AAPPAQQAQQQQQFPPQI 240

                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806 683 PTSTTQQYRPMAPVQYNAQrsqqmpqaaqqaGYQPVLSGQQGFQGLIGVQQPPQSQNVINNQQGTPVQS---------VM 753
Cdd:pfam09770 241 QQQQQPQQQPQQPQQHPGQ------------GHPVTILQRPQSPQPDPAQPSIQPQAQQFHQQPPPVPVqptqilqnpNR 308

                         170       180       190       200
                  ....*....|....*....|....*....|....*....|..
gi 1034630806 754 VSYPTMSSYQVPMTQGSQGLPQQSYQQPIMLPNQAGQGSLPA 795
Cdd:pfam09770 309 LSAARVGYPQNPQPGVQPAPAHQAHRQQGSFGRQAPIITHPQ 350

PHA03247

large tegument protein UL36; Provisional

428-695

1.49e-03

large tegument protein UL36; Provisional

Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 42.62 E-value: 1.49e-03

                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806  428 PPLQSTPLVSGVAAGSPGCVPYPENGIGGQVAPSSTSYILLPLEAATGIPPGSillnphtgqpfVNPDGTPAIYNPPTSQ 507
Cdd:PHA03247  2701 PPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGG-----------PARPARPPTTAGPPAP 2769

                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806  508 QPLRSAMVGQSQQQPPQQQPSPQPQQQVQPPQPQMAGPlvTQSVQGLQASSQSVQYPAVSFPPqhllPVSPTQhfpmrdd 587
Cdd:PHA03247  2770 APPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADP--PAAVLAPAAALPPAASPAGPLPP----PTSAQP------- 2836

                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806  588 vatqfgqmtlsrqssgETPEPPSGPVyPSSLMPQPAQQPSYVIASTG--QQLPTGGFSGSGPPISQqVLQPPPSPQGFVQ 665
Cdd:PHA03247  2837 ----------------TAPPPPPGPP-PPSLPLGGSVAPGGDVRRRPpsRSPAAKPAAPARPPVRR-LARPAVSRSTESF 2898

                          250       260       270
                   ....*....|....*....|....*....|.
gi 1034630806  666 -QPPPAQMPVYYYPSGQYPTSTTQQYRPMAP 695
Cdd:PHA03247  2899 aLPPDQPERPPQPQAPPPPQPQPQPPPPPQP 2929

PHA03247

large tegument protein UL36; Provisional

425-844

1.54e-03

large tegument protein UL36; Provisional

Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 42.62 E-value: 1.54e-03

                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806  425 RTHPPLQSTPLVSGVAAGS----PGCVPYPENGIG---------GQVAPSStsyilLPLEAATGIPPgsillnPHTGQPF 491
Cdd:PHA03247  2566 RSVPPPRPAPRPSEPAVTSrarrPDAPPQSARPRApvddrgdprGPAPPSP-----LPPDTHAPDPP------PPSPSPA 2634

                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806  492 VNPDGTPAIYNPPTSQQPLRSAMVGQSQQQPPQQQPSPQPQQQVQPPQ-PQMAGPLVTQSVQGL------QASSQSVQYP 564
Cdd:PHA03247  2635 ANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSPPQRpRRRAARPTVGSLTSLadppppPPTPEPAPHA 2714

                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806  565 AVSFPPQHLLPVSPTQHFPMR----------DDVATQFGQMTLSRQSSGETPEPPSGPVYPSS----------------- 617
Cdd:PHA03247  2715 LVSATPLPPGPAAARQASPALpaapappavpAGPATPGGPARPARPPTTAGPPAPAPPAAPAAgpprrltrpavaslses 2794

                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806  618 -----LMPQPAQQPSYVIASTGQQLPTGGFSGSGPPISQQVLQPPPSPQGFVqqPPPAQMPVYYYPSGQY---PTSTTQQ 689
Cdd:PHA03247  2795 reslpSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPP--PPSLPLGGSVAPGGDVrrrPPSRSPA 2872

                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806  690 YRPMAPVQYNAQRSQqmpqaaqqagyQPVLSGQQGFQGLIGVQQPPQSQNVINNQQGTPVQSVMVSYPTMSsyqvPMTQG 769
Cdd:PHA03247  2873 AKPAAPARPPVRRLA-----------RPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPP----PPPPP 2937

                          410       420       430       440       450       460       470
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1034630806  770 SQGLPQQSYQQPIMLPNQAGQGSLPATGMPVYCNVTPPTpqnnlRLIGPHCPSSTVPVMSASCRTNCASMSNAGW 844
Cdd:PHA03247  2938 RPQPPLAPTTDPAGAGEPSGAVPQPWLGALVPGRVAVPR-----FRVPQPAPSREAPASSTPPLTGHSLSRVSSW 3007

Atrophin-1

pfam03154

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...

507-845

2.13e-03

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.

Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 41.68 E-value: 2.13e-03

                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806 507 QQPLRSAMVGQSQQQPPQQQPSPQPQQQVQPPQPQMAGPLVTQSVQGLQASSQSVQYPAVSFPPQHLLPVSPTQHFPMRD 586
Cdd:pfam03154 164 QQILQTQPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHTLIQQTPTLHPQRLP 243

                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806 587 DVATQFGQMTLSRQSSGETPEPPSGPVY--PSSLMPQPAQ------------QPSYVIASTGQ-QLPTGGFSGSGPPISQ 651
Cdd:pfam03154 244 SPHPPLQPMTQPPPPSQVSPQPLPQPSLhgQMPPMPHSLQtgpshmqhpvppQPFPLTPQSSQsQVPPGPSPAAPGQSQQ 323

                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806 652 QVLQPPPSPQGFVQQP------PPAQMPVyyyPSGQYPTSTtqqyrPMAPVQyNAQRSQqmpqaaqqagYQPVLSGQQGF 725
Cdd:pfam03154 324 RIHTPPSQSQLQSQQPpreqplPPAPLSM---PHIKPPPTT-----PIPQLP-NPQSHK----------HPPHLSGPSPF 384

                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806 726 QGLIGVQQPPQSQnvinnqqgtPVQSVMVSYPTmSSYQVP---MTQgSQGLPQQSYQQPIMLPNQ---AGQGSLPATGMP 799
Cdd:pfam03154 385 QMNSNLPPPPALK---------PLSSLSTHHPP-SAHPPPlqlMPQ-SQQLPPPPAQPPVLTQSQslpPPAASHPPTSGL 453

                         330       340       350       360
                  ....*....|....*....|....*....|....*....|....*.
gi 1034630806 800 VYCNVTPPTPQNnlrligPHCPSSTVPVMSASCRTNCASMSNAGWQ 845
Cdd:pfam03154 454 HQVPSQSPFPQH------PFVPGGPPPITPPSGPPTSTSSAMPGIQ 493

Atrophin-1

pfam03154

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...

485-809

2.66e-03

Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 41.68 E-value: 2.66e-03

                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806 485 PHTGQPFVNPDGTPAIYNPPTSQQPLRSAMvgQSQQQPPQQQPSPQPQQQVQPPQPQMAGPLVTQSVQGLQASSQSVQYP 564
Cdd:pfam03154 199 PTPSAPSVPPQGSPATSQPPNQTQSTAAPH--TLIQQTPTLHPQRLPSPHPPLQPMTQPPPPSQVSPQPLPQPSLHGQMP 276

                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806 565 AVSFP----PQHLLPVSPTQHFPMRDDVATQFGQMTLSRQSSGETPEPPSGPvyPSSLMPQPAQQPSYviastgQQLPTG 640
Cdd:pfam03154 277 PMPHSlqtgPSHMQHPVPPQPFPLTPQSSQSQVPPGPSPAAPGQSQQRIHTP--PSQSQLQSQQPPRE------QPLPPA 348

                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806 641 gfsgsgpPISQQVLQPPPS--------PQG-----FVQQPPPAQMPVYY-YPSGQYPTSTTQQYRPMAPVQYNAQRSQQM 706
Cdd:pfam03154 349 -------PLSMPHIKPPPTtpipqlpnPQShkhppHLSGPSPFQMNSNLpPPPALKPLSSLSTHHPPSAHPPPLQLMPQS 421

                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806 707 PQAAQQAGYQPVLSGQQGFQGLIGVQQPPQSQNVINNQQGTPVQSVMVSYPT-MSSYQVPMTQGSQGLPqqSYQQPIMLP 785
Cdd:pfam03154 422 QQLPPPPAQPPVLTQSQSLPPPAASHPPTSGLHQVPSQSPFPQHPFVPGGPPpITPPSGPPTSTSSAMP--GIQPPSSAS 499

                         330       340
                  ....*....|....*....|....
gi 1034630806 786 nqagqgslPATGMPVYCNVTPPTP 809
Cdd:pfam03154 500 --------VSSSGPVPAAVSCPLP 515

R3H_unknown_2

cd06006

R3H domain of a group of fungal proteins with unknown function. The name of the R3H domain ...

169-209

2.88e-03

R3H domain of a group of fungal proteins with unknown function. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner.

Pssm-ID: 100076 Cd Length: 59 Bit Score: 36.58 E-value: 2.88e-03

                          10        20        30        40
                  ....*....|....*....|....*....|....*....|.
gi 1034630806 169 KMEQEIIDFIADNNNHYKKFPQMSSYQRMLVHRVAAYFGLD 209
Cdd:cd06006     3 QIESTLRKFINDKSKRSLRFPPMRSPQRAFIHELAKDYGLY 43

PRK10263

DNA translocase FtsK; Provisional

609-801

3.49e-03

DNA translocase FtsK; Provisional

Pssm-ID: 236669 [Multi-domain] Cd Length: 1355 Bit Score: 41.22 E-value: 3.49e-03

                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806  609 PSGPVYPSSLMPQPAQQPSYVIASTgQQLPtggfsgsGPPISQQVLQPppSPQGFVQQPPPAQmpvyyyPSGQYPTSTTQ 688
Cdd:PRK10263   336 PVEPVTQTPPVASVDVPPAQPTVAW-QPVP-------GPQTGEPVIAP--APEGYPQQSQYAQ------PAVQYNEPLQQ 399

                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806  689 QYRPMAPVQYNAQRSQQMPQAAQQAGYQPVLSGQQGFQgligVQQPPQSQNVINNQQGTPVQSvMVSYPTMSSYQVPMTQ 768
Cdd:PRK10263   400 PVQPQQPYYAPAAEQPAQQPYYAPAPEQPAQQPYYAPA----PEQPVAGNAWQAEEQQSTFAP-QSTYQTEQTYQQPAAQ 474

                          170       180       190
                   ....*....|....*....|....*....|...
gi 1034630806  769 GSQGLPQQSYQQPIMLPNQAGQGSLPATGMPVY 801
Cdd:PRK10263   475 EPLYQQPQPVEQQPVVEPEPVVEETKPARPPLY 507

SP1-4_arthropods_N

cd22553

N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ...

578-821

4.56e-03

Pssm-ID: 411778 [Multi-domain] Cd Length: 384 Bit Score: 40.39 E-value: 4.56e-03

                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806 578 PTQHFPMRDDVATQFGQMTLSRQSSGETPEPPSGPVYPSSLMPQ--PAQQPsyVIASTGQ------QLPTGGFSGSGPPI 649
Cdd:cd22553   113 ANQQTLIRPNTVQGQANASNVLQNIAQIASGGNAVQLPLNNMTQtiPVQVP--VSTANGQtvyqtiQVPIQAIQSGNAGG 190

                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806 650 SQQVLQPPPSPQgfVQQPPPAQMPVYYYPSGQ-----YPTSTTQQYRPMAPVQYNaQRSQQMPQAAQQAGYQPVLSGQQG 724
Cdd:cd22553   191 GNQALQAQVIPQ--LAQAAQLQPQQLAQVSSQgyiqqIPANASQQQPQMVQQGPN-QSGQIIGQVASASSIQAAAIPLTV 267

                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806 725 FQGLIGvqqppqSQNVINNQQGTPVQSV--------MVSYPTMSSYQVPMTQGSQGLPQQSYQQPIMLPNQAGQGSLPAT 796
Cdd:cd22553   268 YTGALA------GQNGSNQQQVGQIVTSpiqgmtqgLTAPASSSIPTVVQQQAIQGNPLPPGTQIIAAGQQLQQDPNDPT 341

                         250       260
                  ....*....|....*....|....*
gi 1034630806 797 GMPVYCNVTPPTPQnNLRLIGPHCP 821
Cdd:cd22553   342 KWQVVADGTPGSKK-RLRRVACTCP 365

PRK14971

DNA polymerase III subunit gamma/tau;

601-697

5.68e-03

DNA polymerase III subunit gamma/tau;

Pssm-ID: 237874 [Multi-domain] Cd Length: 614 Bit Score: 40.14 E-value: 5.68e-03

                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1034630806 601 SSGETPEPPSGP---VYPSSLMPQPAQQPSYVIASTGQQLPTGgfSGSGPPISQQVLQPPPS-PQGFVQQPPPAQMPVYY 676
Cdd:PRK14971  364 QKGDDASGGRGPkqhIKPVFTQPAAAPQPSAAAAASPSPSQSS--AAAQPSAPQSATQPAGTpPTVSVDPPAAVPVNPPS 441

                          90       100
                  ....*....|....*....|.
gi 1034630806 677 YPSGQYPTSTTQQYRPMAPVQ 697
Cdd:PRK14971  442 TAPQAVRPAQFKEEKKIPVSK 462

R3H_Smubp-2_like

cd02641

R3H domain of Smubp-2_like proteins. Smubp-2_like proteins also contain a helicase_like and ...

174-221

6.51e-03

R3H domain of Smubp-2_like proteins. Smubp-2_like proteins also contain a helicase_like and an AN1-like Zinc finger domain and have been shown to bind single-stranded DNA. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to bind ssDNA or ssRNA.

Pssm-ID: 100070 Cd Length: 60 Bit Score: 35.79 E-value: 6.51e-03

                          10        20        30        40
                  ....*....|....*....|....*....|....*....|....*....
gi 1034630806 174 IIDFIADNNNHYKKFP-QMSSYQRMLVHRVAAYFGLDHNVDQTGKSVII 221
Cdd:cd02641     8 VKAFMKDPKATELEFPpTLSSHDRLLVHELAEELGLRHESTGEGSDRVI 56

Blast search parameters

Data Source:	Precalculated data, version = cdd.v.3.21
Preset Options:	Database: CDSEARCH/cdd Low complexity filter: no Composition Based Adjustment: yes E-value threshold: 0.01