NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|74757998|sp|Q6IQ32|]
View 

RecName: Full=Activity-dependent neuroprotector homeobox protein 2; Short=ADNP homeobox protein 2; AltName: Full=Zinc finger protein 508

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
ADNP_N super family cl45031
Activity-dependent neuroprotector homeobox protein N-terminal; This entry represent the ...
1-984 2.91e-78

Activity-dependent neuroprotector homeobox protein N-terminal; This entry represent the N-terminal domain of Activity-dependent neuroprotector homeobox protein (ADNP, also known as Activity- dependent neuroprotective protein), which contains zinc finger motifs. It is involved in transcriptional regulation and it is vital for mammalian brain formation. In humans, de novo mutations result in a syndromic form of autism-like spectrum disorder (ASD), including cognitive and motor deficits, the ADNP syndrome. This protein is also related to autophagy and the pathophysiology of schizophrenia.


The actual alignment was detected with superfamily member pfam19627:

Pssm-ID: 466132 [Multi-domain]  Cd Length: 744  Bit Score: 273.26  E-value: 2.91e-78
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998      1 MFQIPVENLDNIRKVRKKVKGILVDIGLDSCKELLKDLKGFDPGEKYFHNTSWGDVSLWEPSGKKVR-YRTKPYCCGLCK 79
Cdd:pfam19627    1 MFQLPVNNLGSLRKARKNVKKILSDIGLEYCKEHIEDFKDFEPNDFYIKNTSWDDVCLWDPSLTKNQdYRTKPFCCSGCP 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998     80 YSTKVLTSFKNHLHRYHEDEIDQELVIPCPNCVFASQPKVVGRHFRMFHAPVRKVQNYTVNILGETKSSRSDVIS----- 154
Cdd:pfam19627   81 FSSKFFSAYKSHFRNVHSEDFENRILLNCPYCTYNGNKKTLETHIKLFHMPNNVVRQPSGGPVGFKDKSKQDSLKpkqgd 160
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    155 ------FTCLKCNFSNTLYYSMKKHVLVAHFHYLINSYFGlRTEEmgeqpKT-NDTVSIEKIPPPDKYYCKKCNANASSQ 227
Cdd:pfam19627  161 sveqavYYCKKCTYRDPLYNVVRKHIYREHFQHVAAPYVA-KPGE-----KSvNGAVASSNTRDDGSIHCKRCLFMPRTY 234
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    228 DALMYHiltsdihrdlenklrsVISEHIKrtgllkqthiapkpaahlaapangsapsapaqppcfhlalpqnspspaAGQ 307
Cdd:pfam19627  235 EALVQH----------------VIEDHER------------------------------------------------IGY 250
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    308 PVTVAQGapgslthsppaagqshmtlvssplpvgqnsltlqppapqpvflshgvplHQSVnppVLPLSQPvgpvnksvgt 387
Cdd:pfam19627  251 QVTAMIG-------------------------------------------------HTNV---VVPRSKP---------- 268
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    388 svlpinqtvrpgvLPLTQPVGPINRPVGpgvLPVSPSVTPGVLQAvspgvlsvsravpsgvlpagqmtpagqmtpagvip 467
Cdd:pfam19627  269 -------------LMLIAPKPQDKKSLG---VTQKGGLVTGNVRS----------------------------------- 297
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    468 gqtatsgvLPTGQMvqsgvlpvgqtapSRVLPPGqtaplrvisagqvvpSGLLSPnqtvsssavVPVNQGvnsgvlQLSQ 547
Cdd:pfam19627  298 --------LSSQQM-------------NRLSIPK---------------ANLLSN---------VHLKQG------SYGL 326
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    548 PVVSGVLPVGQPVRPGvlqlnqtvgtniLPVNQPVR-PGASQNttfltsgsiLRQLIPTGkqvNGIPTYTLAPVSVTLP- 625
Cdd:pfam19627  327 KSMPSFYVLGQQVRLS------------LPGNAQVSvPQQSQT---------VKQLLPGG---NGRPSTVGSSQSGQQPa 382
                          650       660       670       680       690       700       710       720
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    626 ---VPPGGLATVAPPQM-PIQLLPSGAAAPMAGSMPGMPSppvlvnaaqsvfvqASSSAADTNQVLKqakqWKTCPVCNE 701
Cdd:pfam19627  383 rfsVQSGNSASSSSSQLkSPPLSSSVAATRALGQGPSKSS--------------ASAAGLNTSYTQK----WKICTICNE 444
                          730       740       750       760       770       780       790       800
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    702 LFPSNVYQVHMEVAHKhsesksgeklePEKLAACAPFLKWMREKTVRCLSCKCLVSEEELIHHLLMHGLGCLFCPCTFHD 781
Cdd:pfam19627  445 LFPENVYSAHFEKEHK-----------AEKVPAVANYIMKIHNFTSKCLYCNRYLPSDTLLNHMLIHGLSCPYCRSTFND 513
                          810       820       830       840       850       860       870       880
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    782 IKGLSEHSRNRHLGKKKLPMDYSNRGFQLDVDaNGNLLFPHLDFITILPKEKLGEREVYLA---------ILAGIHSKSL 852
Cdd:pfam19627  514 VEKMVAHMRMVHPDEEVGPRTDSPLTFDLTLQ-QGNPKNIQLLVTTYNMRDAPEESVAFHAqnnspqpkkPKPKVQEKSD 592
                          890       900       910       920       930       940       950       960
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    853 VPvyVKVRPQaegTPGSTGKRV--STCPFCF----GPFvtTEAYELHLKERHHIMPTVHTVLKSPAFKCIHCCGVYTGNM 926
Cdd:pfam19627  593 VP--VKSSPQ---AAVPYKKDVgkTLCPLCFsilkGPI--SDALAHHLRERHQVIQTVHPVEKKLTYKCIHCLGVYTSNM 665
                          970       980       990      1000      1010      1020
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 74757998    927 TLAAIAVHLVRCRSAPK--DSSSDLQAQPGFIHNSELLLVSGEVMH-DSSFSVKRKLPDGH 984
Cdd:pfam19627  666 TASTITLHLVHCRGVGKtqNGQDKSAPSPRVTQSPGAAPLKRELEHvDPALPKKRKLDDEE 726
PHA03247 super family cl33720
large tegument protein UL36; Provisional
268-514 1.75e-09

large tegument protein UL36; Provisional


The actual alignment was detected with superfamily member PHA03247:

Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 62.65  E-value: 1.75e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   268 PKPAAhLAAPANGSAPSAPAQPPCFHLALPQNSP-SPAAGQPVTVAQGAPGSLTHSPPAAGQSHMTLVSSPLPVGQNSLT 346
Cdd:PHA03247 2758 ARPPT-TAGPPAPAPPAAPAAGPPRRLTRPAVASlSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQP 2836
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   347 LQPPAPqPVFLSHGVPLHQSVNP--PVLPLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQPVGpiNRPVGPGVLPVSPS 424
Cdd:PHA03247 2837 TAPPPP-PGPPPPSLPLGGSVAPggDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFA--LPPDQPERPPQPQA 2913
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   425 VTPGVLQAVSPGVLSVSRAVPSGVLPAGQMTP----AGQMTPAGVIPgQTATSGVLPTGQMVQSGVLPvgQTAPSRVLPP 500
Cdd:PHA03247 2914 PPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPttdpAGAGEPSGAVP-QPWLGALVPGRVAVPRFRVP--QPAPSREAPA 2990
                         250
                  ....*....|....
gi 74757998   501 GQTAPLRVISAGQV 514
Cdd:PHA03247 2991 SSTPPLTGHSLSRV 3004
HOX smart00389
Homeodomain; DNA-binding factors that are involved in the transcriptional regulation of key ...
1043-1096 1.61e-06

Homeodomain; DNA-binding factors that are involved in the transcriptional regulation of key developmental processes


:

Pssm-ID: 197696 [Multi-domain]  Cd Length: 57  Bit Score: 46.09  E-value: 1.61e-06
                            10        20        30        40        50
                    ....*....|....*....|....*....|....*....|....*....|....
gi 74757998    1043 PKKYEGRSYEEKKQFLKDYFHKKPYPSKKEIELLSSLFWVWKIDVASFFGKRRY 1096
Cdd:smart00389    1 KRRKRTSFTPEQLEELEKEFQKNPYPSREEREELAKKLGLSERQVKVWFQNRRA 54
 
Name Accession Description Interval E-value
ADNP_N pfam19627
Activity-dependent neuroprotector homeobox protein N-terminal; This entry represent the ...
1-984 2.91e-78

Activity-dependent neuroprotector homeobox protein N-terminal; This entry represent the N-terminal domain of Activity-dependent neuroprotector homeobox protein (ADNP, also known as Activity- dependent neuroprotective protein), which contains zinc finger motifs. It is involved in transcriptional regulation and it is vital for mammalian brain formation. In humans, de novo mutations result in a syndromic form of autism-like spectrum disorder (ASD), including cognitive and motor deficits, the ADNP syndrome. This protein is also related to autophagy and the pathophysiology of schizophrenia.


Pssm-ID: 466132 [Multi-domain]  Cd Length: 744  Bit Score: 273.26  E-value: 2.91e-78
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998      1 MFQIPVENLDNIRKVRKKVKGILVDIGLDSCKELLKDLKGFDPGEKYFHNTSWGDVSLWEPSGKKVR-YRTKPYCCGLCK 79
Cdd:pfam19627    1 MFQLPVNNLGSLRKARKNVKKILSDIGLEYCKEHIEDFKDFEPNDFYIKNTSWDDVCLWDPSLTKNQdYRTKPFCCSGCP 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998     80 YSTKVLTSFKNHLHRYHEDEIDQELVIPCPNCVFASQPKVVGRHFRMFHAPVRKVQNYTVNILGETKSSRSDVIS----- 154
Cdd:pfam19627   81 FSSKFFSAYKSHFRNVHSEDFENRILLNCPYCTYNGNKKTLETHIKLFHMPNNVVRQPSGGPVGFKDKSKQDSLKpkqgd 160
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    155 ------FTCLKCNFSNTLYYSMKKHVLVAHFHYLINSYFGlRTEEmgeqpKT-NDTVSIEKIPPPDKYYCKKCNANASSQ 227
Cdd:pfam19627  161 sveqavYYCKKCTYRDPLYNVVRKHIYREHFQHVAAPYVA-KPGE-----KSvNGAVASSNTRDDGSIHCKRCLFMPRTY 234
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    228 DALMYHiltsdihrdlenklrsVISEHIKrtgllkqthiapkpaahlaapangsapsapaqppcfhlalpqnspspaAGQ 307
Cdd:pfam19627  235 EALVQH----------------VIEDHER------------------------------------------------IGY 250
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    308 PVTVAQGapgslthsppaagqshmtlvssplpvgqnsltlqppapqpvflshgvplHQSVnppVLPLSQPvgpvnksvgt 387
Cdd:pfam19627  251 QVTAMIG-------------------------------------------------HTNV---VVPRSKP---------- 268
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    388 svlpinqtvrpgvLPLTQPVGPINRPVGpgvLPVSPSVTPGVLQAvspgvlsvsravpsgvlpagqmtpagqmtpagvip 467
Cdd:pfam19627  269 -------------LMLIAPKPQDKKSLG---VTQKGGLVTGNVRS----------------------------------- 297
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    468 gqtatsgvLPTGQMvqsgvlpvgqtapSRVLPPGqtaplrvisagqvvpSGLLSPnqtvsssavVPVNQGvnsgvlQLSQ 547
Cdd:pfam19627  298 --------LSSQQM-------------NRLSIPK---------------ANLLSN---------VHLKQG------SYGL 326
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    548 PVVSGVLPVGQPVRPGvlqlnqtvgtniLPVNQPVR-PGASQNttfltsgsiLRQLIPTGkqvNGIPTYTLAPVSVTLP- 625
Cdd:pfam19627  327 KSMPSFYVLGQQVRLS------------LPGNAQVSvPQQSQT---------VKQLLPGG---NGRPSTVGSSQSGQQPa 382
                          650       660       670       680       690       700       710       720
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    626 ---VPPGGLATVAPPQM-PIQLLPSGAAAPMAGSMPGMPSppvlvnaaqsvfvqASSSAADTNQVLKqakqWKTCPVCNE 701
Cdd:pfam19627  383 rfsVQSGNSASSSSSQLkSPPLSSSVAATRALGQGPSKSS--------------ASAAGLNTSYTQK----WKICTICNE 444
                          730       740       750       760       770       780       790       800
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    702 LFPSNVYQVHMEVAHKhsesksgeklePEKLAACAPFLKWMREKTVRCLSCKCLVSEEELIHHLLMHGLGCLFCPCTFHD 781
Cdd:pfam19627  445 LFPENVYSAHFEKEHK-----------AEKVPAVANYIMKIHNFTSKCLYCNRYLPSDTLLNHMLIHGLSCPYCRSTFND 513
                          810       820       830       840       850       860       870       880
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    782 IKGLSEHSRNRHLGKKKLPMDYSNRGFQLDVDaNGNLLFPHLDFITILPKEKLGEREVYLA---------ILAGIHSKSL 852
Cdd:pfam19627  514 VEKMVAHMRMVHPDEEVGPRTDSPLTFDLTLQ-QGNPKNIQLLVTTYNMRDAPEESVAFHAqnnspqpkkPKPKVQEKSD 592
                          890       900       910       920       930       940       950       960
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    853 VPvyVKVRPQaegTPGSTGKRV--STCPFCF----GPFvtTEAYELHLKERHHIMPTVHTVLKSPAFKCIHCCGVYTGNM 926
Cdd:pfam19627  593 VP--VKSSPQ---AAVPYKKDVgkTLCPLCFsilkGPI--SDALAHHLRERHQVIQTVHPVEKKLTYKCIHCLGVYTSNM 665
                          970       980       990      1000      1010      1020
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 74757998    927 TLAAIAVHLVRCRSAPK--DSSSDLQAQPGFIHNSELLLVSGEVMH-DSSFSVKRKLPDGH 984
Cdd:pfam19627  666 TASTITLHLVHCRGVGKtqNGQDKSAPSPRVTQSPGAAPLKRELEHvDPALPKKRKLDDEE 726
PHA03247 PHA03247
large tegument protein UL36; Provisional
270-663 3.96e-15

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 81.14  E-value: 3.96e-15
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   270 PAAHLAAPANGSAPSAPAQPPcfhlalpqnsPSPAAGQPVTVAQGAPGSLTHSPPAA--GQSHMTLVSSPLPVGQNSLTL 347
Cdd:PHA03247 2557 PAAPPAAPDRSVPPPRPAPRP----------SEPAVTSRARRPDAPPQSARPRAPVDdrGDPRGPAPPSPLPPDTHAPDP 2626
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   348 QPPAPQPVFLSHGVPLHQSVNPPVLPLSQPVGPV------NKSVGTSVLPINQTVRPGVLPLTQPVGPINRPVGPGVLPV 421
Cdd:PHA03247 2627 PPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRvsrprrARRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPPPP 2706
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   422 SPSVTPGVLQAVSPgvLSVSRAVPSGVLPAGQMTPAGQMTPAG-VIPGQTATSGVLPTGQMVQSGVLPVGQ-TAPSRVLP 499
Cdd:PHA03247 2707 TPEPAPHALVSATP--LPPGPAAARQASPALPAAPAPPAVPAGpATPGGPARPARPPTTAGPPAPAPPAAPaAGPPRRLT 2784
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   500 PGQTAPLRVISAGQVVPSGLLSPNQTVSS-SAVVPVNQGVNSGVlqlsqPVVSGVLPVGQPVRPGVLQLNQTVGTNILPV 578
Cdd:PHA03247 2785 RPAVASLSESRESLPSPWDPADPPAAVLApAAALPPAASPAGPL-----PPPTSAQPTAPPPPPGPPPPSLPLGGSVAPG 2859
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   579 NQPVRPGASQNTTFLTSGsilrqliPTGKQVNGIPTYTLAPVSVTLPVPPGGLATVAPPQMPIQLLPSGAAAPMAGSMPG 658
Cdd:PHA03247 2860 GDVRRRPPSRSPAAKPAA-------PARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPP 2932

                  ....*
gi 74757998   659 MPSPP 663
Cdd:PHA03247 2933 PPPPP 2937
PHA03247 PHA03247
large tegument protein UL36; Provisional
268-514 1.75e-09

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 62.65  E-value: 1.75e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   268 PKPAAhLAAPANGSAPSAPAQPPCFHLALPQNSP-SPAAGQPVTVAQGAPGSLTHSPPAAGQSHMTLVSSPLPVGQNSLT 346
Cdd:PHA03247 2758 ARPPT-TAGPPAPAPPAAPAAGPPRRLTRPAVASlSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQP 2836
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   347 LQPPAPqPVFLSHGVPLHQSVNP--PVLPLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQPVGpiNRPVGPGVLPVSPS 424
Cdd:PHA03247 2837 TAPPPP-PGPPPPSLPLGGSVAPggDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFA--LPPDQPERPPQPQA 2913
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   425 VTPGVLQAVSPGVLSVSRAVPSGVLPAGQMTP----AGQMTPAGVIPgQTATSGVLPTGQMVQSGVLPvgQTAPSRVLPP 500
Cdd:PHA03247 2914 PPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPttdpAGAGEPSGAVP-QPWLGALVPGRVAVPRFRVP--QPAPSREAPA 2990
                         250
                  ....*....|....
gi 74757998   501 GQTAPLRVISAGQV 514
Cdd:PHA03247 2991 SSTPPLTGHSLSRV 3004
SP1-4_arthropods_N cd22553
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ...
384-701 8.84e-08

N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.


Pssm-ID: 411778 [Multi-domain]  Cd Length: 384  Bit Score: 55.80  E-value: 8.84e-08
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  384 SVGTSVLPINQTVRPGVLPLTQPVGPINRPVGPGVLPVSPSVTPGVLQAVSpgVLSVSRAVPSGVLPAGQMTPAG-QMTP 462
Cdd:cd22553   35 ETHDPLILSPPLSQPQQIITAQSSGSAAGGVAYSVSPAVQTVTVDGHEAIF--IPANSGLLQTNNQQAIQLAPGGtQAIL 112
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  463 AGvipGQTATSGVLPTGQMVQSGVLPV-GQTAPSRV---LPP---GQTAPLRV-ISA--GQVVPSGLLSPNQTVSSSAVV 532
Cdd:cd22553  113 AN---QQTLIRPNTVQGQANASNVLQNiAQIASGGNavqLPLnnmTQTIPVQVpVSTanGQTVYQTIQVPIQAIQSGNAG 189
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  533 PVNQGVNSGVL-QLSQPvvsgvlpvgqpvrpGVLQLNQTVGTNILPVNQPVRPGASQNTTFL------TSGSILRQLIPT 605
Cdd:cd22553  190 GGNQALQAQVIpQLAQA--------------AQLQPQQLAQVSSQGYIQQIPANASQQQPQMvqqgpnQSGQIIGQVASA 255
                        250       260       270       280       290       300       310       320
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  606 -GKQVNGIPTYTLApvSVTLPVPPGGLATVAP-PQMPIQLLPSGAAAPMAGSMPGMPSPPVLVNAAQSVFVQASSSAADT 683
Cdd:cd22553  256 sSIQAAAIPLTVYT--GALAGQNGSNQQQVGQiVTSPIQGMTQGLTAPASSSIPTVVQQQAIQGNPLPPGTQIIAAGQQL 333
                        330       340       350
                 ....*....|....*....|....*....|....*.
gi 74757998  684 NQVLKQAKQWK------------------TCPVCNE 701
Cdd:cd22553  334 QQDPNDPTKWQvvadgtpgskkrlrrvacTCPNCRD 369
HOX smart00389
Homeodomain; DNA-binding factors that are involved in the transcriptional regulation of key ...
1043-1096 1.61e-06

Homeodomain; DNA-binding factors that are involved in the transcriptional regulation of key developmental processes


Pssm-ID: 197696 [Multi-domain]  Cd Length: 57  Bit Score: 46.09  E-value: 1.61e-06
                            10        20        30        40        50
                    ....*....|....*....|....*....|....*....|....*....|....
gi 74757998    1043 PKKYEGRSYEEKKQFLKDYFHKKPYPSKKEIELLSSLFWVWKIDVASFFGKRRY 1096
Cdd:smart00389    1 KRRKRTSFTPEQLEELEKEFQKNPYPSREEREELAKKLGLSERQVKVWFQNRRA 54
Soli_cterm TIGR03437
Solibacter uncharacterized C-terminal domain; This model describes a protein domain found in ...
465-667 2.84e-06

Solibacter uncharacterized C-terminal domain; This model describes a protein domain found in 90 proteins of Solibacter usitatus Ellin6076, nearly always as the C-terminal domain of a much larger protein. No homologs to this domain are detected outside of S. usitatus, a member of the Acidobacteria.


Pssm-ID: 274578 [Multi-domain]  Cd Length: 215  Bit Score: 49.58  E-value: 2.84e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    465 VIPGQTAT---SGVLPTGQMVQSGVLPVgQTAPSRVLPPGQTAPLRVISAGQV---VPSGLLSPNQTVsssaVVpVNQGV 538
Cdd:TIGR03437    2 VAPGSIVSifgTNLAPATLTAAGGPLPT-SLGGVSVTVNGVAAPLLYVSPGQInaqVPYEVAPGAATV----TV-TYNGG 75
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    539 NSGVLQLS-QPVVSGVLPV-GQPVRPGVLQLNQtvGTNILPVNQPVRPGaSQNTTFLTSGSILRQLIPTGKQVNGIPTY- 615
Cdd:TIGR03437   76 ASAAVTVTvAAAAPGIFTLdGSGTGQAAALNNQ--DGSVNSAANPAAPG-DVVVLYATGLGPTSPAVADGAPAPSSPLAp 152
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 74757998    616 TLAPVSVTL-----PVPPGGLATVAPPQMPIQL-LPSGAAA---PMAGSMPGMPSPPVLVN 667
Cdd:TIGR03437  153 ALAPVTVTIggvpaTVLYAGLAPGFVGLYQVNVrVPAGLATgavPVVITVGGVTSNAVTIA 213
homeodomain cd00086
Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic ...
1044-1100 7.87e-06

Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner.


Pssm-ID: 238039 [Multi-domain]  Cd Length: 59  Bit Score: 44.16  E-value: 7.87e-06
                         10        20        30        40        50
                 ....*....|....*....|....*....|....*....|....*....|....*..
gi 74757998 1044 KKYEGRSYEEKKQFLKDYFHKKPYPSKKEIELLSSLFWVWKIDVASFFGKRRYICMK 1100
Cdd:cd00086    1 RRKRTRFTPEQLEELEKEFEKNPYPSREEREELAKELGLTERQVKIWFQNRRAKLKR 57
PPE COG5651
PPE-repeat protein [Function unknown];
362-564 2.39e-05

PPE-repeat protein [Function unknown];


Pssm-ID: 444372 [Multi-domain]  Cd Length: 385  Bit Score: 47.97  E-value: 2.39e-05
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  362 PLHQSVNPPVLPLSQPVGPVNKSVGTSV--LPINQTVRPGVLPLTQPVGPINRPVGPGVLPVSPSVTPGVLQAVSPGVLS 439
Cdd:COG5651  170 PPPTITNPGGLLGAQNAGSGNTSSNPGFanLGLTGLNQVGIGGLNSGSGPIGLNSGPGNTGFAGTGAAAGAAAAAAAAAA 249
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  440 VSRAVPSGVLPAGQMTPAGQMTPAGVIPGQTATSGVLPTGQMVQSGVLPVGQTAPSRVLPPGQTAPLRVISAGqVVPSGL 519
Cdd:COG5651  250 AAGAGASAALASLAATLLNASSLGLAATAASSAATNLGLAGSPLGLAGGGAGAAAATGLGLGAGGAAGAAGAT-GAGAAL 328
                        170       180       190       200
                 ....*....|....*....|....*....|....*....|....*
gi 74757998  520 LSPNQTVSSSAVVPVNQGVNSGVLQLSQPVVSGVLPVGQPVRPGV 564
Cdd:COG5651  329 GAGAAAAAAGAAAGAGAAAAAAAGGAGGGGGGALGAGGGGGSAGA 373
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
264-434 1.61e-04

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 45.91  E-value: 1.61e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    264 THIAPKPAAH-LAAPANGSAPSApaQPPCFHLaLPQNS--PSPAAgQPVTVAQgapgSLTHSPPAAgqshmtlvSSPLPV 340
Cdd:pfam03154  388 SNLPPPPALKpLSSLSTHHPPSA--HPPPLQL-MPQSQqlPPPPA-QPPVLTQ----SQSLPPPAA--------SHPPTS 451
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    341 GQNSLTLQPPAPQPVFLSHGVPL------HQSVNPPVLPLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQPVGPINRPV 414
Cdd:pfam03154  452 GLHQVPSQSPFPQHPFVPGGPPPitppsgPPTSTSSAMPGIQPPSSASVSSSGPVPAAVSCPLPPVQIKEEALDEAEEPE 531
                          170       180
                   ....*....|....*....|
gi 74757998    415 GPGVLPVSPSVTPGVLQAVS 434
Cdd:pfam03154  532 SPPPPPRSPSPEPTVVNTPS 551
KLF10_11_N cd21974
N-terminal domain of Kruppel-like factor (KLF) 10, KLF11, and similar proteins; This subfamily ...
321-458 1.00e-02

N-terminal domain of Kruppel-like factor (KLF) 10, KLF11, and similar proteins; This subfamily is composed of Kruppel-like factor or Krueppel-like factor (KLF) 10, KLF11, and similar proteins. KLF10 was first identified in human osteoblasts and plays a role in mediating estrogen (E2) signaling in bone and skeletal homeostasis and a regulatory role in tumor formation and metastasis. KLF11 is involved in cell growth, apoptosis, cellular inflammation and differentiation, endometriosis, and cholesterol, prostaglandin, neurotransmitter, fat, and sugar metabolism. KLF9, KLF10, KLF11, KLF13, KLF14, and KLF16 share a conserved a-helical motif AA/VXXL that mediates their binding to Sin3A and their activities as transcriptional repressors. KLF10/11 belong to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF10, KLF11, and similar proteins.


Pssm-ID: 409243 [Multi-domain]  Cd Length: 229  Bit Score: 38.76  E-value: 1.00e-02
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  321 HSPPAAGQSHMTLVSSPLPV--------GQNSLTLQPPAPQPVFLSHGVPLHQSVNPPVLP---LSQPVgPVNKSVGTSV 389
Cdd:cd21974   62 YSPPFFEASHSPSVASLHPPsaassqppPEPESSEPPAASPQRAQATSVIRHTADPVPVSPppvLCQML-PVSSSSGVIV 140
                         90       100       110       120       130       140       150
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  390 LPINQTVRPGVLPLTQPVGPinrpvgPGVLPVSPSVTPG-VLQAVSPGvlsvsrAVPSGVLPAGQMTPAG 458
Cdd:cd21974  141 AFLKAPQQPSPQPQKPALPQ------PQVVLVGGQVPQGpVMLVVPQP------AVPQPYVQPTVVTPGG 198
 
Name Accession Description Interval E-value
ADNP_N pfam19627
Activity-dependent neuroprotector homeobox protein N-terminal; This entry represent the ...
1-984 2.91e-78

Activity-dependent neuroprotector homeobox protein N-terminal; This entry represent the N-terminal domain of Activity-dependent neuroprotector homeobox protein (ADNP, also known as Activity- dependent neuroprotective protein), which contains zinc finger motifs. It is involved in transcriptional regulation and it is vital for mammalian brain formation. In humans, de novo mutations result in a syndromic form of autism-like spectrum disorder (ASD), including cognitive and motor deficits, the ADNP syndrome. This protein is also related to autophagy and the pathophysiology of schizophrenia.


Pssm-ID: 466132 [Multi-domain]  Cd Length: 744  Bit Score: 273.26  E-value: 2.91e-78
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998      1 MFQIPVENLDNIRKVRKKVKGILVDIGLDSCKELLKDLKGFDPGEKYFHNTSWGDVSLWEPSGKKVR-YRTKPYCCGLCK 79
Cdd:pfam19627    1 MFQLPVNNLGSLRKARKNVKKILSDIGLEYCKEHIEDFKDFEPNDFYIKNTSWDDVCLWDPSLTKNQdYRTKPFCCSGCP 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998     80 YSTKVLTSFKNHLHRYHEDEIDQELVIPCPNCVFASQPKVVGRHFRMFHAPVRKVQNYTVNILGETKSSRSDVIS----- 154
Cdd:pfam19627   81 FSSKFFSAYKSHFRNVHSEDFENRILLNCPYCTYNGNKKTLETHIKLFHMPNNVVRQPSGGPVGFKDKSKQDSLKpkqgd 160
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    155 ------FTCLKCNFSNTLYYSMKKHVLVAHFHYLINSYFGlRTEEmgeqpKT-NDTVSIEKIPPPDKYYCKKCNANASSQ 227
Cdd:pfam19627  161 sveqavYYCKKCTYRDPLYNVVRKHIYREHFQHVAAPYVA-KPGE-----KSvNGAVASSNTRDDGSIHCKRCLFMPRTY 234
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    228 DALMYHiltsdihrdlenklrsVISEHIKrtgllkqthiapkpaahlaapangsapsapaqppcfhlalpqnspspaAGQ 307
Cdd:pfam19627  235 EALVQH----------------VIEDHER------------------------------------------------IGY 250
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    308 PVTVAQGapgslthsppaagqshmtlvssplpvgqnsltlqppapqpvflshgvplHQSVnppVLPLSQPvgpvnksvgt 387
Cdd:pfam19627  251 QVTAMIG-------------------------------------------------HTNV---VVPRSKP---------- 268
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    388 svlpinqtvrpgvLPLTQPVGPINRPVGpgvLPVSPSVTPGVLQAvspgvlsvsravpsgvlpagqmtpagqmtpagvip 467
Cdd:pfam19627  269 -------------LMLIAPKPQDKKSLG---VTQKGGLVTGNVRS----------------------------------- 297
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    468 gqtatsgvLPTGQMvqsgvlpvgqtapSRVLPPGqtaplrvisagqvvpSGLLSPnqtvsssavVPVNQGvnsgvlQLSQ 547
Cdd:pfam19627  298 --------LSSQQM-------------NRLSIPK---------------ANLLSN---------VHLKQG------SYGL 326
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    548 PVVSGVLPVGQPVRPGvlqlnqtvgtniLPVNQPVR-PGASQNttfltsgsiLRQLIPTGkqvNGIPTYTLAPVSVTLP- 625
Cdd:pfam19627  327 KSMPSFYVLGQQVRLS------------LPGNAQVSvPQQSQT---------VKQLLPGG---NGRPSTVGSSQSGQQPa 382
                          650       660       670       680       690       700       710       720
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    626 ---VPPGGLATVAPPQM-PIQLLPSGAAAPMAGSMPGMPSppvlvnaaqsvfvqASSSAADTNQVLKqakqWKTCPVCNE 701
Cdd:pfam19627  383 rfsVQSGNSASSSSSQLkSPPLSSSVAATRALGQGPSKSS--------------ASAAGLNTSYTQK----WKICTICNE 444
                          730       740       750       760       770       780       790       800
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    702 LFPSNVYQVHMEVAHKhsesksgeklePEKLAACAPFLKWMREKTVRCLSCKCLVSEEELIHHLLMHGLGCLFCPCTFHD 781
Cdd:pfam19627  445 LFPENVYSAHFEKEHK-----------AEKVPAVANYIMKIHNFTSKCLYCNRYLPSDTLLNHMLIHGLSCPYCRSTFND 513
                          810       820       830       840       850       860       870       880
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    782 IKGLSEHSRNRHLGKKKLPMDYSNRGFQLDVDaNGNLLFPHLDFITILPKEKLGEREVYLA---------ILAGIHSKSL 852
Cdd:pfam19627  514 VEKMVAHMRMVHPDEEVGPRTDSPLTFDLTLQ-QGNPKNIQLLVTTYNMRDAPEESVAFHAqnnspqpkkPKPKVQEKSD 592
                          890       900       910       920       930       940       950       960
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    853 VPvyVKVRPQaegTPGSTGKRV--STCPFCF----GPFvtTEAYELHLKERHHIMPTVHTVLKSPAFKCIHCCGVYTGNM 926
Cdd:pfam19627  593 VP--VKSSPQ---AAVPYKKDVgkTLCPLCFsilkGPI--SDALAHHLRERHQVIQTVHPVEKKLTYKCIHCLGVYTSNM 665
                          970       980       990      1000      1010      1020
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 74757998    927 TLAAIAVHLVRCRSAPK--DSSSDLQAQPGFIHNSELLLVSGEVMH-DSSFSVKRKLPDGH 984
Cdd:pfam19627  666 TASTITLHLVHCRGVGKtqNGQDKSAPSPRVTQSPGAAPLKRELEHvDPALPKKRKLDDEE 726
PHA03247 PHA03247
large tegument protein UL36; Provisional
270-663 3.96e-15

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 81.14  E-value: 3.96e-15
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   270 PAAHLAAPANGSAPSAPAQPPcfhlalpqnsPSPAAGQPVTVAQGAPGSLTHSPPAA--GQSHMTLVSSPLPVGQNSLTL 347
Cdd:PHA03247 2557 PAAPPAAPDRSVPPPRPAPRP----------SEPAVTSRARRPDAPPQSARPRAPVDdrGDPRGPAPPSPLPPDTHAPDP 2626
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   348 QPPAPQPVFLSHGVPLHQSVNPPVLPLSQPVGPV------NKSVGTSVLPINQTVRPGVLPLTQPVGPINRPVGPGVLPV 421
Cdd:PHA03247 2627 PPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRvsrprrARRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPPPP 2706
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   422 SPSVTPGVLQAVSPgvLSVSRAVPSGVLPAGQMTPAGQMTPAG-VIPGQTATSGVLPTGQMVQSGVLPVGQ-TAPSRVLP 499
Cdd:PHA03247 2707 TPEPAPHALVSATP--LPPGPAAARQASPALPAAPAPPAVPAGpATPGGPARPARPPTTAGPPAPAPPAAPaAGPPRRLT 2784
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   500 PGQTAPLRVISAGQVVPSGLLSPNQTVSS-SAVVPVNQGVNSGVlqlsqPVVSGVLPVGQPVRPGVLQLNQTVGTNILPV 578
Cdd:PHA03247 2785 RPAVASLSESRESLPSPWDPADPPAAVLApAAALPPAASPAGPL-----PPPTSAQPTAPPPPPGPPPPSLPLGGSVAPG 2859
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   579 NQPVRPGASQNTTFLTSGsilrqliPTGKQVNGIPTYTLAPVSVTLPVPPGGLATVAPPQMPIQLLPSGAAAPMAGSMPG 658
Cdd:PHA03247 2860 GDVRRRPPSRSPAAKPAA-------PARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPP 2932

                  ....*
gi 74757998   659 MPSPP 663
Cdd:PHA03247 2933 PPPPP 2937
PHA03247 PHA03247
large tegument protein UL36; Provisional
268-662 4.43e-14

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 77.67  E-value: 4.43e-14
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   268 PKPAAHLAAPANGSA---PSAPAQP--PCFHLALPQNSPSPAAGQPVTVAQGAPgsltHSPPAAGQSHmtlvSSPLPVGQ 342
Cdd:PHA03247 2571 PRPAPRPSEPAVTSRarrPDAPPQSarPRAPVDDRGDPRGPAPPSPLPPDTHAP----DPPPPSPSPA----ANEPDPHP 2642
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   343 NSLTLQPPAPQPVFLSHGVPLHQSVNP---PVLPLSQPVGPVNKSVGTSVLPINQTVRPGvlPLTQPVGPINRPVGPGVl 419
Cdd:PHA03247 2643 PPTVPPPERPRDDPAPGRVSRPRRARRlgrAAQASSPPQRPRRRAARPTVGSLTSLADPP--PPPPTPEPAPHALVSAT- 2719
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   420 PVSPSVTPGVLQAVSPGVLSVSRAVPSG-VLPAGQMTPAGQMTPAGviPGQTATSGVLPTGQMVQSGVLPVGQTAPSRVL 498
Cdd:PHA03247 2720 PLPPGPAAARQASPALPAAPAPPAVPAGpATPGGPARPARPPTTAG--PPAPAPPAAPAAGPPRRLTRPAVASLSESRES 2797
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   499 PPGQTAPLRViSAGQVVPSGLLSPNQTVSS-----SAVVPVNQGVNSGVLQLSQPVVSGVLPVGQPVRPGVLQlnQTVGT 573
Cdd:PHA03247 2798 LPSPWDPADP-PAAVLAPAAALPPAASPAGplpppTSAQPTAPPPPPGPPPPSLPLGGSVAPGGDVRRRPPSR--SPAAK 2874
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   574 NILPVNQPVR----PGASQNTTFLTSGSILRQLIPTGK-QVNGIPTYTLAPVSVTLPVPPgglatvAPPQMPIQLLPSGA 648
Cdd:PHA03247 2875 PAAPARPPVRrlarPAVSRSTESFALPPDQPERPPQPQaPPPPQPQPQPPPPPQPQPPPP------PPPRPQPPLAPTTD 2948
                         410
                  ....*....|....
gi 74757998   649 AAPMAGSMPGMPSP 662
Cdd:PHA03247 2949 PAGAGEPSGAVPQP 2962
PHA03247 PHA03247
large tegument protein UL36; Provisional
268-514 1.75e-09

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 62.65  E-value: 1.75e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   268 PKPAAhLAAPANGSAPSAPAQPPCFHLALPQNSP-SPAAGQPVTVAQGAPGSLTHSPPAAGQSHMTLVSSPLPVGQNSLT 346
Cdd:PHA03247 2758 ARPPT-TAGPPAPAPPAAPAAGPPRRLTRPAVASlSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQP 2836
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   347 LQPPAPqPVFLSHGVPLHQSVNP--PVLPLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQPVGpiNRPVGPGVLPVSPS 424
Cdd:PHA03247 2837 TAPPPP-PGPPPPSLPLGGSVAPggDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFA--LPPDQPERPPQPQA 2913
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   425 VTPGVLQAVSPGVLSVSRAVPSGVLPAGQMTP----AGQMTPAGVIPgQTATSGVLPTGQMVQSGVLPvgQTAPSRVLPP 500
Cdd:PHA03247 2914 PPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPttdpAGAGEPSGAVP-QPWLGALVPGRVAVPRFRVP--QPAPSREAPA 2990
                         250
                  ....*....|....
gi 74757998   501 GQTAPLRVISAGQV 514
Cdd:PHA03247 2991 SSTPPLTGHSLSRV 3004
PHA03247 PHA03247
large tegument protein UL36; Provisional
267-600 6.05e-09

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 60.72  E-value: 6.05e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   267 APKPAAHLAAPANGSAPSAPAQPPcfhlALPQNSPSPAAGQPVTVAQGAPGS-LTHSPPAAGQSHMTLVSsPLPVGQNSL 345
Cdd:PHA03247 2688 ARPTVGSLTSLADPPPPPPTPEPA----PHALVSATPLPPGPAAARQASPALpAAPAPPAVPAGPATPGG-PARPARPPT 2762
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   346 TLQPPAPQPVFLSHGVPLHQSVNPPVLPLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQ-PVGPINRPvgPGVLPVSPS 424
Cdd:PHA03247 2763 TAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAAsPAGPLPPP--TSAQPTAPP 2840
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   425 VTPGVLQAvspgvlsvSRAVPSGVLPAGqmtPAGQMTPAGVIPGQTATSGVLPTGQMVQSgvlPVGQTAPSRVLPPGQTA 504
Cdd:PHA03247 2841 PPPGPPPP--------SLPLGGSVAPGG---DVRRRPPSRSPAAKPAAPARPPVRRLARP---AVSRSTESFALPPDQPE 2906
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   505 PLRVISAGQvvPSGLLSPNQTVSSSAVVPVNQGVNSGVLQ-----LSQPVVSGVLPVGQ--PVRPGVLQLNQTvgtnILP 577
Cdd:PHA03247 2907 RPPQPQAPP--PPQPQPQPPPPPQPQPPPPPPPRPQPPLApttdpAGAGEPSGAVPQPWlgALVPGRVAVPRF----RVP 2980
                         330       340
                  ....*....|....*....|...
gi 74757998   578 VNQPVRPGASQNTTFLTSGSILR 600
Cdd:PHA03247 2981 QPAPSREAPASSTPPLTGHSLSR 3003
PHA03379 PHA03379
EBNA-3A; Provisional
255-586 5.59e-08

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 57.38  E-value: 5.59e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   255 IKRTGllKQTHIAPKPAAHLAAPANGSaPSAPAQPPcfHLALPQNSPSPAAGQPVTVAQGAPGSLTHSPPAAGQSHMtlv 334
Cdd:PHA03379  391 LMRAG--KLTERAREALEKASEPTYGT-PRPPVEKP--RPEVPQSLETATSHGSAQVPEPPPVHDLEPGPLHDQHSM--- 462
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   335 sSPLPVGQNsltlqPPAP----QPVFLSHGVPlhQSVNPPVLPLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQPVGPI 410
Cdd:PHA03379  463 -APCPVAQL-----PPGPlqdlEPGDQLPGVV--QDGRPACAPVPAPAGPIVRPWEASLSQVPGVAFAPVMPQPMPVEPV 534
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   411 NRPVGPGVLPVSPSVTPGVLQAvsPGVLSVSRAV----------PSGVLPAGQMT---------PAGQMTPAGV------ 465
Cdd:PHA03379  535 PVPTVALERPVCPAPPLIAMQG--PGETSGIVRVrerwrpapwtPNPPRSPSQMSvrdrlarlrAEAQPYQASVevqppq 612
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   466 ---IPGQTATSGVL-PTGQM------------VQSGVLPVGQtAPSRVLPPGQ-------TAPLRViSAGQVVPSGLLSP 522
Cdd:PHA03379  613 ltqVSPQQPMEYPLePEQQMfpgspfsqvadvMRAGGVPAMQ-PQYFDLPLQQpisqgapLAPLRA-SMGPVPPVPATQP 690
                         330       340       350       360       370       380       390
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 74757998   523 nQTVSSSAVVPVNQGVNSGVLQLSQPVVSGVLP---------VGQPVRPGVLQlNQTVGtniLPVNQPVRPGA 586
Cdd:PHA03379  691 -QYFDIPLTEPINQGASAAHFLPQQPMEGPLVPerwmfqgatLSQSVRPGVAQ-SQYFD---LPLTQPINHGA 758
SP1-4_arthropods_N cd22553
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ...
384-701 8.84e-08

N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.


Pssm-ID: 411778 [Multi-domain]  Cd Length: 384  Bit Score: 55.80  E-value: 8.84e-08
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  384 SVGTSVLPINQTVRPGVLPLTQPVGPINRPVGPGVLPVSPSVTPGVLQAVSpgVLSVSRAVPSGVLPAGQMTPAG-QMTP 462
Cdd:cd22553   35 ETHDPLILSPPLSQPQQIITAQSSGSAAGGVAYSVSPAVQTVTVDGHEAIF--IPANSGLLQTNNQQAIQLAPGGtQAIL 112
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  463 AGvipGQTATSGVLPTGQMVQSGVLPV-GQTAPSRV---LPP---GQTAPLRV-ISA--GQVVPSGLLSPNQTVSSSAVV 532
Cdd:cd22553  113 AN---QQTLIRPNTVQGQANASNVLQNiAQIASGGNavqLPLnnmTQTIPVQVpVSTanGQTVYQTIQVPIQAIQSGNAG 189
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  533 PVNQGVNSGVL-QLSQPvvsgvlpvgqpvrpGVLQLNQTVGTNILPVNQPVRPGASQNTTFL------TSGSILRQLIPT 605
Cdd:cd22553  190 GGNQALQAQVIpQLAQA--------------AQLQPQQLAQVSSQGYIQQIPANASQQQPQMvqqgpnQSGQIIGQVASA 255
                        250       260       270       280       290       300       310       320
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  606 -GKQVNGIPTYTLApvSVTLPVPPGGLATVAP-PQMPIQLLPSGAAAPMAGSMPGMPSPPVLVNAAQSVFVQASSSAADT 683
Cdd:cd22553  256 sSIQAAAIPLTVYT--GALAGQNGSNQQQVGQiVTSPIQGMTQGLTAPASSSIPTVVQQQAIQGNPLPPGTQIIAAGQQL 333
                        330       340       350
                 ....*....|....*....|....*....|....*.
gi 74757998  684 NQVLKQAKQWK------------------TCPVCNE 701
Cdd:cd22553  334 QQDPNDPTKWQvvadgtpgskkrlrrvacTCPNCRD 369
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
301-664 1.15e-07

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 56.31  E-value: 1.15e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    301 PSPAAGQPVTVAQGAPGSLTHSPPAAGQSHMTLVSSPLPVGQNSlTLQPPAPQPVFLSHGVPLHQSVNPPVLPLSQPVGP 380
Cdd:pfam03154  149 PSPQDNESDSDSSAQQQILQTQPPVLQAQSGAASPPSPPPPGTT-QAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAP 227
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    381 VNKSVGTSVLPINQ--TVRPGVLPLTQPVGPINRPVGPGVLPVSPSVTPGVLQAVSPGVLSVSRAVPSGVLPAGQMTPAG 458
Cdd:pfam03154  228 HTLIQQTPTLHPQRlpSPHPPLQPMTQPPPPSQVSPQPLPQPSLHGQMPPMPHSLQTGPSHMQHPVPPQPFPLTPQSSQS 307
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    459 QM--TPAGVIPGQTATSGVLPTGQ-MVQSGVLPVGQTAPSRVLP-----PGQTAPLRVISAGQ--------VVPSGLLSP 522
Cdd:pfam03154  308 QVppGPSPAAPGQSQQRIHTPPSQsQLQSQQPPREQPLPPAPLSmphikPPPTTPIPQLPNPQshkhpphlSGPSPFQMN 387
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    523 NQTVSSSAVVPVNQGVNSGVLQLSQPVVSgVLPVGQPVRPGVLQlnqtvgTNILPVNQPVRPGASQNTTFLTSGSILRQl 602
Cdd:pfam03154  388 SNLPPPPALKPLSSLSTHHPPSAHPPPLQ-LMPQSQQLPPPPAQ------PPVLTQSQSLPPPAASHPPTSGLHQVPSQ- 459
                          330       340       350       360       370       380
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 74757998    603 iptgkqvNGIPTYTLAPVSVTLPVPPGGLATVAPPQMPIQLLPSGAAAPMAGSMPGMPSPPV 664
Cdd:pfam03154  460 -------SPFPQHPFVPGGPPPITPPSGPPTSTSSAMPGIQPPSSASVSSSGPVPAAVSCPL 514
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
263-680 6.69e-07

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 53.62  E-value: 6.69e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    263 QTHIAPKPAAHLAAPANGSAPSaPAQPPCFHLALPQNSPSPAAGQPVTVAQGAPGSLTHSPPaagQSHMTLVSSPLPVGQ 342
Cdd:pfam03154  175 QAQSGAASPPSPPPPGTTQAAT-AGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHTLIQQTP---TLHPQRLPSPHPPLQ 250
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    343 nSLTLQPPAPQPVFLSHGVPLHQSVNPPvLPLSQPVGP--VNKSVGTSVLPI-NQTVRPGVLPLTQPVGPI---NRPVGP 416
Cdd:pfam03154  251 -PMTQPPPPSQVSPQPLPQPSLHGQMPP-MPHSLQTGPshMQHPVPPQPFPLtPQSSQSQVPPGPSPAAPGqsqQRIHTP 328
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    417 GVLPVSPSVTPGVLQAVSPGVLSVSRAVPSGVLPAGQM-TPAGQMTPAGVipgqtatSGVLPTgQMvqsgvlpvgqtaPS 495
Cdd:pfam03154  329 PSQSQLQSQQPPREQPLPPAPLSMPHIKPPPTTPIPQLpNPQSHKHPPHL-------SGPSPF-QM------------NS 388
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    496 RVLPPGQTAPLRVISAGQVvPSGLLSPnqtvsssavvpvnqgvnsgvLQLsqpvvsgvLPVGQPVRPgvlqlnqtvgtni 575
Cdd:pfam03154  389 NLPPPPALKPLSSLSTHHP-PSAHPPP--------------------LQL--------MPQSQQLPP------------- 426
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    576 lPVNQPvrPGASQNTTFLTSGSilrqliptgkqvNGIPTYTLAPVSVTLPVPPGGLATVAPPQMpiqLLPSGAAAPMAGS 655
Cdd:pfam03154  427 -PPAQP--PVLTQSQSLPPPAA------------SHPPTSGLHQVPSQSPFPQHPFVPGGPPPI---TPPSGPPTSTSSA 488
                          410       420
                   ....*....|....*....|....*
gi 74757998    656 MPGMpSPPVLVNAAQSVFVQASSSA 680
Cdd:pfam03154  489 MPGI-QPPSSASVSSSGPVPAAVSC 512
HOX smart00389
Homeodomain; DNA-binding factors that are involved in the transcriptional regulation of key ...
1043-1096 1.61e-06

Homeodomain; DNA-binding factors that are involved in the transcriptional regulation of key developmental processes


Pssm-ID: 197696 [Multi-domain]  Cd Length: 57  Bit Score: 46.09  E-value: 1.61e-06
                            10        20        30        40        50
                    ....*....|....*....|....*....|....*....|....*....|....
gi 74757998    1043 PKKYEGRSYEEKKQFLKDYFHKKPYPSKKEIELLSSLFWVWKIDVASFFGKRRY 1096
Cdd:smart00389    1 KRRKRTSFTPEQLEELEKEFQKNPYPSREEREELAKKLGLSERQVKVWFQNRRA 54
PHA03247 PHA03247
large tegument protein UL36; Provisional
277-681 1.62e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 52.63  E-value: 1.62e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   277 PANGSAPSAPAQPPCfhlalPQNSPSPAAGQPVTVAQGAPGSLTHSPPA-AGQSHMTLVSSPLPVGQNSLTLQPPAPQPV 355
Cdd:PHA03247 2483 PAEARFPFAAGAAPD-----PGGGGPPDPDAPPAPSRLAPAILPDEPVGePVHPRMLTWIRGLEELASDDAGDPPPPLPP 2557
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   356 FLSHGVPlHQSVnPPVLPLSQPVGPVNKSvgtsvlpinQTVRPGVLPltQPvgpiNRPVGPGVLPVSPsvtpgvlqavsP 435
Cdd:PHA03247 2558 AAPPAAP-DRSV-PPPRPAPRPSEPAVTS---------RARRPDAPP--QS----ARPRAPVDDRGDP-----------R 2609
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   436 GVLSVSRAVPSGVLPAgqmTPAGQMTPAGVIPGQTATSGVLPTGQmvqsgvlPVGQTAPSRVLPPgqtapLRVISAGQvv 515
Cdd:PHA03247 2610 GPAPPSPLPPDTHAPD---PPPPSPSPAANEPDPHPPPTVPPPER-------PRDDPAPGRVSRP-----RRARRLGR-- 2672
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   516 PSGLLSPNQTVSSSAVVPVNQGVNSgvlqLSQPVVSGVLPVGQPvRPGVLQLNQTVGTNILPVNQPVRPGASqnttfLTS 595
Cdd:PHA03247 2673 AAQASSPPQRPRRRAARPTVGSLTS----LADPPPPPPTPEPAP-HALVSATPLPPGPAAARQASPALPAAP-----APP 2742
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   596 GSILRQLIPTGKQVNGIPTYTLAPVSvtlPVPPGGLATVAPPQMPIQLLPSGAAAPMAGSMPGMPSPPVLVNAAQSVFVQ 675
Cdd:PHA03247 2743 AVPAGPATPGGPARPARPPTTAGPPA---PAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALP 2819

                  ....*.
gi 74757998   676 ASSSAA 681
Cdd:PHA03247 2820 PAASPA 2825
Soli_cterm TIGR03437
Solibacter uncharacterized C-terminal domain; This model describes a protein domain found in ...
465-667 2.84e-06

Solibacter uncharacterized C-terminal domain; This model describes a protein domain found in 90 proteins of Solibacter usitatus Ellin6076, nearly always as the C-terminal domain of a much larger protein. No homologs to this domain are detected outside of S. usitatus, a member of the Acidobacteria.


Pssm-ID: 274578 [Multi-domain]  Cd Length: 215  Bit Score: 49.58  E-value: 2.84e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    465 VIPGQTAT---SGVLPTGQMVQSGVLPVgQTAPSRVLPPGQTAPLRVISAGQV---VPSGLLSPNQTVsssaVVpVNQGV 538
Cdd:TIGR03437    2 VAPGSIVSifgTNLAPATLTAAGGPLPT-SLGGVSVTVNGVAAPLLYVSPGQInaqVPYEVAPGAATV----TV-TYNGG 75
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    539 NSGVLQLS-QPVVSGVLPV-GQPVRPGVLQLNQtvGTNILPVNQPVRPGaSQNTTFLTSGSILRQLIPTGKQVNGIPTY- 615
Cdd:TIGR03437   76 ASAAVTVTvAAAAPGIFTLdGSGTGQAAALNNQ--DGSVNSAANPAAPG-DVVVLYATGLGPTSPAVADGAPAPSSPLAp 152
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 74757998    616 TLAPVSVTL-----PVPPGGLATVAPPQMPIQL-LPSGAAA---PMAGSMPGMPSPPVLVN 667
Cdd:TIGR03437  153 ALAPVTVTIggvpaTVLYAGLAPGFVGLYQVNVrVPAGLATgavPVVITVGGVTSNAVTIA 213
DUF4813 pfam16072
Domain of unknown function (DUF4813); This family of proteins is functionally uncharacterized. ...
445-681 5.42e-06

Domain of unknown function (DUF4813); This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 345 and 672 amino acids in length.


Pssm-ID: 435117 [Multi-domain]  Cd Length: 288  Bit Score: 49.37  E-value: 5.42e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    445 PSGVLPAGqmtpaGQMTPAGVIPGqtaTSGVLPTGqmvqsGVlPVGQT----APSRVLPPGQTaplrVISAGQVVPSGLL 520
Cdd:pfam16072   13 PGGYAPAG-----ATYHPAGQVPA---GATYYPSG-----GV-PHGATyypqAPVAAVPAGAT----YLPAGAAIPAGAT 74
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    521 SPNQTVSSSAVVPVNQGVNSG---------VLQLSQPVVSGVLPVGQPVRPGVLQLNQTVGTNILPVNQPvrPGASQNTT 591
Cdd:pfam16072   75 YYPQAPKSSSGLGLGTGLIAGalggailghALTPTQTRVVEHAPSSGGGGGGGGYSNGNNEDKIIIINNG--PPGSVTTT 152
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    592 FLTSGSilrQLIPTGKQVNGiptytlAPVSVTLPVPPGGLATVAPPQMPIQlLPSGAAAPMAGSMPGMPSPPVLVNAAQS 671
Cdd:pfam16072  153 SAGSGT---TVINAGGQQPA------APAAPAYPVAPAAYPAQAPAAAPAP-APGAPQTPLAPLNPVAAAPAAAAGAAAA 222
                          250
                   ....*....|
gi 74757998    672 VFVQASSSAA 681
Cdd:pfam16072  223 PVVAAAAPAA 232
homeodomain cd00086
Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic ...
1044-1100 7.87e-06

Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner.


Pssm-ID: 238039 [Multi-domain]  Cd Length: 59  Bit Score: 44.16  E-value: 7.87e-06
                         10        20        30        40        50
                 ....*....|....*....|....*....|....*....|....*....|....*..
gi 74757998 1044 KKYEGRSYEEKKQFLKDYFHKKPYPSKKEIELLSSLFWVWKIDVASFFGKRRYICMK 1100
Cdd:cd00086    1 RRKRTRFTPEQLEELEKEFEKNPYPSREEREELAKELGLTERQVKIWFQNRRAKLKR 57
SP1-4_arthropods_N cd22553
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ...
260-601 9.20e-06

N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.


Pssm-ID: 411778 [Multi-domain]  Cd Length: 384  Bit Score: 49.25  E-value: 9.20e-06
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  260 LLKQTHIAPKPAAHLAAPANGSAPSAPAQPPCFHLALPQNSPSPAAGQP--VTVAQ---GAPGSLTHSPPAAGQShMTL- 333
Cdd:cd22553    1 FNQSQQVAPSELAQVATTASNIGGQQKQAQSDSSETHDPLILSPPLSQPqqIITAQssgSAAGGVAYSVSPAVQT-VTVd 79
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  334 ----VSSPLPVGQNSLTLQPPAPQPVFLSHGVPLHQSVnppvlpLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQPVGP 409
Cdd:cd22553   80 gheaIFIPANSGLLQTNNQQAIQLAPGGTQAILANQQT------LIRPNTVQGQANASNVLQNIAQIASGGNAVQLPLNN 153
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  410 INRPVgPGVLPVSPSVTPGVLQAVSPGVLSVSRAVPSGVLPAGQMTPAGQMTPAGVI-PGQTATsgvlPTGQMVQSGVLP 488
Cdd:cd22553  154 MTQTI-PVQVPVSTANGQTVYQTIQVPIQAIQSGNAGGGNQALQAQVIPQLAQAAQLqPQQLAQ----VSSQGYIQQIPA 228
                        250       260       270       280       290       300       310       320
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  489 VGQTAPSRVLPPGQTaplrviSAGQVVPSGL-LSPNQTVSSSAVVPVNQGVNSGVLQLSQPVVSGVLPVGQPVRPGVLQL 567
Cdd:cd22553  229 NASQQQPQMVQQGPN------QSGQIIGQVAsASSIQAAAIPLTVYTGALAGQNGSNQQQVGQIVTSPIQGMTQGLTAPA 302
                        330       340       350
                 ....*....|....*....|....*....|....
gi 74757998  568 NQTVGTNILPvNQPVRPGASQNTTFLTSGSILRQ 601
Cdd:cd22553  303 SSSIPTVVQQ-QAIQGNPLPPGTQIIAAGQQLQQ 335
PHA03247 PHA03247
large tegument protein UL36; Provisional
266-434 1.18e-05

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 49.94  E-value: 1.18e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   266 IAPKPAAHLAAPANGSAPSAPAQPPCFHLA--------LPQNSP--SPAAGQPVTVAQGAPGSLTHSPPAAGQSHMTLVS 335
Cdd:PHA03247 2828 LPPPTSAQPTAPPPPPGPPPPSLPLGGSVApggdvrrrPPSRSPaaKPAAPARPPVRRLARPAVSRSTESFALPPDQPER 2907
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   336 SPLPVGQNSLTLQPPAPQPVFLSHGVPLHQSVNPPVLPLSQPvGPVNKSVGTSVLPINQTVRPGVLPLTQPVGPINRPVG 415
Cdd:PHA03247 2908 PPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTTDP-AGAGEPSGAVPQPWLGALVPGRVAVPRFRVPQPAPSR 2986
                         170
                  ....*....|....*....
gi 74757998   416 PGVLPVSPSVTPGVLQAVS 434
Cdd:PHA03247 2987 EAPASSTPPLTGHSLSRVS 3005
PPE COG5651
PPE-repeat protein [Function unknown];
362-564 2.39e-05

PPE-repeat protein [Function unknown];


Pssm-ID: 444372 [Multi-domain]  Cd Length: 385  Bit Score: 47.97  E-value: 2.39e-05
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  362 PLHQSVNPPVLPLSQPVGPVNKSVGTSV--LPINQTVRPGVLPLTQPVGPINRPVGPGVLPVSPSVTPGVLQAVSPGVLS 439
Cdd:COG5651  170 PPPTITNPGGLLGAQNAGSGNTSSNPGFanLGLTGLNQVGIGGLNSGSGPIGLNSGPGNTGFAGTGAAAGAAAAAAAAAA 249
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  440 VSRAVPSGVLPAGQMTPAGQMTPAGVIPGQTATSGVLPTGQMVQSGVLPVGQTAPSRVLPPGQTAPLRVISAGqVVPSGL 519
Cdd:COG5651  250 AAGAGASAALASLAATLLNASSLGLAATAASSAATNLGLAGSPLGLAGGGAGAAAATGLGLGAGGAAGAAGAT-GAGAAL 328
                        170       180       190       200
                 ....*....|....*....|....*....|....*....|....*
gi 74757998  520 LSPNQTVSSSAVVPVNQGVNSGVLQLSQPVVSGVLPVGQPVRPGV 564
Cdd:COG5651  329 GAGAAAAAAGAAAGAGAAAAAAAGGAGGGGGGALGAGGGGGSAGA 373
PHA02682 PHA02682
ORF080 virion core protein; Provisional
268-375 3.76e-05

ORF080 virion core protein; Provisional


Pssm-ID: 177464 [Multi-domain]  Cd Length: 280  Bit Score: 46.78  E-value: 3.76e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   268 PKPAAHLAAPANGSAPSAPAQPPCFHLALPQNSPSPAAGQPvtvAQGAPGSLTHSPPAagqshmtlvsSPLPvgqnslTL 347
Cdd:PHA02682   96 PACAPAAPAPAVTCPAPAPACPPATAPTCPPPAVCPAPARP---APACPPSTRQCPPA----------PPLP------TP 156
                          90       100
                  ....*....|....*....|....*....
gi 74757998   348 QP-PAPQPVFlshgvpLHQSVNPPVLPLS 375
Cdd:PHA02682  157 KPaPAAKPIF------LHNQLPPPDYPAA 179
PRK10263 PRK10263
DNA translocase FtsK; Provisional
268-527 6.76e-05

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 47.39  E-value: 6.76e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   268 PKPAAHLAAPAngSAPSAPAQPPCFHLALPQNSPSPAAGQPVTVAQGAPGSLTHSPPAAGQ---SHMTLVSSPLPVGQNS 344
Cdd:PRK10263  362 PVPGPQTGEPV--IAPAPEGYPQQSQYAQPAVQYNEPLQQPVQPQQPYYAPAAEQPAQQPYyapAPEQPAQQPYYAPAPE 439
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   345 LTL-----QPPAPQPVFLSHgvPLHQSVNPPVLPLSQPVG-----PVNKSVGTSVLPINQTVRPGVLPL----------- 403
Cdd:PRK10263  440 QPVagnawQAEEQQSTFAPQ--STYQTEQTYQQPAAQEPLyqqpqPVEQQPVVEPEPVVEETKPARPPLyyfeeveekra 517
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   404 ---------TQPV-GPI--NRPVGPGVLPVSPSVTPGVLQAvsPGVLSVSRAVPSGVLPAGqmTPAGQMTPAGvipgqTA 471
Cdd:PRK10263  518 rereqlaawYQPIpEPVkePEPIKSSLKAPSVAAVPPVEAA--AAVSPLASGVKKATLATG--AAATVAAPVF-----SL 588
                         250       260       270       280       290
                  ....*....|....*....|....*....|....*....|....*....|....*.
gi 74757998   472 TSGVLPTGQmVQSGVLPvGQTAPSRVlppgqtaplRVISAGQVVPSGLLSPNQTVS 527
Cdd:PRK10263  589 ANSGGPRPQ-VKEGIGP-QLPRPKRI---------RVPTRRELASYGIKLPSQRAA 633
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
264-434 1.61e-04

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 45.91  E-value: 1.61e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    264 THIAPKPAAH-LAAPANGSAPSApaQPPCFHLaLPQNS--PSPAAgQPVTVAQgapgSLTHSPPAAgqshmtlvSSPLPV 340
Cdd:pfam03154  388 SNLPPPPALKpLSSLSTHHPPSA--HPPPLQL-MPQSQqlPPPPA-QPPVLTQ----SQSLPPPAA--------SHPPTS 451
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    341 GQNSLTLQPPAPQPVFLSHGVPL------HQSVNPPVLPLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQPVGPINRPV 414
Cdd:pfam03154  452 GLHQVPSQSPFPQHPFVPGGPPPitppsgPPTSTSSAMPGIQPPSSASVSSSGPVPAAVSCPLPPVQIKEEALDEAEEPE 531
                          170       180
                   ....*....|....*....|
gi 74757998    415 GPGVLPVSPSVTPGVLQAVS 434
Cdd:pfam03154  532 SPPPPPRSPSPEPTVVNTPS 551
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
267-413 2.30e-04

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 45.41  E-value: 2.30e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    267 APKPAAhlaaPANGSAPSAPAQPPCFHLALPQNSPSPAAGQ--PVTVAQGAPGSLTHSPPAA----GQSHMTLVSSPLPV 340
Cdd:pfam09770  207 AKKPAQ----QPAPAPAQPPAAPPAQQAQQQQQFPPQIQQQqqPQQQPQQPQQHPGQGHPVTilqrPQSPQPDPAQPSIQ 282
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 74757998    341 GQNSLTLQPPAPQPVflshgVPLHQSVNPPVLPLSQPVGPVNKSVGTSVLPINQTVR-PGVLPltQPVGPINRP 413
Cdd:pfam09770  283 PQAQQFHQQPPPVPV-----QPTQILQNPNRLSAARVGYPQNPQPGVQPAPAHQAHRqQGSFG--RQAPIITHP 349
PHA03247 PHA03247
large tegument protein UL36; Provisional
270-665 2.74e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 45.31  E-value: 2.74e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   270 PAAHLAAP-ANGSAPSAPAQPPCFHLALPQNSPSPAAGQPVTV----------------AQGAPGSLTHS--PPAAGQSH 330
Cdd:PHA03247 2489 PFAAGAAPdPGGGGPPDPDAPPAPSRLAPAILPDEPVGEPVHPrmltwirgleelasddAGDPPPPLPPAapPAAPDRSV 2568
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   331 MTLVSSPLPVGqnsltlqpPAPQPVFLSHGVPLHQsvNPPVLPLSQPVGPVNKSVGTSVLPINQTVRPgvlPLTQPVGPI 410
Cdd:PHA03247 2569 PPPRPAPRPSE--------PAVTSRARRPDAPPQS--ARPRAPVDDRGDPRGPAPPSPLPPDTHAPDP---PPPSPSPAA 2635
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   411 NRPVGPGVLPVSPSVTPGvlQAVSPGVLSVSRAVPSGVLPAGQMTPAGQMTPAGVIPGqtatsgVLPTGQMVQSGVLPVG 490
Cdd:PHA03247 2636 NEPDPHPPPTVPPPERPR--DDPAPGRVSRPRRARRLGRAAQASSPPQRPRRRAARPT------VGSLTSLADPPPPPPT 2707
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   491 QTAPSRVLPPGQTAPLRVISAGQVVPSGLLSPnqtvsssavvpvnqgvnsgvlqLSQPVVSGVLPVGQPVRPGVLQLNQT 570
Cdd:PHA03247 2708 PEPAPHALVSATPLPPGPAAARQASPALPAAP----------------------APPAVPAGPATPGGPARPARPPTTAG 2765
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   571 VGTNILPVNQPVRPGASQNTTFLTSGSILRQLIPTGKQVngiptytlAPVSVTLPVPPGGLATVAPPQMPiqLLPSGAAA 650
Cdd:PHA03247 2766 PPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDP--------ADPPAAVLAPAAALPPAASPAGP--LPPPTSAQ 2835
                         410
                  ....*....|....*
gi 74757998   651 PMAGSMPGMPSPPVL 665
Cdd:PHA03247 2836 PTAPPPPPGPPPPSL 2850
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
313-711 3.76e-04

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 44.61  E-value: 3.76e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    313 QGAPGSLTHSPPAAGQSHMtlVSSPLPVGQN--SLTLQPPAPQPVFLSHGVPLHQSVNPPVL--PLSQPVGPVNKSVGTS 388
Cdd:pfam09606   60 QQQPQGGQGNGGMGGGQQG--MPDPINALQNlaGQGTRPQMMGPMGPGPGGPMGQQMGGPGTasNLLASLGRPQMPMGGA 137
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    389 VLPINQTvrpGVLPLTQPVGpinrpVGPGVLPVSPSVTPGVLQAvspgvlsvsravpsgvlPAGQMTPAGQMTPaGVIPG 468
Cdd:pfam09606  138 GFPSQMS---RVGRMQPGGQ-----AGGMMQPSSGQPGSGTPNQ-----------------MGPNGGPGQGQAG-GMNGG 191
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    469 QTATSGVLPTGQMVQSGVL-------PVGQTAPSRVLPP---GQTAPLRVISAGQVVPSGllsPNQTVSSSAVVPVNQgV 538
Cdd:pfam09606  192 QQGPMGGQMPPQMGVPGMPgpadagaQMGQQAQANGGMNpqqMGGAPNQVAMQQQQPQQQ---GQQSQLGMGINQMQQ-M 267
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    539 NSGVLQLSQPVVSGVLPVGQPVRPGVLQLNQTVGTNILPVNQPVRPgasqnttfltsgsilRQlipTGKQVNGIPTytlA 618
Cdd:pfam09606  268 PQGVGGGAGQGGPGQPMGPPGQQPGAMPNVMSIGDQNNYQQQQTRQ---------------QQ---QQQGGNHPAA---H 326
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    619 PVSVTLPVPPGGLATVAPPQMPIQLLPSGA-----AAPMAGSMPGMPSPPVLVNAAQSVFVQasssaadTNQVLKQAKQw 693
Cdd:pfam09606  327 QQQMNQSVGQGGQVVALGGLNHLETWNPGNfgglgANPMQRGQPGMMSSPSPVPGQQVRQVT-------PNQFMRQSPQ- 398
                          410
                   ....*....|....*...
gi 74757998    694 ktcpvcnelfPSNVYQVH 711
Cdd:pfam09606  399 ----------PSVPSPQG 406
SP4_N cd22536
N-terminal domain of transcription factor Specificity Protein (SP) 4; Specificity Proteins ...
277-670 5.62e-04

N-terminal domain of transcription factor Specificity Protein (SP) 4; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. Human SP4 is a risk gene of multiple psychiatric disorders including schizophrenia, bipolar disorder, and major depression. SP4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP4.


Pssm-ID: 411773 [Multi-domain]  Cd Length: 623  Bit Score: 44.14  E-value: 5.62e-04
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  277 PANGSAPSAPAQppcFHLALPQNSPSPAAGQPVTVA---QGAPGSLTHSPPAAGQSHMTLVSSP--LPVGQNSLTLQPPA 351
Cdd:cd22536  115 KAGNSNASAPGQ---FQVIQVQNMQNPSGSVQYQVIpqiQTVEGQQIQISPANATALQDLQGQIqlIPAGNNQAILTTPN 191
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  352 PQP-------VFLSHGVPLHqsVNPPV-LPLSQPVGPVNKSVGTSVLPINQtvrpGVLPLTQPVgpINRPVGPG-----V 418
Cdd:cd22536  192 RTAsgniiaqNLANQTVPVQ--IRPGVsIPLQLQTIPGAQAQVVTTLPINI----GGVTLALPV--INNVAAGGgsgqlV 263
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  419 LPVSPSVTPGVLQAVSPGVLSVSRAVPSgvlpagqmtpagqmTPAGVIPGQTATSGVLPTGQMVQSGVLPVGQ--TAPSR 496
Cdd:cd22536  264 QPSDGGVSNGNQLVSTPITTASVSTMPE--------------SPSSSTTCTTTASTSLTSSDTLVSSAETGQYasTAASS 329
                        250       260       270       280       290       300       310       320
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  497 VL----PPGQTAPLRVISAGQVVPSGLLS-PNQTVSSSAVVPVNQGVNSgVLQLSQPVVSgVLPVGQPVRPgVLQLNQTV 571
Cdd:cd22536  330 ERteeePQTSAAESEAQSSSQLQSNGLQNvQDQSNSLQQVQIVGQPILQ-QIQIQQPQQQ-IIQAIQPQSF-QLQSGQTI 406
                        330       340       350       360       370       380       390       400
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  572 GTNILPVNQPVRPGASQNTT-------FLT-SGSI---------LRQLIPTGKQVNGIPTY-TLAPVSVTlpvppGGLAT 633
Cdd:cd22536  407 QTIQQQPLQNVQLQAVQSPTqvlirapTLTpSGQIswqtvqvqnIQSLSNLQVQNAGLPQQlTLTPVSSS-----AGGTT 481
                        410       420       430
                 ....*....|....*....|....*....|....*..
gi 74757998  634 VAppqmpiQLLPsgaaAPMAGSmpgmpspPVLVNAAQ 670
Cdd:cd22536  482 IA------QIAP----VAVAGT-------PITLNAAQ 501
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
263-441 5.86e-04

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 44.10  E-value: 5.86e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   263 QTHIAPKPAAHLAAPANGSAPSAPAQPPCFHLALPqnSPSPAAGQPVTVAQGAPGSLTHSPPAAgqshmtlvSSPLPVGQ 342
Cdd:PRK12323  392 PAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARR--SPAPEALAAARQASARGPGGAPAPAPA--------PAAAPAAA 461
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   343 NSLTLQPPAPQPVFLSHGVPLHQSVNPPV-----------LPLSQPV-GPVNKSVGTSVLPINQTVRPGVLPL-----TQ 405
Cdd:PRK12323  462 ARPAAAGPRPVAAAAAAAPARAAPAAAPApadddpppweeLPPEFASpAPAQPDAAPAGWVAESIPDPATADPddafeTL 541
                         170       180       190
                  ....*....|....*....|....*....|....*.
gi 74757998   406 PVGPINRPVGPGVLPVSPSVTPGVLQAVSPGVLSVS 441
Cdd:PRK12323  542 APAPAAAPAPRAAAATEPVVAPRPPRASASGLPDMF 577
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
247-533 7.87e-04

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 43.60  E-value: 7.87e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    247 LRSVISEHIKRtglLKQTHIAPKPAAHLAAPANgsAPSAPAQPPCFHLALP------QNSPS----PAAGQPVTV----- 311
Cdd:pfam03154  231 IQQTPTLHPQR---LPSPHPPLQPMTQPPPPSQ--VSPQPLPQPSLHGQMPpmphslQTGPShmqhPVPPQPFPLtpqss 305
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    312 -AQGAPGSLTHSPPAAGQSHMTLVSSPLPVGQNSLTLQPPAPQPVFLSHGVPlhqsvnPPVLPLSQpvgpvnksvgtsvL 390
Cdd:pfam03154  306 qSQVPPGPSPAAPGQSQQRIHTPPSQSQLQSQQPPREQPLPPAPLSMPHIKP------PPTTPIPQ-------------L 366
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    391 PINQTVR-PGVLPLTQPVG-PINRPVGPGVLPVS-------PSVTPGVLQAVSPGV-LSVSRAVPSGVLPAGQMTPAGQM 460
Cdd:pfam03154  367 PNPQSHKhPPHLSGPSPFQmNSNLPPPPALKPLSslsthhpPSAHPPPLQLMPQSQqLPPPPAQPPVLTQSQSLPPPAAS 446
                          250       260       270       280       290       300       310
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 74757998    461 TPAGVIPGQTATSGVLPTGQMVQSGvlpvgqtaPSRVLPPGQTAPlrviSAGQVVPSGLLSPNQTVSSSAVVP 533
Cdd:pfam03154  447 HPPTSGLHQVPSQSPFPQHPFVPGG--------PPPITPPSGPPT----STSSAMPGIQPPSSASVSSSGPVP 507
PPE COG5651
PPE-repeat protein [Function unknown];
431-684 1.10e-03

PPE-repeat protein [Function unknown];


Pssm-ID: 444372 [Multi-domain]  Cd Length: 385  Bit Score: 42.57  E-value: 1.10e-03
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  431 QAVSpgvLSVSRAVPSGVLPAGQMTPAGQMTPAGVIPGQTATS---GVLPTGQMVQSGVLPVGQTAPSRVLPPGQTAPLR 507
Cdd:COG5651  155 AAAS---AAAVALTPFTQPPPTITNPGGLLGAQNAGSGNTSSNpgfANLGLTGLNQVGIGGLNSGSGPIGLNSGPGNTGF 231
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  508 VISAGQVVPSGLLSPNQTVSSSAVVPVNQGVNSGVLQLSQPVVSGVLPvgqpvrpgvlqlNQTVGTNILPVNQPVRPGAS 587
Cdd:COG5651  232 AGTGAAAGAAAAAAAAAAAAGAGASAALASLAATLLNASSLGLAATAA------------SSAATNLGLAGSPLGLAGGG 299
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  588 QNTTFLTSGSilrqliptgkqvNGIPTYTLAPVSVTLPVPPGGLATVAPPQMPIQLLPSGAAAPMAGSMPGMPSPPVLVN 667
Cdd:COG5651  300 AGAAAATGLG------------LGAGGAAGAAGATGAGAALGAGAAAAAAGAAAGAGAAAAAAAGGAGGGGGGALGAGGG 367
                        250
                 ....*....|....*..
gi 74757998  668 AAQSVFVQASSSAADTN 684
Cdd:COG5651  368 GGSAGAAAGAASGGGAA 384
PRK07994 PRK07994
DNA polymerase III subunits gamma and tau; Validated
269-433 1.69e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236138 [Multi-domain]  Cd Length: 647  Bit Score: 42.55  E-value: 1.69e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   269 KPAAHLAAPANGSAPSAPAqppcfhlALPQNSPSPAAGQPVTVAQGAPGSLTHSPPAAGQSHMTLVSSPLPVGQNSLTLQ 348
Cdd:PRK07994  360 HPAAPLPEPEVPPQSAAPA-------ASAQATAAPTAAVAPPQAPAVPPPPASAPQQAPAVPLPETTSQLLAARQQLQRA 432
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   349 PPAPQPvflshgvplhqsvnppvlPLSQPVGPVNKSVGTSVLPINQTVRPgvLPLTQPVGPIN------RPVGPGVLPVS 422
Cdd:PRK07994  433 QGATKA------------------KKSEPAAASRARPVNSALERLASVRP--APSALEKAPAKkeayrwKATNPVEVKKE 492
                         170
                  ....*....|.
gi 74757998   423 PSVTPGVLQAV 433
Cdd:PRK07994  493 PVATPKALKKA 503
PHA03378 PHA03378
EBNA-3B; Provisional
267-488 5.87e-03

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 40.82  E-value: 5.87e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   267 APKPAAHLAA-------PANGSAPSAPAQPPCFHLALPQNSPSPA---AGQPVTVAQGAPGSLTHSPPAAGQSHMTlvSS 336
Cdd:PHA03378  700 APTPMRPPAAppgraqrPAAATGRARPPAAAPGRARPPAAAPGRArppAAAPGRARPPAAAPGRARPPAAAPGAPT--PQ 777
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   337 PLPVGQNSLTLQP---PAPQPVflSHGVPLHQSVNPPVLPLSQpvGPVNKSVGTSVLPINQTVRPGV-----LPLTQPVG 408
Cdd:PHA03378  778 PPPQAPPAPQQRPrgaPTPQPP--PQAGPTSMQLMPRAAPGQQ--GPTKQILRQLLTGGVKRGRPSLkkpaaLERQAAAG 853
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998   409 PINRPVGPGVLPV--SPSVTPGVLQAVS-PGVLSVSRAVPSGVLP------AGQMTPAGQMTPAGVIPGQTATSGVLPTG 479
Cdd:PHA03378  854 PTPSPGSGTSDKIvqAPVFYPPVLQPIQvMRQLGSVRAAAASTVTqapteyTGERRGVGPMHPTDIPPSKRAKTDAYVES 933

                  ....*....
gi 74757998   480 QMVQSGVLP 488
Cdd:PHA03378  934 QPPHGGQSH 942
half-pint TIGR01645
poly-U binding splicing factor, half-pint family; The proteins represented by this model ...
406-522 7.07e-03

poly-U binding splicing factor, half-pint family; The proteins represented by this model contain three RNA recognition motifs (rrm: pfam00076) and have been characterized as poly-pyrimidine tract binding proteins associated with RNA splicing factors. In the case of PUF60 (GP|6176532), in complex with p54, and in the presence of U2AF, facilitates association of U2 snRNP with pre-mRNA.


Pssm-ID: 130706 [Multi-domain]  Cd Length: 612  Bit Score: 40.44  E-value: 7.07e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998    406 PVGPINRPVGPGVLPVSPSVTPGvlqAVSPGVLSVSRAVPSGVLPAGQMTPAgqmTPAGVIPGQTATSGVLPTGQMVQSG 485
Cdd:TIGR01645  284 PPDALLQPATVSAIPAAAAVAAA---AATAKIMAAEAVAGAAVLGPRAQSPA---TPSSSLPTDIGNKAVVSSAKKEAEE 357
                           90       100       110
                   ....*....|....*....|....*....|....*..
gi 74757998    486 VLPVGQTAPSRVLPPGQTAPLRVISAGQVVPSGLLSP 522
Cdd:TIGR01645  358 VPPLPQAAPAVVKPGPMEIPTPVPPPGLAIPSLVAPP 394
KLF10_11_N cd21974
N-terminal domain of Kruppel-like factor (KLF) 10, KLF11, and similar proteins; This subfamily ...
321-458 1.00e-02

N-terminal domain of Kruppel-like factor (KLF) 10, KLF11, and similar proteins; This subfamily is composed of Kruppel-like factor or Krueppel-like factor (KLF) 10, KLF11, and similar proteins. KLF10 was first identified in human osteoblasts and plays a role in mediating estrogen (E2) signaling in bone and skeletal homeostasis and a regulatory role in tumor formation and metastasis. KLF11 is involved in cell growth, apoptosis, cellular inflammation and differentiation, endometriosis, and cholesterol, prostaglandin, neurotransmitter, fat, and sugar metabolism. KLF9, KLF10, KLF11, KLF13, KLF14, and KLF16 share a conserved a-helical motif AA/VXXL that mediates their binding to Sin3A and their activities as transcriptional repressors. KLF10/11 belong to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF10, KLF11, and similar proteins.


Pssm-ID: 409243 [Multi-domain]  Cd Length: 229  Bit Score: 38.76  E-value: 1.00e-02
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  321 HSPPAAGQSHMTLVSSPLPV--------GQNSLTLQPPAPQPVFLSHGVPLHQSVNPPVLP---LSQPVgPVNKSVGTSV 389
Cdd:cd21974   62 YSPPFFEASHSPSVASLHPPsaassqppPEPESSEPPAASPQRAQATSVIRHTADPVPVSPppvLCQML-PVSSSSGVIV 140
                         90       100       110       120       130       140       150
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 74757998  390 LPINQTVRPGVLPLTQPVGPinrpvgPGVLPVSPSVTPG-VLQAVSPGvlsvsrAVPSGVLPAGQMTPAG 458
Cdd:cd21974  141 AFLKAPQQPSPQPQKPALPQ------PQVVLVGGQVPQGpVMLVVPQP------AVPQPYVQPTVVTPGG 198
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH